Bootstrapping Image Classification with Sample Evaluation Sanjiban Choudhury, Ruta Desai, Venkatraman Narayanan The Robotics Institute, Carnegie Mellon University
In this work, we look at the problem of multi-class image classification in a semi-supervised learning framework. Given a small set of labeled images, and a much larger set of unlabeled images, we propose a semi-supervised learning method based on bootstrapping that uses independent and discriminating evaluators to overcome semantic drift. Results show the usefulness of an evaluator in learning difficult examples.
METHOD OVERVIEW
66
60
54 52 1
Bootstrapping with Sample Evaluation
2
3
4 5 6 Iteration Number (every 1000 images)
7
8
Fig 1. Independent and discriminative evaluation continue to improve with new samples
SUN
Experiment Setup
Pool of unlabeled images
Labeled Images
Unlabeled Images Bootstrapping
Batch of unlabeled images
1. Learn initial hypothesis from labeled seed examples 2. Classify unlabeled images using current hypothesis 3. Re-train hypothesis using self-labeled images 4. Repeat from step 2
Difficult examples to classify Evaluate labeled images
Iteration 1 Seed Data
church mountain church church Apple Apple Apple Berry Berry Berry
Iteration 2
Apple Apple Apple Apple Berry Berry
Iteration 1000 Apple Apple Apple Apple Apple Apple
Semantic Drift Elimination Popular Techniques Coupled Learning Mutual Exclusion Co-Training
Our Approach Independent Evaluation Pairwise Discrimination Hard Negative Promotion
(
+
+
+
-
)
Retrain classifier
Sample Evaluations
kitchen living room
2. Independent evaluation Train an independent classifier on a different view (set of features) of the data. Unlabeled instances that the main and independent classifiers agree upon are alone promoted. 3. Enforcing discriminative constraints One-versus-one SVM classifiers are trained to discriminate between pairs of confusable classes. These pairs of classes are determined from the confusion matrix computed on a validation set.
living room kitchen
SUMMARY
To suppress semantic drift, the data to be promoted is subject to the following evaluations 1. Hard negative promotion Unlabeled images that are classified with low confidence only are promoted.
living room kitchen
Above Baseline
Combine Multiple Views
Eliminate Pairwise Confusion
Hard Negative Promotion Independent evaluator Discriminating evaluator Independent + Discriminative [1] A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proceedings of the eleventh annual conference on Computational learning theory , pages 92–100. ACM,1998. [2] J. Xiao, J. Hays, K.A. Ehinger, A. Oliva, and A. Torralba. Sun database: Large-scale scene recognition from abbey to zoo. In Computer vision and pattern recognition (CVPR), 2010 IEEE conference on , pages 3485–3492. IEEE, 2010. [3] A. Shrivastava, S. Singh, and A. Gupta. Constrained semi-supervised learning using attributes and comparative attributes. Computer VisionECCV 2012, pages 369-383, 2012