Multiple-Instance Learning With Generalized ... - Semantic Scholar

Report 5 Downloads 201 Views
Multiple-Instance Learning With Generalized Support Vector Machines Stuart Andrews Department of Computer Science Brown University www.cs.brown.edu/~stu

Joint work with:

Ioannis Tsochantaridis & Thomas Hofmann

1

Brown University 05/12/2005

Multiple Instance Learning (MIL) q

Informal Definition: "Classification problem where labels are not directly associated with patterns, but with sets of patterns." § Incomplete association (ambiguity) between patterns & labels § Less information than in supervised classification, more than in unsupervised learning: semi-supervised learning Training sample: {( ,-1), ( ,-1), ( ,1), ( 1)}

2

AAAI Confernece, July 28th-August 31th, 2002

Multiple Instance Learning (MIL) q q

Semantics: a set or bag is a member of a concept, if it contains a pattern which is a member Asymmetry: one positive patterns makes a bag positive, negative bags only contain negative patterns true pattern level discriminant

+ _ Training sample: {( ,-1), ( ,-1), ( ,1), ( 1)}

3

AAAI Confernece, July 28th-August 31th, 2002

Applications of MIL q

Drug design: § Predict efficiency of a drug, problem: different conformations of the same molecule. § Bag = molecule, pattern = conformation

q

Content-based image retrieval: § Image annotations reference objects in the image, but are typically not associated with particular parts of the image. § Bag = image, pattern = region or blob

q

Text categorization: § Documents may be categorized/filtered based on a relevant passage. § Bag = document, pattern = passage or paragraph

4

AAAI Confernece, July 28th-August 31th, 2002

Applications of MIL Drug Design § At any time, a molecule may have one of many different low enery conformations, or shapes

§ Shape plays an important role in chemical reactions, however such events are difficult to observe or measure

!! !!

!!

§ Classification by shape can help predict chemical reactions

5

AAAI Confernece, July 28th-August 31th, 2002

Applications of MIL Drug Design q

Approach: Estimate molecule conformations and learn from experimentally tested molecules

{(

, +/-1)}i Conformations of ith molecule

6

AAAI Confernece, July 28th-August 31th, 2002

Applications of MIL Content-Based Image Retrieval § From text to images § Support of a closed vocabulary

(at

least several thousand keywords)

§ Ultimately: richer types of text queries

(“a tiger next to a tree”, “a tiger looking to the right”, “a lazy tiger”, …)

tiger

7

AAAI Confernece, July 28th-August 31th, 2002

Applications of MIL Content-Based Image Retrieval q

Approach: Automatic image indexing/classification starting from a seed set of annotated images

Hand labeled training set of “Tigers” Automatically annotated

8



q Ultimately: Using only annotations from text “surrounding” an image - for example, on the WWW AAAI Confernece, July 28th-August 31th, 2002

Applications of MIL Text Categorization Multiple Instance Learning

Kernel Methods

Support Vector Machines

9

AAAI Confernece, July 28th-August 31th, 2002

MIL – Previous Work q

Dietterich, Lathrop & Lozano-Perez 1997 § Concepts modeled by axis parallel rectangles (APR) § Explicit feature selection § Good accuracy on drug prediction, but "custom-built" solution

q

Maron & Lozano-Perez 1999 § Diverse density (DD): circular region in pattern space close to examples from many positive bags and far from most negative bags

q

Zhang & Goldman 2002 § EM-DD: efficient extension of DD that searches for circular region after nominating examples to represent each bag

q

Gärtner, Flach, Kowalczyk & Smola 2002 § MI-Kernels: kernels defined at the bag level used to separate bags

10

q

Our Contributions: § Generalize Support Vector Machine learning to MIL July 28th-August 31th, 2002 AAAI Confernece,

MIL Definitions Input patterns x1 , K , x n ∈ R d grouped into bags B1 , K , B m with B I = {x i : i ∈ I } for given index sets I ⊆ {1, K , n} Labels YI = {−1, + 1} are associated with bags (B I , YI ) True pattern level labels yi are only indirectly accessble through the bag labels and the constraint YI = max yi i∈I

The goal is to induce a classifier f : X → Y f : X → R is called MI - separating w.r.t. a multiple instance data set if sgn max f(x i ) = YI for all bags B i i∈I

11

AAAI Confernece, July 28th-August 31th, 2002

Max. Pattern Margin Formulation q Standard SVM formulation with additional constraints on unknown labels y(mixed i integer problem) § Given an assignment to the unknown variables, the margin of every instance has influence on the solution

12

AAAI Confernece, July 28th-August 31th, 2002

Max. Pattern Margin Formulation q Primal form - joint optimization over labels and hyperplane:

1 2 mi - SVM : min min w + C ∑ ξ i s.t. { yi } w ,b 2 i

Slack variable constraints

yi ( w, x i + b )≥ 1 − ξ i

Bag constraints

ξi ≥ 0 YI = max yi

Unknown integer variables

yi ∈ {−1, + 1}

13

i∈I

∀i

AAAI Confernece, July 28th-August 31th, 2002

Max. Bag Margin Formulation q Alternative bag-centered formulation q Define functional margin of a bag as:

γ I = YI max( w, x i + b ) i∈I

q Only the most positive instance, or witness x , s (I ) matters in determining the bag margin

14

AAAI Confernece, July 28th-August 31th, 2002

Max. Bag Margin Formulation q Primal form - joint optimization over bag witnesses and hyperplane:

1 21 2 min min w∑+ξC ξI MI - SVM : min w +C s.t. ∑ I {s w(,Ib)} 2 w ,b 2 I I Bag and slack variable constraints

YI max( w, x i + b )≥ 1 − ξ I

Re-written for negative bags

− w, x i − b ≥ 1 − ξ I ∀I

And positive bags

15

i∈I

ξI ≥ 0

∀I

w, x s (I ) + b ≥ 1 − ξ I

∀I

AAAI Confernece, July 28th-August 31th, 2002

MIL-SVM Optimization q

Solve approximately using the following general alternating optimization scheme:

q

Loop until integer variables have converged: § For given integer variables (labels or bag-witnesses), solve SVM-QP to find optimal discriminant § For given discriminant, update integer variables in a way that (locally) minimizes the objective

q

16

Problems may be relaxed and EM-like update applied

AAAI Confernece, July 28th-August 31th, 2002

Synthetic Data Set in 2D (One)

17

AAAI Confernece, July 28th-August 31th, 2002

Synthetic Data Set in 2D (Two)

18

AAAI Confernece, July 28th-August 31th, 2002

Result on Synthetic Data Sets q

Accuracy on bag (training) and pattern level (testing): 120 100 80

SVM SVM-MIL SVM*

60 40 20 0 Bag AccPat AccBag AccPat Acc

19

Example One

Example Two AAAI Confernece, July 28th-August 31th, 2002

MUSK Molecule Representation q

Molecule § A molecular sample may be thought of as a bag containing multiple conformations of the molecule

q

Representation § Individual conformations are described by surface shape descriptors

q

Datasets § § § §

20

MUSK1: 92 bags, 476 instances, 166 features MUSK2: 102 bags, 6600 instances, 166 features Molecules labeled as having odor, or not By Dietterich, Lathrop and Lozano-Perez AAAI Confernece, July 28th-August 31th, 2002

Results on MUSK Data Set ≅ 100 bags ≅ 476/6600 instances ≅ 166 features

95 90

EM-DD DD MI-NN IAPR mi-SVM MI-SVM

85 80 75 70

21

MUSK 1

MUSK 2

AAAI Confernece, July 28th-August 31th, 2002

Blobworld Image Representation q

Bags § Each image is treated as a bag of segments § Segments obtained by clustering color and texture attributes using Gaussian mixture model § Used Blobworld code from group at UC Berkeley: [Carson, Belongie, Greenspan, Malik ‘99]

q

Sub-Instance representation § Segments represented by texture, color and shape features

q

22

Dataset § Corel Photo 1M database CD 4 (1000 photos) § 200 bags, 1300 instances, 230 features § Images annotated by categories (“elephant”, “fox”, “tiger”) AAAI Confernece, July 28th-August 31th, 2002

Examples: Blobworld Representation TIGER

x

23

 .   .  . 1  .  . 

q Color histogram q Texture features q Region shape descriptors

x

 .   .  2  .  .  . 

AAAI Confernece, July 28th-August 31th, 2002

More examples … SNOW

WOLF

ELEPHANT

24

AAAI Confernece, July 28th-August 31th, 2002

Automatic Image Annotation ≅ 200 bags ≅ 1300 instances ≅ 230 features

100 EM-DD mi-SVM linear mi-SVM poly mi-SVM rbf MI-SVM linear MI-SVM poly MI-SVM rbf

80 60 40 20 0 Elephant Fox

25

Tiger

AAAI Confernece, July 28th-August 31th, 2002

MEDLINE Document Representation q

Bags § Documents are treated as a bag comprised of consecutive, overlapping passages of 50 words (windows)

q

Sub-Instances § Term frequency vectors encode each window

q

Dataset § TREC9/OHSUMED data set § 400 bags, 3300 instances, 6800 features § Documents are annotated by medical subject heading (MeSH) categories (we tested the first 7 pre-test categories that each contained 100 positive bags)

26

AAAI Confernece, July 28th-August 31th, 2002

Text Categorization ≅ 400 bags ≅ 3300 instances ≅ 6800 features

100 80 60 40 20 0 TST1 2

3

EM-DD mi-SVM poly MI-SVM poly

27

4

7

9

10

mi-SVM linear MI-SVM linear MI-SVM rbf AAAI Confernece, July 28th-August 31th, 2002

Contributions

28

q

Presented novel maximum margin formulations of MIL (pattern & bag)

q

Generalized SVM thereby rendering kernel methods for MIL

q

Outlined EM-like optimization heuristics

q

Created new and varied MIL datasets AAAI Confernece, July 28th-August 31th, 2002

Future Work

29

q

Simulated / deterministic annealing

q

Larger problems

q

Rigorous testing and evaluation

q

CBIR application

AAAI Confernece, July 28th-August 31th, 2002

Conclusions

30

q

Competitive results across wide range of problems shows promise of maximum margin formulations

q

Modified optimization techniques should improve classification accuracy

q

Learning is still possible in domains where labeled data is difficult or impossible to obtain AAAI Confernece, July 28th-August 31th, 2002