Experiments with Semantic Similarity Measures based on LDA and LSA

Report 4 Downloads 79 Views
Experiments with Semantic Similarity Measures based on LDA and LSA Nobal Niraula, Rajendra Banjade, Dan Stefanescu, and Vasile Rus [email protected] The University of Memphis

Semantic Similarity • A practical approach to language understanding • Basic idea – Compare a target text with a benchmark/expert text whose meaning is known

• Semantic similarity vs. True understanding – True understanding is intractable and poorly scalable

Current Study • How does Latent Dirichlet Allocation (LDA) compare to Latent Semantic Analysis (LSA) • LDA – Probabilistic method – Latent topics – Blei, D.M., Ng, A.Y., & Jordan, M.I. 2003. Latent dirichlet allocation, The Journal of Machine Learning Research 3, 993-1022. • LSA – Algebraic method – Latent concepts – Landauer, T.; McNamara, D. S.; Dennis, S.; and Kintsch, W. (2007). Handbook of Latent Semantic Analysis. Mahwah, NJ: Erlbaum.

Why LDA and LSA? • We used LSA for more than a decade • LDA and LSA are fully automated – All you need is a large collection of texts

Practical purpose • To promote Deep Learning … – By humans

• Understand student utterances in dialoguebased Intelligent Tutoring Systems

DeepTutor

www.deeptutor.org

DeepTutor • Promoting deep learning of science topics through quality instruction and interaction – Deep Natural Language Processing • For now, just advanced semantic similarity methods • In the long run: deep, true understanding, e.g. using semantic nets

– Funded by Institute for Education Science

• Started in 2009 • Dialogue-based • First Intelligent Tutoring System based on Learning Progressions

DeepTutor • Topic: Conceptual Physics • Target Population: – High school students (335 students; d=0.908; online unsupervised learning) – Non-science majors in college (30 students; d=0.786)

Semantic Similarity: LSA vs LDA • LSA vs. LDA (vs. ESA vs. etc.) – Fully automated – LSA: Latent concepts • concept = abstract dimension in a reduceddimensionality abstract space • Word = vector/point in the latent space – All senses of a word map to the same vector

– LDA: latent topics • Topic = distribution over words • Word contributes to many topics – A topic may encode a particular word sense

LDA: Latent Dirichlet Allocation

TOPIC 2

TOPIC 2

Blei, D.M., Ng, A.Y., & Jordan, M.I. 2003. Latent dirichlet allocation, The Journal of Machine Learning Research 3, 993-1022.

Semantic Similarity between Short Texts • From Microsoft Research Paraphrase corpus (Dolan, Quirk, & Brockett, 2004) – Text A: York had no problem with MTA’s insisting the decision to shift funds had been within its legal rights. – Text B: York had no problem with MTA’s saying the decision to shift funds was within its powers.

• ITS data: – Student Response: An object that has a zero force acting on it will have zero acceleration. – Expert Answer: If an object moves with a constant velocity, the net force on the object is zero.

W2W versus T2T Similarity Measures • W2W (word-to-word) – LSA: cosine between corresponding vectors – LDA: similarity of word contributions to all topics T in the LDA model

• T2T (text-to-text) – Direct methods • LSA – Step 1: obtain an LSA vector for the text by summing individual words’ vectors – Step 2: cosine between text vectors

• LDA – Step 1: Similarity of documents (distributions over topics) – Step 2: Similarity of topics (distributions over words) – Step 3: Take the product

– Expanding W2W measures • Greedy • Optimal (Hungarian algorithm or Kuhn-Munkres algorithm; Munkres, 1957)

Experiments • Several LSA and LDA models • W2W and T2T methods • 2 datasets – Microsoft Research Paraphrase corpus • Sentences extracted from news articles

– User Language Paraphrase corpus • Sentences from student-computer tutor interactions

LDA vs. LSA Models • LSA model – Typically derived using N=300 latent dimensions/concepts

• LDA model – N=300 topics; to compare with same number of latent concepts used in LSA – Optimized number of topics (N=100) • Maximizing topic coherence based on Point-wise Mutual Information (PMI) derived from Wikipedia

Results Accuracy/ Kappa/Fmeasure (T=300)

Accuracy/Kappa/ Fmeasure (T=100)

Accuracy/Kappa/ Fmeasure (T=300)

Accuracy/Kappa/Fmeasure (T=100)

LDA-IR

71.17/16.17/81.94

68.24/3.09/80.92

67.47/4.52/79.87

67.01/3.15/79.98

LDA-Hellinger

71.32/18.85/81.75

68.24/2.46/80.99

67.36/4.39/79.73

67.18/3.50/80.04

LDA-Manhattan

71.07/10.10/82.50

71.21/23.41/81.16

66.78/3.56/79.91

67.18/4.04/80.04

LDA-Greedy

77.32/34.40/85.75

76.85/37.89/84.94

73.04/35.01/81.31

73.10/34.27/81.32

LDA-Optimal

76.97/36.96/85.06

75.96/36.75/84.14

73.27/36.74/80.71

73.15/36.86/80.71

LSA-Greedy

77.22/33.82/85.73

Same

72.86/33.89/81.11

Same

LSA-Optimal

77.12/36.80/85.24

Same

73.04/35.95/80.80

Same

LSA

77.47/37.54/85.50

Same

73.56/34.61/81.83

same

Method

Conclusions • W2W LDA-based measures perform well and comparable to LSA • The T2T LDA-based measures have limitations for short texts

SEMILAR: A Semantic Similarity Toolkit • It implements a number semantic similarity methods – – – – – – – –

preprocessing Overlap (n-gram) WordNet-based similarity measures LDA LSA Syntax Negation Etc.

• Library available to download • GUI-based application to be released soon • DEMO at ACL-2013

www.semanticsimilarity.org

Acknowledgments • This research has been supported in part by Institute for Education Sciences under award R305A100875

Thank You!

www.deeptutor.org