ACL Wiki - User contributions [en]

RG-65 Test Collection (State of the art)

2015-07-24T11:52:06Z

Taher:

* state of the art in Rubenstein & Goodenough (RG-65) dataset
* 65 word pairs;
* Similarity of each pair is scored according to a scale from 0 to 4 (the higher the "similarity of meaning," the higher the number);
* The similarity values in the dataset are the means of judgments made by 51 subjects [Rubenstein and Goodenough, 1965].
* see also: [[Similarity (State of the art)]]

== Table of results ==

* '''Listed in order of decreasing [http://en.wikipedia.org/wiki/Spearman_rank_correlation Spearman's rho].'''

{| border="1" cellpadding="5" cellspacing="1" width="100%"
|-
! Algorithm
! Reference for algorithm
! Reference for reported results
! Type
! [http://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient Spearman correlation] (ρ)
! [http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient Pearson correlation] (r)
|-
| ADW
| Pilehvar and Navigli (2015)
| Pilehvar and Navigli (2015)
| Knowledge-based (Wiktionary)
| 0.920
| 0.910
|-
| Y&Q
| Yih and Qazvinian (2012)
| Yih and Qazvinian (2012)
| Hybrid
| 0.890
| -
|-
| NASARI
| Camacho-Collados et al. (2015)
| Camacho-Collados et al. (2015)
| Hybrid
| 0.880
| 0.910
|-
| ADW
| Pilehvar et al. (2013)
| Pilehvar et al. (2013)
| Knowledge-based (WordNet)
| 0.868
| 0.810
|-
| PPR
| Hughes and Ramage (2007)
| Hughes and Ramage (2007)
| Knowledge-based
| 0.838
| -
|-
| SSA
| Hassan and Mihalcea (2011)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.833
| 0.861
|-
| PPR
| Agirre et al. (2009)
| Agirre et al. (2009)
| Knowledge-based
| 0.830
| -
|-
| H&S
| Hirst and St-Onge (1998)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.813
| 0.732
|-
| Roget
| Jarmasz (2003)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.804
| 0.818
|-
| J&C
| Jiang and Conrath (1997)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.804
| 0.731
|-
| WNE
| Jarmasz (2003)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.801
| 0.787
|-
| L&C
| Leacock and Chodorow (1998)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.797
| 0.852
|-
| Lin
| Lin (1998)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.788
| 0.834
|-
| ESA*
| Gabrilovich and Markovitch (2007)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.749
| 0.716
|-
| SOCPMI*
| Islam and Inkpen (2006)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.741
| 0.729
|-
| Resnik
| Resnik (1995)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.731
| 0.800
|-
| WLM
| Milne and Witten (2008)
| Milne and Witten (2008)
| Knowledge-based
| 0.640
| -
|-
| LSA*
| Landauer et al. (1997)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.609
| 0.644
|-
| WikiRelate
| Strube and Ponzetto (2006)
| Strube and Ponzetto (2006)
| Knowledge-based
| -
| 0.530
|}

Note: values reported by (Hassan and Mihalcea, 2011) are "based on the collected raw data from the respective authors", and those highlighted by (*) are re-implementations.

== References ==

* '''Listed alphabetically.'''

Agirre, Eneko, Enrique Alfonseca, Keith Hall, Jana Kravalova, Marius Pasca, Aitor Soroa: [http://www.aclweb.org/anthology/N09-1003 A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches]. HLT-NAACL 2009: 19-27

Camacho-Collados, José, Pilehvar, Mohammad Taher, and Navigli, Roberto: [http://aclweb.org/anthology/N/N15/N15-1059.pdf NASARI: a Novel Approach to a Semantically-Aware Representation of Items]. NAACL 2015, pp. 567-577, Denver, USA.

Gabrilovich, Evgeniy, and Shaul Markovitch, [http://www.cs.technion.ac.il/~gabr/papers/ijcai-2007-sim.pdf Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis], Proceedings of The 20th International Joint Conference on Artificial Intelligence (IJCAI), Hyderabad, India, 2007.

Hassan, Samer, and Rada Mihalcea: [http://www.cse.unt.edu/~rada/papers/hassan.aaai11.pdf‎ Semantic Relatedness Using Salient Semantic Analysis]. AAAI 2011

Hirst, Graeme and David St-Onge. Lexical chains as representations of context for the detection and correction of malapropisms. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 305–332, 1998.

Hughes, Thad, Daniel Ramage, Lexical Semantic Relatedness with Random Graph Walks. EMNLP-CoNLL 2007: 581-589.

Islam, A., and Inkpen, D. 2006. [http://www.site.uottawa.ca/~mdislam/publications/LREC_06_242.pdf Second order co-occurrence pmi for determining the semantic similarity of words]. Proceedings of the International Conference on Language Resources and Evaluation (LREC 2006) 1033–1038.

Jarmasz, M. 2003. [http://www.arxiv.org/pdf/1204.0140 Roget’s thesaurus as a Lexical Resource for Natural Language Processing]. Ph.D. Dissertation, Ottawa Carleton Institute for Computer Science, School of Information Technology and Engineering, University of Ottawa.

Jiang, Jay J. and David W. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of International Conference on Research in Computational Linguistics (ROCLING X), Taiwan, pages 19–33, 1997.

Landauer, T. K.; L, T. K.; Laham, D.; Rehder, B.; and Schreiner, M. E. 1997. How well can passage meaning be derived without using word order? a comparison of latent semantic analysis and humans.

Leacock, Claudia and Martin Chodorow. Combining local context and WordNet similarity for word sense identification. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 265–283, 1998.

Lin, Dekang. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, Madison,WI, pages 296–304, 1998.

Milne, David, and Ian H. Witten, An Effective, Low-Cost Measure of Semantic Relatedness Obtained from Wikipedia Links, In Proceedings of AAAI 2008.

Pilehvar, M.T., Jurgens, D. and Navigli, R. [http://wwwusers.di.uniroma1.it/~navigli/pubs/ACL_2013_Pilehvar_Jurgens_Navigli.pdf Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity]. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, August 4-9, 2013, pp. 1341-1351.

Pilehvar, M.T. and Navigli, R. [http://www.sciencedirect.com/science/article/pii/S000437021500106X From Senses to Texts: An All-in-one Graph-based Approach for Measuring Semantic Similarity]. Artificial Intelligence, Elsevier.

Resnik, Philip. Using information content to evaluate semantic similarity. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 448–453, Montreal, Canada, 1995.

Rubenstein, Herbert, and John B. Goodenough. Contextual correlates of synonymy. Communications of the ACM, 8(10):627–633, 1965.

Strube, Michael, Simone Paolo Ponzetto: WikiRelate! Computing Semantic Relatedness Using Wikipedia. AAAI 2006: 1419-1424

Yih, W. and Qazvinian, V. (2012). [http://aclweb.org/anthology/N/N12/N12-1077.pdf Measuring Word Relatedness Using Heterogeneous Vector Space Models]. Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2012).

[[Category:State of the art]]
[[Category:Similarity]]

TOEFL Synonym Questions (State of the art)

2014-02-13T11:44:33Z

Taher: /* References */

* TOEFL = Test of English as a Foreign Language
* 80 multiple-choice synonym questions; 4 choices per question
* the TOEFL questions are available on request by contacting [http://lsa.colorado.edu/mail_sub.html LSA Support at CU Boulder], the people who manage the [http://lsa.colorado.edu/ LSA web site at Colorado]
* introduced in Landauer and Dumais (1997) as a way of evaluating algorithms for measuring degree of similarity between words
* subsequently used by many other researchers
* see also: [[Similarity (State of the art)]]

== Sample question ==

::{| border="0" cellpadding="1" cellspacing="1"
|-
! Stem:
|
| levied
|-
! Choices:
| (a)
| imposed
|-
|
| (b)
| believed
|-
|
| (c)
| requested
|-
|
| (d)
| correlated
|-
! Solution:
| (a)
| imposed
|-
|}

== Table of results ==

{| border="1" cellpadding="5" cellspacing="1" width="100%"
|-
! Algorithm
! Reference for algorithm
! Reference for experiment
! Type
! Correct
! 95% confidence
|-
| RES
| Resnik (1995)
| Jarmasz and Szpakowicz (2003)
| Hybrid
| 20.31%
| 12.89–31.83%
|-
| LC
| Leacock and Chodrow (1998)
| Jarmasz and Szpakowicz (2003)
| Lexicon-based
| 21.88%
| 13.91–33.21%
|-
| LIN
| Lin (1998)
| Jarmasz and Szpakowicz (2003)
| Hybrid
| 24.06%
| 15.99–35.94%
|-
| Random
| Random guessing
| 1 / 4 = 25.00%
| Random
| 25.00%
| 15.99–35.94%
|-
| JC
| Jiang and Conrath (1997)
| Jarmasz and Szpakowicz (2003)
| Hybrid
| 25.00%
| 15.99–35.94%
|-
| LSA
| Landauer and Dumais (1997)
| Landauer and Dumais (1997)
| Corpus-based
| 64.38%
| 52.90–74.80%
|-
| Human
| Average non-English US college applicant
| Landauer and Dumais (1997)
| Human
| 64.50%
| 53.01–74.88%
|-
| RI
| Karlgren and Sahlgren (2001)
| Karlgren and Sahlgren (2001)
| Corpus-based
| 72.50%
| 61.38-81.90%
|-
| DS
| Pado and Lapata (2007)
| Pado and Lapata (2007)
| Corpus-based
| 73.00%
| 62.72-82.96%
|-
| PMI-IR
| Turney (2001)
| Turney (2001)
| Corpus-based
| 73.75%
| 62.72–82.96%
|-
| PairClass
| Turney (2008)
| Turney (2008)
| Corpus-based
| 76.25%
| 65.42-85.06%
|-
| HSO
| Hirst and St.-Onge (1998)
| Jarmasz and Szpakowicz (2003)
| Lexicon-based
| 77.91%
| 68.17–87.11%
|-
| JS
| Jarmasz and Szpakowicz (2003)
| Jarmasz and Szpakowicz (2003)
| Lexicon-based
| 78.75%
| 68.17–87.11%
|-
| PMI-IR
| Terra and Clarke (2003)
| Terra and Clarke (2003)
| Corpus-based
| 81.25%
| 70.97–89.11%
|-
| CWO
| Ruiz-Casado et al. (2005)
| Ruiz-Casado et al. (2005)
| Web-based
| 82.55%
| 72.38–90.09%
|-
| PPMIC
| Bullinaria and Levy (2007)
| Bullinaria and Levy (2007)
| Corpus-based
| 85.00%
| 75.26-92.00%
|-
| GLSA
| Matveeva et al. (2005)
| Matveeva et al. (2005)
| Corpus-based
| 86.25%
| 76.73-92.93%
|-
| LSA
| Rapp (2003)
| Rapp (2003)
| Corpus-based
| 92.50%
| 84.39-97.20%
|-
| ADW
| Pilehvar et al. (2013)
| Pilehvar et al. (2013)
| WordNet graph-based (unsupervised)
| 96.25%
| 89.43-99.22%
|-
| PR
| Turney et al. (2003)
| Turney et al. (2003)
| Hybrid
| 97.50%
| 91.26–99.70%
|-
| PCCP
| Bullinaria and Levy (2012)
| Bullinaria and Levy (2012)
| Corpus-based
| 100.00%
| 96.32-100.00%
|}

== Explanation of table ==

* '''Algorithm''' = name of algorithm
* '''Reference for algorithm''' = where to find out more about given algorithm
* '''Reference for experiment''' = where to find out more about evaluation of given algorithm with TOEFL questions
* '''Type''' = general type of algorithm: corpus-based, lexicon-based, hybrid
* '''Correct''' = percent of 80 questions that given algorithm answered correctly
* '''95% confidence''' = confidence interval calculated using the [[Statistical calculators|Binomial Exact Test]]
* table rows sorted in order of increasing percent correct
* several WordNet-based similarity measures are implemented in [http://www.d.umn.edu/~tpederse/ Ted Pedersen]'s [http://www.d.umn.edu/~tpederse/similarity.html WordNet::Similarity] package
* LSA = Latent Semantic Analysis
* PCCP = Principal Component vectors with Caron P
* PMI-IR = Pointwise Mutual Information - Information Retrieval
* PR = Product Rule
* PPMIC = Positive Pointwise Mutual Information with Cosine
* GLSA = Generalized Latent Semantic Analysis
* CWO = Context Window Overlapping
* DS = Dependency Space
* RI = Random Indexing

== Notes ==

* the performance of a corpus-based algorithm depends on the corpus, so the difference in performance between two corpus-based systems may be due to the different corpora, rather than the different algorithms
* the TOEFL questions include nouns, verbs, and adjectives, but some of the WordNet-based algorithms were only designed to work with nouns; this explains some of the lower scores
* some of the algorithms may have been tuned on the TOEFL questions; read the references for details
* Landauer and Dumais (1997) report scores that were corrected for guessing by subtracting a penalty of 1/3 for each incorrect answer; they report a score of 52.5% when this penalty is applied; when the penalty is removed, their performance is 64.4% correct

== References ==

Bullinaria, J.A., and Levy, J.P. (2007). [http://www.cs.bham.ac.uk/~jxb/PUBS/BRM.pdf Extracting semantic representations from word co-occurrence statistics: A computational study]. ''Behavior Research Methods'', 39(3), 510-526.

Bullinaria, J.A., and Levy, J.P. (2012). [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.228.9582&rep=rep1&type=pdf Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD]. ''Behavior Research Methods'', 44(3):890-907.

Hirst, G., and St-Onge, D. (1998). [http://mirror.eacoss.org/documentation/ITLibrary/IRIS/Data/1997/Hirst/Lexical/1997-Hirst-Lexical.pdf Lexical chains as representation of context for the detection and correction of malapropisms]. In C. Fellbaum (ed.), ''WordNet: An Electronic Lexical Database''. Cambridge: MIT Press, 305-332.

Jarmasz, M., and Szpakowicz, S. (2003). [http://www.csi.uottawa.ca/~szpak/recent_papers/TR-2003-01.pdf Roget’s thesaurus and semantic similarity], ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03)'', Borovets, Bulgaria, September, pp. 212-219.

Jiang, J.J., and Conrath, D.W. (1997). [http://wortschatz.uni-leipzig.de/~sbordag/aalw05/Referate/03_Assoziationen_BudanitskyResnik/Jiang_Conrath_97.pdf Semantic similarity based on corpus statistics and lexical taxonomy]. ''Proceedings of the International Conference on Research in Computational Linguistics'', Taiwan.

Karlgren, J. and Sahlgren, M. (2001). [http://www.sics.se/~jussi/Artiklar/2001_RWIbook/KarlgrenSahlgren2001.pdf From Words to Understanding]. In Uesaka, Y., Kanerva, P., & Asoh, H. (Eds.), ''Foundations of Real-World Intelligence'', Stanford: CSLI Publications, pp. 294–308.

Landauer, T.K., and Dumais, S.T. (1997). [http://lsa.colorado.edu/papers/plato/plato.annote.html A solution to Plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge]. ''Psychological Review'', 104(2):211–240.

Leacock, C., and Chodorow, M. (1998). [http://books.google.ca/books?id=Rehu8OOzMIMC&lpg=PA265&ots=IpnaLkZUec&lr&pg=PA265#v=onepage&q&f=false Combining local context and WordNet similarity for word sense identification]. In C. Fellbaum (ed.), ''WordNet: An Electronic Lexical Database''. Cambridge: MIT Press, pp. 265-283.

Lin, D. (1998). [http://www.cs.ualberta.ca/~lindek/papers/sim.pdf An information-theoretic definition of similarity]. ''Proceedings of the 15th International Conference on Machine Learning (ICML-98)'', Madison, WI, pp. 296-304.

Matveeva, I., Levow, G., Farahat, A., and Royer, C. (2005). [http://people.cs.uchicago.edu/~matveeva/SynGLSA_ranlp_final.pdf Generalized latent semantic analysis for term representation]. ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-05)'', Borovets, Bulgaria.

Pado, S., and Lapata, M. (2007). [http://www.nlpado.de/~sebastian/pub/papers/cl07_pado.pdf Dependency-based construction of semantic space models]. ''Computational Linguistics'', 33(2), 161-199.

Pilehvar, M.T., Jurgens D., and Navigli R. (2013). [http://wwwusers.di.uniroma1.it/~navigli/pubs/ACL_2013_Pilehvar_Jurgens_Navigli.pdf Align, disambiguate and walk: A unified approach for measuring semantic similarity]. ''Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013),'' Sofia, Bulgaria.

Rapp, R. (2003). [http://www.amtaweb.org/summit/MTSummit/FinalPapers/19-Rapp-final.pdf Word sense discovery based on sense descriptor dissimilarity]. ''Proceedings of the Ninth Machine Translation Summit'', pp. 315-322.

Resnik, P. (1995). [http://citeseer.ist.psu.edu/resnik95using.html Using information content to evaluate semantic similarity]. ''Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95)'', Montreal, pp. 448-453.

Ruiz-Casado, M., Alfonseca, E. and Castells, P. (2005) [http://alfonseca.org/pubs/2005-ranlp1.pdf Using context-window overlapping in Synonym Discovery and Ontology Extension]. ''Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP-2005)'', Borovets, Bulgaria.

Terra, E., and Clarke, C.L.A. (2003). [http://acl.ldc.upenn.edu/N/N03/N03-1032.pdf Frequency estimates for statistical word similarity measures]. ''Proceedings of the Human Language Technology and North American Chapter of Association of Computational Linguistics Conference 2003 (HLT/NAACL 2003)'', pp. 244–251.

Turney, P.D. (2001). [http://arxiv.org/abs/cs.LG/0212033 Mining the Web for synonyms: PMI-IR versus LSA on TOEFL]. ''Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001)'', Freiburg, Germany, pp. 491-502.

Turney, P.D., Littman, M.L., Bigham, J., and Shnayder, V. (2003). [http://arxiv.org/abs/cs.CL/0309035 Combining independent modules to solve multiple-choice synonym and analogy problems]. ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03)'', Borovets, Bulgaria, pp. 482-489.

Turney, P.D. (2008). [http://arxiv.org/abs/0809.0124 A uniform approach to analogies, synonyms, antonyms, and associations]. ''Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)'', Manchester, UK, pp. 905-912.

[[Category:State of the art]]

TOEFL Synonym Questions (State of the art)

2014-02-13T11:44:11Z

Taher: /* References */

* TOEFL = Test of English as a Foreign Language
* 80 multiple-choice synonym questions; 4 choices per question
* the TOEFL questions are available on request by contacting [http://lsa.colorado.edu/mail_sub.html LSA Support at CU Boulder], the people who manage the [http://lsa.colorado.edu/ LSA web site at Colorado]
* introduced in Landauer and Dumais (1997) as a way of evaluating algorithms for measuring degree of similarity between words
* subsequently used by many other researchers
* see also: [[Similarity (State of the art)]]

== Sample question ==

::{| border="0" cellpadding="1" cellspacing="1"
|-
! Stem:
|
| levied
|-
! Choices:
| (a)
| imposed
|-
|
| (b)
| believed
|-
|
| (c)
| requested
|-
|
| (d)
| correlated
|-
! Solution:
| (a)
| imposed
|-
|}

== Table of results ==

{| border="1" cellpadding="5" cellspacing="1" width="100%"
|-
! Algorithm
! Reference for algorithm
! Reference for experiment
! Type
! Correct
! 95% confidence
|-
| RES
| Resnik (1995)
| Jarmasz and Szpakowicz (2003)
| Hybrid
| 20.31%
| 12.89–31.83%
|-
| LC
| Leacock and Chodrow (1998)
| Jarmasz and Szpakowicz (2003)
| Lexicon-based
| 21.88%
| 13.91–33.21%
|-
| LIN
| Lin (1998)
| Jarmasz and Szpakowicz (2003)
| Hybrid
| 24.06%
| 15.99–35.94%
|-
| Random
| Random guessing
| 1 / 4 = 25.00%
| Random
| 25.00%
| 15.99–35.94%
|-
| JC
| Jiang and Conrath (1997)
| Jarmasz and Szpakowicz (2003)
| Hybrid
| 25.00%
| 15.99–35.94%
|-
| LSA
| Landauer and Dumais (1997)
| Landauer and Dumais (1997)
| Corpus-based
| 64.38%
| 52.90–74.80%
|-
| Human
| Average non-English US college applicant
| Landauer and Dumais (1997)
| Human
| 64.50%
| 53.01–74.88%
|-
| RI
| Karlgren and Sahlgren (2001)
| Karlgren and Sahlgren (2001)
| Corpus-based
| 72.50%
| 61.38-81.90%
|-
| DS
| Pado and Lapata (2007)
| Pado and Lapata (2007)
| Corpus-based
| 73.00%
| 62.72-82.96%
|-
| PMI-IR
| Turney (2001)
| Turney (2001)
| Corpus-based
| 73.75%
| 62.72–82.96%
|-
| PairClass
| Turney (2008)
| Turney (2008)
| Corpus-based
| 76.25%
| 65.42-85.06%
|-
| HSO
| Hirst and St.-Onge (1998)
| Jarmasz and Szpakowicz (2003)
| Lexicon-based
| 77.91%
| 68.17–87.11%
|-
| JS
| Jarmasz and Szpakowicz (2003)
| Jarmasz and Szpakowicz (2003)
| Lexicon-based
| 78.75%
| 68.17–87.11%
|-
| PMI-IR
| Terra and Clarke (2003)
| Terra and Clarke (2003)
| Corpus-based
| 81.25%
| 70.97–89.11%
|-
| CWO
| Ruiz-Casado et al. (2005)
| Ruiz-Casado et al. (2005)
| Web-based
| 82.55%
| 72.38–90.09%
|-
| PPMIC
| Bullinaria and Levy (2007)
| Bullinaria and Levy (2007)
| Corpus-based
| 85.00%
| 75.26-92.00%
|-
| GLSA
| Matveeva et al. (2005)
| Matveeva et al. (2005)
| Corpus-based
| 86.25%
| 76.73-92.93%
|-
| LSA
| Rapp (2003)
| Rapp (2003)
| Corpus-based
| 92.50%
| 84.39-97.20%
|-
| ADW
| Pilehvar et al. (2013)
| Pilehvar et al. (2013)
| WordNet graph-based (unsupervised)
| 96.25%
| 89.43-99.22%
|-
| PR
| Turney et al. (2003)
| Turney et al. (2003)
| Hybrid
| 97.50%
| 91.26–99.70%
|-
| PCCP
| Bullinaria and Levy (2012)
| Bullinaria and Levy (2012)
| Corpus-based
| 100.00%
| 96.32-100.00%
|}

== Explanation of table ==

* '''Algorithm''' = name of algorithm
* '''Reference for algorithm''' = where to find out more about given algorithm
* '''Reference for experiment''' = where to find out more about evaluation of given algorithm with TOEFL questions
* '''Type''' = general type of algorithm: corpus-based, lexicon-based, hybrid
* '''Correct''' = percent of 80 questions that given algorithm answered correctly
* '''95% confidence''' = confidence interval calculated using the [[Statistical calculators|Binomial Exact Test]]
* table rows sorted in order of increasing percent correct
* several WordNet-based similarity measures are implemented in [http://www.d.umn.edu/~tpederse/ Ted Pedersen]'s [http://www.d.umn.edu/~tpederse/similarity.html WordNet::Similarity] package
* LSA = Latent Semantic Analysis
* PCCP = Principal Component vectors with Caron P
* PMI-IR = Pointwise Mutual Information - Information Retrieval
* PR = Product Rule
* PPMIC = Positive Pointwise Mutual Information with Cosine
* GLSA = Generalized Latent Semantic Analysis
* CWO = Context Window Overlapping
* DS = Dependency Space
* RI = Random Indexing

== Notes ==

* the performance of a corpus-based algorithm depends on the corpus, so the difference in performance between two corpus-based systems may be due to the different corpora, rather than the different algorithms
* the TOEFL questions include nouns, verbs, and adjectives, but some of the WordNet-based algorithms were only designed to work with nouns; this explains some of the lower scores
* some of the algorithms may have been tuned on the TOEFL questions; read the references for details
* Landauer and Dumais (1997) report scores that were corrected for guessing by subtracting a penalty of 1/3 for each incorrect answer; they report a score of 52.5% when this penalty is applied; when the penalty is removed, their performance is 64.4% correct

== References ==

Bullinaria, J.A., and Levy, J.P. (2007). [http://www.cs.bham.ac.uk/~jxb/PUBS/BRM.pdf Extracting semantic representations from word co-occurrence statistics: A computational study]. ''Behavior Research Methods'', 39(3), 510-526.

Bullinaria, J.A., and Levy, J.P. (2012). [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.228.9582&rep=rep1&type=pdf Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD]. ''Behavior Research Methods'', 44(3):890-907.

Hirst, G., and St-Onge, D. (1998). [http://mirror.eacoss.org/documentation/ITLibrary/IRIS/Data/1997/Hirst/Lexical/1997-Hirst-Lexical.pdf Lexical chains as representation of context for the detection and correction of malapropisms]. In C. Fellbaum (ed.), ''WordNet: An Electronic Lexical Database''. Cambridge: MIT Press, 305-332.

Jarmasz, M., and Szpakowicz, S. (2003). [http://www.csi.uottawa.ca/~szpak/recent_papers/TR-2003-01.pdf Roget’s thesaurus and semantic similarity], ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03)'', Borovets, Bulgaria, September, pp. 212-219.

Jiang, J.J., and Conrath, D.W. (1997). [http://wortschatz.uni-leipzig.de/~sbordag/aalw05/Referate/03_Assoziationen_BudanitskyResnik/Jiang_Conrath_97.pdf Semantic similarity based on corpus statistics and lexical taxonomy]. ''Proceedings of the International Conference on Research in Computational Linguistics'', Taiwan.

Karlgren, J. and Sahlgren, M. (2001). [http://www.sics.se/~jussi/Artiklar/2001_RWIbook/KarlgrenSahlgren2001.pdf From Words to Understanding]. In Uesaka, Y., Kanerva, P., & Asoh, H. (Eds.), ''Foundations of Real-World Intelligence'', Stanford: CSLI Publications, pp. 294–308.

Landauer, T.K., and Dumais, S.T. (1997). [http://lsa.colorado.edu/papers/plato/plato.annote.html A solution to Plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge]. ''Psychological Review'', 104(2):211–240.

Leacock, C., and Chodorow, M. (1998). [http://books.google.ca/books?id=Rehu8OOzMIMC&lpg=PA265&ots=IpnaLkZUec&lr&pg=PA265#v=onepage&q&f=false Combining local context and WordNet similarity for word sense identification]. In C. Fellbaum (ed.), ''WordNet: An Electronic Lexical Database''. Cambridge: MIT Press, pp. 265-283.

Lin, D. (1998). [http://www.cs.ualberta.ca/~lindek/papers/sim.pdf An information-theoretic definition of similarity]. ''Proceedings of the 15th International Conference on Machine Learning (ICML-98)'', Madison, WI, pp. 296-304.

Matveeva, I., Levow, G., Farahat, A., and Royer, C. (2005). [http://people.cs.uchicago.edu/~matveeva/SynGLSA_ranlp_final.pdf Generalized latent semantic analysis for term representation]. ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-05)'', Borovets, Bulgaria.

Pado, S., and Lapata, M. (2007). [www.nlpado.de/~sebastian/pub/papers/cl07_pado.pdf Dependency-based construction of semantic space models]. ''Computational Linguistics'', 33(2), 161-199.

Pilehvar, M.T., Jurgens D., and Navigli R. (2013). [http://wwwusers.di.uniroma1.it/~navigli/pubs/ACL_2013_Pilehvar_Jurgens_Navigli.pdf Align, disambiguate and walk: A unified approach for measuring semantic similarity]. ''Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013),'' Sofia, Bulgaria.

Rapp, R. (2003). [http://www.amtaweb.org/summit/MTSummit/FinalPapers/19-Rapp-final.pdf Word sense discovery based on sense descriptor dissimilarity]. ''Proceedings of the Ninth Machine Translation Summit'', pp. 315-322.

Resnik, P. (1995). [http://citeseer.ist.psu.edu/resnik95using.html Using information content to evaluate semantic similarity]. ''Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95)'', Montreal, pp. 448-453.

Ruiz-Casado, M., Alfonseca, E. and Castells, P. (2005) [http://alfonseca.org/pubs/2005-ranlp1.pdf Using context-window overlapping in Synonym Discovery and Ontology Extension]. ''Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP-2005)'', Borovets, Bulgaria.

Terra, E., and Clarke, C.L.A. (2003). [http://acl.ldc.upenn.edu/N/N03/N03-1032.pdf Frequency estimates for statistical word similarity measures]. ''Proceedings of the Human Language Technology and North American Chapter of Association of Computational Linguistics Conference 2003 (HLT/NAACL 2003)'', pp. 244–251.

Turney, P.D. (2001). [http://arxiv.org/abs/cs.LG/0212033 Mining the Web for synonyms: PMI-IR versus LSA on TOEFL]. ''Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001)'', Freiburg, Germany, pp. 491-502.

Turney, P.D., Littman, M.L., Bigham, J., and Shnayder, V. (2003). [http://arxiv.org/abs/cs.CL/0309035 Combining independent modules to solve multiple-choice synonym and analogy problems]. ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03)'', Borovets, Bulgaria, pp. 482-489.

Turney, P.D. (2008). [http://arxiv.org/abs/0809.0124 A uniform approach to analogies, synonyms, antonyms, and associations]. ''Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)'', Manchester, UK, pp. 905-912.

[[Category:State of the art]]

Knowledge collections and datasets (English)

2013-10-16T22:14:15Z

Taher: Undo revision 10389 by Taher (talk)

Knowledge collections and datasets for Computational Linguistics and Natural Language Processing.

For languages other than English, see [[List of resources by language]].


* [[Clustering by Committee]] - terms clustered and organized using the [[Distributional Hypothesis]]
* [[DIRT Paraphrase Collection]] - Discovery of Inference Rules from Text
* [http://www.eat.rl.ac.uk/ Edinburgh Associative Thesaurus (EAT)]
* [http://framenet.icsi.berkeley.edu/ FrameNet]
* [http://www.psych.rl.ac.uk/ MRC Psycholinguistic Database]
* [http://www.clres.com/prepositions.html Preposition Project]
* [[Noun compound repository|Noun Compound Repository]]
* [http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html Reuters-21578 Text Categorization Collection]
* [[SAT Analogy Questions]] - a way of evaluating algorithms for measuring relational similarity
* [[Spam filtering datasets]]
* [[TEASE]] - Acquisition of Entailment Relations from the Web
* [[TOEFL Synonym Questions]] - a way of evaluating algorithms for measuring degree of similarity between 2 words
* [[RG-65 Test Collection (State of the art)]] - suitable for correlation-based evaluation of algorithms for measuring semantic similarity of word pairs
* [http://w3.usf.edu/FreeAssociation/ University of South Florida Free Association Norms]
* [[VerbOcean]] - verbs organized by semantic relation, including temporal precedence and strength
* [[WordNet]]
* [http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/wordsim353.html WordSimilarity-353 Test Collection]

See also [[NLG:Data sets]] for a collection of data sets used for building natural language generation systems.

== Additional Dataset Collections ==
* [http://www.ldc.upenn.edu/ Linguistic Data Consortium (LDC)]

[[Category:Knowledge Collections and Datasets|*]]

Knowledge collections and datasets (English)

2013-10-16T22:12:44Z

Taher:

Knowledge collections and datasets for Computational Linguistics and Natural Language Processing.

For languages other than English, see [[List of resources by language]].


* [[Clustering by Committee]] - terms clustered and organized using the [[Distributional Hypothesis]]
* [[DIRT Paraphrase Collection]] - Discovery of Inference Rules from Text
* [http://www.eat.rl.ac.uk/ Edinburgh Associative Thesaurus (EAT)]
* [http://framenet.icsi.berkeley.edu/ FrameNet]
* [http://www.psych.rl.ac.uk/ MRC Psycholinguistic Database]
* [http://www.clres.com/prepositions.html Preposition Project]
* [[Noun compound repository|Noun Compound Repository]]
* [http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html Reuters-21578 Text Categorization Collection]
* [[SAT Analogy Questions]] - a way of evaluating algorithms for measuring relational similarity
* [[Spam filtering datasets]]
* [[TEASE]] - Acquisition of Entailment Relations from the Web
* [[TOEFL Synonym Questions]] - a way of evaluating algorithms for measuring degree of similarity between 2 words
* [[RG-65 Test Collection]] - suitable for correlation-based evaluation of algorithms for measuring semantic similarity of word pairs
* [http://w3.usf.edu/FreeAssociation/ University of South Florida Free Association Norms]
* [[VerbOcean]] - verbs organized by semantic relation, including temporal precedence and strength
* [[WordNet]]
* [http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/wordsim353.html WordSimilarity-353 Test Collection]

See also [[NLG:Data sets]] for a collection of data sets used for building natural language generation systems.

== Additional Dataset Collections ==
* [http://www.ldc.upenn.edu/ Linguistic Data Consortium (LDC)]

[[Category:Knowledge Collections and Datasets|*]]

RG-65 Test Collection (State of the art)

2013-10-16T17:04:50Z

Taher:

* state of the art in Rubenstein & Goodenough (RG-65) dataset
* 65 word pairs;
* Similarity of each pair is scored according to a scale from 0 to 4 (the higher the "similarity of meaning," the higher the number);
* The similarity values in the dataset are the means of judgments made by 51 subjects [Rubenstein and Goodenough, 1965].
* see also: [[Similarity (State of the art)]]

== Table of results ==

{| border="1" cellpadding="5" cellspacing="1" width="100%"
|-
! Algorithm
! Reference for algorithm
! Reference for reported results
! Type
! Spearman correlation (ρ)
! Pearson correlation (r)
|-
| ADW
| Pilehvar et al. (2013)
| Pilehvar et al. (2013)
| Knowledge-based
| 0.868
| 0.810
|-
| PPR
| Hughes and Ramage (2007)
| Hughes and Ramage (2007)
| Knowledge-based
| 0.838
| -
|-
| SSA
| Hassan and Mihalcea (2011)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.833
| 0.861
|-
| PPR
| Agirre et al. (2009)
| Agirre et al. (2009)
| Knowledge-based
| 0.830
| -
|-
| H&S
| Hirst and St-Onge (1998)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.813
| 0.732
|-
| Roget
| Jarmasz (2003)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.804
| 0.818
|-
| J&C
| Jiang and Conrath (1997)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.804
| 0.731
|-
| WNE
| Jarmasz (2003)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.801
| 0.787
|-
| L&C
| Leacock and Chodorow (1998)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.797
| 0.852
|-
| Lin
| Lin (1998)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.788
| 0.834
|-
| ESA*
| Gabrilovich and Markovitch (2007)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.749
| 0.716
|-
| SOCPMI*
| Islam and Inkpen (2006)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.741
| 0.729
|-
| Resnik
| Resnik (1995)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.731
| 0.800
|-
| WLM
| Milne and Witten (2008)
| Milne and Witten (2008)
| Knowledge-based
| 0.640
| -
|-
| LSA*
| Landauer et al. (1997)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.609
| 0.644
|-
| WikiRelate
| Strube and Ponzetto (2006)
| Strube and Ponzetto (2006)
| Knowledge-based
| -
| 0.530
|}

Note: values reported by (Hassan and Mihalcea, 2011) are "based on the collected raw data from the respective authors", and those highlighted by (*) are re-implementations.

== References ==

* Herbert Rubenstein and John B. Goodenough. Contextual correlates of synonymy. Communications of the ACM, 8(10):627–633, 1965.

* Samer Hassan, Rada Mihalcea: [http://www.cse.unt.edu/~rada/papers/hassan.aaai11.pdf‎ Semantic Relatedness Using Salient Semantic Analysis]. AAAI 2011

* M. T. Pilehvar, D. Jurgens and R. Navigli. [http://wwwusers.di.uniroma1.it/~navigli/pubs/ACL_2013_Pilehvar_Jurgens_Navigli.pdf Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity]. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, August 4-9, 2013, pp. 1341-1351.

* Thad Hughes, Daniel Ramage, Lexical Semantic Relatedness with Random Graph Walks. EMNLP-CoNLL 2007: 581-589.

* Eneko Agirre, Enrique Alfonseca, Keith Hall, Jana Kravalova, Marius Pasca, Aitor Soroa: [http://www.aclweb.org/anthology/N09-1003 A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches]. HLT-NAACL 2009: 19-27

* Hirst, Graeme and David St-Onge. Lexical chains as representations of context for the detection and correction of malapropisms. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 305–332, 1998.

* Jarmasz, M. 2003. [http://www.arxiv.org/pdf/1204.0140 Roget’s thesaurus as a Lexical Resource for Natural Language Processing]. Ph.D. Dissertation, Ottawa Carleton Institute for Computer Science, School of Information Technology and Engineering, University of Ottawa.

* Jiang, Jay J. and David W. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of International Conference on Research in Computational Linguistics (ROCLING X), Taiwan, pages 19–33, 1997.

* Leacock, Claudia and Martin Chodorow. Combining local context and WordNet similarity for word sense identification. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 265–283, 1998.

* Lin, Dekang. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, Madison,WI, pages 296–304, 1998.

* Evgeniy Gabrilovich and Shaul Markovitch, [http://www.cs.technion.ac.il/~gabr/papers/ijcai-2007-sim.pdf Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis], Proceedings of The 20th International Joint Conference on Artificial Intelligence (IJCAI), Hyderabad, India, 2007.

* Islam, A., and Inkpen, D. 2006. [http://www.site.uottawa.ca/~mdislam/publications/LREC_06_242.pdf Second order co-occurrence pmi for determining the semantic similarity of words]. Proceedings of the International Conference on Language Resources and Evaluation (LREC 2006) 1033–1038.

* Resnik, Philip. Using information content to evaluate semantic similarity. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 448–453, Montreal, Canada, 1995.

* David Milne, and Ian H. Witten, An Effective, Low-Cost Measure of Semantic Relatedness Obtained from Wikipedia Links, In Proceedings of AAAI 2008.

* Landauer, T. K.; L, T. K.; Laham, D.; Rehder, B.; and Schreiner, M. E. 1997. How well can passage meaning be derived without using word order? a comparison of latent semantic analysis and humans.

* Michael Strube, Simone Paolo Ponzetto: WikiRelate! Computing Semantic Relatedness Using Wikipedia. AAAI 2006: 1419-1424

[[Category:State of the art]]

Knowledge collections and datasets (English)

2013-10-16T16:53:31Z

Taher:

RG-65 Test Collection (State of the art)

2013-10-16T16:38:49Z

Taher:

State of the art in Rubenstein & Goodenough (RG-65) dataset

* 65 word pairs;
* Similarity of each pair is scored according to a scale from 0 to 4 (the higher the "similarity of meaning," the higher the number);
* The similarity values in the dataset are the means of judgments made by 51 subjects [Rubenstein and Goodenough, 1965].

== Table of results ==

{| border="1" cellpadding="5" cellspacing="1" width="100%"
|-
! Algorithm
! Reference for algorithm
! Reference for reported results
! Type
! Spearman correlation (ρ)
! Pearson correlation (r)
|-
| ADW
| Pilehvar et al. (2013)
| Pilehvar et al. (2013)
| Knowledge-based
| 0.868
| 0.810
|-
| PPR
| Hughes and Ramage (2007)
| Hughes and Ramage (2007)
| Knowledge-based
| 0.838
| -
|-
| PPR
| Agirre et al. (2009)
| Agirre et al. (2009)
| Knowledge-based
| 0.830
| -
|-
| H&S
| Hirst and St-Onge (1998)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.813
| 0.732
|-
| Roget
| Jarmasz (2003)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.804
| 0.818
|-
| J&C
| Jiang and Conrath (1997)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.804
| 0.731
|-
| WNE
| Jarmasz (2003)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.801
| 0.787
|-
| L&C
| Leacock and Chodorow (1998)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.797
| 0.852
|-
| Lin
| Lin (1998)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.788
| 0.834
|-
| ESA*
| Gabrilovich and Markovitch (2007)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.749
| 0.716
|-
| SOCPMI*
| Islam and Inkpen (2006)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.741
| 0.729
|-
| Resnik
| Resnik (1995)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.731
| 0.800
|-
| WLM
| Milne and Witten (2008)
| Milne and Witten (2008)
| Knowledge-based
| 0.640
| -
|-
| LSA*
| Landauer et al. (1997)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.609
| 0.644
|-
| WikiRelate
| Strube and Ponzetto (2006)
| Strube and Ponzetto (2006)
| Knowledge-based
| -
| 0.530
|}

Note: values reported by (Hassan and Mihalcea, 2011) are "based on the collected raw data from the respective authors", and those highlighted by (*) are re-implementations.

== References ==

* Herbert Rubenstein and John B. Goodenough. Contextual correlates of synonymy. Communications of the ACM, 8(10):627–633, 1965.

* Samer Hassan, Rada Mihalcea: [http://www.cse.unt.edu/~rada/papers/hassan.aaai11.pdf‎ Semantic Relatedness Using Salient Semantic Analysis]. AAAI 2011

* M. T. Pilehvar, D. Jurgens and R. Navigli. [http://wwwusers.di.uniroma1.it/~navigli/pubs/ACL_2013_Pilehvar_Jurgens_Navigli.pdf Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity]. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, August 4-9, 2013, pp. 1341-1351.

* Thad Hughes, Daniel Ramage, Lexical Semantic Relatedness with Random Graph Walks. EMNLP-CoNLL 2007: 581-589.

* Eneko Agirre, Enrique Alfonseca, Keith Hall, Jana Kravalova, Marius Pasca, Aitor Soroa: [http://www.aclweb.org/anthology/N09-1003 A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches]. HLT-NAACL 2009: 19-27

* Hirst, Graeme and David St-Onge. Lexical chains as representations of context for the detection and correction of malapropisms. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 305–332, 1998.

* Jarmasz, M. 2003. [http://www.arxiv.org/pdf/1204.0140 Roget’s thesaurus as a Lexical Resource for Natural Language Processing]. Ph.D. Dissertation, Ottawa Carleton Institute for Computer Science, School of Information Technology and Engineering, University of Ottawa.

* Jiang, Jay J. and David W. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of International Conference on Research in Computational Linguistics (ROCLING X), Taiwan, pages 19–33, 1997.

* Leacock, Claudia and Martin Chodorow. Combining local context and WordNet similarity for word sense identification. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 265–283, 1998.

* Lin, Dekang. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, Madison,WI, pages 296–304, 1998.

* Evgeniy Gabrilovich and Shaul Markovitch, [http://www.cs.technion.ac.il/~gabr/papers/ijcai-2007-sim.pdf Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis], Proceedings of The 20th International Joint Conference on Artificial Intelligence (IJCAI), Hyderabad, India, 2007.

* Islam, A., and Inkpen, D. 2006. [http://www.site.uottawa.ca/~mdislam/publications/LREC_06_242.pdf Second order co-occurrence pmi for determining the semantic similarity of words]. Proceedings of the International Conference on Language Resources and Evaluation (LREC 2006) 1033–1038.

* Resnik, Philip. Using information content to evaluate semantic similarity. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 448–453, Montreal, Canada, 1995.

* David Milne, and Ian H. Witten, An Effective, Low-Cost Measure of Semantic Relatedness Obtained from Wikipedia Links, In Proceedings of AAAI 2008.

* Landauer, T. K.; L, T. K.; Laham, D.; Rehder, B.; and Schreiner, M. E. 1997. How well can passage meaning be derived without using word order? a comparison of latent semantic analysis and humans.

* Michael Strube, Simone Paolo Ponzetto: WikiRelate! Computing Semantic Relatedness Using Wikipedia. AAAI 2006: 1419-1424

RG-65 Test Collection (State of the art)

2013-10-16T16:22:18Z

Taher: /* Table of results */

State of the art in Rubenstein & Goodenough (RG-65) dataset

* 65 word pairs;
* Similarity of each pair is scored according to a scale from 0 to 4 (the higher the "similarity of meaning," the higher the number);
* The similarity values in the dataset are the means of judgments made by 51 subjects [Rubenstein and Goodenough, 1965].

== Table of results ==

{| border="1" cellpadding="5" cellspacing="1" width="100%"
|-
! Algorithm
! Reference for algorithm
! Reference for reported results
! Type
! Spearman correlation (ρ)
! Pearson correlation (r)
|-
| ADW
| Pilehvar et al. (2013)
| Pilehvar et al. (2013)
| Knowledge-based
| 0.868
| 0.810
|-
| PPR
| Hughes and Ramage (2007)
| Hughes and Ramage (2007)
| Knowledge-based
| 0.838
| -
|-
| PPR
| Agirre et al. (2009)
| Agirre et al. (2009)
| Knowledge-based
| 0.830
| -
|-
| H&S
| Hirst and St-Onge (1998)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.813
| 0.732
|-
| Roget
| Jarmasz (2003)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.804
| 0.818
|-
| J&C
| Jiang and Conrath (1997)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.804
| 0.731
|-
| WNE
| Jarmasz (2003)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.801
| 0.787
|-
| L&C
| Leacock and Chodorow (1998)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.797
| 0.852
|-
| Lin
| Lin (1998)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.788
| 0.834
|-
| ESA*
| Gabrilovich and Markovitch (2007)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.749
| 0.716
|-
| SOCPMI*
| Islam and Inkpen (2006)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.741
| 0.729
|-
| Resnik
| Resnik (1995)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.731
| 0.800
|-
| WLM
| Milne and Witten (2008)
| Milne and Witten (2008)
| Knowledge-based
| 0.640
| -
|-
| LSA*
| Landauer et al. (1997)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.609
| 0.644
|-
| WikiRelate
| Strube and Ponzetto (2006)
| Strube and Ponzetto (2006)
| Knowledge-based
| -
| 0.530
|}

Note: values reported by (Hassan and Mihalcea, 2011) are "based on the collected raw data from the respective authors", and those highlighted by (*) are re-implementations.

== References ==

* Herbert Rubenstein and John B. Goodenough. Contextual correlates of synonymy. Communications of the ACM, 8(10):627–633, 1965.

* Samer Hassan, Rada Mihalcea: Semantic Relatedness Using Salient Semantic Analysis. AAAI 2011

* Lin, Dekang. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, Madison,WI, pages 296–304, 1998.

* Lin, Dekang. Automatic retrieval and clustering of similar words. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics (COLING–ACL ’98), Montreal, Canada, pages 768–774, 1998.

* Eneko Agirre, Enrique Alfonseca, Keith Hall, Jana Kravalova, Marius Pasca, Aitor Soroa: A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches. HLT-NAACL 2009: 19-27

* Hirst, Graeme and David St-Onge. Lexical chains as representations of context for the detection and correction of malapropisms. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 305–332, 1998.

* Thad Hughes, Daniel Ramage: Lexical Semantic Relatedness with Random Graph Walks. EMNLP-CoNLL 2007: 581-589.

* Jiang, Jay J. and David W. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of International Conference on Research in Computational Linguistics (ROCLING X), Taiwan, pages 19–33, 1997.

* Leacock, Claudia and Martin Chodorow. Combining local context and WordNet similarity for word sense identification. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 265–283, 1998.

* Resnik, Philip. Using information content to evaluate semantic similarity. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 448–453, Montreal, Canada, 1995.

* Jarmasz, M. 2003. Roget’s thesaurus as a Lexical Resource for Natural Language Processing. Ph.D. Dissertation, Ottawa Carleton Institute for Computer Science, School of Information Technology and Engineering, University of Ottawa.

* Landauer, T. K.; L, T. K.; Laham, D.; Rehder, B.; and Schreiner, M. E. 1997. How well can passage meaning be derived without using word order? a comparison of latent semantic analysis and humans.

* Islam, A., and Inkpen, D. 2006. Second order co-occurrence pmi for determining the semantic similarity of words. Proceedings of the International Conference on Language Resources and Evaluation (LREC 2006) 1033–1038.

* M. T. Pilehvar, D. Jurgens and R. Navigli. Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity. Proc. of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, August 4-9, 2013, pp. 1341-1351.

* Michael Strube, Simone Paolo Ponzetto: WikiRelate! Computing Semantic Relatedness Using Wikipedia. AAAI 2006: 1419-1424

Knowledge collections and datasets (English)

2013-10-16T16:01:53Z

Taher:

Knowledge collections and datasets for Computational Linguistics and Natural Language Processing.

For languages other than English, see [[List of resources by language]].


* [[Clustering by Committee]] - terms clustered and organized using the [[Distributional Hypothesis]]
* [[DIRT Paraphrase Collection]] - Discovery of Inference Rules from Text
* [http://www.eat.rl.ac.uk/ Edinburgh Associative Thesaurus (EAT)]
* [http://framenet.icsi.berkeley.edu/ FrameNet]
* [http://www.psych.rl.ac.uk/ MRC Psycholinguistic Database]
* [http://www.clres.com/prepositions.html Preposition Project]
* [[Noun compound repository|Noun Compound Repository]]
* [http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html Reuters-21578 Text Categorization Collection]
* [[SAT Analogy Questions]] - a way of evaluating algorithms for measuring relational similarity
* [[Spam filtering datasets]]
* [[TEASE]] - Acquisition of Entailment Relations from the Web
* [[TOEFL Synonym Questions]] - a way of evaluating algorithms for measuring degree of similarity between 2 words
* [[Rubenstein and Goodenough (RG-65) dataset]] - suitable for correlation-based evaluation of algorithms for measuring semantic similarity of word pairs
* [http://w3.usf.edu/FreeAssociation/ University of South Florida Free Association Norms]
* [[VerbOcean]] - verbs organized by semantic relation, including temporal precedence and strength
* [[WordNet]]
* [http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/wordsim353.html WordSimilarity-353 Test Collection]

See also [[NLG:Data sets]] for a collection of data sets used for building natural language generation systems.

== Additional Dataset Collections ==
* [http://www.ldc.upenn.edu/ Linguistic Data Consortium (LDC)]

[[Category:Knowledge Collections and Datasets|*]]

Rubenstein and Goodenough (RG-65) dataset

2013-10-16T16:00:24Z

Taher: Redirected page to RG-65

#REDIRECT [[RG-65]]

RG-65 Test Collection (State of the art)

2013-10-16T15:56:57Z

Taher:

State of the art in Rubenstein & Goodenough (RG-65) dataset

* 65 word pairs;
* Similarity of each pair is scored according to a scale from 0 to 4 (the higher the "similarity of meaning," the higher the number);
* The similarity values in the dataset are the means of judgments made by 51 subjects [Rubenstein and Goodenough, 1965].

== Table of results ==

{| border="1" cellpadding="5" cellspacing="1" width="100%"
|-
! Algorithm
! Reference for algorithm
! Reference for reported results
! Type
! Spearman correlation (ρ)
! Pearson correlation (r)
|-
| ADW
| Pilehvar et al. (2013)
| Pilehvar et al. (2013)
| Knowledge-based
| 0.868
| 0.810
|-
| Roget
| Jarmasz (2003)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.804
| 0.818
|-
| WNE
| Jarmasz (2003)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.801
| 0.787
|-
| ESA*
| Gabrilovich and Markovitch (2007)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.749
| 0.716
|-
| LSA*
| Landauer et al. (1997)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.609
| 0.644
|-
| SOCPMI*
| Islam and Inkpen (2006)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.741
| 0.729
|-
| H&S
| Hirst and St-Onge (1998)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.813
| 0.732
|-
| J&C
| Jiang and Conrath (1997)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.804
| 0.731
|-
| L&C
| Leacock and Chodorow (1998)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.797
| 0.852
|-
| Lin
| Lin (1998)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.788
| 0.834
|-
| Resnik
| Resnik (1995)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.731
| 0.800
|-
| WikiRelate
| Strube and Ponzetto (2006)
| Strube and Ponzetto (2006)
| Knowledge-based
| -
| 0.530
|-
| PPR
| Agirre et al. (2009)
| Agirre et al. (2009)
| Knowledge-based
| 0.830
| -
|-
| WLM
| Milne and Witten (2008)
| Milne and Witten (2008)
| Knowledge-based
| 0.640
| -
|-
| PPR
| Hughes and Ramage (2007)
| Hughes and Ramage (2007)
| Knowledge-based
| 0.838
| -
|}

Note: values reported by (Hassan and Mihalcea, 2011) are "based on the collected raw data from the respective authors", and those highlighted by (*) are re-implementations.

== References ==

* Herbert Rubenstein and John B. Goodenough. Contextual correlates of synonymy. Communications of the ACM, 8(10):627–633, 1965.

* Samer Hassan, Rada Mihalcea: Semantic Relatedness Using Salient Semantic Analysis. AAAI 2011

* Lin, Dekang. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, Madison,WI, pages 296–304, 1998.

* Lin, Dekang. Automatic retrieval and clustering of similar words. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics (COLING–ACL ’98), Montreal, Canada, pages 768–774, 1998.

* Eneko Agirre, Enrique Alfonseca, Keith Hall, Jana Kravalova, Marius Pasca, Aitor Soroa: A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches. HLT-NAACL 2009: 19-27

* Hirst, Graeme and David St-Onge. Lexical chains as representations of context for the detection and correction of malapropisms. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 305–332, 1998.

* Thad Hughes, Daniel Ramage: Lexical Semantic Relatedness with Random Graph Walks. EMNLP-CoNLL 2007: 581-589.

* Jiang, Jay J. and David W. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of International Conference on Research in Computational Linguistics (ROCLING X), Taiwan, pages 19–33, 1997.

* Leacock, Claudia and Martin Chodorow. Combining local context and WordNet similarity for word sense identification. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 265–283, 1998.

* Resnik, Philip. Using information content to evaluate semantic similarity. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 448–453, Montreal, Canada, 1995.

* Jarmasz, M. 2003. Roget’s thesaurus as a Lexical Resource for Natural Language Processing. Ph.D. Dissertation, Ottawa Carleton Institute for Computer Science, School of Information Technology and Engineering, University of Ottawa.

* Landauer, T. K.; L, T. K.; Laham, D.; Rehder, B.; and Schreiner, M. E. 1997. How well can passage meaning be derived without using word order? a comparison of latent semantic analysis and humans.

* Islam, A., and Inkpen, D. 2006. Second order co-occurrence pmi for determining the semantic similarity of words. Proceedings of the International Conference on Language Resources and Evaluation (LREC 2006) 1033–1038.

* M. T. Pilehvar, D. Jurgens and R. Navigli. Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity. Proc. of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, August 4-9, 2013, pp. 1341-1351.

* Michael Strube, Simone Paolo Ponzetto: WikiRelate! Computing Semantic Relatedness Using Wikipedia. AAAI 2006: 1419-1424

RG-65 Test Collection (State of the art)

2013-10-16T15:54:35Z

Taher: Created page with "State of the art in Rubenstein & Goodenough (RG-65) dataset * 65 word pairs; * Similarity of each pair is scored according to a scale from 0 to 4 (the higher the "similarity..."

State of the art in Rubenstein & Goodenough (RG-65) dataset

* 65 word pairs;
* Similarity of each pair is scored according to a scale from 0 to 4 (the higher the "similarity of meaning," the higher the number);
* The similarity values in the dataset are the means of judgments made by 51 subjects [Rubenstein and Goodenough, 1965].

== Table of results ==

{| border="1" cellpadding="5" cellspacing="1" width="100%"
|-
! Algorithm
! Reference for algorithm
! Reference for reported results
! Type
! Spearman correlation (ρ)
! Pearson correlation (r)
|-
| ADW
| Pilehvar et al. (2013)
| Pilehvar et al. (2013)
| Knowledge-based
| 0.868
| 0.810
|-
| Roget
| Jarmasz (2003)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.804
| 0.818
|-
| WNE
| Jarmasz (2003)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.801
| 0.787
|-
| ESA*
| Gabrilovich and Markovitch (2007)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.749
| 0.716
|-
| LSA*
| Landauer et al. (1997)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.609
| 0.644
|-
| SOCPMI*
| Islam and Inkpen (2006)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.741
| 0.729
|-
| H&S
| Hirst and St-Onge (1998)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.813
| 0.732
|-
| J&C
| Jiang and Conrath (1997)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.804
| 0.731
|-
| L&C
| Leacock and Chodorow (1998)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.797
| 0.852
|-
| Lin
| Lin (1998)
| Hassan and Mihalcea (2011)
| Corpus-based
| 0.788
| 0.834
|-
| Resnik
| Resnik (1995)
| Hassan and Mihalcea (2011)
| Knowledge-based
| 0.731
| 0.800
|-
| WikiRelate
| Strube and Ponzetto (2006)
| Strube and Ponzetto (2006)
| Knowledge-based
| -
| 0.530
|-
| PPR
| Agirre et al. (2009)
| Agirre et al. (2009)
| Knowledge-based
| 0.830
| -
|-
| WLM
| Milne and Witten (2008)
| Milne and Witten (2008)
| Knowledge-based
| 0.640
| -
|-
| PPR
| Hughes and Ramage (2007)
| Hughes and Ramage (2007)
| Knowledge-based
| 0.838
| -
|}

Note: values reported by (Hassan and Mihalcea, 2011) are "based on the collected raw data from the respective authors", and those highlighted by (*) are re-implementations.

== References ==

* Herbert Rubenstein and John B. Goodenough. Contextual correlates of synonymy. Communications of the ACM, 8(10):627–633, 1965.

* Lin, Dekang. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, Madison,WI, pages 296–304, 1998.

* Eneko Agirre, Enrique Alfonseca, Keith Hall, Jana Kravalova, Marius Pasca, Aitor Soroa: A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches. HLT-NAACL 2009: 19-27

* Hirst, Graeme and David St-Onge. Lexical chains as representations of context for the detection and correction of malapropisms. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 305–332, 1998.

* Thad Hughes, Daniel Ramage: Lexical Semantic Relatedness with Random Graph Walks. EMNLP-CoNLL 2007: 581-589.

* Jiang, Jay J. and David W. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of International Conference on Research in Computational Linguistics (ROCLING X), Taiwan, pages 19–33, 1997.

* Leacock, Claudia and Martin Chodorow. Combining local context and WordNet similarity for word sense identification. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 265–283, 1998.

* Lin, Dekang. Automatic retrieval and clustering of similar words. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics (COLING–ACL ’98), Montreal, Canada, pages 768–774, 1998.

* Resnik, Philip. Using information content to evaluate semantic similarity. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 448–453, Montreal, Canada, 1995.

* Jarmasz, M. 2003. Roget’s thesaurus as a Lexical Resource for Natural Language Processing. Ph.D. Dissertation, Ottawa Carleton Institute for Computer Science, School of Information Technology and Engineering, University of Ottawa.

* Landauer, T. K.; L, T. K.; Laham, D.; Rehder, B.; and Schreiner, M. E. 1997. How well can passage meaning be derived without using word order? a comparison of latent semantic analysis and humans.

* Islam, A., and Inkpen, D. 2006. Second order co-occurrence pmi for determining the semantic similarity of words. Proceedings of the International Conference on Language Resources and Evaluation (LREC 2006) 1033–1038.

* M. T. Pilehvar, D. Jurgens and R. Navigli. Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity. Proc. of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, August 4-9, 2013, pp. 1341-1351.

* Michael Strube, Simone Paolo Ponzetto: WikiRelate! Computing Semantic Relatedness Using Wikipedia. AAAI 2006: 1419-1424

TOEFL Synonym Questions (State of the art)

2013-07-03T15:53:07Z

Taher:

* TOEFL = Test of English as a Foreign Language
* 80 multiple-choice synonym questions; 4 choices per question
* the TOEFL questions are available on request by contacting [http://lsa.colorado.edu/mail_sub.html LSA Support at CU Boulder], the people who manage the [http://lsa.colorado.edu/ LSA web site at Colorado]
* introduced in Landauer and Dumais (1997) as a way of evaluating algorithms for measuring degree of similarity between words
* subsequently used by many other researchers

== Sample question ==

::{| border="0" cellpadding="1" cellspacing="1"
|-
! Stem:
|
| levied
|-
! Choices:
| (a)
| imposed
|-
|
| (b)
| believed
|-
|
| (c)
| requested
|-
|
| (d)
| correlated
|-
! Solution:
| (a)
| imposed
|-
|}

== Table of results ==

{| border="1" cellpadding="5" cellspacing="1" width="100%"
|-
! Algorithm
! Reference for algorithm
! Reference for experiment
! Type
! Correct
! 95% confidence
|-
| RES
| Resnik (1995)
| Jarmasz and Szpakowicz (2003)
| Hybrid
| 20.31%
| 12.89–31.83%
|-
| LC
| Leacock and Chodrow (1998)
| Jarmasz and Szpakowicz (2003)
| Lexicon-based
| 21.88%
| 13.91–33.21%
|-
| LIN
| Lin (1998)
| Jarmasz and Szpakowicz (2003)
| Hybrid
| 24.06%
| 15.99–35.94%
|-
| Random
| Random guessing
| 1 / 4 = 25.00%
| Random
| 25.00%
| 15.99–35.94%
|-
| JC
| Jiang and Conrath (1997)
| Jarmasz and Szpakowicz (2003)
| Hybrid
| 25.00%
| 15.99–35.94%
|-
| LSA
| Landauer and Dumais (1997)
| Landauer and Dumais (1997)
| Corpus-based
| 64.38%
| 52.90–74.80%
|-
| Human
| Average non-English US college applicant
| Landauer and Dumais (1997)
| Human
| 64.50%
| 53.01–74.88%
|-
| DS
| Pado and Lapata (2007)
| Pado and Lapata (2007)
| Corpus-based
| 73.00%
| 62.72-82.96%
|-
| PMI-IR
| Turney (2001)
| Turney (2001)
| Corpus-based
| 73.75%
| 62.72–82.96%
|-
| PairClass
| Turney (2008)
| Turney (2008)
| Corpus-based
| 76.25%
| 65.42-85.06%
|-
| HSO
| Hirst and St.-Onge (1998)
| Jarmasz and Szpakowicz (2003)
| Lexicon-based
| 77.91%
| 68.17–87.11%
|-
| JS
| Jarmasz and Szpakowicz (2003)
| Jarmasz and Szpakowicz (2003)
| Lexicon-based
| 78.75%
| 68.17–87.11%
|-
| PMI-IR
| Terra and Clarke (2003)
| Terra and Clarke (2003)
| Corpus-based
| 81.25%
| 70.97–89.11%
|-
| CWO
| Ruiz-Casado et al. (2005)
| Ruiz-Casado et al. (2005)
| Web-based
| 82.55%
| 72.38–90.09%
|-
| PPMIC
| Bullinaria and Levy (2007)
| Bullinaria and Levy (2007)
| Corpus-based
| 85.00%
| 75.26-92.00%
|-
| GLSA
| Matveeva et al. (2005)
| Matveeva et al. (2005)
| Corpus-based
| 86.25%
| 76.73-92.93%
|-
| LSA
| Rapp (2003)
| Rapp (2003)
| Corpus-based
| 92.50%
| 84.39-97.20%
|-
| ADW
| Pilehvar et al. (2013)
| Pilehvar et al. (2013)
| WordNet graph-based (unsupervised)
| 96.25%
| 89.43-99.22%
|-
| PR
| Turney et al. (2003)
| Turney et al. (2003)
| Hybrid
| 97.50%
| 91.26–99.70%
|-
| PCCP
| Bullinaria and Levy (2012)
| Bullinaria and Levy (2012)
| Corpus-based
| 100.00%
| 96.32-100.00%
|}

== Explanation of table ==

* '''Algorithm''' = name of algorithm
* '''Reference for algorithm''' = where to find out more about given algorithm
* '''Reference for experiment''' = where to find out more about evaluation of given algorithm with TOEFL questions
* '''Type''' = general type of algorithm: corpus-based, lexicon-based, hybrid
* '''Correct''' = percent of 80 questions that given algorithm answered correctly
* '''95% confidence''' = confidence interval calculated using the [[Statistical calculators|Binomial Exact Test]]
* table rows sorted in order of increasing percent correct
* several WordNet-based similarity measures are implemented in [http://www.d.umn.edu/~tpederse/ Ted Pedersen]'s [http://www.d.umn.edu/~tpederse/similarity.html WordNet::Similarity] package
* LSA = Latent Semantic Analysis
* PCCP = Principal Component vectors with Caron P
* PMI-IR = Pointwise Mutual Information - Information Retrieval
* PR = Product Rule
* PPMIC = Positive Pointwise Mutual Information with Cosine
* GLSA = Generalized Latent Semantic Analysis
* CWO = Context Window Overlapping
* DS = Dependency Space

== Notes ==

* the performance of a corpus-based algorithm depends on the corpus, so the difference in performance between two corpus-based systems may be due to the different corpora, rather than the different algorithms
* the TOEFL questions include nouns, verbs, and adjectives, but some of the WordNet-based algorithms were only designed to work with nouns; this explains some of the lower scores
* some of the algorithms may have been tuned on the TOEFL questions; read the references for details
* Landauer and Dumais (1997) report scores that were corrected for guessing by subtracting a penalty of 1/3 for each incorrect answer; they report a score of 52.5% when this penalty is applied; when the penalty is removed, their performance is 64.4% correct

== References ==

Bullinaria, J.A., and Levy, J.P. (2007). [http://www.cs.bham.ac.uk/~jxb/PUBS/BRM.pdf Extracting semantic representations from word co-occurrence statistics: A computational study]. ''Behavior Research Methods'', 39(3), 510-526.

Bullinaria, J.A., and Levy, J.P. (2012). [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.228.9582&rep=rep1&type=pdf Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD]. ''Behavior Research Methods'', 44(3):890-907.

Hirst, G., and St-Onge, D. (1998). [http://mirror.eacoss.org/documentation/ITLibrary/IRIS/Data/1997/Hirst/Lexical/1997-Hirst-Lexical.pdf Lexical chains as representation of context for the detection and correction of malapropisms]. In C. Fellbaum (ed.), ''WordNet: An Electronic Lexical Database''. Cambridge: MIT Press, 305-332.

Jarmasz, M., and Szpakowicz, S. (2003). [http://www.csi.uottawa.ca/~szpak/recent_papers/TR-2003-01.pdf Roget’s thesaurus and semantic similarity], ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03)'', Borovets, Bulgaria, September, pp. 212-219.

Jiang, J.J., and Conrath, D.W. (1997). [http://wortschatz.uni-leipzig.de/~sbordag/aalw05/Referate/03_Assoziationen_BudanitskyResnik/Jiang_Conrath_97.pdf Semantic similarity based on corpus statistics and lexical taxonomy]. ''Proceedings of the International Conference on Research in Computational Linguistics'', Taiwan.

Landauer, T.K., and Dumais, S.T. (1997). [http://lsa.colorado.edu/papers/plato/plato.annote.html A solution to Plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge]. ''Psychological Review'', 104(2):211–240.

Leacock, C., and Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. In C. Fellbaum (ed.), ''WordNet: An Electronic Lexical Database''. Cambridge: MIT Press, pp. 265-283.

Lin, D. (1998). [http://www.cs.ualberta.ca/~lindek/papers/sim.pdf An information-theoretic definition of similarity]. ''Proceedings of the 15th International Conference on Machine Learning (ICML-98)'', Madison, WI, pp. 296-304.

Matveeva, I., Levow, G., Farahat, A., and Royer, C. (2005). [http://people.cs.uchicago.edu/~matveeva/SynGLSA_ranlp_final.pdf Generalized latent semantic analysis for term representation]. ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-05)'', Borovets, Bulgaria.

Pado, S., and Lapata, M. (2007). [http://www.coli.uni-saarland.de/~pado/pub/papers/cl07_pado.pdf Dependency-based construction of semantic space models]. ''Computational Linguistics'', 33(2), 161-199.

Pilehvar, M.T., Jurgens D., and Navigli R. (2013). Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity. ''Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013),'' Sofia, Bulgaria.

Rapp, R. (2003). [http://www.amtaweb.org/summit/MTSummit/FinalPapers/19-Rapp-final.pdf Word sense discovery based on sense descriptor dissimilarity]. ''Proceedings of the Ninth Machine Translation Summit'', pp. 315-322.

Resnik, P. (1995). [http://citeseer.ist.psu.edu/resnik95using.html Using information content to evaluate semantic similarity]. ''Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95)'', Montreal, pp. 448-453.

Ruiz-Casado, M., Alfonseca, E. and Castells, P. (2005) [http://alfonseca.org/pubs/2005-ranlp1.pdf Using context-window overlapping in Synonym Discovery and Ontology Extension]. ''Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP-2005)'', Borovets, Bulgaria.

Terra, E., and Clarke, C.L.A. (2003). [http://acl.ldc.upenn.edu/N/N03/N03-1032.pdf Frequency estimates for statistical word similarity measures]. ''Proceedings of the Human Language Technology and North American Chapter of Association of Computational Linguistics Conference 2003 (HLT/NAACL 2003)'', pp. 244–251.

Turney, P.D. (2001). [http://arxiv.org/abs/cs.LG/0212033 Mining the Web for synonyms: PMI-IR versus LSA on TOEFL]. ''Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001)'', Freiburg, Germany, pp. 491-502.

Turney, P.D., Littman, M.L., Bigham, J., and Shnayder, V. (2003). [http://arxiv.org/abs/cs.CL/0309035 Combining independent modules to solve multiple-choice synonym and analogy problems]. ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03)'', Borovets, Bulgaria, pp. 482-489.

Turney, P.D. (2008). [http://arxiv.org/abs/0809.0124 A uniform approach to analogies, synonyms, antonyms, and associations]. ''Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)'', Manchester, UK, pp. 905-912.

== See also ==

* [[Attributional and Relational Similarity (State of the art)]]
* [[ESL Synonym Questions (State of the art)|ESL Synonym Questions]]
* [[SAT Analogy Questions]]
* [[State of the art]]

[[Category:State of the art]]