A Graph-Based Approach for Computing Free Word Associations

Gemma Bel Enguix, Reinhard Rapp, Michael Zock


Abstract
A graph-based algorithm is used to analyze the co-occurrences of words in the British National Corpus. It is shown that the statistical regularities detected can be exploited to predict human word associations. The corpus-derived associations are evaluated using a large test set comprising several thousand stimulus/response pairs as collected from humans. The finding is that there is a high agreement between the two types of data. The considerable size of the test set allows us to split the stimulus words into a number of classes relating to particular word properties. For example, we construct six saliency classes, and for the words in each of these classes we compare the simulation results with the human data. It turns out that for each class there is a close relationship between the performance of our system and human performance. This is also the case for classes based on two other properties of words, namely syntactic and semantic word ambiguity. We interpret these findings as evidence for the claim that human association acquisition must be based on the statistical analysis of perceived language and that when producing associations the detected statistical regularities are replicated.
Anthology ID:
L14-1105
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3027–3033
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/1150_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Gemma Bel Enguix, Reinhard Rapp, and Michael Zock. 2014. A Graph-Based Approach for Computing Free Word Associations. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3027–3033, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
A Graph-Based Approach for Computing Free Word Associations (Enguix et al., LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/1150_Paper.pdf