Turkish Resources for Visual Word Recognition

Begüm Erten, Cem Bozsahin, Deniz Zeyrek


Abstract
We report two tools to conduct psycholinguistic experiments on Turkish words. KelimetriK allows experimenters to choose words based on desired orthographic scores of word frequency, bigram and trigram frequency, ON, OLD20, ATL and subset/superset similarity. Turkish version of Wuggy generates pseudowords from one or more template words using an efficient method. The syllabified version of the words are used as the input, which are decomposed into their sub-syllabic components. The bigram frequency chains are constructed by the entire words’ onset, nucleus and coda patterns. Lexical statistics of stems and their syllabification are compiled by us from BOUN corpus of 490 million words. Use of these tools in some experiments is shown.
Anthology ID:
L14-1279
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2106–2110
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/316_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Begüm Erten, Cem Bozsahin, and Deniz Zeyrek. 2014. Turkish Resources for Visual Word Recognition. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 2106–2110, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
Turkish Resources for Visual Word Recognition (Erten et al., LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/316_Paper.pdf