Light Verb Constructions in the SzegedParalellFX English–Hungarian Parallel Corpus

Veronika Vincze


Abstract
In this paper, we describe the first English-Hungarian parallel corpus annotated for light verb constructions, which contains 14,261 sentence alignment units. Annotation principles and statistical data on the corpus are also provided, and English and Hungarian data are contrasted. On the basis of corpus data, a database containing pairs of English-Hungarian light verb constructions has been created as well. The corpus and the database can contribute to the automatic detection of light verb constructions and it is also shown how they can enhance performance in several fields of NLP (e.g. parsing, information extraction/retrieval and machine translation).
Anthology ID:
L12-1041
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2381–2388
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/177_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Veronika Vincze. 2012. Light Verb Constructions in the SzegedParalellFX English–Hungarian Parallel Corpus. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 2381–2388, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Light Verb Constructions in the SzegedParalellFX English–Hungarian Parallel Corpus (Vincze, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/177_Paper.pdf