Discarding Noise in an Automatically Acquired Lexicon of Support verb Constructions

M. Begoña Villada Moirón


Abstract
We applied data-driven methods to carry out automatic acquisition of Dutch prepositional support verb constructions (SVCs) in corpora (e.g., iets in de gaten houden (``keep an eye on something'')). This paper addresses the question whether linguistic diagnostics help to discard noise from the nbest lists and how to (semi-)automatically apply such linguistic diagnostics to parsed corpora. We show that some of the linguistic diagnostics proposed in Hollebrandse (1993) effectively identify SVCs and contribute a modest error rate decrease.
Anthology ID:
L04-1261
Volume:
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
Month:
May
Year:
2004
Address:
Lisbon, Portugal
Editors:
Maria Teresa Lino, Maria Francisca Xavier, Fátima Ferreira, Rute Costa, Raquel Silva
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2004/pdf/442.pdf
DOI:
Bibkey:
Cite (ACL):
M. Begoña Villada Moirón. 2004. Discarding Noise in an Automatically Acquired Lexicon of Support verb Constructions. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04), Lisbon, Portugal. European Language Resources Association (ELRA).
Cite (Informal):
Discarding Noise in an Automatically Acquired Lexicon of Support verb Constructions (Moirón, LREC 2004)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2004/pdf/442.pdf