Acquiring Reliable Predicate-argument Structures from Raw Corpora for Case Frame Compilation

Daisuke Kawahara, Sadao Kurohashi


Abstract
We present a method for acquiring reliable predicate-argument structures from raw corpora for automatic compilation of case frames. Such lexicon compilation requires highly reliable predicate-argument structures to practically contribute to Natural Language Processing (NLP) applications, such as paraphrasing, text entailment, and machine translation. However, to precisely identify predicate-argument structures, case frames are required. This issue is similar to the question ""what came first: the chicken or the egg?"" In this paper, we propose the first step in the extraction of reliable predicate-argument structures without using case frames. We first apply chunking to raw corpora and then extract reliable chunks to ensure that high-quality predicate-argument structures are obtained from the chunks. We conducted experiments to confirm the effectiveness of our approach. We successfully extracted reliable chunks of an accuracy of 98% and high-quality predicate-argument structures of an accuracy of 97%. Our experiments confirmed that we succeeded in acquiring highly reliable predicate-argument structures that can be used to compile case frames.
Anthology ID:
L10-1507
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/733_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Daisuke Kawahara and Sadao Kurohashi. 2010. Acquiring Reliable Predicate-argument Structures from Raw Corpora for Case Frame Compilation. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Acquiring Reliable Predicate-argument Structures from Raw Corpora for Case Frame Compilation (Kawahara & Kurohashi, LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/733_Paper.pdf