Example-Based Treebank Querying

Liesbeth Augustinus, Vincent Vandeghinste, Frank Van Eynde


Abstract
The recent construction of large linguistic treebanks for spoken and written Dutch (e.g. CGN, LASSY, Alpino) has created new and exciting opportunities for the empirical investigation of Dutch syntax and semantics. However, the exploitation of those treebanks requires knowledge of specific data structures and query languages such as XPath. Linguists who are unfamiliar with formal languages are often reluctant towards learning such a language. In order to make treebank querying more attractive for non-technical users we developed GrETEL (Greedy Extraction of Trees for Empirical Linguistics), a query engine in which linguists can use natural language examples as a starting point for searching the Lassy treebank without knowledge about tree representations nor formal query languages. By allowing linguists to search for similar constructions as the example they provide, we hope to bridge the gap between traditional and computational linguistics. Two case studies are conducted to provide a concrete demonstration of the tool. The architecture of the tool is optimised for searching the LASSY treebank, but the approach can be adapted to other treebank lay-outs.
Anthology ID:
L12-1442
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3161–3167
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/756_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Liesbeth Augustinus, Vincent Vandeghinste, and Frank Van Eynde. 2012. Example-Based Treebank Querying. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3161–3167, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Example-Based Treebank Querying (Augustinus et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/756_Paper.pdf