The Collection of Distributionally Idiosyncratic Items: A Multilingual Resource for Linguistic Research

Manfred Sailer, Beata Trawiński


Abstract
We present two collections of lexical items with idiosyncratic distribution. The collections document the behavior of German and English bound words (BW, such as English “headway”), i.e., words which can only occur in one expression (“make headway”). BWs are a problem for both general and idiomatic dictionaries since it is unclear whether they have an independent lexical status and to what extent the expressions in which they occur are typical idiomatic expressions. We propose a system which allows us to document the information about BWs from dictionaries and linguistic literature, together with corpus data and example queries for major text corpora. We present our data structure and point to other phraseologically oriented collections. We will also show differences between the German and the English collection.
Anthology ID:
L06-1220
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/375_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Manfred Sailer and Beata Trawiński. 2006. The Collection of Distributionally Idiosyncratic Items: A Multilingual Resource for Linguistic Research. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
The Collection of Distributionally Idiosyncratic Items: A Multilingual Resource for Linguistic Research (Sailer & Trawiński, LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/375_pdf.pdf