Term and Collocation Extraction by Means of Complex Linguistic Web Services

Ulrich Heid, Fabienne Fritzinger, Erhard Hinrichs, Marie Hinrichs, Thomas Zastrow


Abstract
We present a web service-based environment for the use of linguistic resources and tools to address issues of terminology and language varieties. We discuss the architecture, corpus representation formats, components and a chainer supporting the combination of tools into task-specific services. Integrated into this environment, single web services also become part of complex scenarios for web service use. Our web services take for example corpora of several million words as an input on which they perform preprocessing, such as tokenisation, tagging, lemmatisation and parsing, and corpus exploration, such as collocation extraction and corpus comparison. Here we present an example on extraction of single and multiword items typical of a specific domain or typical of a regional variety of German. We also give a critical review on needs and available functions from a user's point of view. The work presented here is part of ongoing experimentation in the D-SPIN project, the German national counterpart of CLARIN.
Anthology ID:
L10-1251
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/363_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Ulrich Heid, Fabienne Fritzinger, Erhard Hinrichs, Marie Hinrichs, and Thomas Zastrow. 2010. Term and Collocation Extraction by Means of Complex Linguistic Web Services. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Term and Collocation Extraction by Means of Complex Linguistic Web Services (Heid et al., LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/363_Paper.pdf