Linguagrid: a network of Linguistic and Semantic Services for the Italian Language.

Alessio Bosca, Luca Dini, Milen Kouylekov, Marco Trevisan


Abstract
In order to handle the increasing amount of textual information today available on the web and exploit the knowledge latent in this mass of unstructured data, a wide variety of linguistic knowledge and resources (Language Identification, Morphological Analysis, Entity Extraction, etc.). is crucial. In the last decade LRaas (Language Resource as a Service) emerged as a novel paradigm for publishing and sharing these heterogeneous software resources over the Web. In this paper we present an overview of Linguagrid, a recent initiative that implements an open network of linguistic and semantic Web Services for the Italian language, as well as a new approach for enabling customizable corpus-based linguistic services on Linguagrid LRaaS infrastructure. A corpus ingestion service in fact allows users to upload corpora of documents and to generate classification/clustering models tailored to their needs by means of standard machine learning techniques applied to the textual contents and metadata from the corpora. The models so generated can then be accessed through proper Web Services and exploited to process and classify new textual contents.
Anthology ID:
L12-1516
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3304–3307
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/867_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Alessio Bosca, Luca Dini, Milen Kouylekov, and Marco Trevisan. 2012. Linguagrid: a network of Linguistic and Semantic Services for the Italian Language.. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3304–3307, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Linguagrid: a network of Linguistic and Semantic Services for the Italian Language. (Bosca et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/867_Paper.pdf