A Survey of Text Mining Architectures and the UIMA Standard

Mathias Bank, Martin Schierle


Abstract
With the rising amount of digitally available text, the need for efficient processing algorithms is growing fast. Although a lot of libraries are commonly available, their modularity and interchangeability is very limited, therefore forcing a lot of reimplementations and modifications not only in research areas but also in real world application scenarios. In recent years, different NLP frameworks have been proposed to provide an efficient, robust and convenient architecture for information processing tasks. This paper will present an overview over the most common approaches with their advantages and shortcomings, and will discuss them with respect to the first standardized architecture - the Unstructured Information Management Architecture (UIMA).
Anthology ID:
L12-1047
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3479–3486
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/183_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Mathias Bank and Martin Schierle. 2012. A Survey of Text Mining Architectures and the UIMA Standard. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3479–3486, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
A Survey of Text Mining Architectures and the UIMA Standard (Bank & Schierle, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/183_Paper.pdf