EVALIEX — A Proposal for an Extended Evaluation Methodology for Information Extraction Systems

Christina Feilmayr, Birgit Pröll, Elisabeth Linsmayr


Abstract
Assessing the correctness of extracted data requires performance evaluation, which is accomplished by calculating quality metrics. The evaluation process must cope with the challenges posed by information extraction and natural language processing. In the previous work most of the existing methodologies have been shown that they support only traditional scoring metrics. Our research work addresses requirements, which arose during the development of three productive rule-based information extraction systems. The main contribution is twofold: First, we developed a proposal for an evaluation methodology that provides the flexibility and effectiveness needed for comprehensive performance measurement. The proposal extends state-of-the-art scoring metrics by measuring string and semantic similarities and by parameterization of metric scoring, and thus simulating with human judgment. Second, we implemented an IE evaluation tool named EVALIEX, which integrates these measurement concepts and provides an efficient user interface that supports evaluation control and the visualization of IE results. To guarantee domain independence, the tool additionally provides a Generic Mapper for XML Instances (GeMap) that maps domain-dependent XML files containing IE results to generic ones. Compared to other tools, it provides more flexible testing and better visualization of extraction results for the comparison of different (versions of) information extraction systems.
Anthology ID:
L12-1060
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2303–2310
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/204_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Christina Feilmayr, Birgit Pröll, and Elisabeth Linsmayr. 2012. EVALIEX — A Proposal for an Extended Evaluation Methodology for Information Extraction Systems. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 2303–2310, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
EVALIEX — A Proposal for an Extended Evaluation Methodology for Information Extraction Systems (Feilmayr et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/204_Paper.pdf