Evaluating Machine Translation Utility via Semantic Role Labels

Chi-kiu Lo, Dekai Wu


Abstract
We present the methodology that underlies mew metrics for semantic machine translation evaluation we are developing. Unlike widely-used lexical and n-gram based MT evaluation metrics, the aim of semantic MT evaluation is to measure the utility of translations. We discuss the design of empirical studies to evaluate the utility of machine translation output by assessing the accuracy for key semantic roles. These roles are from the English 5W templates (who, what, when, where, why) used in recent GALE distillation evaluations. Recent work by Wu and Fung (2009) introduced semantic role labeling into statistical machine translation to enhance the quality of MT output. However, this approach has so far only been evaluated using lexical and n-gram based SMT evaluation metrics like BLEU which are not aimed at evaluating the utility of MT output. Direct data analysis are still needed to understand how semantic models can be leveraged to evaluate the utility of MT output. In this paper, we discuss a new methodology for evaluating the utility of the machine translation output, by assessing the accuracy with which human readers are able to complete the English 5W templates.
Anthology ID:
L10-1521
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/752_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Chi-kiu Lo and Dekai Wu. 2010. Evaluating Machine Translation Utility via Semantic Role Labels. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Evaluating Machine Translation Utility via Semantic Role Labels (Lo & Wu, LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/752_Paper.pdf