Evaluating the Morphosyntactic Well-formedness of Generated Texts

Adithya Pratapa, Antonios Anastasopoulos, Shruti Rijhwani, Aditi Chaudhary, David R. Mortensen, Graham Neubig, Yulia Tsvetkov


Abstract
Text generation systems are ubiquitous in natural language processing applications. However, evaluation of these systems remains a challenge, especially in multilingual settings. In this paper, we propose L’AMBRE – a metric to evaluate the morphosyntactic well-formedness of text using its dependency parse and morphosyntactic rules of the language. We present a way to automatically extract various rules governing morphosyntax directly from dependency treebanks. To tackle the noisy outputs from text generation systems, we propose a simple methodology to train robust parsers. We show the effectiveness of our metric on the task of machine translation through a diachronic study of systems translating into morphologically-rich languages.
Anthology ID:
2021.emnlp-main.570
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7131–7150
Language:
URL:
https://aclanthology.org/2021.emnlp-main.570
DOI:
10.18653/v1/2021.emnlp-main.570
Bibkey:
Cite (ACL):
Adithya Pratapa, Antonios Anastasopoulos, Shruti Rijhwani, Aditi Chaudhary, David R. Mortensen, Graham Neubig, and Yulia Tsvetkov. 2021. Evaluating the Morphosyntactic Well-formedness of Generated Texts. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7131–7150, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Evaluating the Morphosyntactic Well-formedness of Generated Texts (Pratapa et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.570.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.570.mp4
Code
 adithya7/lambre