Revisiting Pivot-Based Paraphrase Generation: Language Is Not the Only Optional Pivot

Yitao Cai, Yue Cao, Xiaojun Wan


Abstract
Paraphrases refer to texts that convey the same meaning with different expression forms. Pivot-based methods, also known as the round-trip translation, have shown promising results in generating high-quality paraphrases. However, existing pivot-based methods all rely on language as the pivot, where large-scale, high-quality parallel bilingual texts are required. In this paper, we explore the feasibility of using semantic and syntactic representations as the pivot for paraphrase generation. Concretely, we transform a sentence into a variety of different semantic or syntactic representations (including AMR, UD, and latent semantic representation), and then decode the sentence back from the semantic representations. We further explore a pretraining-based approach to compress the pipeline process into an end-to-end framework. We conduct experiments comparing different approaches with different kinds of pivots. Experimental results show that taking AMR as pivot can obtain paraphrases with better quality than taking language as the pivot. The end-to-end framework can reduce semantic shift when language is used as the pivot. Besides, several unsupervised pivot-based methods can generate paraphrases with similar quality as the supervised sequence-to-sequence model, which indicates that parallel data of paraphrases may not be necessary for paraphrase generation.
Anthology ID:
2021.emnlp-main.350
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4255–4268
Language:
URL:
https://aclanthology.org/2021.emnlp-main.350
DOI:
10.18653/v1/2021.emnlp-main.350
Bibkey:
Cite (ACL):
Yitao Cai, Yue Cao, and Xiaojun Wan. 2021. Revisiting Pivot-Based Paraphrase Generation: Language Is Not the Only Optional Pivot. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4255–4268, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Revisiting Pivot-Based Paraphrase Generation: Language Is Not the Only Optional Pivot (Cai et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.350.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.350.mp4