SAPPHIRE: Simple Aligner for Phrasal Paraphrase with Hierarchical Representation

Masato Yoshinaka, Tomoyuki Kajiwara, Yuki Arase


Abstract
We present SAPPHIRE, a Simple Aligner for Phrasal Paraphrase with HIerarchical REpresentation. Monolingual phrase alignment is a fundamental problem in natural language understanding and also a crucial technique in various applications such as natural language inference and semantic textual similarity assessment. Previous methods for monolingual phrase alignment are language-resource intensive; they require large-scale synonym/paraphrase lexica and high-quality parsers. Different from them, SAPPHIRE depends only on a monolingual corpus to train word embeddings. Therefore, it is easily transferable to specific domains and different languages. Specifically, SAPPHIRE first obtains word alignments using pre-trained word embeddings and then expands them to phrase alignments by bilingual phrase extraction methods. To estimate the likelihood of phrase alignments, SAPPHIRE uses phrase embeddings that are hierarchically composed of word embeddings. Finally, SAPPHIRE searches for a set of consistent phrase alignments on a lattice of phrase alignment candidates. It achieves search-efficiency by constraining the lattice so that all the paths go through a phrase alignment pair with the highest alignment score. Experimental results using the standard dataset for phrase alignment evaluation show that SAPPHIRE outperforms the previous method and establishes the state-of-the-art performance.
Anthology ID:
2020.lrec-1.847
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
6861–6867
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.847
DOI:
Bibkey:
Cite (ACL):
Masato Yoshinaka, Tomoyuki Kajiwara, and Yuki Arase. 2020. SAPPHIRE: Simple Aligner for Phrasal Paraphrase with Hierarchical Representation. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 6861–6867, Marseille, France. European Language Resources Association.
Cite (Informal):
SAPPHIRE: Simple Aligner for Phrasal Paraphrase with Hierarchical Representation (Yoshinaka et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.847.pdf