Unsupervised Rewriter for Multi-Sentence Compression

Yang Zhao, Xiaoyu Shen, Wei Bi, Akiko Aizawa


Abstract
Multi-sentence compression (MSC) aims to generate a grammatical but reduced compression from multiple input sentences while retaining their key information. Previous dominating approach for MSC is the extraction-based word graph approach. A few variants further leveraged lexical substitution to yield more abstractive compression. However, two limitations exist. First, the word graph approach that simply concatenates fragments from multiple sentences may yield non-fluent or ungrammatical compression. Second, lexical substitution is often inappropriate without the consideration of context information. To tackle the above-mentioned issues, we present a neural rewriter for multi-sentence compression that does not need any parallel corpus. Empirical studies have shown that our approach achieves comparable results upon automatic evaluation and improves the grammaticality of compression based on human evaluation. A parallel corpus with more than 140,000 (sentence group, compression) pairs is also constructed as a by-product for future research.
Anthology ID:
P19-1216
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2235–2240
URL:
https://www.aclweb.org/anthology/P19-1216.pdf
DOI:
10.18653/v1/P19-1216
Bib Export formats:
BibTeX MODS XML EndNote
Supplementary:
 P19-1216.Supplementary.pdf