An Unsupervised Multi-Document Summarization Framework Based on Neural Document Model

Shulei Ma, Zhi-Hong Deng, Yunlun Yang


Abstract
In the age of information exploding, multi-document summarization is attracting particular attention for the ability to help people get the main ideas in a short time. Traditional extractive methods simply treat the document set as a group of sentences while ignoring the global semantics of the documents. Meanwhile, neural document model is effective on representing the semantic content of documents in low-dimensional vectors. In this paper, we propose a document-level reconstruction framework named DocRebuild, which reconstructs the documents with summary sentences through a neural document model and selects summary sentences to minimize the reconstruction error. We also apply two strategies, sentence filtering and beamsearch, to improve the performance of our method. Experimental results on the benchmark datasets DUC 2006 and DUC 2007 show that DocRebuild is effective and outperforms the state-of-the-art unsupervised algorithms.
Anthology ID:
C16-1143
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Yuji Matsumoto, Rashmi Prasad
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
1514–1523
Language:
URL:
https://aclanthology.org/C16-1143
DOI:
Bibkey:
Cite (ACL):
Shulei Ma, Zhi-Hong Deng, and Yunlun Yang. 2016. An Unsupervised Multi-Document Summarization Framework Based on Neural Document Model. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1514–1523, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
An Unsupervised Multi-Document Summarization Framework Based on Neural Document Model (Ma et al., COLING 2016)
Copy Citation:
PDF:
https://aclanthology.org/C16-1143.pdf