AttentionRank: Unsupervised Keyphrase Extraction using Self and Cross Attentions

Haoran Ding, Xiao Luo


Abstract
Keyword or keyphrase extraction is to identify words or phrases presenting the main topics of a document. This paper proposes the AttentionRank, a hybrid attention model, to identify keyphrases from a document in an unsupervised manner. AttentionRank calculates self-attention and cross-attention using a pre-trained language model. The self-attention is designed to determine the importance of a candidate within the context of a sentence. The cross-attention is calculated to identify the semantic relevance between a candidate and sentences within a document. We evaluate the AttentionRank on three publicly available datasets against seven baselines. The results show that the AttentionRank is an effective and robust unsupervised keyphrase extraction model on both long and short documents. Source code is available on Github.
Anthology ID:
2021.emnlp-main.146
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1919–1928
Language:
URL:
https://aclanthology.org/2021.emnlp-main.146
DOI:
10.18653/v1/2021.emnlp-main.146
Bibkey:
Cite (ACL):
Haoran Ding and Xiao Luo. 2021. AttentionRank: Unsupervised Keyphrase Extraction using Self and Cross Attentions. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 1919–1928, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
AttentionRank: Unsupervised Keyphrase Extraction using Self and Cross Attentions (Ding & Luo, EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.146.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.146.mp4
Code
 hd10-iupui/attentionrank