Effects of Graph Generation for Unsupervised Non-Contextual Single Document Keyword Extraction

Natalie Schluter


Abstract
This paper presents an exhaustive study on the generation of graph input to unsupervised graph-based non-contextual single document keyword extraction systems. A concrete hypothesis on concept coordination for documents that are scientific articles is put forward, consistent with two separate graph models : one which is based on word adjacency in the linear text–an approach forming the foundation of all previous graph-based keyword extraction methods, and a novel one that is based on word adjacency modulo their modifiers. In doing so, we achieve a best reported NDCG score to date of 0.431 for any system on the same data. In terms of a best parameter f-score, we achieve the highest reported to date (0.714) at a reasonable ranked list cut-off of n = 6, which is also the best reported f-score for any keyword extraction or generation system in the literature on the same data. The best-parameter f-score corresponds to a reduction in error of 12.6% conservatively.
Anthology ID:
2015.jeptalnrecital-court.10
Volume:
Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts
Month:
June
Year:
2015
Address:
Caen, France
Editors:
Jean-Marc Lecarpentier, Nadine Lucas
Venue:
JEP/TALN/RECITAL
SIG:
Publisher:
ATALA
Note:
Pages:
61–67
Language:
URL:
https://aclanthology.org/2015.jeptalnrecital-court.10
DOI:
Bibkey:
Cite (ACL):
Natalie Schluter. 2015. Effects of Graph Generation for Unsupervised Non-Contextual Single Document Keyword Extraction. In Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts, pages 61–67, Caen, France. ATALA.
Cite (Informal):
Effects of Graph Generation for Unsupervised Non-Contextual Single Document Keyword Extraction (Schluter, JEP/TALN/RECITAL 2015)
Copy Citation:
PDF:
https://aclanthology.org/2015.jeptalnrecital-court.10.pdf