Diachronic Analysis of Entities by Exploiting Wikipedia Page revisions

Pierpaolo Basile, Annalina Caputo, Seamus Lawless, Giovanni Semeraro


Abstract
In the last few years, the increasing availability of large corpora spanning several time periods has opened new opportunities for the diachronic analysis of language. This type of analysis can bring to the light not only linguistic phenomena related to the shift of word meanings over time, but it can also be used to study the impact that societal and cultural trends have on this language change. This paper introduces a new resource for performing the diachronic analysis of named entities built upon Wikipedia page revisions. This resource enables the analysis over time of changes in the relations between entities (concepts), surface forms (words), and the contexts surrounding entities and surface forms, by analysing the whole history of Wikipedia internal links. We provide some useful use cases that prove the impact of this resource on diachronic studies and delineate some possible future usage.
Anthology ID:
R19-1011
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
Month:
September
Year:
2019
Address:
Varna, Bulgaria
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
84–91
Language:
URL:
https://aclanthology.org/R19-1011
DOI:
10.26615/978-954-452-056-4_011
Bibkey:
Cite (ACL):
Pierpaolo Basile, Annalina Caputo, Seamus Lawless, and Giovanni Semeraro. 2019. Diachronic Analysis of Entities by Exploiting Wikipedia Page revisions. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 84–91, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):
Diachronic Analysis of Entities by Exploiting Wikipedia Page revisions (Basile et al., RANLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/R19-1011.pdf