The EDGeS Diachronic Bible Corpus

Gerlof Bouma, Evie Coussé, Trude Dijkstra, Nicoline van der Sijs


Abstract
We present the EDGeS Diachronic Bible Corpus: a diachronically and synchronically parallel corpus of Bible translations in Dutch, English, German and Swedish, with texts from the 14th century until today. It is compiled in the context of an intended longitudinal and contrastive study of complex verb constructions in Germanic. The paper discusses the corpus design principles, its selection of 36 Bibles, and the information and metadata encoded for the corpus texts. The EDGeS corpus will be available in two forms: the whole corpus will be accessible for researchers behind a login in the well-known OPUS search infrastructure, and the open subpart of the corpus will be available for download.
Anthology ID:
2020.lrec-1.644
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5232–5239
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.644
DOI:
Bibkey:
Cite (ACL):
Gerlof Bouma, Evie Coussé, Trude Dijkstra, and Nicoline van der Sijs. 2020. The EDGeS Diachronic Bible Corpus. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 5232–5239, Marseille, France. European Language Resources Association.
Cite (Informal):
The EDGeS Diachronic Bible Corpus (Bouma et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.644.pdf