Deep Cross-Lingual Coreference Resolution for Less-Resourced Languages: The Case of Basque

Gorka Urbizu; Ander Soraluze; Olatz Arregi

doi:10.18653/v1/W19-2806

Deep Cross-Lingual Coreference Resolution for Less-Resourced Languages: The Case of Basque

Gorka Urbizu, Ander Soraluze, Olatz Arregi

Abstract

In this paper, we present a cross-lingual neural coreference resolution system for a less-resourced language such as Basque. To begin with, we build the first neural coreference resolution system for Basque, training it with the relatively small EPEC-KORREF corpus (45,000 words). Next, a cross-lingual coreference resolution system is designed. With this approach, the system learns from a bigger English corpus, using cross-lingual embeddings, to perform the coreference resolution for Basque. The cross-lingual system obtains slightly better results (40.93 F1 CoNLL) than the monolingual system (39.12 F1 CoNLL), without using any Basque language corpus to train it.

Anthology ID:: W19-2806
Volume:: Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference
Month:: June
Year:: 2019
Address:: Minneapolis, USA
Editors:: Maciej Ogrodniczuk, Sameer Pradhan, Yulia Grishina, Vincent Ng
Venue:: CRAC
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 35–41
Language:
URL:: https://aclanthology.org/W19-2806
DOI:: 10.18653/v1/W19-2806
Bibkey:
Cite (ACL):: Gorka Urbizu, Ander Soraluze, and Olatz Arregi. 2019. Deep Cross-Lingual Coreference Resolution for Less-Resourced Languages: The Case of Basque. In Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference, pages 35–41, Minneapolis, USA. Association for Computational Linguistics.
Cite (Informal):: Deep Cross-Lingual Coreference Resolution for Less-Resourced Languages: The Case of Basque (Urbizu et al., CRAC 2019)
Copy Citation:
PDF:: https://aclanthology.org/W19-2806.pdf

PDF Cite Search