PAWS: A Multi-lingual Parallel Treebank with Anaphoric Relations

Anna Nedoluzhko, Michal Novák, Maciej Ogrodniczuk


Abstract
We present PAWS, a multi-lingual parallel treebank with coreference annotation. It consists of English texts from the Wall Street Journal translated into Czech, Russian and Polish. In addition, the texts are syntactically parsed and word-aligned. PAWS is based on PCEDT 2.0 and continues the tradition of multilingual treebanks with coreference annotation. The paper focuses on the coreference annotation in PAWS and its language-specific differences. PAWS offers linguistic material that can be further leveraged in cross-lingual studies, especially on coreference.
Anthology ID:
W18-0708
Volume:
Proceedings of the First Workshop on Computational Models of Reference, Anaphora and Coreference
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Massimo Poesio, Vincent Ng, Maciej Ogrodniczuk
Venue:
CRAC
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
68–76
Language:
URL:
https://aclanthology.org/W18-0708
DOI:
10.18653/v1/W18-0708
Bibkey:
Cite (ACL):
Anna Nedoluzhko, Michal Novák, and Maciej Ogrodniczuk. 2018. PAWS: A Multi-lingual Parallel Treebank with Anaphoric Relations. In Proceedings of the First Workshop on Computational Models of Reference, Anaphora and Coreference, pages 68–76, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
PAWS: A Multi-lingual Parallel Treebank with Anaphoric Relations (Nedoluzhko et al., CRAC 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-0708.pdf