Building a Hebrew Semantic Role Labeling Lexical Resource from Parallel Movie Subtitles

Ben Eyal, Michael Elhadad


Abstract
We present a semantic role labeling resource for Hebrew built semi-automatically through annotation projection from English. This corpus is derived from the multilingual OpenSubtitles dataset and includes short informal sentences, for which reliable linguistic annotations have been computed. We provide a fully annotated version of the data including morphological analysis, dependency syntax and semantic role labeling in both FrameNet and ProbBank styles. Sentences are aligned between English and Hebrew, both sides include full annotations and the explicit mapping from the English arguments to the Hebrew ones. We train a neural SRL model on this Hebrew resource exploiting the pre-trained multilingual BERT transformer model, and provide the first available baseline model for Hebrew SRL as a reference point. The code we provide is generic and can be adapted to other languages to bootstrap SRL resources.
Anthology ID:
2020.lrec-1.727
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5934–5942
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.727
DOI:
Bibkey:
Cite (ACL):
Ben Eyal and Michael Elhadad. 2020. Building a Hebrew Semantic Role Labeling Lexical Resource from Parallel Movie Subtitles. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 5934–5942, Marseille, France. European Language Resources Association.
Cite (Informal):
Building a Hebrew Semantic Role Labeling Lexical Resource from Parallel Movie Subtitles (Eyal & Elhadad, LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.727.pdf
Code
 bgunlp/hebrew_srl
Data
FrameNetOpenSubtitles