Automatic semantic role labeling in Ancient Greek using distributional semantic modeling

Alek Keersmaekers


Abstract
This paper describes a first attempt to automatic semantic role labeling in Ancient Greek, using a supervised machine learning approach. A Random Forest classifier is trained on a small semantically annotated corpus of Ancient Greek, annotated with a large amount of linguistic features, including form of the construction, morphology, part-of-speech, lemmas, animacy, syntax and distributional vectors of Greek words. These vectors turned out to be more important in the model than any other features, likely because they are well suited to handle a low amount of training examples. Overall labeling accuracy was 0.757, with large differences with respect to the specific role that was labeled and with respect to text genre. Some ways to further improve these results include expanding the amount of training examples, improving the quality of the distributional vectors and increasing the consistency of the syntactic annotation.
Anthology ID:
2020.lt4hala-1.9
Volume:
Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Rachele Sprugnoli, Marco Passarotti
Venue:
LT4HALA
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
59–67
Language:
English
URL:
https://aclanthology.org/2020.lt4hala-1.9
DOI:
Bibkey:
Cite (ACL):
Alek Keersmaekers. 2020. Automatic semantic role labeling in Ancient Greek using distributional semantic modeling. In Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages, pages 59–67, Marseille, France. European Language Resources Association (ELRA).
Cite (Informal):
Automatic semantic role labeling in Ancient Greek using distributional semantic modeling (Keersmaekers, LT4HALA 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lt4hala-1.9.pdf