Crowdsourced Hedge Term Disambiguation

Morgan Ulinski, Julia Hirschberg


Abstract
We address the issue of acquiring quality annotations of hedging words and phrases, linguistic phenomenona in which words, sounds, or other constructions are used to express ambiguity or uncertainty. Due to the limited availability of existing corpora annotated for hedging, linguists and other language scientists have been constrained as to the extent they can study this phenomenon. In this paper, we introduce a new method of acquiring hedging annotations via crowdsourcing, based on reformulating the task of labeling hedges as a simple word sense disambiguation task. We also introduce a new hedging corpus we have constructed by applying this method, a collection of forum posts annotated using Amazon Mechanical Turk. We found that the crowdsourced judgments we obtained had an inter-annotator agreement of 92.89% (Fleiss’ Kappa=0.751) and, when comparing a subset of these annotations to an expert-annotated gold standard, an accuracy of 96.65%.
Anthology ID:
W19-4001
Volume:
Proceedings of the 13th Linguistic Annotation Workshop
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Annemarie Friedrich, Deniz Zeyrek, Jet Hoek
Venue:
LAW
SIG:
SIGANN
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–5
Language:
URL:
https://aclanthology.org/W19-4001
DOI:
10.18653/v1/W19-4001
Bibkey:
Cite (ACL):
Morgan Ulinski and Julia Hirschberg. 2019. Crowdsourced Hedge Term Disambiguation. In Proceedings of the 13th Linguistic Annotation Workshop, pages 1–5, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Crowdsourced Hedge Term Disambiguation (Ulinski & Hirschberg, LAW 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4001.pdf