Disambiguating False-Alarm Hashtag Usages in Tweets for Irony Detection

Hen-Hsen Huang, Chiao-Chen Chen, Hsin-Hsi Chen


Abstract
The reliability of self-labeled data is an important issue when the data are regarded as ground-truth for training and testing learning-based models. This paper addresses the issue of false-alarm hashtags in the self-labeled data for irony detection. We analyze the ambiguity of hashtag usages and propose a novel neural network-based model, which incorporates linguistic information from different aspects, to disambiguate the usage of three hashtags that are widely used to collect the training data for irony detection. Furthermore, we apply our model to prune the self-labeled training data. Experimental results show that the irony detection model trained on the less but cleaner training instances outperforms the models trained on all data.
Anthology ID:
P18-2122
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Iryna Gurevych, Yusuke Miyao
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
771–777
Language:
URL:
https://aclanthology.org/P18-2122
DOI:
10.18653/v1/P18-2122
Bibkey:
Cite (ACL):
Hen-Hsen Huang, Chiao-Chen Chen, and Hsin-Hsi Chen. 2018. Disambiguating False-Alarm Hashtag Usages in Tweets for Irony Detection. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 771–777, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Disambiguating False-Alarm Hashtag Usages in Tweets for Irony Detection (Huang et al., ACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/P18-2122.pdf
Presentation:
 P18-2122.Presentation.pdf
Video:
 https://aclanthology.org/P18-2122.mp4