Exploiting Noisy Data in Distant Supervision Relation Classification

Kaijia Yang, Liang He, Xin-yu Dai, Shujian Huang, Jiajun Chen


Abstract
Distant supervision has obtained great progress on relation classification task. However, it still suffers from noisy labeling problem. Different from previous works that underutilize noisy data which inherently characterize the property of classification, in this paper, we propose RCEND, a novel framework to enhance Relation Classification by Exploiting Noisy Data. First, an instance discriminator with reinforcement learning is designed to split the noisy data into correctly labeled data and incorrectly labeled data. Second, we learn a robust relation classifier in semi-supervised learning way, whereby the correctly and incorrectly labeled data are treated as labeled and unlabeled data respectively. The experimental results show that our method outperforms the state-of-the-art models.
Anthology ID:
N19-1325
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Jill Burstein, Christy Doran, Thamar Solorio
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3216–3225
Language:
URL:
https://aclanthology.org/N19-1325
DOI:
10.18653/v1/N19-1325
Bibkey:
Cite (ACL):
Kaijia Yang, Liang He, Xin-yu Dai, Shujian Huang, and Jiajun Chen. 2019. Exploiting Noisy Data in Distant Supervision Relation Classification. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3216–3225, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Exploiting Noisy Data in Distant Supervision Relation Classification (Yang et al., NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-1325.pdf