Reinforced Counterfactual Data Augmentation for Dual Sentiment Classification

Hao Chen, Rui Xia, Jianfei Yu


Abstract
Data augmentation and adversarial perturbation approaches have recently achieved promising results in solving the over-fitting problem in many natural language processing (NLP) tasks including sentiment classification. However, existing studies aimed to improve the generalization ability by augmenting the training data with synonymous examples or adding random noises to word embeddings, which cannot address the spurious association problem. In this work, we propose an end-to-end reinforcement learning framework, which jointly performs counterfactual data generation and dual sentiment classification. Our approach has three characteristics:1) the generator automatically generates massive and diverse antonymous sentences; 2) the discriminator contains a original-side sentiment predictor and an antonymous-side sentiment predictor, which jointly evaluate the quality of the generated sample and help the generator iteratively generate higher-quality antonymous samples; 3) the discriminator is directly used as the final sentiment classifier without the need to build an extra one. Extensive experiments show that our approach outperforms strong data augmentation baselines on several benchmark sentiment classification datasets. Further analysis confirms our approach’s advantages in generating more diverse training samples and solving the spurious association problem in sentiment classification.
Anthology ID:
2021.emnlp-main.24
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
269–278
Language:
URL:
https://aclanthology.org/2021.emnlp-main.24
DOI:
10.18653/v1/2021.emnlp-main.24
Bibkey:
Cite (ACL):
Hao Chen, Rui Xia, and Jianfei Yu. 2021. Reinforced Counterfactual Data Augmentation for Dual Sentiment Classification. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 269–278, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Reinforced Counterfactual Data Augmentation for Dual Sentiment Classification (Chen et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.24.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.24.mp4
Code
 nustm/rcda
Data
SSTSST-2SST-5