Posterior Differential Regularization with f-divergence for Improving Model Robustness

Hao Cheng, Xiaodong Liu, Lis Pereira, Yaoliang Yu, Jianfeng Gao


Abstract
We address the problem of enhancing model robustness through regularization. Specifically, we focus on methods that regularize the model posterior difference between clean and noisy inputs. Theoretically, we provide a connection of two recent methods, Jacobian Regularization and Virtual Adversarial Training, under this framework. Additionally, we generalize the posterior differential regularization to the family of f-divergences and characterize the overall framework in terms of the Jacobian matrix. Empirically, we compare those regularizations and standard BERT training on a diverse set of tasks to provide a comprehensive profile of their effect on model generalization. For both fully supervised and semi-supervised settings, we show that regularizing the posterior difference with f-divergence can result in well-improved model robustness. In particular, with a proper f-divergence, a BERT-base model can achieve comparable generalization as its BERT-large counterpart for in-domain, adversarial and domain shift scenarios, indicating the great potential of the proposed framework for enhancing NLP model robustness.
Anthology ID:
2021.naacl-main.85
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Editors:
Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1078–1089
Language:
URL:
https://aclanthology.org/2021.naacl-main.85
DOI:
10.18653/v1/2021.naacl-main.85
Bibkey:
Cite (ACL):
Hao Cheng, Xiaodong Liu, Lis Pereira, Yaoliang Yu, and Jianfeng Gao. 2021. Posterior Differential Regularization with f-divergence for Improving Model Robustness. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1078–1089, Online. Association for Computational Linguistics.
Cite (Informal):
Posterior Differential Regularization with f-divergence for Improving Model Robustness (Cheng et al., NAACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-main.85.pdf
Video:
 https://aclanthology.org/2021.naacl-main.85.mp4
Code
 additional community code
Data
BioASQIMDb Movie ReviewsMRQAMultiNLISQuADSSTSST-2