Few-shot Pseudo-Labeling for Intent Detection

Thomas Dopierre, Christophe Gravier, Julien Subercaze, Wilfried Logerais


Abstract
In this paper, we introduce a state-of-the-art pseudo-labeling technique for few-shot intent detection. We devise a folding/unfolding hierarchical clustering algorithm which assigns weighted pseudo-labels to unlabeled user utterances. We show that our two-step method yields significant improvement over existing solutions. This performance is achieved on multiple intent detection datasets, even in more challenging situations where the number of classes is large or when the dataset is highly imbalanced. Moreover, we confirm this results on the more general text classification task. We also demonstrate that our approach nicely complements existing solutions, thereby providing an even stronger state-of-the-art ensemble method.
Anthology ID:
2020.coling-main.438
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
4993–5003
Language:
URL:
https://aclanthology.org/2020.coling-main.438
DOI:
10.18653/v1/2020.coling-main.438
Bibkey:
Cite (ACL):
Thomas Dopierre, Christophe Gravier, Julien Subercaze, and Wilfried Logerais. 2020. Few-shot Pseudo-Labeling for Intent Detection. In Proceedings of the 28th International Conference on Computational Linguistics, pages 4993–5003, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
Few-shot Pseudo-Labeling for Intent Detection (Dopierre et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.438.pdf
Data
SNIPS