Character and Subword-Based Word Representation for Neural Language Modeling Prediction

Matthieu Labeau, Alexandre Allauzen


Abstract
Most of neural language models use different kinds of embeddings for word prediction. While word embeddings can be associated to each word in the vocabulary or derived from characters as well as factored morphological decomposition, these word representations are mainly used to parametrize the input, i.e. the context of prediction. This work investigates the effect of using subword units (character and factored morphological decomposition) to build output representations for neural language modeling. We present a case study on Czech, a morphologically-rich language, experimenting with different input and output representations. When working with the full training vocabulary, despite unstable training, our experiments show that augmenting the output word representations with character-based embeddings can significantly improve the performance of the model. Moreover, reducing the size of the output look-up table, to let the character-based embeddings represent rare words, brings further improvement.
Anthology ID:
W17-4101
Volume:
Proceedings of the First Workshop on Subword and Character Level Models in NLP
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Manaal Faruqui, Hinrich Schuetze, Isabel Trancoso, Yadollah Yaghoobzadeh
Venue:
SCLeM
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–13
Language:
URL:
https://aclanthology.org/W17-4101
DOI:
10.18653/v1/W17-4101
Bibkey:
Cite (ACL):
Matthieu Labeau and Alexandre Allauzen. 2017. Character and Subword-Based Word Representation for Neural Language Modeling Prediction. In Proceedings of the First Workshop on Subword and Character Level Models in NLP, pages 1–13, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Character and Subword-Based Word Representation for Neural Language Modeling Prediction (Labeau & Allauzen, SCLeM 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-4101.pdf