Learning to Embed Words in Context for Syntactic Tasks

Lifu Tu, Kevin Gimpel, Karen Livescu


Abstract
We present models for embedding words in the context of surrounding words. Such models, which we refer to as token embeddings, represent the characteristics of a word that are specific to a given context, such as word sense, syntactic category, and semantic role. We explore simple, efficient token embedding models based on standard neural network architectures. We learn token embeddings on a large amount of unannotated text and evaluate them as features for part-of-speech taggers and dependency parsers trained on much smaller amounts of annotated data. We find that predictors endowed with token embeddings consistently outperform baseline predictors across a range of context window and training set sizes.
Anthology ID:
W17-2632
Volume:
Proceedings of the 2nd Workshop on Representation Learning for NLP
Month:
August
Year:
2017
Address:
Vancouver, Canada
Editors:
Phil Blunsom, Antoine Bordes, Kyunghyun Cho, Shay Cohen, Chris Dyer, Edward Grefenstette, Karl Moritz Hermann, Laura Rimell, Jason Weston, Scott Yih
Venue:
RepL4NLP
SIG:
SIGREP
Publisher:
Association for Computational Linguistics
Note:
Pages:
265–275
Language:
URL:
https://aclanthology.org/W17-2632
DOI:
10.18653/v1/W17-2632
Bibkey:
Cite (ACL):
Lifu Tu, Kevin Gimpel, and Karen Livescu. 2017. Learning to Embed Words in Context for Syntactic Tasks. In Proceedings of the 2nd Workshop on Representation Learning for NLP, pages 265–275, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
Learning to Embed Words in Context for Syntactic Tasks (Tu et al., RepL4NLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-2632.pdf
Data
Penn Treebank