Leveraging Sentence-level Information with Encoder LSTM for Semantic Slot Filling

Gakuto Kurata1, Bing Xiang2, Bowen Zhou2, Mo Yu2
1IBM Research, 2IBM Watson


Abstract

Recurrent Neural Network (RNN) and one of its specific architectures, Long Short-Term Memory (LSTM), have been widely used for sequence labeling. Explicitly modeling output label dependencies on top of RNN/LSTM is a widely-studied and effective extension. We propose another extension to incorporate the global information spanning over the whole input sequence. The proposed method, encoder-labeler LSTM, first encodes the whole input sequence into a fixed length vector with the encoder LSTM, and then uses this encoded vector as the initial state of another LSTM for sequence labeling. With this method, we can predict the label sequence while taking the whole input sequence information into consideration. In the experiments of a slot filling task, which is an essential component of natural language understanding, with using the standard ATIS corpus, we achieved the state-of-the-art F1-score of 95.66%.