An Analysis of Deep Contextual Word Embeddings and Neural Architectures for Toponym Mention Detection in Scientific Publications

Matthew Magnusson, Laura Dietz


Abstract
Toponym detection in scientific papers is an open task and a key first step in place entity enrichment of documents. We examine three common neural architectures in NLP: 1) convolutional neural network, 2) multi-layer perceptron (both applied in a sliding window context) and 3) bidirectional LSTM and apply contextual and non-contextual word embedding layers to these models. We find that deep contextual word embeddings improve the performance of the bi-LSTM with CRF neural architecture achieving the best performance when multiple layers of deep contextual embeddings are concatenated. Our best performing model achieves an average F1 of 0.910 when evaluated on overlap macro exceeding previous state-of-the-art models in the toponym detection task.
Anthology ID:
W19-2607
Volume:
Proceedings of the Workshop on Extracting Structured Knowledge from Scientific Publications
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Vivi Nastase, Benjamin Roth, Laura Dietz, Andrew McCallum
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
48–56
Language:
URL:
https://aclanthology.org/W19-2607
DOI:
10.18653/v1/W19-2607
Bibkey:
Cite (ACL):
Matthew Magnusson and Laura Dietz. 2019. An Analysis of Deep Contextual Word Embeddings and Neural Architectures for Toponym Mention Detection in Scientific Publications. In Proceedings of the Workshop on Extracting Structured Knowledge from Scientific Publications, pages 48–56, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
An Analysis of Deep Contextual Word Embeddings and Neural Architectures for Toponym Mention Detection in Scientific Publications (Magnusson & Dietz, NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-2607.pdf