Learning Orthographic Features in Bi-directional LSTM for Biomedical Named Entity Recognition

Nut Limsopatham, Nigel Collier


Abstract
End-to-end neural network models for named entity recognition (NER) have shown to achieve effective performances on general domain datasets (e.g. newswire), without requiring additional hand-crafted features. However, in biomedical domain, recent studies have shown that hand-engineered features (e.g. orthographic features) should be used to attain effective performance, due to the complexity of biomedical terminology (e.g. the use of acronyms and complex gene names). In this work, we propose a novel approach that allows a neural network model based on a long short-term memory (LSTM) to automatically learn orthographic features and incorporate them into a model for biomedical NER. Importantly, our bi-directional LSTM model learns and leverages orthographic features on an end-to-end basis. We evaluate our approach by comparing against existing neural network models for NER using three well-established biomedical datasets. Our experimental results show that the proposed approach consistently outperforms these strong baselines across all of the three datasets.
Anthology ID:
W16-5102
Volume:
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Sophia Ananiadou, Riza Batista-Navarro, Kevin Bretonnel Cohen, Dina Demner-Fushman, Paul Thompson
Venue:
WS
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
10–19
Language:
URL:
https://aclanthology.org/W16-5102
DOI:
Bibkey:
Cite (ACL):
Nut Limsopatham and Nigel Collier. 2016. Learning Orthographic Features in Bi-directional LSTM for Biomedical Named Entity Recognition. In Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016), pages 10–19, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Learning Orthographic Features in Bi-directional LSTM for Biomedical Named Entity Recognition (Limsopatham & Collier, 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-5102.pdf
Data
NCBI Disease