Target Concept Guided Medical Concept Normalization in Noisy User-Generated Texts

Katikapalli Subramanyam Kalyan, Sivanesan Sangeetha


Abstract
Medical concept normalization (MCN) i.e., mapping of colloquial medical phrases to standard concepts is an essential step in analysis of medical social media text. The main drawback in existing state-of-the-art approach (Kalyan and Sangeetha, 2020b) is learning target concept vector representations from scratch which requires more number of training instances. Our model is based on RoBERTa and target concept embeddings. In our model, we integrate a) target concept information in the form of target concept vectors generated by encoding target concept descriptions using SRoBERTa, state-of-the-art RoBERTa based sentence embedding model and b) domain lexicon knowledge by enriching target concept vectors with synonym relationship knowledge using retrofitting algorithm. It is the first attempt in MCN to exploit both target concept information as well as domain lexicon knowledge in the form of retrofitted target concept vectors. Our model outperforms all the existing models with an accuracy improvement up to 1.36% on three standard datasets. Further, our model when trained only on mapping lexicon synonyms achieves up to 4.87% improvement in accuracy.
Anthology ID:
2020.deelio-1.8
Volume:
Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures
Month:
November
Year:
2020
Address:
Online
Editors:
Eneko Agirre, Marianna Apidianaki, Ivan Vulić
Venue:
DeeLIO
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
64–73
Language:
URL:
https://aclanthology.org/2020.deelio-1.8
DOI:
10.18653/v1/2020.deelio-1.8
Bibkey:
Cite (ACL):
Katikapalli Subramanyam Kalyan and Sivanesan Sangeetha. 2020. Target Concept Guided Medical Concept Normalization in Noisy User-Generated Texts. In Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pages 64–73, Online. Association for Computational Linguistics.
Cite (Informal):
Target Concept Guided Medical Concept Normalization in Noisy User-Generated Texts (Kalyan & Sangeetha, DeeLIO 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.deelio-1.8.pdf
Video:
 https://slideslive.com/38939731