Multilingualization of Medical Terminology: Semantic and Structural Embedding Approaches

Long-Huei Chen, Kyo Kageura


Abstract
The multilingualization of terminology is an essential step in the translation pipeline, to ensure the correct transfer of domain-specific concepts. Many institutions and language service providers construct and maintain multilingual terminologies, which constitute important assets. However, the curation of such multilingual resources requires significant human effort; though automatic multilingual term extraction methods have been proposed so far, they are of limited success as term translation cannot be satisfied by simply conveying meaning, but requires the terminologists and domain experts’ knowledge to fit the term within the existing terminology. Here we propose a method to encode the structural property of a term by aligning their embeddings using graph convolutional networks trained from separate languages. We observe that the structural information can augment the semantic methods also explored in this work, and recognize the unique nature of terminologies allows our method to fully take advantage and produce superior results.
Anthology ID:
2020.lrec-1.512
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
4157–4166
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.512
DOI:
Bibkey:
Cite (ACL):
Long-Huei Chen and Kyo Kageura. 2020. Multilingualization of Medical Terminology: Semantic and Structural Embedding Approaches. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4157–4166, Marseille, France. European Language Resources Association.
Cite (Informal):
Multilingualization of Medical Terminology: Semantic and Structural Embedding Approaches (Chen & Kageura, LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.512.pdf