Multilingual enrichment of disease biomedical ontologies

Léo Bouscarrat, Antoine Bonnefoy, Cécile Capponi, Carlos Ramisch


Abstract
Translating biomedical ontologies is an important challenge, but doing it manually requires much time and money. We study the possibility to use open-source knowledge bases to translate biomedical ontologies. We focus on two aspects: coverage and quality. We look at the coverage of two biomedical ontologies focusing on diseases with respect to Wikidata for 9 European languages (Czech, Dutch, English, French, German, Italian, Polish, Portuguese and Spanish) for both, plus Arabic, Chinese and Russian for the second. We first use direct links between Wikidata and the studied ontologies and then use second-order links by going through other intermediate ontologies. We then compare the quality of the translations obtained thanks to Wikidata with a commercial machine translation tool, here Google Cloud Translation.
Anthology ID:
2020.multilingualbio-1.4
Volume:
Proceedings of the LREC 2020 Workshop on Multilingual Biomedical Text Processing (MultilingualBIO 2020)
Month:
May
Year:
2020
Address:
Marseille, France
Editor:
Maite Melero
Venue:
MultilingualBIO
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
21–28
Language:
English
URL:
https://aclanthology.org/2020.multilingualbio-1.4
DOI:
Bibkey:
Cite (ACL):
Léo Bouscarrat, Antoine Bonnefoy, Cécile Capponi, and Carlos Ramisch. 2020. Multilingual enrichment of disease biomedical ontologies. In Proceedings of the LREC 2020 Workshop on Multilingual Biomedical Text Processing (MultilingualBIO 2020), pages 21–28, Marseille, France. European Language Resources Association.
Cite (Informal):
Multilingual enrichment of disease biomedical ontologies (Bouscarrat et al., MultilingualBIO 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.multilingualbio-1.4.pdf
Code
 euranova/orphanet_translation +  additional community code