French Biomedical Text Simplification: When Small and Precise Helps

Rémi Cardon, Natalia Grabar


Abstract
We present experiments on biomedical text simplification in French. We use two kinds of corpora – parallel sentences extracted from existing health comparable corpora in French and WikiLarge corpus translated from English to French – and a lexicon that associates medical terms with paraphrases. Then, we train neural models on these parallel corpora using different ratios of general and specialized sentences. We evaluate the results with BLEU, SARI and Kandel scores. The results point out that little specialized data helps significantly the simplification.
Anthology ID:
2020.coling-main.62
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
710–716
Language:
URL:
https://aclanthology.org/2020.coling-main.62
DOI:
10.18653/v1/2020.coling-main.62
Bibkey:
Cite (ACL):
Rémi Cardon and Natalia Grabar. 2020. French Biomedical Text Simplification: When Small and Precise Helps. In Proceedings of the 28th International Conference on Computational Linguistics, pages 710–716, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
French Biomedical Text Simplification: When Small and Precise Helps (Cardon & Grabar, COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.62.pdf
Data
WikiLarge