The Biomaterials Annotator: a system for ontology-based concept annotation of biomaterials text

Javier Corvi, Carla Fuenteslópez, José Fernández, Josep Gelpi, Maria-Pau Ginebra, Salvador Capella-Guitierrez, Osnat Hakimi


Abstract
Biomaterials are synthetic or natural materials used for constructing artificial organs, fabricating prostheses, or replacing tissues. The last century saw the development of thousands of novel biomaterials and, as a result, an exponential increase in scientific publications in the field. Large-scale analysis of biomaterials and their performance could enable data-driven material selection and implant design. However, such analysis requires identification and organization of concepts, such as materials and structures, from published texts. To facilitate future information extraction and the application of machine-learning techniques, we developed a semantic annotator specifically tailored for the biomaterials literature. The Biomaterials Annotator has been implemented following a modular organization using software containers for the different components and orchestrated using Nextflow as workflow manager. Natural language processing (NLP) components are mainly developed in Java. This set-up has allowed named entity recognition of seventeen classes relevant to the biomaterials domain. Here we detail the development, evaluation and performance of the system, as well as the release of the first collection of annotated biomaterials abstracts. We make both the corpus and system available to the community to promote future efforts in the field and contribute towards its sustainability.
Anthology ID:
2021.sdp-1.5
Volume:
Proceedings of the Second Workshop on Scholarly Document Processing
Month:
June
Year:
2021
Address:
Online
Editors:
Iz Beltagy, Arman Cohan, Guy Feigenblat, Dayne Freitag, Tirthankar Ghosal, Keith Hall, Drahomira Herrmannova, Petr Knoth, Kyle Lo, Philipp Mayr, Robert M. Patton, Michal Shmueli-Scheuer, Anita de Waard, Kuansan Wang, Lucy Lu Wang
Venue:
sdp
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
36–48
Language:
URL:
https://aclanthology.org/2021.sdp-1.5
DOI:
10.18653/v1/2021.sdp-1.5
Bibkey:
Cite (ACL):
Javier Corvi, Carla Fuenteslópez, José Fernández, Josep Gelpi, Maria-Pau Ginebra, Salvador Capella-Guitierrez, and Osnat Hakimi. 2021. The Biomaterials Annotator: a system for ontology-based concept annotation of biomaterials text. In Proceedings of the Second Workshop on Scholarly Document Processing, pages 36–48, Online. Association for Computational Linguistics.
Cite (Informal):
The Biomaterials Annotator: a system for ontology-based concept annotation of biomaterials text (Corvi et al., sdp 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.sdp-1.5.pdf
Code
 projectdebbie/biomaterials_annotator