The Impact of Annotation on the Performance of Protein Tagging in Biomedical Text

Beatrice Alex, Malvina Nissim, Claire Grover


Abstract
In this paper we discuss five different corpora annotated forprotein names. We present several within- and cross-dataset proteintagging experiments showing that different annotation schemes severelyaffect the portability of statistical protein taggers. By means of adetailed error analysis we identify crucial annotation issues thatfuture annotation projects should take into careful consideration.
Anthology ID:
L06-1235
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/398_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Beatrice Alex, Malvina Nissim, and Claire Grover. 2006. The Impact of Annotation on the Performance of Protein Tagging in Biomedical Text. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
The Impact of Annotation on the Performance of Protein Tagging in Biomedical Text (Alex et al., LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/398_pdf.pdf