CONLL-2003 (State of the art)

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Performance measure: F = 2 * Precision * Recall / (Recall + Precision)
Precision: percentage of named entities found by the algorithm that are correct
Recall: percentage of named entities defined in the corpus that were found by the program
Exact match (for all words of a chunk) is used in the calculation of precision and recall (see CONLL scoring software)

Training data: Train split of CONLL-2003 corpus
Dryrun data: Testa split of CONLL-2003 corpus
Testing data: Testb split of CONLL-2003 corpus
The corpus contains a very high ratio of metonymic references (city names standing for sport teams)

Table of results

System name	Short description	System type (1)	Main publications	Software	Results
FIJZ	Best CONLL-2003 participant	S	Florian, Ittycheriah, Jing and Zhang (2003)	-	88.76%
Baseline	Vocabulary transfer from training to testing	S	Tjong Kim Sang and De Meulder(2003)	-	59.61%
Balie	Unsupervised approach: no prior training	U	Nadeau, Turney and Matwin (2006)	sourceforge.net	55.98%
BI-LSTM-CRF	Bidirectional LSTM-CRF Model	S	Huang et al. (2015)	-	90.10%
BI-LSTM-CRF	Bidirectional LSTM-CRF Model	S	Akbik, Blythe, & Vollgraf (2018)	https://github.com/zalandoresearch/flair	93.09%

(1) System type: R = hand-crafted rules, S = supervised learning, U = unsupervised learning, H = hybrid

References

Florian, R., Ittycheriah, A., Jing, H. and Zhang, T. (2003) Named Entity Recognition through Classifier Combination. Proceedings of CoNLL-2003. Edmonton, Canada.

Nadeau, D., Turney, P. D. and Matwin, S. (2006) Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity. Proceedings 19th Canadian Conference on Artificial Intelligence. Québec, Canada.

Tjong Kim Sang, E. F. and De Meulder, F. (2003) Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. Proceedings of CoNLL-2003. Edmonton, Canada.

Z. H. Huang, W. Xu, and K. Yu. (2015) Bidirectional LSTM-CRF Models for Sequence Tagging. In arXiv:1508.01991. 2015.

Akbik, A., Blythe, D., and Vollgraf, R. (2018). Contextual string embeddings for sequence labeling. In Proceedings of the 27th International Conference on Computational Linguistics (pp. 1638-1649).

CONLL-2003 (State of the art)

Table of results

References

See also

Navigation menu

Search