CONLL-2003 (State of the art)

2019-07-12T13:29:02Z

Simzalabim: /* References */

* '''Performance measure:''' F = 2 * Precision * Recall / (Recall + Precision)
* '''Precision:''' percentage of named entities found by the algorithm that are correct
* '''Recall:''' percentage of named entities defined in the corpus that were found by the program
* Exact match (for all words of a chunk) is used in the calculation of precision and recall (see [http://www.cnts.ua.ac.be/conll2000/chunking/output.html CONLL scoring software])

* '''Training data:''' Train split of CONLL-2003 corpus
* '''Dryrun data:''' Testa split of CONLL-2003 corpus
* '''Testing data:''' Testb split of CONLL-2003 corpus
* The corpus contains a very high ratio of metonymic references (city names standing for sport teams)

== Table of results ==

{| border="1" cellpadding="5" cellspacing="1" width="100%"
|-
! System name
! Short description
! System type (1)
! Main publications
! Software
! Results
|-
| FIJZ
| Best CONLL-2003 participant
| S
| Florian, Ittycheriah, Jing and Zhang (2003)
| -
| 88.76%
|-
| Baseline
| Vocabulary transfer from training to testing
| S
| Tjong Kim Sang and De Meulder(2003)
| -
| 59.61%
|-
| Balie
| Unsupervised approach: no prior training
| U
| Nadeau, Turney and Matwin (2006)
| [http://balie.sourceforge.net sourceforge.net]
| 55.98%
|-
| BI-LSTM-CRF
| Bidirectional LSTM-CRF Model
| S
| Huang et al. (2015)
| -
| 90.10%
|-
| BI-LSTM-CRF
| Bidirectional LSTM-CRF Model
| S
| Akbik, Blythe, & Vollgraf (2018)
| https://github.com/zalandoresearch/flair
| 93.09%
|}

* (1) '''System type''': R = hand-crafted rules, S = supervised learning, U = unsupervised learning, H = hybrid

== References ==

Florian, R., Ittycheriah, A., Jing, H. and Zhang, T. (2003) [http://www.cnts.ua.ac.be/conll2003/pdf/16871flo.pdf Named Entity Recognition through Classifier Combination]. ''Proceedings of CoNLL-2003''. Edmonton, Canada.

Nadeau, D., Turney, P. D. and Matwin, S. (2006) [http://iit-iti.nrc-cnrc.gc.ca/publications/nrc-48727_e.html Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity]. ''Proceedings 19th Canadian Conference on Artificial Intelligence''. Québec, Canada.

Tjong Kim Sang, E. F. and De Meulder, F. (2003) [http://www.cnts.ua.ac.be/conll2003/pdf/14247tjo.pdf Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition]. ''Proceedings of CoNLL-2003''. Edmonton, Canada.

Z. H. Huang, W. Xu, and K. Yu. (2015) [http://arxiv.org/abs/1508.01991 Bidirectional LSTM-CRF Models for Sequence Tagging]. ''In arXiv:1508.01991''. 2015.

Akbik, A., Blythe, D., and Vollgraf, R. (2018). Contextual string embeddings for sequence labeling. In Proceedings of the 27th International Conference on Computational Linguistics (pp. 1638-1649).

== See also ==

* [[Named Entity Recognition (State of the art)|Named Entity Recognition]]
* [[State of the art]]

[[Category:State of the art]]

CONLL-2003 (State of the art)

2019-07-12T13:26:17Z

Simzalabim: /* Table of results */

* '''Performance measure:''' F = 2 * Precision * Recall / (Recall + Precision)
* '''Precision:''' percentage of named entities found by the algorithm that are correct
* '''Recall:''' percentage of named entities defined in the corpus that were found by the program
* Exact match (for all words of a chunk) is used in the calculation of precision and recall (see [http://www.cnts.ua.ac.be/conll2000/chunking/output.html CONLL scoring software])

* '''Training data:''' Train split of CONLL-2003 corpus
* '''Dryrun data:''' Testa split of CONLL-2003 corpus
* '''Testing data:''' Testb split of CONLL-2003 corpus
* The corpus contains a very high ratio of metonymic references (city names standing for sport teams)

== Table of results ==

{| border="1" cellpadding="5" cellspacing="1" width="100%"
|-
! System name
! Short description
! System type (1)
! Main publications
! Software
! Results
|-
| FIJZ
| Best CONLL-2003 participant
| S
| Florian, Ittycheriah, Jing and Zhang (2003)
| -
| 88.76%
|-
| Baseline
| Vocabulary transfer from training to testing
| S
| Tjong Kim Sang and De Meulder(2003)
| -
| 59.61%
|-
| Balie
| Unsupervised approach: no prior training
| U
| Nadeau, Turney and Matwin (2006)
| [http://balie.sourceforge.net sourceforge.net]
| 55.98%
|-
| BI-LSTM-CRF
| Bidirectional LSTM-CRF Model
| S
| Huang et al. (2015)
| -
| 90.10%
|-
| BI-LSTM-CRF
| Bidirectional LSTM-CRF Model
| S
| Akbik, Blythe, & Vollgraf (2018)
| https://github.com/zalandoresearch/flair
| 93.09%
|}

* (1) '''System type''': R = hand-crafted rules, S = supervised learning, U = unsupervised learning, H = hybrid

== References ==

Florian, R., Ittycheriah, A., Jing, H. and Zhang, T. (2003) [http://www.cnts.ua.ac.be/conll2003/pdf/16871flo.pdf Named Entity Recognition through Classifier Combination]. ''Proceedings of CoNLL-2003''. Edmonton, Canada.

Nadeau, D., Turney, P. D. and Matwin, S. (2006) [http://iit-iti.nrc-cnrc.gc.ca/publications/nrc-48727_e.html Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity]. ''Proceedings 19th Canadian Conference on Artificial Intelligence''. Québec, Canada.

Tjong Kim Sang, E. F. and De Meulder, F. (2003) [http://www.cnts.ua.ac.be/conll2003/pdf/14247tjo.pdf Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition]. ''Proceedings of CoNLL-2003''. Edmonton, Canada.

Z. H. Huang, W. Xu, and K. Yu. (2015) [http://arxiv.org/abs/1508.01991 Bidirectional LSTM-CRF Models for Sequence Tagging]. ''In arXiv:1508.01991''. 2015.

== See also ==

* [[Named Entity Recognition (State of the art)|Named Entity Recognition]]
* [[State of the art]]

[[Category:State of the art]]

ACL Wiki - User contributions [en]

CONLL-2003 (State of the art)

CONLL-2003 (State of the art)