ACL Wiki - User contributions [en]

Resources for German

2009-02-03T09:48:37Z

Yversley: /* Lexicons */

==Corpora==


* [http://www.phonetik.uni-muenchen.de/Bas/BasKorporaeng.html Bavarian Archive for Speech Signals Corpora]
* [http://corpora.ids-mannheim.de/~cosmas/ COSMAS II]
* [http://www.ims.uni-stuttgart.de/projekte/tc/CQP.html Experimental Corpus Query System (University of Stuttgart, Germany)]
* [http://www.wortschatz.uni-leipzig.de/ German plain text and Co-occurrences at LCC]
* [http://www.coli.uni-sb.de/sfb378/negra-corpus/negra-corpus.html NEGRA Corpus]
* [http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/ TIGER treebank]
* [http://www.sfs.uni-tuebingen.de/en_tuebadz.shtml Tübingen Treebank of Written German (TüBa-D/Z)]
* [http://www.sfs.uni-tuebingen.de/en_tuebads.shtml Tübingen Treebank of Spoken German (TüBa-D/S, aka Verbmobil treebank)]
* [http://www.sfs.uni-tuebingen.de/en_tuepp.shtml Tübingen Partially Parsed Corpus of Written German (TüPP-D/Z)]

==Evaluation datasets==
* [http://www.ukp.tu-darmstadt.de/data/semRelDatasets Semantic relatedness evaluation]

==Lexicons==
* [http://www.ims.uni-stuttgart.de/projekte/IMSLex/ IMSLex German Lexicon]
* [http://www.ims.uni-stuttgart.de/tcl/RESOURCES/German-Lexicon-en.html Lexical information for German]
* [http://www.cl.uzh.ch/CL/siclemat/sprachanalyse/molif/ mOlif morphological analyzer]

==Resource Access==
* [http://wortschatz.uni-leipzig.de/Webservices/ Web service access to German language statistics]

==Timeline Analysis==
* [http://wortschatz.uni-leipzig.de/wort-des-tages/ German Words of the Day]
* [http://www.sfs.uni-tuebingen.de/~lothar/nw/ Wortwarte (selection of German neologisms for each day) ]

[[Category:Resources by language|German]]

NP Chunking (State of the art)

2009-02-03T09:35:41Z

Yversley:

* '''Performance measure:''' F = 2 * Precision * Recall / (Recall + Precision)
* '''Precision:''' percentage of NPs found by the algorithm that are correct
* '''Recall:''' percentage of NPs defined in the corpus that were found by the chunking program
* '''Training data:''' sections 15-18 of Wall Street Journal corpus (Ramshaw and Marcus)
* '''Testing data:''' section 20 of Wall Street Journal corpus (Ramshaw and Marcus)
* original data of the NP chunking experiments by Lance Ramshaw and Mitch Marcus
* data contains one word per line and each line contains six fields of which only the first three fields are relevant: the word, the part-of-speech tag assigned by the Brill tagger, and the correct IOB tag

== Table of results ==

{| border="1" cellpadding="5" cellspacing="1" width="100%"
|-
! System name
! Short description
! Main publications
! Software
! Reports (F)
|-
| KM00
| B-I-O tagging using SVM classifiers with polynomial kernel
| Kudo and Matsumoto (2000), CONLL
| [http://chasen.org/~taku/software/yamcha/ YAMCHA Toolkit] (but models are not provided)
| 93.79%
|-
| KM01
| learning as in KM00, but voting between different representations
| Kudo and Matsumoto (2001), NAACL
| No
| 94.22%
|-
| SP03
| Second order conditional random fields
| Fei Sha and Fernando Pereira (2003), HLT/NAACL
| No
| 94.3%
|-
| SS05
| specialized HMM + voting between different representations
| Shen and Sarkar (2005)
| No
| 95.23%
|-
| M05
| Second order conditional random fields + multi-label classification
| Ryan McDonald, KOby Crammer and Fernando Pereira (2005), HLT/EMNLP
| No
| 94.29%
|-
| V06
| Conditional random fields + Stochastic Meta Decent (SMD)
| S. V. N. Vishwanathan, Nicol N. Schraudolph, Mark Schmidt, and Kevin Murphy (2006), ICML
| No
| 93.6%
|-
| S08
| Second order latent-dynamic conditional random fields + an improved inference method based on A* search
| Xu Sun, Louis-Philippe Morency, Daisuke Okanohara and Jun'ichi Tsujii (2008), COLING
| HCRF Library
| 94.34%
|-
| C00
| Chunks from the Charniak Parser
| Hollingshead, Fisher and Roark (2005), Charniak (2000)
| ?
| 94.20%
|}

== References ==

E. Charniak (2000). [http://aclweb.org/anthology-new/A/A00/A00-2018.pdf A Maximum-Entropy inspired parser], NAACL 2000

K. Hollingshead, S. Fisher and B. Roark (2005). [http://www.aclweb.org/anthology-new/H/H05/H05-1099.pdf Comparing and combining finite-state and context-free parsers.] HLT/EMNLP 2005.

T. Kudo and Y. Matsumoto (2000). [http://acl.ldc.upenn.edu/W/W00/W00-0730.pdf Use of support vector learning for chunk identification]. ''Proceedings of the 4th Conference on CoNLL-2000 and LLL-2000'', pages 142-144, Lisbon, Portugal.

T. Kudo and Y. Matsumoto (2001). [http://acl.ldc.upenn.edu/N/N01/N01-1025.pdf Chunking with support vector machines]. ''Proceedings of NAACL-2001''.

F. Sha and F. Pereira (2003). [http://www-rcf.usc.edu/~feisha/htmls/Papers.html Shallow Parsing with Conditional Random Fields]. ''Proceedings of HLT-NAACL 2003'', pages 213-220. Edmonton, Canada.

H. Shen and A. Sarkar (2005). [http://www.cs.sfu.ca/~anoop/papers/pdf/ai05.pdf Voting between multiple data representations for text chunking]. ''Proceedings of the Eighteenth Meeting of the Canadian Society for Computational Intelligence, Canadian AI 2005''.

R. McDonald, K. Crammer and F. Pereira (2005). [http://ryanmcd.googlepages.com/segmentationHLT-EMNLP2005.pdf Flexible Text Segmentation with Structured Multilabel Classification]. ''Human Language Technologies and Empirical Methods in Natural Language Processing (HLT-EMNLP), 2005''

S. V. N. Vishwanathan, N. Schraudolph, M. Schmidt, and K. Murphy. Accelerated Training Conditional Random Fields with Stochastic Gradient Methods. In Proc. Intl. Conf. Machine Learning, pp. 969 – 976, ACM Press, New York, NY, USA, 2006.

X. Sun, L.P. Morency, D. OKanohara and J. Tsujii (2008). [http://www.aclweb.org/anthology-new/C/C08/C08-1106.pdf Modeling Latent-Dynamic in Shallow Parsing: A Latent Conditional Model with Improved Inference]. ''Proceedings of The 22nd International Conference on Computational Linguistics (COLING 2008)''. Pages 841-848. Manchester, UK.

== See also ==

* [[State of the art]]

== External links ==

* dataset is available from [ftp://ftp.cis.upenn.edu/pub/chunker/ ftp://ftp.cis.upenn.edu/pub/chunker/]
* more information is available from [http://ifarm.nl/erikt/research/np-chunking.html NP Chunking]

[[Category:State of the art]]

Resources for German

2009-02-03T09:22:41Z

Yversley: /* Corpora */

==Corpora==


* [http://www.phonetik.uni-muenchen.de/Bas/BasKorporaeng.html Bavarian Archive for Speech Signals Corpora]
* [http://corpora.ids-mannheim.de/~cosmas/ COSMAS II]
* [http://www.ims.uni-stuttgart.de/projekte/tc/CQP.html Experimental Corpus Query System (University of Stuttgart, Germany)]
* [http://www.wortschatz.uni-leipzig.de/ German plain text and Co-occurrences at LCC]
* [http://www.coli.uni-sb.de/sfb378/negra-corpus/negra-corpus.html NEGRA Corpus]
* [http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/ TIGER treebank]
* [http://www.sfs.uni-tuebingen.de/en_tuebadz.shtml Tübingen Treebank of Written German (TüBa-D/Z)]
* [http://www.sfs.uni-tuebingen.de/en_tuebads.shtml Tübingen Treebank of Spoken German (TüBa-D/S, aka Verbmobil treebank)]
* [http://www.sfs.uni-tuebingen.de/en_tuepp.shtml Tübingen Partially Parsed Corpus of Written German (TüPP-D/Z)]

==Evaluation datasets==
* [http://www.ukp.tu-darmstadt.de/data/semRelDatasets Semantic relatedness evaluation]

==Lexicons==
* [http://www.ims.uni-stuttgart.de/tcl/RESOURCES/German-Lexicon-en.html Lexical information for German]

==Resource Access==
* [http://wortschatz.uni-leipzig.de/Webservices/ Web service access to German language statistics]

==Timeline Analysis==
* [http://wortschatz.uni-leipzig.de/wort-des-tages/ German Words of the Day]
* [http://www.sfs.uni-tuebingen.de/~lothar/nw/ Wortwarte (selection of German neologisms for each day) ]

[[Category:Resources by language|German]]

Resources for German

2009-02-03T09:20:34Z

Yversley: /* Timeline Analysis */

==Corpora==


* [http://www.phonetik.uni-muenchen.de/Bas/BasKorporaeng.html Bavarian Archive for Speech Signals Corpora]
* [http://corpora.ids-mannheim.de/~cosmas/ COSMAS II]
* [http://www.ims.uni-stuttgart.de/projekte/tc/CQP.html Experimental Corpus Query System (University of Stuttgart, Germany)]
* [http://www.wortschatz.uni-leipzig.de/ German plain text and Co-occurrences at LCC]
* [http://www.coli.uni-sb.de/sfb378/negra-corpus/negra-corpus.html NEGRA Corpus]
* [http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/ TIGER treebank]
* [http://www.sfs.uni-tuebingen.de/en_tuebadz.shtml Tübingen Treebank of Written German (TüBa-D/Z)]

==Evaluation datasets==
* [http://www.ukp.tu-darmstadt.de/data/semRelDatasets Semantic relatedness evaluation]

==Lexicons==
* [http://www.ims.uni-stuttgart.de/tcl/RESOURCES/German-Lexicon-en.html Lexical information for German]

==Resource Access==
* [http://wortschatz.uni-leipzig.de/Webservices/ Web service access to German language statistics]

==Timeline Analysis==
* [http://wortschatz.uni-leipzig.de/wort-des-tages/ German Words of the Day]
* [http://www.sfs.uni-tuebingen.de/~lothar/nw/ Wortwarte (selection of German neologisms for each day) ]

[[Category:Resources by language|German]]

Resources for German

2009-02-03T09:19:17Z

Yversley: added Tiger and Negra

==Corpora==


* [http://www.phonetik.uni-muenchen.de/Bas/BasKorporaeng.html Bavarian Archive for Speech Signals Corpora]
* [http://corpora.ids-mannheim.de/~cosmas/ COSMAS II]
* [http://www.ims.uni-stuttgart.de/projekte/tc/CQP.html Experimental Corpus Query System (University of Stuttgart, Germany)]
* [http://www.wortschatz.uni-leipzig.de/ German plain text and Co-occurrences at LCC]
* [http://www.coli.uni-sb.de/sfb378/negra-corpus/negra-corpus.html NEGRA Corpus]
* [http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/ TIGER treebank]
* [http://www.sfs.uni-tuebingen.de/en_tuebadz.shtml Tübingen Treebank of Written German (TüBa-D/Z)]

==Evaluation datasets==
* [http://www.ukp.tu-darmstadt.de/data/semRelDatasets Semantic relatedness evaluation]

==Lexicons==
* [http://www.ims.uni-stuttgart.de/tcl/RESOURCES/German-Lexicon-en.html Lexical information for German]

==Resource Access==
* [http://wortschatz.uni-leipzig.de/Webservices/ Web service access to German language statistics]

==Timeline Analysis==
* [http://wortschatz.uni-leipzig.de/wort-des-tages/ German Words of the Day]

[[Category:Resources by language|German]]