Corpora for English: Difference between revisions

From ACL Wiki
Jump to navigation Jump to search
No edit summary
Sven (talk | contribs)
m English: misclassified
Line 30: Line 30:
*[http://www.cs.fit.edu/~mmahoney/compression/text.html Large Text Compression Benchmark's 1G sample of Wikipedia]
*[http://www.cs.fit.edu/~mmahoney/compression/text.html Large Text Compression Benchmark's 1G sample of Wikipedia]
*[http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/naive-bayes/bow-0.8/stopwords.c List of English stopwords]
*[http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/naive-bayes/bow-0.8/stopwords.c List of English stopwords]
*[http://www.lsi.upc.es/~nlp/tools/mapping.html Mapping WordNet Versions 1.6 and 2.0]
*[http://www.cs.cornell.edu/People/pabo/movie-review-data/ Movie Review Data]
*[http://www.cs.cornell.edu/People/pabo/movie-review-data/ Movie Review Data]
*[http://mwe.stanford.edu/resources/ Multiword Expression Resources]
*[http://mwe.stanford.edu/resources/ Multiword Expression Resources]

Revision as of 10:39, 2 March 2007


English

German

Multilingual

Russian

Slovak

Italian

Link collections

Corpora tools

Uncategorized

Arabic

Bosnian

Bulgarian

Czech

Danish

English

Finnish

French

German

Haitian Creole

Italian

Japanese

Polish

Romanian

Sanskrit

Slovenian

Spanish

Swahili