Corpora for English: Difference between revisions

From ACL Wiki
Jump to navigation Jump to search
m Jonsafari moved page Corpora (English) to Corpora for English: align with other related articles
Added GUM corpus
Line 8: Line 8:
*[http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/naive-bayes/bow-0.8/stopwords.c English stop words (from SMART)]
*[http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/naive-bayes/bow-0.8/stopwords.c English stop words (from SMART)]
*[http://gmb.let.rug.nl Groningen Meaning Bank] semantically annotated corpus
*[http://gmb.let.rug.nl Groningen Meaning Bank] semantically annotated corpus
*[https://corpling.uis.georgetown.edu/gum/ GUM - Georgetown University Multilayer corpus], multiple parses, coreference, entities, sentence types and RST
*[https://www.gutenberg.org Project Gutenberg]
*[https://www.gutenberg.org Project Gutenberg]
*[http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style.
*[http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style.

Revision as of 13:50, 10 June 2016

For languages other than English, see List of resources by language.

Free and Downloadable

Proprietary or Require Prior Permission


Link collections

Corpora tools