Difference between revisions of "Corpora, datasets, lexicons"

From ACL Wiki
Jump to navigation Jump to search
Line 34: Line 34:
 
(alphabetical order)
 
(alphabetical order)
 
* [http://devoted.to/corpora David Lee's Bookmarks for Corpus-based Linguists]
 
* [http://devoted.to/corpora David Lee's Bookmarks for Corpus-based Linguists]
 +
* [[Resources]]
  
 
== Datasets ==
 
== Datasets ==

Revision as of 07:52, 2 November 2006

Miscellaneous

Corpora

English

(alphabetical order)

Multilingual

(alphabetical order)

Other lists of corpora

(alphabetical order)

Datasets

Lexicons

(alphabetical order)

WordNet and enhancements

(alphabetical order)

  • eXtended WordNet - glosses are syntactically parsed, transformed into logic forms, and content words are semantically disambiguated
  • SentiWordNet - assigns to each synset of WordNet three sentiment scores: positivity, negativity, objectivity
  • WordNet - the original
  • WordNet Domains - augmented with Domain Labels, such as POLITICS, ECONOMY, SPORT