|
|
| Line 1: |
Line 1: |
|
| |
| * [[Corpora]] | | * [[Corpora]] |
| * [[Datasets]] | | * [[Datasets]] |
| Line 37: |
Line 36: |
| * [http://devoted.to/corpora David Lee's Bookmarks for Corpus-based Linguists] | | * [http://devoted.to/corpora David Lee's Bookmarks for Corpus-based Linguists] |
| * [[Resources]] | | * [[Resources]] |
|
| |
| == Datasets ==
| |
|
| |
| * [http://www.eat.rl.ac.uk/ Edinburgh Associative Thesaurus (EAT)]
| |
| * [http://www.ldc.upenn.edu/ Linguistic Data Consortium (LDC)]
| |
| * [http://www.psych.rl.ac.uk/ MRC Psycholinguistic Database]
| |
| * [http://www.cs.utexas.edu/~mfkb/nn/ Noun Compound Repository]
| |
| * [http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html Reuters-21578 Text Categorization Collection]
| |
| * [http://w3.usf.edu/FreeAssociation/ University of South Florida Free Association Norms]
| |
| * [http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/wordsim353.html WordSimilarity-353 Test Collection]
| |
|
| |
| == Lexicons ==
| |
| (alphabetical order)
| |
| * [http://clipdemos.umiacs.umd.edu/catvar/ Catvar 2.0: The Categorial Variation Database] - for example, the ''developing'' cluster: {''develop'' (V), ''developer'' (N), ''developed'' (AJ), ''developing'' (N), ''developing'' (AJ), ''development'' (N)}
| |
| * [http://www.wjh.harvard.edu/%7Einquirer/spreadsheet_guide.htm General Inquirer]
| |
| * [http://www.csse.monash.edu.au/~jwb/edict_doc.html JMdict: Japanese-Multilingual Dictionary file]
| |
| * [http://www.umiacs.umd.edu/~bonnie/LCS_Database_Documentation.html LCS Database: Lexical Conceptual Structures]
| |
| * [http://www.dcs.shef.ac.uk/research/ilash/Moby/ Moby lexicon project]
| |
| * [http://www.signiform.com/tt/htm/tt.htm ThoughtTreasure]
| |
|
| |
| === WordNet and enhancements ===
| |
| (alphabetical order)
| |
| * [http://xwn.hlt.utdallas.edu/ eXtended WordNet] - glosses are syntactically parsed, transformed into logic forms, and content words are semantically disambiguated
| |
| * [http://patty.isti.cnr.it/~esuli/software/SentiWordNet/ SentiWordNet] - assigns to each synset of WordNet three sentiment scores: positivity, negativity, objectivity
| |
| * [http://wordnet.princeton.edu/ WordNet] - the original
| |
| * [http://tcc.itc.it/research/textec/topics/disambiguation/wordnetdomains.html WordNet Domains] - augmented with Domain Labels, such as POLITICS, ECONOMY, SPORT
| |