Knowledge collections and datasets (English): Difference between revisions
Jump to navigation
Jump to search
mNo edit summary |
Added spam filtering datasets. |
||
| Line 8: | Line 8: | ||
* [http://www.cs.utexas.edu/~mfkb/nn/ Noun Compound Repository] | * [http://www.cs.utexas.edu/~mfkb/nn/ Noun Compound Repository] | ||
* [http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html Reuters-21578 Text Categorization Collection] | * [http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html Reuters-21578 Text Categorization Collection] | ||
* [[Spam filtering datasets]] | |||
* [http://w3.usf.edu/FreeAssociation/ University of South Florida Free Association Norms] | * [http://w3.usf.edu/FreeAssociation/ University of South Florida Free Association Norms] | ||
* [[VerbOcean|VerbOcean - verbs organized by semantic relation, including temporal precedence, strength, etc.]] | * [[VerbOcean|VerbOcean - verbs organized by semantic relation, including temporal precedence, strength, etc.]] | ||
Revision as of 15:00, 19 November 2006
Datasets for Computational Linguistics and Natural Language Processing.
- Clustering by Committee - terms clustered and organized using the Distributional Hypothesis
- DIRT Paraphrase Collection
- Edinburgh Associative Thesaurus (EAT)
- FrameNet
- MRC Psycholinguistic Database
- Noun Compound Repository
- Reuters-21578 Text Categorization Collection
- Spam filtering datasets
- University of South Florida Free Association Norms
- VerbOcean - verbs organized by semantic relation, including temporal precedence, strength, etc.
- WordNet
- WordSimilarity-353 Test Collection