Knowledge collections and datasets (English)
Revision as of 16:07, 21 November 2006 by Jonsafari (talk | contribs) (Knowledge Collections and Datasets moved to Knowledge collections and datasets: lc)
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Datasets for Computational Linguistics and Natural Language Processing.
- Clustering by Committee - terms clustered and organized using the Distributional Hypothesis
- DIRT Paraphrase Collection - Discovery of Inference Rules from Text
- Edinburgh Associative Thesaurus (EAT)
- FrameNet
- MRC Psycholinguistic Database
- Noun Compound Repository
- Reuters-21578 Text Categorization Collection
- Spam filtering datasets
- University of South Florida Free Association Norms
- VerbOcean - verbs organized by semantic relation, including temporal precedence and strength
- WordNet
- WordSimilarity-353 Test Collection