Knowledge collections and datasets (English)

From ACL Wiki

Revision as of 10:26, 24 August 2007 by Ioan (talk | contribs) (added this from semantics software -- fits better here)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Jump to navigation Jump to search

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Datasets for Computational Linguistics and Natural Language Processing.

Clustering by Committee - terms clustered and organized using the Distributional Hypothesis
DIRT Paraphrase Collection - Discovery of Inference Rules from Text
Edinburgh Associative Thesaurus (EAT)
FrameNet
MRC Psycholinguistic Database
Preposition Project
Noun Compound Repository
Reuters-21578 Text Categorization Collection
SAT Analogy Questions - a way of evaluating algorithms for measuring relational similarity
Spam filtering datasets
TEASE - Acquisition of Entailment Relations from the Web
TOEFL Synonym Questions - a way of evaluating algorithms for measuring degree of similarity between two words
University of South Florida Free Association Norms
VerbOcean - verbs organized by semantic relation, including temporal precedence and strength
WordNet
WordSimilarity-353 Test Collection

Additional Dataset Collections

Linguistic Data Consortium (LDC)

Retrieved from "https://aclweb.org/aclwiki/index.php?title=Knowledge_collections_and_datasets_(English)&oldid=4364"

Knowledge Collections and Datasets