Spam filtering datasets

From ACL Wiki
Revision as of 09:07, 19 November 2006 by Ionandr (talk | contribs) (Added three datasets related to spam filtering.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
  • Enron-Spam A collection of datasets that contains spam messages, and ham messages from the Enron corpus. See this article for further details.
  • Ling-Spam A dataset that contains spam messages and messages from the Linguist list. See this article for further details.
  • PU datasets A collection of encrypted datasets that contain spam messages and ham messages from real users. See this paper and this report for further details.