Spam filtering datasets

From ACL Wiki
Jump to navigation Jump to search
  • Enron-Spam A collection of datasets that contains spam messages, and ham messages from the Enron corpus. See this article for further details.
  • Ling-Spam A dataset that contains spam messages and messages from the Linguist list. See this article for further details.
  • PU datasets A collection of encrypted datasets that contain spam messages and ham messages from real users. See this paper and this report for further details.