nlp662-spamdetection

Training and Test Datasets

  • We downloaded all datasets from https://spamassassin.apache.org/old/publiccorpus/

  • Suggested commands to unpack files: `

    • tar xvjf 20030228_easy_ham_2.tar.bz2
    • tar xvjf 20030228_easy_ham.tar.bz2
    • tar xvjf 20030228_hard_ham.tar.bz2
    • tar xvjf 20030228_spam.tar.bz2
    • tar xvjf 20050311_spam_2.tar.bz2 `