/ML_Final_Project

Primary LanguageJupyter Notebook

ML360_Final_Project

This is a machine learning course project about spam email classificaiton, using multiple different algorithms we've covered in the course.

The dataset we used here was from the following website link: http://www2.aueb.gr/users/ion/data/enron-spam/

You will need to download the .tar files and then unzip them to read all the emails in .txt format. There are about 33K of them in the preprocessed version.

The code for non-nn part is in the .py file. It includes the hyperparameter tuning part. The code for LSTM part is in the ,ipynb file.

And for the non-nn part, I've downloaded the whole dataset to my home computer, and combined all spam and non-spam email into one folder for each of the six, and then combine them into one big folder. U can follow the same rules if u want to try the code on your own computer, or you can simply change the path to the dataset to run the models.