Phishing is the fraudulent attempt to obtain sensitive information or data, such as usernames, passwords and credit card details, by disguising oneself as a trustworthy entity in an electronic communication. Typically carried out by email spoofing, instant messaging, and text messaging, phishing often directs users to enter personal information at a fake website which matches the look and feel of the legitimate site.
Phishing is an example of social engineering techniques used to deceive users. Users are lured by communications purporting to be from trusted parties such as social web sites, auction sites, banks, colleagues/executives, online payment processors or IT administrators.
In this Kaggle challenge my goal was to train a classifier that will detect phishing emails
Acknowledgements J. Nazario. phishingcorpus homepage, Apr. 2006. http://monkey.org/
Enron Email Dataset https://www.cs.cmu.edu/~enron/
The Jupyter Notebook was used for the whole project including final predictions uploaded to Kaggle.