/Spam-Email-Classifier

Spam Email Classifier

Primary LanguageJupyter Notebook

WiDS-Project

This repository contains the code for a simple project - a spam email classifier

Email Spam Detection

The project attempts to familiarize people with classification problems in Machine Learning and the various ML algorithms used to tackle them depending upon the problem at hand and the user's needs.

Key Details

  • Basic Exploratory Data Analysis using tSNE to visualize high dimensional data and understanding its limitations.
  • General idea of the usage of data pre-processing to process raw data into a feature vector as an input to the ML model
  • ML practices such as training, cross validation and hyperparameter tuning, analysis of model fit and analysis of the best metrics to be used for model evaluation (especially in case of imbalanced data.)
  • Implementation of well-known classification algorithms on an e-mail spam detection task (viz. Logistic Regression, SVMs, Ensemble Methods etc.)

Acknowledgements

  • Really thankful to Gerges Dib for his Python version of programming assignments of 'Machine Learning' on Coursera taken by Andrew Ng. A large part of the data pre-processing pipeline is based on the code written by him.
  • Indebted to Prof. Andrew Ng for having created such thoughtful, structured and relevant programming assignments for an introductory course in machine learning, aimed at beginners. His assignments served as an inspiration for the project offering.
  • Special thanks to Kunind Sahu for his guidance throughout the project and beyond!