Email Spam Classifier

Overview

This project aims to build an Email Spam Classifier using Natural Language Processing (NLP) techniques. The classifier distinguishes between spam and non-spam (ham) emails, assisting users in managing their email inboxes effectively.

Features

Cleans the dataset by removing duplicates and handling missing values.
Utilizes Exploratory Data Analysis (EDA) to understand data distribution and patterns.
Preprocesses text data through tokenization, stopword removal, and normalization.
Implements various machine learning models for classification, including SVM, Naive Bayes, Decision Tree, Random Forest, etc.
Improves model performance through hyperparameter tuning and ensemble learning techniques.
Deploys the model as a web application using Flask, allowing users to input email content and receive real-time predictions on spam likelihood.

Repository Structure

README.md: Overview of the project and instructions.
app.py: Flask application for deploying the model.
model.pkl: Pickle file containing the trained model.
templates/: HTML templates for the web interface.
static/: Static files such as CSS and JavaScript.

Usage

Clone the repository: git clone <repository-url>
Install dependencies: pip install -r requirements.txt
Run the Flask application: python app.py
Access the web interface in your browser and input email content for prediction.

Credits

Dataset: SMS Spam Collection Dataset from Kaggle.
Libraries: Utilizes various Python libraries including Flask, scikit-learn, etc.

harshilaneja/email-spam-classifier

Email Spam Classifier

Overview

Features

Repository Structure

Usage

Credits