Web Page Phishing Detection

Introduction

This project aims to develop and evaluate machine learning models for the detection of phishing URLs. The provided dataset contains 11,430 URLs with 87 extracted features, making it a valuable resource for benchmarking phishing detection systems.

Dataset

Dataset Name: Phishing Detection Dataset
Dataset Description: The dataset includes a balanced collection of phishing data.

Project Overview

In this project, we use machine learning techniques to build and evaluate phishing detection models. Here's a brief overview of the project's components:

Data Preprocessing: Data cleaning, feature selection, and outlier handling.
Model Building: Training various machine learning models (e.g., Logistic Regression, Decision Trees, Random Forest, SVM) on the dataset.
Model Evaluation: Assessing model performance using metrics like accuracy, precision, recall, and F1-score.
Hyperparameter Tuning: Optimizing model hyperparameters for improved performance.
Dimensionality Reduction: Applying techniques like PCA and t-SNE for dimensionality reduction.
Cross-Validation: Assessing model generalization using k-fold cross-validation.

Model Evaluation

The model evaluation metrics are the following:

Accuracy
Precision
Recall
F1 Score
Confusion Matrix

adhilcodes/Web-page-Phishing-Detection

Web Page Phishing Detection

Introduction

Dataset

Project Overview

Model Evaluation