PayGuard is a machine learning-driven solution designed to detect fraudulent activities in bank payment systems. By leveraging advanced algorithms and data analysis techniques, PayGuard aims to enhance financial security by identifying suspicious transactions in real-time. This project integrates data preprocessing, exploratory data analysis (EDA), model building, and deployment into an interactive web application for seamless user experience.
- Data Analysis & Visualization
- Comprehensive EDA using heatmaps, correlation matrices, histograms, and box plots.
- Visualization of transaction distributions and identification of anomalies.
- Machine Learning Models
- Implementation of various classification algorithms, including:
- Random Forest Classifier
- XGBoost Classifier
- K-Nearest Neighbors (KNN) Classifier
- Comparison of model performance metrics to select the optimal model.
- Implementation of various classification algorithms, including:
- Model Optimization
- Hyperparameter tuning to improve model accuracy, precision, recall, F1-score, and AUC-score.
- Evaluation of models before and after tuning for performance enhancement.
- Web Application Interface
- Integration of models into a Streamlit web app for user-friendly interaction.
- Features include user authentication (signup and login), data uploading, EDA, model training, and fraud prediction.
- Prediction and Deployment
- Real-time fraud prediction on new transaction data.
- Visualization of prediction results and performance metrics.
- dashboard/app.py: Streamlit application code handling the web interface, user authentication, data uploading, EDA, model building, and prediction functionalities.
- notebook/code.ipynb: Jupyter Notebook containing the detailed steps of data analysis, preprocessing, model training, evaluation, and visualization.
- script/simulate.py: Python script for simulating and testing the fraud detection models on synthetic data.
- data: Dataset for training and testing.
- Programming Language
- Python 3.x
- Python Libraries
- pandas
- numpy
- scikit-learn
- matplotlib
- seaborn
- streamlit
- xgboost
- imbalanced-learn
- Tools
- Jupyter Notebook or JupyterLab
- Clone the Repository
git clone https://github.com/shxu7788/PayGuard.git
- Navigate to the Project Directory
cd PayGuard
- Create a Virtual Environment (Optional but Recommended)
python -m venv venv
- Activate the virtual environment:
- On Windows:
venv\Scripts\activate
- On macOS/Linux:
source venv/bin/activate
- On Windows:
- Activate the virtual environment:
- Install Dependencies
pip install -r requirements.txt
- If
requirements.txt
is not provided, install dependencies manually:pip install pandas numpy scikit-learn matplotlib seaborn streamlit xgboost imbalanced-learn
- If
- Start the Streamlit App
streamlit run dashboard/app.py
- Access the Web Interface
- Open your web browser and navigate to
http://localhost:8501
.
- Open your web browser and navigate to
- Interact with the Application
- Signup: Create a new user account to access the app features.
- Login: Access the application using your credentials.
- File Upload: Upload transaction data for analysis (ensure the data is in the correct format).
- EDA: Perform exploratory data analysis to understand data patterns and anomalies.
- Model Building: Train machine learning models and compare their performance.
- Prediction: Use the trained model to predict fraudulent transactions on new data.
- Open the Notebook
- Navigate to the
notebook
directory. - Open
code.ipynb
using Jupyter Notebook or JupyterLab.
- Navigate to the
- Execute the Cells
- Run the notebook cells sequentially to perform data analysis, model training, and evaluation.
- Modify code and parameters as needed for experimentation.
- Data Preprocessing
- Handling of imbalanced datasets using Synthetic Minority Over-sampling Technique (SMOTE).
- Encoding of categorical variables and feature scaling for optimal model performance.
- Model Evaluation Metrics
- Assessment using accuracy, precision, recall, F1-score, and ROC-AUC curves.
- Visualization of model performance before and after hyperparameter tuning.
- User-Friendly Interface
- Streamlit web app provides an accessible platform for users without programming knowledge.
- Step-by-step guidance through data upload, analysis, and fraud prediction processes.
Contributions to PayGuard are welcome! If you have ideas for enhancements or encounter any issues, please open an issue or submit a pull request on the GitHub repository.
This project is licensed under the Apache License - see the LICENSE file for details.
This project is inspired by various open-source projects and research papers on fraud detection and machine learning.
Thank you for exploring PayGuard. We hope this project helps in making financial transactions more secure and reliable. If you have any questions or need further assistance, feel free to reach out through the GitHub repository.
Empowering financial security through machine learning.