This project focuses on leveraging advanced data analytics and machine learning techniques to develop a comprehensive credit risk assessment, credit scoring, and loan optimization framework. The system is designed to support a Buy-Now-Pay-Later (BNPL) service offered by a financial service provider in partnership with an eCommerce company. The primary objectives include customer segmentation, credit risk modeling, and loan optimization.
- Objective: Conduct in-depth customer segmentation using RFMS (Recency, Frequency, Monetary Value, and Standard Deviation of Amount Spent) scores.
- Purpose: Classify customers into high-risk and low-risk segments to tailor the BNPL or loan service offerings.
- Objective: Develop machine learning models to predict credit risk and default probabilities.
- Outcome: Provide a risk probability for each customer, aiding in the assessment of creditworthiness and default risk.
- Credit Score Model: Create a credit score model based on risk probabilities, aligned with FICO standards.
- Objective: Develop a model to determine optimal loan amounts, repayment periods, and other terms.
- Outcome: Offer personalized financing options to customers, enhancing the BNPL service's value proposition.
The project integrates supervised and unsupervised machine learning techniques, including logistic regression, decision trees, random forests, and clustering algorithms. These models are trained and validated using historical BNPL data and external credit bureau information. The ultimate goal is to embed these models into the BNPL platform, improving credit decision-making, customer satisfaction, and business performance.
- Data Collection and Preprocessing
- Exploratory Data Analysis (EDA)
- Feature Engineering
- Weight of Evidence (WoE) Binning
- Feature Selection
- Model Development
- Model Evaluation and Selection
- Model Deployment and Integration
- Monitoring and Continuous Improvement
- Installation
- Usage
- Contributing
- License
- Acknowledgments
Gather and preprocess historical BNPL application and repayment data. This includes data cleaning, handling missing values, and normalization.
- Notebook: Data Cleaning
Analyze customer characteristics, behaviors, and credit profiles to identify patterns influencing credit risk and repayment.
- Notebook: EDA
Create new features, including RFMS scores, based on insights from EDA to enhance the models' predictive power.
Constructing a default estimator (proxy) By visualizing all transactions in the RFMS space to establish a boundary Where customers are classified as high and low RFMS scores.
- Visualizing Transactions in RFMS space & Establishing boundaries.
-
Notebook: Feature Engineering
Apply WoE binning to transform features into a suitable format for machine learning models.
- Notebook: WoE Binning
Select the most relevant features using techniques like correlation analysis and recursive feature elimination.
- Correlation Analysis
- Selected Features
- Notebook: Feature Selection
Develop various machine learning models to predict credit risk, score, and loan optimization metrics.
-
Notebook: Model Development
Deploy the selected models into the BNPL platform to enhance decision-making processes.
Continuously monitor and refine the models to ensure they maintain high accuracy and effectiveness over time.
- Python 3.x
- Virtual environment (e.g.,
virtualenv
,conda
)
-
Clone the repository:
git clone https://github.com/Daniel-Andarge/AiML-credit-risk---loan-optimization-ml.git
-
Navigate to the project directory:
cd AiML-credit-risk---loan-optimization-ml
-
Create and activate a virtual environment:
# Using virtualenv virtualenv venv source venv/bin/activate # Using conda conda create -n your-env python=3.x conda activate your-env
-
Install the required dependencies:
pip install -r requirements.txt
Open the Jupyter notebooks in your preferred environment and follow the instructions. Customize the code based on your dataset and requirements.
While this project is primarily a portfolio piece and not open for external contributions, feedback and suggestions are always welcome. If you have any thoughts or comments, reach out via email or LinkedIn.
This project is licensed under the MIT License.
- Email: Send Message
- LinkedIn: Daniel Andarge
👤 Daniel Andarge