- Develop a model to predict loan approval category (P1-P4 representing risk levels) for new customers.
- Leverage the model to aid loan approval decisions.
- Over 50,000 customer records with features including demographics, financial information, and loan history.
- Target variable: Categorical variable representing loan approval category (P1-P4).
- Identified and addressed missing values using appropriate techniques.
- Handled data inconsistencies.
- Performed exploratory data analysis (EDA) to understand data distribution and relationships.
- Conducted Chi-square tests to identify significant relationships between categorical features and the target variable.
- Used VIF analysis to remove highly correlated numerical features.
- Implemented domain knowledge to create new features if necessary.
- Built an XGBoost model to predict loan approval category.
- Tuned hyperparameters to optimize model performance.
- Evaluated model on unseen data using accuracy metrics.
- Achieved 85% accuracy on unseen data.
- Developed a model effective in assessing credit risk and aiding loan approval decisions.
This repository includes:
- Jupyter notebooks for data cleaning, feature engineering, model building, and evaluation.