Classification models including logistic regression and random forest are introduced along with their diagnostic methods and tuning processes. Variable importance analysis is highlighted following with comparisons of two models. Imputation methods for handling missing data is then discussed, and finally applications of both models on building behavioural scorecard in the banking area are explored.
Data and codes for constructing the internal behaviour scorecard is not available in this repository due to copyright restrictions. Rights reserved @AtomBank.