/Credit-Scoring-Data-Sets

These common credit score data sets are collected to empirical evaluations, and I will update dynamically.

Primary LanguageCommon Lisp

Credit-Scoring-Data-Sets

These common credit score data sets are collected to empirical evaluations, and I will update dynamically.

  1. UCI Repository:

    (1.1) German: http://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29

    or Kaggle url: https://www.kaggle.com/uciml/german-credit

    (1.2) Australian: http://archive.ics.uci.edu/ml/datasets/Statlog+%28Australian+Credit+Approval%29

    (1.3) Taiwan: http://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients

    or Kaggle url:https://www.kaggle.com/uciml/default-of-credit-card-clients-dataset

    (1.4) Japan: http://archive.ics.uci.edu/ml/datasets/Japanese+Credit+Screening

    (1.5) Polish: http://archive.ics.uci.edu/ml/datasets/Polish+companies+bankruptcy+data

Reference:

(1.1; 1.2; 1.4; 1.5): M. Lichman, UCI machine learning repository, School of Information and Computer Science, University of California, Irvine, CA, http://archive.ics.uci.edu/ml/, (2013) , Accessed date: 1 September 2018.

(1.3): I.C. Yeh, C.h. Lien, The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients, Expert Syst. Appl. 36 (2009) 2473–2480.
  1. PAKDD 2009 Data Mining Competition:

    (2.1) pakdd 2009:

     link1: http://sede.neurotech.com.br/PAKDD2009 (temporarily inaccessible)
     
     link2: https://pakdd.org/archive/pakdd2009/front/show/competition.html
    

Reference: PAKDD data mining competition 2009, Credit risk assessment on a private label credit card application (2009), http://sede.neurotech.com.br/PAKDD2009

  1. Kaggle:

    (3.1) Give Me Some Credit(gmsc): https://www.kaggle.com/c/GiveMeSomeCredit

    (3.2) Home Credit Default Risk: https://www.kaggle.com/c/home-credit-default-risk/data

    (3.3) Credit Card Data from book "Econometric Analysis": https://www.kaggle.com/dansbecker/aer-credit-card-data

  2. Financial institutions in the Benelux(Belgium, The Netherlands and Luxembourg) and UK:

    (4.1) bene1

    (4.2) bene2

    (4.3) uk

Reference: B. Baesens, T. Van Gestel, S. Viaene, M. Stepanova, J. Suykens, J. Vanthienen, Benchmarking state-of-the-art classification algorithms for credit scoring, Journal of the Operational Research Society 54 (6) (2003) 627–635.

  1. Thomas: More information see reference [Thomas 2002]

    (5.1) thomas

Reference: L.C. Thomas, D.B. Edelman, J.N. Crook, Credit Scoring and its Applications, SIAM, Philadelphia, 2002.

  1. Credit risk analysis: http://www.creditriskanalytics.net

    (6.1) hmeq: http://www.creditriskanalytics.net/uploads/1/9/5/1/19511601/hmeq.csv

    (6.2) Mortgage: http://www.creditriskanalytics.net/uploads/1/9/5/1/19511601/mortgage_csv.rar

    (6.3) LGD: http://www.creditriskanalytics.net/uploads/1/9/5/1/19511601/lgd.csv

    (6.4) Ratings: http://www.creditriskanalytics.net/uploads/1/9/5/1/19511601/ratings.csv

Reference: B. Baesens, D. Roesch, H. Scheule, Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS, $John Wiley & Sons$, 2016.

  1. Lending club:

    (7.1) Lending club: https://www.lendingclub.com/info/download-data.action

    (7.2) A ton of LendingClub datasets on Kaggle: https://www.kaggle.com/datasets?sortBy=relevance&group=public&search=lending%20club&page=1&pageSize=20&size=all&filetype=all&license=all