앱 사용성 데이터를 통한 대출신청 예측 분석

Overview

Results

The results for the forecast after June can be found at the following Google Drive link: Google Drive
test.csv == 데이터분석분야_퓨처스부분_이용재와아이들_평가데이터.csv
cluster_user.csv
If you want the final result file (test.csv) to create the same application id, product id as the test set for submission, run the Jupiter notebook as follows. 9_select_the_submit_data.ipynb

Notice

You MUST Create the following folders in advance.

data
prepro_data
DL_dataset

The data folder requires the loan_result.csv, log_data.csv, and user_spec.csv files. Google Drive
When you run 1_preprocessing_real.ipynb, the preprocessed file is stored in the prepro_data folder in the following two forms:

full_data.csv
submit_test.csv

full_data.csv means a dataset for learning before June and a dataset for submit_test.csv means a dataset for testing after June.

Next, when 2_Preprocessing_2.ipynb and 3_Preprocessing_3.ipynb are runed, the data on the user's behavior is reflected in loon_result.csv. At this time, ray was used for parallel processing. The reflected results are similarly stored in the prepro_data folder as full_data.csv and submit_test.csv.

Thired, by runing 4_Preprocessing_4.ipynb, continuous variables are converted into categorical variables and stored in a dataset folder. The stored data sets are as follows.

full_data.csv
submit_test.csv

Finally, a dataset for deep learning is configured by executing 6_DL_models_inputs.ipynb. The following data is stored in the DL_dataset.

Through deep learning, all results are stored in DL_dataset.

Get Started

Install Python 3.8. (i.g, Create environment conda create -n bigcon python=3.8)
Download data. (first, you must be load data in ./data folder) 3-1. Download requirement packages pip install -r requirements.txt 3-2. Download Autogolun packages pip3 install autogluon you can use GPU mode See the link. 3-3. Download Ray package for preprocessing pip install ray
For the preprocessing process, run five jupyer notes as follows.

To run the ML model, run the following jupyter notebook. The weights of all models can be downloaded from the following Google drive Link : Google Drive

5_test_modeling-ACC-ALL.ipynb

Train the model. We provide the experiment scripts of all benchmarks under the folder ./runfile. The weights of Deep learning can be downloaded from the checkpoints.zip folder. You can reproduce the experiment results by:

bash ./runfile/big_1.sh

For machine learning and deep learning models, run Voting enamble 7_ML_DL_model_output.ipybn. All results are stored in the submit folder.
Run 8_Clustering.ipynb for clustering results. All results are stored in the submit folder

Results

The results for the forecast after June can be found at the following Google Drive link: Google Drive
test.csv == 데이터분석분야_퓨처스부분_이용재와아이들_평가데이터.csv
cluster_user.csv
If you want the final result file (test.csv) to create the same application id, product id as the test set for submission, run the Jupiter notebook as follows. 9_select_the_submit_data.ipynb

Models

We construct a final model with an ensemble of machine learning models and deep learning models.

Clustering

Clustering was conducted from the Embedding vector extracted from deep learning, and an evaluation index based on cumulative probability distribution was created to find the optimal cluster from the Embedding vector of high dimensions.

Contact

If you have any questions or want to use the code, please contact yoontae@unist.ac.kr.

youngandbin/Bigcon

앱 사용성 데이터를 통한 대출신청 예측 분석

Overview

Results

Notice

Get Started

Results

Models

Clustering

Contact