건설장비에서 작동오일의 상태를 실시간으로 모니터링하기 위한 오일 상태 판단 모델 개발 (정상, 이상의 이진분류)
Competition Link
- 주최: 현대제뉴인 [Link]
- 후원: AWS
- 주관: Dacon
- Private 12th, Score 0.577764
Train/Test data and sample submission file must be placed under dataset folder.
repo.
|——dataset
|——train.csv
|——test.csv
|——sample_submission.csv
|——data_info.xlsx
|——models
|——teacher
|——Teacher_model1.pickle
|—— ...
|——Teacher_model5.pickle
|——student
|——Student_model1.pickle
|—— ...
|——Student_model5.pickle
|——result
|——submission_CatBoost.csv
|——ipynotebook
|——학습코드.ipynb
|——requirements.txt
|——teacher_train.py
|——student_train.py
|——inference.py
Windows 10
11th Gen Intel(R) Core(TM) i7-11850H
NVIDIA RTX A2000 Laptop GPU
If you want to run it with Colab or AWS Sagemaker, run the ipynb code below. Also, It includes an EDA and results analysis process. It will output the same result.
- Google Colab
- AWS Sagemaker Studio Lab
repo.
|——ipynotebook
|——학습코드.ipynb
the details of packages version are listed in requirements.txt
> python --version
Python 3.9.15
> pip install -r requirements.txt
> pip list
...
numpy 1.23.5
pandas 1.5.2
scikit-learn 1.2.0
catboost 1.1.1
optuna 3.0.4
...
Because the number of variables in the learning environment and the inference environment are different, we used the knowledge distillation method. So, We go through 2 trainning process to obtain Teacher model and Student Model
- learning method: CatBoostClassifier
- 5 StratifiedKFold ensemble
- So, 5 model weights are saved after training
- includes hyper-parameter tuning preocess with Optuna
> python teacher_train.py
models
|——teacher
|——Teacher_model1.pickle
|—— ...
|——Teacher_model5.pickle
- train with the teacher's output as the label
- learning method: CatBoostRegressor
- 5 StratifiedKFold ensemble
- So, 5 model weights are saved after training
- includes hyper-parameter tuning preocess with Optuna
> python student_train.py
models
|——student
|——Student_model1.pickle
|—— ...
|——Student_model5.pickle
- 5 fold ensemble (soft-voting) inference
- metric: F1 score
- thershold: 0.15
python inference.py
check submission_CatBoost.csv
result
|——submission_CatBoost.csv
You can download our best model weight Here GoogleDrive.
Best model must be unzipped under models folder
unzip and move best model to models directory
python inferecne.py
check submission_CatBoost.csv
models # (it is best model)
|——teacher
|——Teacher_model1.pickle
|—— ...
|——Teacher_model5.pickle
|——student
|——Student_model1.pickle
|—— ...
|——Student_model5.pickle
result # (it is best result)
|——submission_CatBoost.csv
inference.py