OTUS Machine Learning Advanced
AutoML - try out automatic feature generation/selection and modelling:
- Compare AutoML performance in ATOM library (provides TPOT wrapper)
https://tvdboom.github.io/ATOM/about/
with baseline and two out of the box models. - Compare AutoML performance in AutoML mljar-supervised library
https://github.com/mljar/mljar-supervised
with two out of the box models. In addition, will try ensembling of autoML models.
Means:
AutoML tasks will be given to ATOM and mljar-supervised respectively. All preprocessing and pipelines management will be done in ATOM.
Dataset:
- Richter's Predictor: Modeling Earthquake Damage
https://www.drivendata.org/competitions/57/nepal-earthquake/data/
Choice of models:
- Random Forest and CatBoost classifiers will compete with AutoML solution. LogisticRegression is added as a baseline in ATOM case.
Methodology:
- OOB models' hyperparameters will be tuned with BO primarily to get some CV statistics and to level up the competition ground.
- Weighted F1 score will be used as the main performance metrics following suggestion of the
competition organizers. Other metrics are collected where possible.
Colab notebooks:
ATOM autoML
mljar-supervised AutoML