model, input preprocessed data, output predictions:
output predict csv path: data/output/rubert-tiny2_v3_tag_1668632322_predict.csv
EDA notebook Model train and inference notebook
EDA summary is in output/EDA/Sber interview task.pdf
the document contains all nessesary info about EDA, experiment logic, metrics and model selection.
1 Clone this repo and download data from gdrive
2 Check docker-compose.yml
and select available port for notebook
3 Run docker-compose up --build
4 Open jupyter lab
info about data
data preprocessing
naive baseline
model train code
model test code