/NeuroWood2022

Nuclear Foxes Team solution to the NeuroWood 2022 Hackathon.

Primary LanguageJupyter NotebookMIT LicenseMIT

WoodHack solution by Nuclear Foxes Team

  • Sergey Zakharov — @f3ss1

  • Evgeny Plyushch — @zhekuson

If you find this repo useful, please consider smashing the ⭐ for it!

Can be found at Kaggle. Click the Dataset title.

Warnings

We did not save all the models as their files are quite large, and we had 100+ test run. Thus, by default only the models trained on a full dataset will be saved. Of course, you can configure that by yourself using the save_model function from the source.py.

We are using WandB to log our every single run. If you are not, please remove its mentions from from all the files. Do not forget the source.py file containing train loops.

Project structure

We did not include the Data folder into this repo. Please make sure to download it on your own. We also changed the storage system: now the train folder should contain all the images with no label folder (so it should be just Data/Train/IMG... instead of Data/Train/1/IMG...). All the labels are stored in the train.csv so feel safe about that.

Since we do not provide any pretrained models, the Models folder does not exist too and you need to create an empty one on your own.

We also did not make any special exception for the 120.JPG file as we just renamed it to 120.png. Still, feel free to correct the dataset in the source.py if you think it shall be handled somehow else.

.
├── Data
    ├── Train
        ├── ...
    ├── Test
        ├── ...
├── Models                          # Here your future models will be saved
    ├── ...
├── LICENSE
├── ReadMe.md
├── Train and create submit.ipynb   # Use this to train model on the whole train
                                    # dataset and create a submit
├── config.py                       # Confing some common settings for your 
                                    # training process e.g. bath size,
                                    # image size, etc.
├── main.ipynb                      # Use this to experiment on your models and
                                    # validate them
├── models.py                       # Store your models here
├── prediction.csv                  # Generated by "Train and create submit.ipynb"           
├── sample_submission.csv
├── source.py                       # Core functions used in both .ipynb files.
                                    # These include dataset, train loop, etc.
└── train.csv                       # .csv with labels for train data

We achieved our best results using the VGG19BN network and setup:

  • Batch size ~ 4;
  • SGD optimizer;
  • Image size 512x512;
  • No augmentations;
  • LR ~ 1e-3, no scheduler.

UPD: on private set EFFnetB3 was the best.