Nat-D/varuna-hackathon

Jupyter Notebook

Varuna Hackathon

Team: Noname

Instructions

Prepare environment

conda create -n varuna python=3.8; conda activate varuna
conda install -c conda-forge geopandas
pip install rasterio
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
pip install tensorboard
pip install -U albumentations
pip install tqdm

Train your model

put the raw data into the ./data_raw/ folder
Run python generate_mask.py to generate raster masks from polygons
Run python generate_data.py to process the raw data into numpy arrays. The process involved date selection, band selections and random crop the big image into multiple smaller data samples.
Run python train.py --name "my_model" --save to train and save your model.
Run tensorbaord --logdir=runs to visualize the training.

Use pretrained model to predict the test shape

Prepare your data into appropriate folders
Run python test_v2.py

Reproducing the CSV results

Our pretrained model can be downloaded here: https://drive.google.com/drive/folders/1drKdEeyS_zlUwh0ncXA4hyCx7ZHXhVx7?usp=sharing
Run python reproducing_first_submission.py to reproduce the first submission

Run python reproducing_second_submission.py to reproduce the 2nd submission

Model v1

A standard UNET segmentation
standardise input with per-sample statistics to avoid losing relative magnitudes between bands
extensive augmentations
a lot of hyper-param tuning
reduce the training image size to avoid model remebering exact location

Model v2

incorporate temporal data through computing maximum values of NDVI and EVI across time (only from 2021). We use that as additional channels in the inputs.
BatchNorm between convolutions
extensive augmentations

Final submission

We will use the 2nd model for the final submission. If time permitted, training multiple models to perform ensemble estimation might help.