/kaggle-lyft-motion-pred

Training code for kaggle competition, Lyft Motion Prediction for Autonomous Vehicles

Primary LanguagePythonApache License 2.0Apache-2.0

lint Imports: isort Code style: black deepcode

Description

Training and Prediction code for Kaggle competition, Lyft Motion Prediction for Autonomous Vehicles. The target is motion predicion over 5 sec for each vehicle, the pink track below. Each prediction is evaluated with "multi-modal negative log-likelihood loss".

input image 0 input image 1

How to run

First, download the data, here, and full training data, here. You will get the followings.

/your/dataset/root_path
    │
    ├ meta.json
    ├ single_mode_sample_submission.csv
    ├ multi_mode_sample_submission.csv
    ├ aerial_map/
    ├ semantic_map/
    └ scenes/
        ├ mask.npz
        ├ sample.zarr
        ├ test.zarr
        ├ train.zarr
        ├ train_full.zarr
        └ validate.zarr

Install dependencies,

# clone project
git clone https://github.com/Fkaneko/kaggle-lyft-motion-pred

# install project
cd kaggle-lyft-motion-pred
pip install -r requirements.txt

Run training and testing it,

# run training
python run_lyft_mpred.py  \
          --l5kit_data_folder \
          /your/dataset/root_path \
          --epochs \
          1 \
          --lr \
          5.2e-4 \
          --batch_size \
          128 \
          --num_workers \
          4
# run test
python run_lyft_mpred.py  \
          --l5kit_data_folder \
          /your/dataset/root_path \
          --is_test \
          --ckpt_path \
          /your/trained/ckeckpoint_path
          --batch_size \
          128 \
          --num_workers \
          4

After testing you will find ./submission.csv, and you can check the test score throguh this notebook.

Competition summary

The baseline CNN regression approach, just replacing the 1st and final layers of Imagenet pretrained model, was strong. Segmentaion or RNN approeches are not good. And following tips we can get top-10 equivalent performance(11.238) using the baseline approach.

Actually I got the following result. The history_frames was 10 at the baseline so if you can use 10 instead of 2 you may get better result than top-10 score, 11.283 with this single model. You can check 11.377 result with my full test pipeline and trained weight at kaggle notebook.

model backbone scenes iteration x batch_size loss history_frames MIN_FRAME_HISTORY / FUTURE test score
baseline resnet50 11314 100k x 64 single mode 10 10/1 104.195
this study seresnext26 134622 451k x 440 multi-modal 2 (10->2 due to time constraint) 0/10 11.377

[Note] The backbone difference is not a matter, within top-10 solution a single resnet18 reaches score < 10.0. But smaller model tends to be better for this task.

Prediction Visualization

Prediction visualization with this study model. The left most figure is ground truth track is displayed and remaining three images are 3-mode predictions. The loss is over three modes. A intersecetion scene with a trafic light is hard as expected. pred image 0 pred image 1

License

Code

Apache 2.0

Dataset

Please check, https://self-driving.lyft.com/level5/prediction/

Reference