Aicity 2022 Challenge Deep Learning Model Self Supervised(contrastive) learning
This repo covers an reference implementation for the following papers in PyTorch, using AICITY2022 Challenge-Track3 as an illustrative example:
(1) Supervised Contrastive Learning. Paper
(2) A Simple Framework for Contrastive Learning of Visual Representations. Paper
The AICity2022 dataset can be downloaded from its official website
Results on AICITY2022:
Arch | Setting | Loss | Accuracy(%) | |
---|---|---|---|---|
CrossEntropy | ResNet50-C3D | Contrastive | Cross Entropy | 00.0 |
NCELoss | ResNet50 | Contrastive | Contrastive | 00.0 |
You might use config.py
to set proper setting, and/or switch to datasets/test_dataset.py
and datasets/train_dataset.py
.
Train
python main.py --mode train
Test
python main.py --mode test
Linear evaluation stage:
python main_classifier.py --ckpt /checkpoints/model.pth
@inproceedings{hara3dcnns,
author={Kensho Hara and Hirokatsu Kataoka and Yutaka Satoh},
title={Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
pages={6546--6555},
year={2018},
}
Pre-trained models are available here. All models are trained on Kinetics-700 (K), Moments in Time (M), STAIR-Actions (S), or merged datasets of them (KM, KS, MS, KMS).
r3d18_K_200ep.pth: --model resnet --model_depth 18 --n_pretrain_classes 700
r3d18_KM_200ep.pth: --model resnet --model_depth 18 --n_pretrain_classes 1039
r3d34_K_200ep.pth: --model resnet --model_depth 34 --n_pretrain_classes 700
r3d34_KM_200ep.pth: --model resnet --model_depth 34 --n_pretrain_classes 1039
r3d50_K_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 700
r3d50_KM_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 1039
r3d50_KMS_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 1139
r3d50_KS_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 800
r3d50_M_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 339
r3d50_MS_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 439
r3d50_S_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 100
r3d101_K_200ep.pth: --model resnet --model_depth 101 --n_pretrain_classes 700
r3d101_KM_200ep.pth: --model resnet --model_depth 101 --n_pretrain_classes 1039
r3d152_K_200ep.pth: --model resnet --model_depth 152 --n_pretrain_classes 700
r3d152_KM_200ep.pth: --model resnet --model_depth 152 --n_pretrain_classes 1039
r3d200_K_200ep.pth: --model resnet --model_depth 200 --n_pretrain_classes 700
r3d200_KM_200ep.pth: --model resnet --model_depth 200 --n_pretrain_classes 1039
Please cite the following article if you use this code or pre-trained models:
@misc{seonwoolee2022naturalisticdrivingationrecognition,
title = {TAL-CoLR:Temporal Action Localization using Contrastive Learning Repesentation},
author = {Seon-Woo Lee, Jeong-Gu Kim, Heo-June Kim, Mun-Hyung Lee, So-Hee Yong, Jang-Woo Kwon},
booktitle = {IEEE Xplore Digital Library and CVF Open Access},
howpublished = {\url{https://github.com/LEE-SEON-WOO/Naturalistic-Driving-Action-Recognition/}},
year = {2022},
note = {commit xxxxxxx}
}
We refer to these code. codebase 1and codebase 2, which we build our work on top.
- This project could not have happened without the advice given by our Contributors(HCI-LAB Member).
- This project also borrowed some code from the lists below.
- Some readme/docs formatting was borrowed from Cheng-Bin Jin's Cheng-Bin Jin Style