@inproceedings{
li2021ctnet,
title={{\{}CT{\}}-Net: Channel Tensorization Network for Video Classification},
author={Kunchang Li and Xianhang Li and Yali Wang and Jun Wang and Yu Qiao},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=UoaQUQREMOs}
}
[2021/6/3] We release the PyTorch code of CT-Net. More details and models will be available.
All models can be trained on a single machine (e.g., 8 1080Ti). Some tricks will help you save GPU memory, suck as mixed precision or torch.utils.checkpoint
.
Some models are lost after hacking by mining malware. If there is any problem about training model, please create an issue or send me an email.
Now we release the model for visualization (Something-Something V1), please download it from here and put it in ./model
. (passward: t3to)
pip install -r requirements.txt
In our paper, we conduct experiments on Kinetics-400, Something-Something V1&V2, UCF101, and HMDB51. Please refer to TSM repo for the detailed guide of data pre-processing.
Please refer to scripts/train.sh
and scripts/test.sh
, more details can be found in the appendix of our paper.
source ./init.sh
We use dense sampling
and uniform sampling
for Kinetics and Something-Something respecitively.
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
python3 main.py something RGB \
--root-log ./log \
--root-model ./model \
--arch resnet50 --model CT_Net --num-segments 8 \
--gd 20 --lr 0.02 --unfrozen-epoch 0 --lr-type cos \
--warmup 10 --tune-epoch 10 --tune-lr 0.02 --epochs 45 \
--batch-size 8 -j 24 --dropout 0.3 --consensus-type=avg \
--npb --num-total 7 --full-res --gpus 0 1 2 3 4 5 6 7 --suffix 2021
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
python3 test_acc.py something RGB \
--arch resnet50 --model CT_Net --num-segments 8 \
--batch-size 64 -j 8 --consensus-type=avg \
--resume ./model/ct_net_8f_r50.pth.tar \
--npb --num-total 7 --evaluate --test-crops 1 --full-res --gpus 0 1 2 3 4 5 6 7
See demo/show_cam.ipynb
,
source ./init.sh
cd demo
jupyter notebook