Fail to finetune from the provided pretrained model checkpoint on UCF101

Question

Fail to finetune from the provided pretrained model checkpoint on UCF101

Yisen-Feng opened this issue a year ago · 5 comments

Hi!
I have tried to reproduce the result on UCF101.I succeeded in testing the finetuned checkpoint but failed to finetune the pretrained checkpoint.I am using this script and this checkpoint.Is there something wrong with my setting?@yztongzhan
This is my finetune_log
The dataset is divided following the trainlist01 and the testlist01 realeased in UCF101.And the val is the same as the test dataset.
the following is my script

OUTPUT_DIR='./ucf101/finetune1'
DATA_PATH='../data/ucf101/UCF-101'
MODEL_PATH='../model/videomae/ucf101/finetune_checkpoint.pth'

OMP_NUM_THREADS=1 python3 -m torch.distributed.launch --nproc_per_node=1
--master_port 12320 run_class_finetuning.py
--model vit_base_patch16_224
--data_path ${DATA_PATH}
--finetune ${MODEL_PATH}
--log_dir ${OUTPUT_DIR}
--output_dir ${OUTPUT_DIR}
--data_set UCF101
--nb_classes 101
--batch_size 16
--input_size 224
--short_side_size 224
--save_ckpt_freq 50
--num_frames 16
--sampling_rate 4
--num_sample 2
--opt adamw
--lr 5e-4
--warmup_lr 1e-8
--min_lr 1e-5
--layer_decay 0.7
--opt_betas 0.9 0.999
--weight_decay 0.05
--epochs 100
--test_num_segment 5
--test_num_crop 3
--fc_drop_rate 0.5
--drop_path 0.2
--use_checkpoint
--dist_eval
--enable_deepspeed

Answer 1 · 2023-04-19T06:48:10.000Z

I recommend increasing the batch size as we default to --nproc_per_node=8.

Answer 2 · 2023-04-19T07:20:03.000Z

我建议增加批量大小，因为我们默认为--nproc_per_node=8.

thanks for your reply

Answer 3 · 2023-05-06T13:10:30.000Z

Hi! I have tried to reproduce the result on UCF101.I succeeded in testing the finetuned checkpoint but failed to finetune the pretrained checkpoint.I am using this script and this checkpoint.Is there something wrong with my setting?@yztongzhan This is my finetune_log The dataset is divided following the trainlist01 and the testlist01 realeased in UCF101.And the val is the same as the test dataset. the following is my script

OUTPUT_DIR='./ucf101/finetune1' DATA_PATH='../data/ucf101/UCF-101' MODEL_PATH='../model/videomae/ucf101/finetune_checkpoint.pth'

OMP_NUM_THREADS=1 python3 -m torch.distributed.launch --nproc_per_node=1 --master_port 12320 run_class_finetuning.py --model vit_base_patch16_224 --data_path ${DATA_PATH} --finetune ${MODEL_PATH} --log_dir ${OUTPUT_DIR} --output_dir ${OUTPUT_DIR} --data_set UCF101 --nb_classes 101 --batch_size 16 --input_size 224 --short_side_size 224 --save_ckpt_freq 50 --num_frames 16 --sampling_rate 4 --num_sample 2 --opt adamw --lr 5e-4 --warmup_lr 1e-8 --min_lr 1e-5 --layer_decay 0.7 --opt_betas 0.9 0.999 --weight_decay 0.05 --epochs 100 --test_num_segment 5 --test_num_crop 3 --fc_drop_rate 0.5 --drop_path 0.2 --use_checkpoint --dist_eval --enable_deepspeed

Hello, can you share the csv file of ucf101? I'm having some problems reading the video. Looking forward to your reply.

Answer 4 · 2023-05-06T13:33:30.000Z

Hi! I have tried to reproduce the result on UCF101.I succeeded in testing the finetuned checkpoint but failed to finetune the pretrained checkpoint.I am using this script and this checkpoint.Is there something wrong with my setting?@yztongzhan This is my finetune_log The dataset is divided following the trainlist01 and the testlist01 realeased in UCF101.And the val is the same as the test dataset. the following is my script
OUTPUT_DIR='./ucf101/finetune1' DATA_PATH='../data/ucf101/UCF-101' MODEL_PATH='../model/videomae/ucf101/finetune_checkpoint.pth'
OMP_NUM_THREADS=1 python3 -m torch.distributed.launch --nproc_per_node=1 --master_port 12320 run_class_finetuning.py --model vit_base_patch16_224 --data_path ${DATA_PATH} --finetune ${MODEL_PATH} --log_dir ${OUTPUT_DIR} --output_dir ${OUTPUT_DIR} --data_set UCF101 --nb_classes 101 --batch_size 16 --input_size 224 --short_side_size 224 --save_ckpt_freq 50 --num_frames 16 --sampling_rate 4 --num_sample 2 --opt adamw --lr 5e-4 --warmup_lr 1e-8 --min_lr 1e-5 --layer_decay 0.7 --opt_betas 0.9 0.999 --weight_decay 0.05 --epochs 100 --test_num_segment 5 --test_num_crop 3 --fc_drop_rate 0.5 --drop_path 0.2 --use_checkpoint --dist_eval --enable_deepspeed

Hello, can you share the csv file of ucf101? I'm having some problems reading the video. Looking forward to your reply.

recommend referring to Data Preparation to make csv.Mine cannot use directly
test.csv
train.csv

Answer 5 · 2023-05-07T04:17:12.000Z

train.csv

OK，Thanks !