How to train SwinBert on UCA dataset ?

Question

How to train SwinBert on UCA dataset ?

pixelieee opened this issue 9 months ago · 1 comments

Hi,
I am going to reproduce the reported performance on MAD task using SwinBert,but i encountered difficulties while training it on the UCA dataset.

Unfortunately, there doesn't seem to be any guidance on training SwinBert on a custom dataset within the official repository.

Could you update the relevant code or give me some advice ? Thank you in advance for your assistance!

Answer 1 · 2024-04-30T09:10:48.000Z

Sorry for the late response,

The training process for SwinBert can be referred to in the original README. When training on the UCA dataset, set num-frames to 32 during the generation of frame TSV. The main training parameters are as follows:

--per_gpu_train_batch_size 4
--per_gpu_eval_batch_size 4
--num_train_epochs 20
--learning_rate 3e-05
--max_num_frames 32
--pretrained_2d 0
--backbone_coef_lr 0.01
--mask_prob 0.5
--max_masked_token 45
--zero_opt_stage -1
--mixed_precision_method deepspeed
--deepspeed_fp16
--gradient_accumulation_steps 1
--learn_mask_enabled
--loss_sparse_w 0.5

After obtaining the training weights, use the Dense caption generation section in the README to generate captions for the original UCF videos that can be used in the MAD task. We have uploaded the generated captions file to Experimental Files/MAD/ucf_captions_swin.txt. If you have any questions, feel free to reach out at any time.