facebookresearch/VMZ

[E video_decoder.cc:751] The video seems faulty and we could not decode enough frames: 31 VS 32

ThatIndianCoder opened this issue · 1 comments

I have been trying to finetune the R(2+1)D-34 layer for clip length 32 with my own dataset. While pre-processing my data, I extracted clips of only 32 frames from each of my videos and I organized them to make my training data. My training/testing data looks like this:

Number of labels: 5
Training data - Number of videos for each label: 650
Training data - video length of each video: 32 frames (encoded in 30fps)
Testing data - Number of videos for each label: 150
Testing data - video length of each video: 32 frames (encoded in 30fps)

When I try to finetune the model with following parameters,

python tools/train_net.py
--train_data=/home/data/performance_baseline/training_input/
--test_data=/home/data/performance_baseline/testing_input/
--model_name=r2plus1d --model_depth=34
--clip_length_rgb=32 --batch_size=5
--load_model_path=/home/320061298/Desktop/r2plus1d_34_clip32_ft_kinetics_from_ig65m.pkl
--db_type='pickle' --is_checkpoint=1
--gpus=0 --num_gpus=1 --base_learning_rate=0.001
--epoch_size=10000 --num_epochs=24 --step_epoch=5
--weight_decay=0.005 --num_labels=5 --use_local_file=0 --file_store_path="/home/320061298/model_checkpoints/" --save_model_name="performance_baseline_model" --pred_layer_name="prediction_Layer_baseline"

I get the above mentioned error. I am sure that all of my videos have exactly 32 frames. Hence, why the video_decoder is not able to extract enough frames? Any light on this issue is much appreciated.

Thank you.

My apologies. It was a bug in my augmentation script where some videos had been trimmed with lesser frames. I cleaned my dataset and it works fine now.