Pretrained model config file
wwnbbd opened this issue · 5 comments
Hi, I am trying to test the model-zoo model: TIN trained on MMit dataset based on resnet 50. I changed the default.yaml in ./experiments/tin folder to the following:
version: 1.0
config:
gpus: 4
seed: 2020
dataset:
workers: 4
num_class: 313
num_segments: 16
batch_size: 8
img_prefix: '{:05d}.jpg'
video_source: True
dense_sample: False
modality: RGB
flow_prefix: ''
root_dir: ""
flip: False
input_mean: [0.485, 0.456, 0.406]
input_std: [0.229, 0.224 ,0.225]
crop_size: 224
scale_size: 256
train:
meta_file: /path
val:
meta_file: /workdir/wwn/Multi_Moments_in_Time/mit-val.txt
test:
meta_file: /workdir/wwn/Multi_Moments_in_Time/mit-val.txt
multi_class: True
net:
arch: resnet50
model_type: 2D
tin: True
shift_div: 4
consensus_type: avg
dropout: 0.8
img_feature_dim: 256
pretrain: True # imagenet pretrain for 2D network
trainer:
print_freq: 20
eval_freq: 1
epochs: 35
start_epoch: 0
loss_type: bce
no_partial_bn: True
clip_gradient: 20
lr_scheduler:
warmup_epochs: 1
warmup_type: linear
type: CosineAnnealingLR
kwargs:
T_max: 30
optimizer:
type: SGD
kwargs:
lr: 0.02
momentum: 0.9
weight_decay: 0.0005
nesterov: True
evaluate:
spatial_crops: 1
temporal_samples: 1
saver:
#save_dir: 'checkpoint/'
#pretrain_model: '/path'
resume_model: /home/hadoop-mtcv/cephfs/data/wangwanneng/X-Temporal-master/X-Temporal-master/pretrained/tin_mit_16.pth.tar
but the testing result is 14.4 mAP.
I think maybe there is somthing wrong in the configuration of the model because when testing the model, there are missing keys:
missing keys are as follows:
module.base_model.layer3.4.bn1.num_batches_tracked
module.base_model.layer2.1.bn2.num_batches_tracked
module.base_model.layer3.2.bn3.num_batches_tracked
module.base_model.layer3.5.bn1.num_batches_tracked
module.base_model.bn1.num_batches_tracked
module.base_model.layer4.2.bn3.num_batches_tracked
module.base_model.layer4.1.bn2.num_batches_tracked
module.base_model.layer1.2.bn2.num_batches_tracked
module.base_model.layer2.2.bn1.num_batches_tracked
module.base_model.layer3.5.bn2.num_batches_tracked
module.base_model.layer4.2.bn2.num_batches_tracked
module.base_model.layer4.0.downsample.1.num_batches_tracked
module.base_model.layer1.0.bn3.num_batches_tracked
module.base_model.layer3.0.downsample.1.num_batches_tracked
module.base_model.layer3.3.bn3.num_batches_tracked
module.base_model.layer3.3.bn2.num_batches_tracked
module.base_model.layer4.0.bn1.num_batches_tracked
module.base_model.layer3.2.bn1.num_batches_tracked
module.base_model.layer2.3.bn2.num_batches_tracked
module.base_model.layer1.0.bn2.num_batches_tracked
module.base_model.layer4.1.bn1.num_batches_tracked
module.base_model.layer2.1.bn3.num_batches_tracked
module.base_model.layer2.0.downsample.1.num_batches_tracked
module.base_model.layer3.4.bn3.num_batches_tracked
module.base_model.layer1.0.downsample.1.num_batches_tracked
module.base_model.layer1.2.bn1.num_batches_tracked
module.base_model.layer4.1.bn3.num_batches_tracked
module.base_model.layer4.0.bn3.num_batches_tracked
module.base_model.layer3.1.bn1.num_batches_tracked
module.base_model.layer3.3.bn1.num_batches_tracked
module.base_model.layer1.0.bn1.num_batches_tracked
module.base_model.layer1.1.bn3.num_batches_tracked
module.base_model.layer3.0.bn2.num_batches_tracked
module.base_model.layer3.0.bn3.num_batches_tracked
module.base_model.layer2.1.bn1.num_batches_tracked
module.base_model.layer1.2.bn3.num_batches_tracked
module.base_model.layer2.3.bn1.num_batches_tracked
module.base_model.layer3.1.bn2.num_batches_tracked
module.base_model.layer1.1.bn1.num_batches_tracked
module.base_model.layer2.0.bn3.num_batches_tracked
module.base_model.layer2.0.bn2.num_batches_tracked
module.base_model.layer1.1.bn2.num_batches_tracked
module.base_model.layer3.4.bn2.num_batches_tracked
module.base_model.layer4.0.bn2.num_batches_tracked
module.base_model.layer3.5.bn3.num_batches_tracked
module.base_model.layer2.2.bn2.num_batches_tracked
module.base_model.layer3.1.bn3.num_batches_tracked
module.base_model.layer3.2.bn2.num_batches_tracked
module.base_model.layer2.3.bn3.num_batches_tracked
module.base_model.layer3.0.bn1.num_batches_tracked
module.base_model.layer4.2.bn1.num_batches_tracked
module.base_model.layer2.2.bn3.num_batches_tracked
module.base_model.layer2.0.bn1.num_batches_tracked
so can you share your config file when testing MMit dataset?
Hi, 35 epochs may be shoft for Mit (a big dataset), you can train it again for a longer time. This warnings can be ignored because these missing keys comes from parallel gpus training, which won't affect the final performace.
I am not trying to train a new model, I just used the model zoo model: TIN resnet50 | MMit | 16 | 224*224. It seems a bug
in
mAP should not be divided by the number of gpus.
I changed
all_reduce(mAP_sum)
to
all_reduce(mAP_sum, False)
The result is mAP 56.592. But there still a gap compared to model zoo result 62.5
You can use multi-crops and a bigger input to get a higher performance like below:
config:
dataset:
crop_size: 256
evaluate:
spatial_crops: 3
temporal_samples: 5
I use multi-crops and a bigger input, I still can not reproduce the result on Multi-moments in time using TIN and slowfast model. However, I can get a little bit better result using tsn model(59.7 vs 58.9)