lxtGH/CAE

about modeling_finetune.py

lywang76 opened this issue · 1 comments

In your method

@register_model
def cae_large_patch16_384(pretrained=False, **kwargs):
model = VisionTransformer(
img_size=384, patch_size=16, embed_dim=1024, depth=24, num_heads=16, mlp_ratio=4, qkv_bias=True,
norm_layer=partial(nn.LayerNorm, eps=1e-6), **kwargs)
model.default_cfg = _cfg()
return model

def _cfg(url='', **kwargs):
return {
'url': url,
'input_size': (3, 224, 224), 'pool_size': None,
'crop_pct': .9, 'interpolation': 'bicubic',
'mean': (0.5, 0.5, 0.5), 'std': (0.5, 0.5, 0.5),
**kwargs
}

Therefore, if the input size is 384, your calling _cfg() will revise the input size to 224 again.

Could this be a potential problem?

Hi, this problem could be solved by passing this argument to the program: --input_size 384.