Parameter setting problem
hero-White opened this issue · 1 comments
hero-White commented
In Part D.6 of the article, what is the sr_ratios list set when the input size is 256*256, the pool size is 4, and the aggregated attention is used in all four stages?
DaiShiResearch commented
Taking transnext_micro
as an example, you can use the following configuration to implement a model with a pool_size of 4, an input resolution of
@register_model
def transnext_micro(pretrained=False, **kwargs):
model = TransNeXt(window_size=[3, 3, 3, 3],
patch_size=4, embed_dims=[48, 96, 192, 384], num_heads=[2, 4, 8, 16],
mlp_ratios=[8, 8, 4, 4], qkv_bias=True,
norm_layer=partial(nn.LayerNorm, eps=1e-6), depths=[2, 2, 15, 2], sr_ratios=[16, 8, 4, 2],
**kwargs)
model.default_cfg = _cfg()
return model
Additionally, you’ll need to adjust the calculation of relative_pos_index
and relative_coords_table
in the model as follows:
relative_pos_index, relative_coords_table = get_relative_position_cpb(query_size=to_2tuple(img_size // (2 ** (i + 2))),
key_size=to_2tuple(img_size // ((2 ** (i + 2)) * sr_ratios[i])),
pretrain_size=to_2tuple(pretrain_size // (2 ** (i + 2))))
This change is necessary because the previously released version defaults to a pool_size
of 1/32 of the input image size, whereas now it’s set to 1/64.