MLP block weights for mask_tokens 0 in mask_decoder are almost all zeros

Question

MLP block weights for mask_tokens 0 in mask_decoder are almost all zeros

leondgarse opened this issue a year ago · 2 comments

Just noticed the mlp block weights for mask_tokens 0 in mask_decoder are almost all zeros. Is this intend to be?

import torch
ss = torch.load("tinysam.pth", map_location=torch.device('cpu'))

ww = ss["mask_decoder.output_hypernetworks_mlps.0.layers.0.weight"]
print(ww[torch.where(ww.abs() > 1e-6)])
# tensor([-0.1436, -0.0390,  0.3668,  0.2065,  0.1118, -0.0201,  0.1688])

ww = ss["mask_decoder.output_hypernetworks_mlps.0.layers.1.weight"]
print(ww[torch.where(ww.abs() > 1e-6)])
# tensor([0.1090, 0.0203, 0.8415, 0.0125, 0.2405, 0.1774])

ww = ss["mask_decoder.output_hypernetworks_mlps.0.layers.2.weight"]
print(ww[torch.where(ww.abs() > 1e-6)])
# tensor([])

Answer 1 · 2023-12-28T02:03:24.000Z

Hi, the released model weight is trained within the multi-output mode which corresponds to mask_token 1,2,3, thus the branch for mask_token 0 is not trained.

Answer 2 · 2023-12-28T04:50:42.000Z

Ok, that makes sense. Thanks for the clarification.