Gate_value is all 0 in the ViT-T5-base repleased model

Question

Gate_value is all 0 in the ViT-T5-base repleased model

Closed this issue 6 months ago · 1 comments

Hi, Dear Authors:

Not sure if it is my issues. I downloaded your ViT-T5-base model from your github and test it with eval script.
But when I check the value of the gate value from layer 12 to layer 23. All of them are zero, which means the new features (question-aware) is totally ingored by the QA-transformer, which make it almost no difference from villina clip transformer model weights.
Could you double check with this? If you can see some meaningful values from your end, maybe it is my problem.

Thanks for your time! Keep building!

Best,
Zhiyuan

Answer 1 · 2024-08-04T19:03:20.000Z

Close the issue as Zhiyuan made a stupid mistake.
For people who wants to watch the weights of gate_value from layer12~layer23 of base version, I just print them here.

-0.21439486742019653 (layer index 12
0.2283252328634262
0.2884502112865448
-0.2394029051065445
0.3377605080604553
0.3446634113788605
0.37189891934394836
-0.36175721883773804
-0.37525999546051025
-0.3238457441329956
0.29202741384506226
0.07214009016752243 (layer index 23

So the mean absolute value is around 0.2~0.3, which means the original features are still important for representing the image.
Fun to see that the last layer's weight is very low, which may suggest something but i dont fully figure it out (hope authors could give some insights here :)