why use torch.clip in Q-MLP
Pang-Yatian opened this issue · 2 comments
Pang-Yatian commented
Hi, Thanks for your great work.
May I ask why add this line to Q_MLP?
x = torch.clip(x, -10., 10.)
Is there any specific reason? To make training stable? or this trick will improve performance?
Q-ViT/quant_vision_transformer.py
Lines 138 to 147 in 0cee463
YanjingLi0202 commented
We use the torch.clip() here to limit the outliers, but the specific parameters, i.e. "-10., 10." here have not yet been explored sufficiently. We also removed the torch.clip() here conducting experiments, and found that the performance on the Deit-tiny model decreased by about 0.1%
Pang-Yatian commented
Thanks.