About the hyperparamters for training
yimingsh opened this issue · 0 comments
yimingsh commented
Hi~ @MengLcool, Thanks for the interesting work.
When I run your code, I find the training is performed using the default hyperparameters. Are these the parameters you used in experiments? I found some head_diverse_ratio
or head_minimal_weight
are set to 0 by default. Doesn't Adavit need these parameters?