Are the parameters given in config_X_mira.yaml same as the training parameters?
alpercanberk opened this issue · 3 comments
alpercanberk commented
If not, could I learn about the training parameters (e.g. effective batch size, learning rate, clipping, etc.)
zzyfd commented
Hello, the settings specified in the config_X_mira.yaml file are ready to be used for training Mira on A100 40G GPU.
alpercanberk commented
Wait, so the model was trained with 1 A100 and no gradient accumulation?
mira-space commented
No, the Mira-v0 model was trained on 32 A100 GPUs for approximately two days.