mira-space/Mira

Are the parameters given in config_X_mira.yaml same as the training parameters?

alpercanberk opened this issue · 3 comments

If not, could I learn about the training parameters (e.g. effective batch size, learning rate, clipping, etc.)

Hello, the settings specified in the config_X_mira.yaml file are ready to be used for training Mira on A100 40G GPU.

Wait, so the model was trained with 1 A100 and no gradient accumulation?

No, the Mira-v0 model was trained on 32 A100 GPUs for approximately two days.