mlcommons/logging

New logging keys for LLM training benchmark

Closed this issue · 2 comments

sgpyc commented

Hi, we would like to add a few new logging keys for the LLM training benchmark:

No. key what is it expected value
1 opt_adam_weight_decay ADAM weight decay 0.1
2 opt_learning_rate_decay_schedule learning rate decay schedule cosine with linear warmup
3 opt_init_checkpoint_step the step ckpt is loaded for, in the lr schedule 4000 * 1536 / batch_size
4 opt_adam_beta_1 ADAM beta1 0.9
5 opt_adam_beta_2 ADAM beta2 0.995
6 opt_adam_epsilon ADAM epsilon 0.1
7 sequence_length input sequence length 2048
8 trained_samples number of train samples used to reach the target accuracy

Please advise on how to proceed or whether a full list of keys used by the LLM benchmark is required.
Thanks a lot.

@sgpyc Should there be a new pair of .yaml files for the LLM benchmark? In this case how should they be named? closed_LLM.yaml and open_LLM.yaml?
And is this new model supposed to replace bert?

@sgpyc's PR adds the yaml files. The model name is gpt3 and it is a new benchmark that does not replace anything.