New logging keys for LLM training benchmark
Closed this issue · 2 comments
sgpyc commented
Hi, we would like to add a few new logging keys for the LLM training benchmark:
No. | key | what is it | expected value |
---|---|---|---|
1 | opt_adam_weight_decay | ADAM weight decay | 0.1 |
2 | opt_learning_rate_decay_schedule | learning rate decay schedule | cosine with linear warmup |
3 | opt_init_checkpoint_step | the step ckpt is loaded for, in the lr schedule | 4000 * 1536 / batch_size |
4 | opt_adam_beta_1 | ADAM beta1 | 0.9 |
5 | opt_adam_beta_2 | ADAM beta2 | 0.995 |
6 | opt_adam_epsilon | ADAM epsilon | 0.1 |
7 | sequence_length | input sequence length | 2048 |
8 | trained_samples | number of train samples used to reach the target accuracy |
Please advise on how to proceed or whether a full list of keys used by the LLM benchmark is required.
Thanks a lot.
pgmpablo157321 commented
@sgpyc Should there be a new pair of .yaml
files for the LLM benchmark? In this case how should they be named? closed_LLM.yaml
and open_LLM.yaml
?
And is this new model supposed to replace bert
?
ShriyaPalsamudram commented