/Scaling-Law-from-Scratch

从头开始,探讨LLM pretrain & post-train阶段的 scaling law与数据配比

MIT LicenseMIT

Scaling-Law-from-Scratch

Generally, we try to reproduce the paper "Scaling Laws for Neural Language Models" from scratch, and explore the scaling law and data ratio in LLM pretrain & post-train stages.

从头开始,探讨LLM pretrain & post-train阶段的 scaling law与数据配比