Scaling-Law-from-Scratch

Generally, we try to reproduce the paper "Scaling Laws for Neural Language Models" from scratch, and explore the scaling law and data ratio in LLM pretrain & post-train stages.

从头开始，探讨LLM pretrain & post-train阶段的 scaling law与数据配比

Unakar/Scaling-Law-from-Scratch

Scaling-Law-from-Scratch