/mixinglaws

Primary LanguageJupyter Notebook

mixinglaws

Code and data for "Data Mixing Laws: Optimizing Data Mixture by Predicting Language Modeling Performance"

Citation

@article{ye2024datamixinglaws,
  title={Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance},
  author={Ye, Jiasheng and Liu, Peiju and Sun, Tianxiang and Zhou, Yunhua and Zhan, Jun and Qiu, Xipeng},
  journal={arXiv preprint arXiv:2403.16952},
  year={2024}
}