/RedPajama-Data

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Primary LanguagePythonApache License 2.0Apache-2.0

Stargazers