togethercomputer/RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
PythonApache-2.0
Stargazers
- Alexays@Screebapp
- antocodes
- basicv8vcearth
- bimriBimri, Inc
- bit-mind
- cbqinsuzhou
- cuiwenyaonull
- DanielFloresDiaz
- delveintodetailPing An Property & Casualty Insurance
- dhrubasumatary
- donggyukimcSeoul, Korea
- eware-godaddyGoDaddy
- gradetwo
- gradjittaUniversity of Helsinki, HIIT
- headless-giraffe
- HuangLKsysu
- jamiedeguerreTogether
- KakulukianHugging Face
- lamptsSaigonapps
- LEFTeyex
- licongguanBeijing Jiaotong University
- MChoi-gitVector Institute
- peiyong-addwater
- peteriz@IntelLabs
- rentainheIDEA
- RLuke22Mila
- sarfrazkhan18riyadh
- sdvfh
- SomokuPeking University
- SysuCharon
- TheReluctantHeroesTokyo
- torish14Sukimakakumei, inc.
- wendyjnwang
- xiaj1011
- Youhichka
- ys112