michaelnny/InstructLLaMA
Implements pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), to train and fine-tune the LLaMA2 model to follow human instructions, similar to InstructGPT or ChatGPT, but on a much smaller scale.
Jupyter NotebookMIT
Stargazers
- 18166035475JNU
- alsm168
- alusang
- amritregmi26
- bharathc346
- bingo619Australia
- BMDACMERSCUT
- Crownzz
- debbo2011
- dx-dtran
- eenzeenee
- elifbeyzatok00OUTLIER
- flazerain
- flower-kyo
- GitWhilebear
- haohao200609
- holarissunUniversity of Cambridge
- kid-gorgeousTradeAI
- Kyle-HK
- loadingyy
- metterian42dot
- mistlike
- MQahawish
- nampdnSomewhere on Earth
- Nimra-Amir-tcs
- rongzhou
- sori424
- TonyStark042
- wajihullahbaigIslamabad, Pakistan
- Wei-Cheng881221
- Xinran-He
- xuanxuanxuanxuan
- Yb-Z@uscnlp-lime
- yuefanhaoBeijing
- yuisekiYuiseki Inc.
- zxq-0058