michaelnny/InstructLLaMA

Implements pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), to train and fine-tune the LLaMA2 model to follow human instructions, similar to InstructGPT or ChatGPT, but on a much smaller scale.

Jupyter NotebookMIT

Issues

bug report
#10 opened a year ago by loadingyy
0
bug report
#9 opened a year ago by loadingyy
0
tokenizer.model?
#8 opened a year ago by loadingyy
1
how to do the inference?
#7 opened a year ago by chowkamlee81
1
Wrong implementation of gradient accumulation
#6 opened a year ago by michaelnny
0
how to run InstructLLaMA on cpu
#5 opened a year ago by superclocks
1
how to
#4 opened a year ago by superclocks
0
https://github.com/Metaresearch/llama
#3 opened a year ago by superclocks
0