samiranrl/DHS2023_RLHF
Bare bones implementation of RLHF to fine tune a language model - to demonstrate the key concepts
Jupyter NotebookMIT
Bare bones implementation of RLHF to fine tune a language model - to demonstrate the key concepts
Jupyter NotebookMIT