GPT4animal's Stars
huggingface/trl
Train transformer language models with reinforcement learning.
OpenLLMAI/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
facebookresearch/SEAL
Search Engines with Autoregressive Language models
Lichang-Chen/InstructZero
Official Implementation of InstructZero; the first framework to optimize bad prompts of ChatGPT(API LLMs) and finally obtain good prompts!
gpt4life/alpagasus
Unofficial implementation of AlpaGasus
gauss5930/AlpaGasus2-QLoRA
This is AlpaGasus2-QLoRA based on LLaMA2 with AlpaGasus mechanism using QLoRA!