Pinned Repositories
peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
awesome-rlhf
An index of algorithms for reinforcement learning from human feedback (rlhf))
jaxrl
JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.
louieworth.github.io
OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)
rm_lmsys
trl
Train transformer language models with reinforcement learning.
llm2vec
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
912_project
清华大学计算机系考研攻略 Guidance for postgraduate entrance examination in Department of Computer Science and Technology, Tsinghua University
louieworth's Repositories
louieworth/awesome-rlhf
An index of algorithms for reinforcement learning from human feedback (rlhf))
louieworth/jaxrl
JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.
louieworth/louieworth.github.io
louieworth/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)
louieworth/rm_lmsys
louieworth/trl
Train transformer language models with reinforcement learning.