louieworth

stay hungry, stay healthy.

Tsinghua UniversityShenzhen, China

Pinned Repositories

peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Language:Python16.8k 111 1.1k1.7k
awesome-rlhf
An index of algorithms for reinforcement learning from human feedback (rlhf))
88 9 02
jaxrl
JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.
Language:Jupyter Notebook0 0 00
louieworth.github.io
Language:HTML0 1 00
OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)
Language:Python0 0 00
rm_lmsys
0 1 00
trl
Train transformer language models with reinforcement learning.
Language:Python0 0 00
llm2vec
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
Language:Python1.4k 23 130106
OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Language:Python3.4k 28 369318
912_project
清华大学计算机系考研攻略 Guidance for postgraduate entrance examination in Department of Computer Science and Technology, Tsinghua University
Language:HTML2.6k 71 9527

louieworth/awesome-rlhf
An index of algorithms for reinforcement learning from human feedback (rlhf))
88 9 02
louieworth/jaxrl
JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.
Language:Jupyter Notebook0 0 00
louieworth/louieworth.github.io
Language:HTML0 1 00
louieworth/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)
Language:Python0 0 00
louieworth/rm_lmsys
0 1 00
louieworth/trl
Train transformer language models with reinforcement learning.
Language:Python0 0 00