zsychina's Stars
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
hiyouga/LLaMA-Factory
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
ShiArthur03/ShiArthur03
huggingface/trl
Train transformer language models with reinforcement learning.
lucidrains/PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch
PyTorch implementations of deep reinforcement learning algorithms and environments
huggingface/alignment-handbook
Robust recipes to align language models with human and AI preferences
CarperAI/trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
sweetice/Deep-reinforcement-learning-with-pytorch
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
pointfeev/CreamInstaller
Automatically finds all installed Steam, Epic and Ubisoft games with their respective DLC-related DLL locations on the user's computer, parses SteamCMD, Steam Store and Epic Games Store for user-selected games' DLCs, then provides a very simple graphical interface utilizing the gathered information for the maintenance of DLC unlockers.
opendilab/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
wangshusen/RecommenderSystem
Doragd/Algorithm-Practice-in-Industry
搜索、推荐、广告、用增等工业界实践文章收集(来源:知乎、Datafuntalk、技术公众号)
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
Curt-Park/rainbow-is-all-you-need
Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow
ysyisyourbrother/SYSU_Notebook
本项目分享了中山大学计算机学院本科和研究生阶段的课程资料、笔记、期末考试卷和其他实用的相关资源。希望对同学们的学习有所帮助❤️,如果喜欢记得给个star🌟
PKU-Alignment/safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
wnlen/clash-for-linux
clash-for-linux
pranz24/pytorch-soft-actor-critic
PyTorch implementation of soft actor critic
RLHFlow/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
BlackSamorez/tensor_parallel
Automatically split your PyTorch models on multiple GPUs for training & inference
MorvanZhou/pytorch-A3C
Simple A3C implementation with pytorch + multiprocessing
WangRongsheng/MedQA-ChatGLM
🛰️ 基于真实医疗对话数据在ChatGLM上进行LoRA、P-Tuning V2、Freeze、RLHF等微调,我们的眼光不止于医疗问答
heleidsn/UAV_Navigation_DRL_AirSim
This is a new repo used for training UAV navigation (local path planning) policy using DRL methods.
seuyh/stellaris-dlc-unlocker
Stellaris DLC Unlocker - tool to automatically unlock all dlc in Stellaris completely free
sunghoonhong/AirsimDRL
Autonomous UAV Navigation without Collision using Visual Information in Airsim
chrisliu298/awesome-llm-unlearning
A resource repository for machine unlearning in large language models
liziniu/ReMax
Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)
jjccero/pbrl
A Population Based Reinforcement Learning Library based on PyTorch
fomorians/forward-models
A tutorial on forward models for model-based reinforcement learning.