hanshuo123i's Stars
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)
debitCrossBlockchain/renzhengfei
https://github.com/benmahr/RenZhengfei
WeThinkIn/Interview-for-Algorithm-Engineer
【三年面试五年模拟】算法工程师秘籍。涵盖AIGC、传统深度学习、自动驾驶、机器学习、计算机视觉、自然语言处理、SLAM、具身智能、元宇宙、AGI等AI行业面试笔试经验与干货知识。
qd-today/qd
QD [v20240210] —— HTTP请求定时任务自动执行框架 base on HAR Editor and Tornado Server
sunkafei/xcpc-algorithm-templates
XCPC/ICPC/CCPC 算法模板
google/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
meta-llama/llama
Inference code for Llama models
opendilab/GoBigger
[ICLR 2023] Come & try Decision-Intelligence version of "Agar"! Gobigger could also help you with multi-agent decision intelligence study.
opendilab/DI-engine
OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.
liuruoze/mini-AlphaStar
(JAIR'2022) A mini-scale reproduction code of the AlphaStar program. Note: the original AlphaStar is the AI proposed by DeepMind to play StarCraft II. JAIR = Journal of Artificial Intelligence Research.
Changhe160/cplusplus2020-2021-2
practical-tutorials/project-based-learning
Curated list of project-based tutorials
codecrafters-io/build-your-own-x
Master programming by recreating your favorite technologies from scratch.
vwxyzjn/ppo-implementation-details
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
PKUFlyingPig/cs-self-learning
计算机自学指南
datawhalechina/easy-rl
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
starry-sky6688/MARL-Algorithms
Implementations of IQL, QMIX, VDN, COMA, QTRAN, MAVEN, CommNet, DyMA-CL, and G2ANet on SMAC, the decentralised micromanagement scenario of StarCraft II
HumanCompatibleAI/imitation
Clean PyTorch implementations of imitation and reward learning algorithms
openai/spinningup
An educational resource to help anyone learn deep reinforcement learning.
ray-project/ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
DLR-RM/stable-baselines3
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
tencent-ailab/hok_env
Honor of Kings AI Open Environment of Tencent