hanshuo123i

hanshuo123i's Stars

OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)
Language:Python2.5k238
debitCrossBlockchain/renzhengfei
https://github.com/benmahr/RenZhengfei
704297
WeThinkIn/Interview-for-Algorithm-Engineer
【三年面试五年模拟】算法工程师秘籍。涵盖AIGC、传统深度学习、自动驾驶、机器学习、计算机视觉、自然语言处理、SLAM、具身智能、元宇宙、AGI等AI行业面试笔试经验与干货知识。
773116
qd-today/qd
QD [v20240210] —— HTTP请求定时任务自动执行框架 base on HAR Editor and Tornado Server
Language:JavaScript4.4k562
sunkafei/xcpc-algorithm-templates
XCPC/ICPC/CCPC 算法模板
Language:C++55835
google/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
Language:Python5.3k508
meta-llama/llama
Inference code for Llama models
Language:Python56.3k9.6k
opendilab/GoBigger
[ICLR 2023] Come & try Decision-Intelligence version of "Agar"! Gobigger could also help you with multi-agent decision intelligence study.
Language:Python46034
opendilab/DI-engine
OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.
Language:Python3.1k371
liuruoze/mini-AlphaStar
(JAIR'2022) A mini-scale reproduction code of the AlphaStar program. Note: the original AlphaStar is the AI proposed by DeepMind to play StarCraft II. JAIR = Journal of Artificial Intelligence Research.
Language:Python31257
Changhe160/cplusplus2020-2021-2
Language:TeX948
practical-tutorials/project-based-learning
Curated list of project-based tutorials
203k26.5k
codecrafters-io/build-your-own-x
Master programming by recreating your favorite technologies from scratch.
Language:Markdown307k28.7k
vwxyzjn/ppo-implementation-details
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
Language:Python64199
PKUFlyingPig/cs-self-learning
计算机自学指南
Language:HTML57.6k6.9k
datawhalechina/easy-rl
强化学习中文教程（蘑菇书🍄），在线阅读地址：https://datawhalechina.github.io/easy-rl/
Language:Jupyter Notebook9.4k1.9k
starry-sky6688/MARL-Algorithms
Implementations of IQL, QMIX, VDN, COMA, QTRAN, MAVEN, CommNet, DyMA-CL, and G2ANet on SMAC, the decentralised micromanagement scenario of StarCraft II
Language:Python1.5k283
HumanCompatibleAI/imitation
Clean PyTorch implementations of imitation and reward learning algorithms
Language:Python1.3k247
openai/spinningup
An educational resource to help anyone learn deep reinforcement learning.
Language:Python10.1k2.2k
ray-project/ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Language:Python33.8k5.7k
DLR-RM/stable-baselines3
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Language:Python9.1k1.7k
tencent-ailab/hok_env
Honor of Kings AI Open Environment of Tencent
Language:Python64272