wwxFromTju's Stars
ray-project/llm-numbers
Numbers every LLM developer should know
alpa-projects/alpa
Training and serving large-scale neural networks with auto parallelization.
eureka-research/Eureka
Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models" (ICLR 2024)
google-deepmind/mujoco_menagerie
A collection of high-quality models for the MuJoCo physics engine, curated by Google DeepMind.
openai/Video-Pre-Training
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
google-deepmind/rlax
facebookresearch/shumai
Fast Differentiable Tensor Library in JavaScript and TypeScript with Bun + Flashlight
d4nj1/TLPUI
A GTK user interface for TLP written in Python
tinkoff-ai/CORL
High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC
pytorch/torchdynamo
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
instadeepai/jumanji
🕹️ A diverse suite of scalable reinforcement learning environments in JAX
Azure/MS-AMP
Microsoft Automatic Mixed Precision Library
salesforce/CodeRL
This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (NeurIPS22).
FLAIROx/JaxMARL
Multi-Agent Reinforcement Learning with JAX
google-deepmind/alphastar
wwxFromTju/awesome-reinforcement-learning-lib
GitHub's code repository is all you need
MineDojo/MineCLIP
Foundation Model for MineDojo
floodsung/LLM-with-RL-papers
A collection of LLM with RL papers
OrigamiDream/gato
Unofficial Gato: A Generalist Agent
sotopia-lab/sotopia
Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)
microsoft/MoCapAct
A Multi-Task Dataset for Simulated Humanoid Control
google-deepmind/s6
chandar-lab/RLHive
architsharma97/dpo-rlaif
taichi-dev/faster-python-with-taichi
gilzamir18/AI4U
AI4U is a plugin that allows you use the Godot Game Engine to specify agents with reinforcement learning. Non-Player Characters (NPCs) of games can be designed using ready-made components.
kvfrans/powderworld
Code for Powderworld: A Platform for Understanding Generalization via Rich Task Distributions
manantomar/Mirror-Descent-Policy-Optimization
Mirror Descent Policy Optimization
NVlabs/easysim
A library for creating Gym environments with unified API to various physics simulators
perrin-isir/xpag
a modular reinforcement learning library with JAX agents