wKIDw's Stars
facebookresearch/drqv2
DrQ-v2: Improved Data-Augmented Reinforcement Learning
openai/procgen
Procgen Benchmark: Procedurally-Generated Game-Like Gym-Environments
facebookresearch/minihack
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
dunnolab/awesome-in-context-rl
Awesome In-Context RL: A curated list of In-Context Reinforcement Learning
withinmiaov/A-Survey-on-Mixture-of-Experts
pjlab-sys4nlp/llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
files-community/Files
A modern file manager that helps users organize their files and folders.
schrodingercatss/tuning_playbook_zh_cn
一本系统地教你将深度学习模型的性能最大化的战术手册。
labmlai/annotated_deep_learning_paper_implementations
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
OpenSparseLLMs/CLIP-MoE
CLIP-MoE: Mixture of Experts for CLIP
corl-team/ad-eps
Official Implementation for "In-Context Reinforcement Learning from Noise Distillation"
xu-ye/PSBL-MetaRL
modelscope/ms-swift
Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
jon--lee/decision-pretrained-transformer
Implemention of the Decision-Pretrained Transformer (DPT) from the paper Supervised Pretraining Can Learn In-Context Reinforcement Learning.
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
UT-Austin-RPL/amago
a simple and scalable agent for training adaptive policies with sequence-based RL
EcoPasteHub/EcoPaste
🎉跨平台的剪贴板管理工具 | Cross-platform clipboard management tool
lucidrains/mixture-of-experts
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Ekoda/SoftMoE
Soft Mixture of Experts Vision Transformer, addressing MoE limitations as highlighted by Puigcerver et al., 2023.
reinforcement-learning-kr/pg_travel
Policy Gradient algorithms (REINFORCE, NPG, TRPO, PPO)
Khrylx/PyTorch-RL
PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
NJU-RL/ACORM
kzl/decision-transformer
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
2019ChenGong/RL-Paper-notes
jannerm/diffuser
Code for the paper "Planning with Diffusion for Flexible Behavior Synthesis"
opendilab/awesome-diffusion-model-in-rl
A curated list of Diffusion Model in RL resources (continually updated)
TJU-DRL-LAB/self-supervised-rl
chathub-dev/chathub
All-in-one chatbot client