muzero

There are 40 repositories under muzero topic.

werner-duvaud/muzero-general
MuZero
Language:Python2.5k 74 175611
opendilab/LightZero
[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)
Language:Python1.1k 13 106120
huawei-noah/xingtian
xingtian is a componentized library for the development and verification of reinforcement learning algorithms
Language:Python307 14 1289
johan-gras/MuZero
A structured implementation of MuZero
Language:Python206 10 254
kaesve/muzero
A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each other, and investigate reliability of learned MuZero MDP models.
Language:Jupyter Notebook156 8 725
yenw/computer-go-dataset
datasets for computer go
Language:C++147 18 538
Zeta36/muzero
A simple implementation of MuZero algorithm for connect4 game
Language:Jupyter Notebook95 11 420
rlglab/minizero
MiniZero: An AlphaZero and MuZero Training Framework
Language:C++72 6 418
DHDev0/Stochastic-muzero
Pytorch Implementation of Stochastic MuZero for gym environment. This algorithm is capable of supporting a wide range of action and observation spaces, including both discrete and continuous variations.
Language:Python57 5 1010
Hwhitetooth/jax_muzero
An implementation of MuZero in JAX.
Language:Python53 4 36
hr0nix/omega
A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environments.
Language:Python39 5 34
tuero/muzero-cpp
A C++ pytorch implementation of MuZero
Language:C++32 5 08
sail-sg/rosmo
Codes for "Efficient Offline Policy Optimization with a Learned Model", ICLR2023
Language:Python28 6 30
DHDev0/Muzero-unplugged
Pytorch Implementation of MuZero Unplugged for gym environment. This algorithm is capable of supporting a wide range of action and observation spaces, including both discrete and continuous variations.
Language:Python27 3 02
michaelnny/muzero
A PyTorch implementation of DeepMind's MuZero agent
Language:Python27 1 13
bellerb/chappie.ai
Generalized AI to perform a multitude of tasks written in python3
Language:Jupyter Notebook21 3 16
DHDev0/Muzero
Pytorch Implementation of MuZero for gym environment. It support any Discrete , Box and Box2D configuration for the action space and observation space.
Language:Python16 3 01
Itomigna2/Muesli-lunarlander
Muesli RL algorithm implementation (PyTorch) (LunarLander-v2)
Language:Jupyter Notebook15 2 45
rystrauss/dopamax
Reinforcement learning in pure JAX.
Language:Python10 3 01
jianzhnie/RLZero
A clean and easy implementation of MuZero, AlphaZero and Self-Play reinforcement learning algorithms for any game.
Language:Python8 4 0
benborder/drla
C++ Deep Reinforcement Learning Agent library
Language:C++6 3 01
seawee1/efficientalphazero
AlphaZero for singleplayer environments implemented efficiently using Ray
Language:Python6 1 02
hayashimasa/Robust_MuZero
A robust variant of MuZero
Language:Python5 2 00
BIGBALLON/Toward-AGZ
Materials for AlphaGo
4 3 01
AntoniovanDijck/BlackJackRL
Deep Q Learning blackbox strategies for casino games
Language:Jupyter Notebook2 1 01
abrahamabel/Muzero-GDM_Pseudo_Code
A Notebook implementation of the Pseudocode from the original Muzero paper
Language:Jupyter Notebook1 1 00
Atze00/muzero-cartpole
Language:Python1 2 00
benborder/drla-atari
Trains deep reinforcement learning agents in Atari environments via the DRLA library.
Language:C++1 2 20
benborder/drla-sim
Trains a deep reinforcement learning agent in simulation testbed environments with the DRLA library.
Language:C++1 1 11
mdhiebert/meta-minichess
Meta-learning experiments for the game of minichess and related rule variants.
Language:Python1 2 01
Nebraskinator/SuperMarioBrosAI
MuZero for Super Mario Bros
Language:Python1 0 01
souvikshanku/tic-tac-toe-zero
MuZero - tic-tac-toe
Language:Python1 1 00
svenssona/muzero
Learning how muzero works
Language:Jupyter Notebook1 1 00
abrahamabel/GenesisZero
GenesisZERO : potential applications for MCTS agents with LLMs for Sequential decision-making
0 2 00
ChukwumaChukwuma/enyimba_ai
Applying AlphaZero Self-Play Tactics to LLaMA for Enhanced Chatbot Interaction
Language:Python0 2 00
trunghng/muzero
Language:Python0 1 00

muzero

werner-duvaud/muzero-general

opendilab/LightZero

huawei-noah/xingtian

johan-gras/MuZero

kaesve/muzero

yenw/computer-go-dataset

Zeta36/muzero

rlglab/minizero

DHDev0/Stochastic-muzero

Hwhitetooth/jax_muzero

hr0nix/omega

tuero/muzero-cpp

sail-sg/rosmo

DHDev0/Muzero-unplugged

michaelnny/muzero

bellerb/chappie.ai

DHDev0/Muzero

Itomigna2/Muesli-lunarlander

rystrauss/dopamax

jianzhnie/RLZero

benborder/drla

seawee1/efficientalphazero

hayashimasa/Robust_MuZero

BIGBALLON/Toward-AGZ

AntoniovanDijck/BlackJackRL

abrahamabel/Muzero-GDM_Pseudo_Code

Atze00/muzero-cartpole

benborder/drla-atari

benborder/drla-sim

mdhiebert/meta-minichess

Nebraskinator/SuperMarioBrosAI

souvikshanku/tic-tac-toe-zero

svenssona/muzero

abrahamabel/GenesisZero

ChukwumaChukwuma/enyimba_ai

trunghng/muzero