Pinned Repositories
dcd
Implementations of robust Dual Curriculum Design (DCD) algorithms for unsupervised environment design.
level-replay
This code implements Prioritized Level Replay, a method for sampling training levels for reinforcement learning agents that exploits the fact that not all levels are equally useful for agents to learn from during training.
minihack
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
minimax
Efficient baselines for autocurricula in JAX.
alphazero
Generic implementation of AlphaZero
hnatt
Train and visualize Hierarchical Attention Networks
learning-to-communicate-pytorch
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch
procgen
Procgen Benchmark: Procedurally Generated Game-Like Gym Environments
PyMDP
Markov decision processes in Python
wordcraft
An environment for benchmarking commonsense agents
minqi's Repositories
minqi/learning-to-communicate-pytorch
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch
minqi/hnatt
Train and visualize Hierarchical Attention Networks
minqi/wordcraft
An environment for benchmarking commonsense agents
minqi/alphazero
Generic implementation of AlphaZero
minqi/PyMDP
Markov decision processes in Python
minqi/procgen
Procgen Benchmark: Procedurally Generated Game-Like Gym Environments
minqi/auto-drac
Automatic Data-Regularized Actor-Critic (Auto-DrAC)
minqi/awesome-open-ended
minqi/babyai
BabyAI platform. A testbed for training agents to understand and execute language commands.
minqi/baselines
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
minqi/basicnn
Common neural networks in numpy
minqi/carracingf1
minqi/cma_mae
A python implementation of Covariance Matrix Adaptation MAP-Annealing
minqi/EGG
EGG: Emergence of lanGuage in Games
minqi/facenet
Face recognition using Tensorflow
minqi/gym-minigrid
Minimalistic gridworld package for OpenAI Gym
minqi/minimax-updates
Efficient baselines for autocurricula in JAX.
minqi/minqi.github.io
minqi/papers
minqi/pytorch-a2c-ppo-acktr-gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
minqi/random-network-distillation
Code for the paper "Exploration by Random Network Distillation"
minqi/scikit-learn
scikit-learn: machine learning in Python
minqi/scipy
SciPy library main repository
minqi/seq2seq
Example attention-seq2seq implementations.
minqi/tfjs
A WebGL accelerated JavaScript library for training and deploying ML models.
minqi/tfjs-converter
Convert TensorFlow SavedModel and Keras models to TensorFlow.js
minqi/ued
Open-Ended Autocurricula
minqi/v139
Proceedings of ICML 2021
minqi/vae
VAE implementations
minqi/vqvae
A pytorch implementation of the vector quantized variational autoencoder (https://arxiv.org/abs/1711.00937)