Pinned Repositories
agents
TF-Agents is a library for Reinforcement Learning in TensorFlow
alberdice
Office PyTorch implementation of AlberDICE
BCQ
PyTorch implementation of BCQ for "Off-Policy Deep Reinforcement Learning without Exploration"
CL1AD
Compositionality Level 1 Action Dataset
CL2AD
Compositionality Level 2 Action Dataset
continuous-policy-learning
DSTC10-SIMMC
Repository (preliminary codes) for DSTC10 SIMMC track.
imitation-dice
kmifqe
Kernel Metric learning for In-sample Fitted Q Evaluation (KMIFQE)
kmis
local kernel metric learning for IS (KMIS) OPE estimation
haanvid's Repositories
haanvid/DSTC10-SIMMC
Repository (preliminary codes) for DSTC10 SIMMC track.
haanvid/imitation-dice
haanvid/kmifqe
Kernel Metric learning for In-sample Fitted Q Evaluation (KMIFQE)
haanvid/kmis
local kernel metric learning for IS (KMIS) OPE estimation
haanvid/agents
TF-Agents is a library for Reinforcement Learning in TensorFlow
haanvid/alberdice
Office PyTorch implementation of AlberDICE
haanvid/BCQ
PyTorch implementation of BCQ for "Off-Policy Deep Reinforcement Learning without Exploration"
haanvid/continuous-policy-learning
haanvid/dice_rl
haanvid/DJL
haanvid/generative-models
Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow.
haanvid/google-research
Google Research
haanvid/GPT-Critic
GPT-Critic: Offline Reinforcement Learning for End-to-End Task-Oriented Dialogue Systems
haanvid/haanvid.github.io
Personal website
haanvid/LSPI
LSPI(Least-Squares Policy Iteration) with TF1.5
haanvid/MC-LAVE-RL
ICLR 2021: "Monte-Carlo Planning and Learning with Language Action Value Estimates"
haanvid/models
Models built with TensorFlow
haanvid/Nadaraya-Watson-Regression-Metric
haanvid/NeuralPipeline_DSTC8
haanvid/palr
haanvid/probability
Probabilistic reasoning and statistical analysis in TensorFlow
haanvid/RepBM
Representation Balancing MDPs for Off-Policy Policy Evaluation
haanvid/rllab
rllab is a framework for developing and evaluating reinforcement learning algorithms, fully compatible with OpenAI Gym.
haanvid/rllab-colab-tutorial
haanvid/SBV
haanvid/slope-experiments
haanvid/softlearning
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.
haanvid/SVGD
TensorFlow Implementation of Stein Variational Gradient Descent (SVGD)
haanvid/tutorial-git
:blue_book: 어떻게 깃을 사용하는지 빠르게 알아봅시다. (Quick learn How to use Git.)
haanvid/zr-obp
Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation