haanvid

KAISTDaejeon, Republic of Korea

Pinned Repositories

agents
TF-Agents is a library for Reinforcement Learning in TensorFlow
Language:Python0 1 00
alberdice
Office PyTorch implementation of AlberDICE
Language:Python0 0 00
BCQ
PyTorch implementation of BCQ for "Off-Policy Deep Reinforcement Learning without Exploration"
Language:Python0 3 00
CL1AD
Compositionality Level 1 Action Dataset
1 2 00
CL2AD
Compositionality Level 2 Action Dataset
1 1 00
continuous-policy-learning
Language:Jupyter Notebook0 0 00
DSTC10-SIMMC
Repository (preliminary codes) for DSTC10 SIMMC track.
Language:Python1 0 00
imitation-dice
Language:Python1 0 00
kmifqe
Kernel Metric learning for In-sample Fitted Q Evaluation (KMIFQE)
Language:Python1 1 00
kmis
local kernel metric learning for IS (KMIS) OPE estimation
Language:Python1 1 00

haanvid's Repositories

haanvid/DSTC10-SIMMC
Repository (preliminary codes) for DSTC10 SIMMC track.
Language:Python1 0 00
haanvid/imitation-dice
Language:Python1 0 00
haanvid/kmifqe
Kernel Metric learning for In-sample Fitted Q Evaluation (KMIFQE)
Language:Python1 1 00
haanvid/kmis
local kernel metric learning for IS (KMIS) OPE estimation
Language:Python1 1 00
haanvid/agents
TF-Agents is a library for Reinforcement Learning in TensorFlow
Language:Python0 1 00
haanvid/alberdice
Office PyTorch implementation of AlberDICE
Language:Python0 0 00
haanvid/BCQ
PyTorch implementation of BCQ for "Off-Policy Deep Reinforcement Learning without Exploration"
Language:Python0 3 00
haanvid/continuous-policy-learning
Language:Jupyter Notebook0 0 00
haanvid/dice_rl
Language:Python0 2 00
haanvid/DJL
haanvid/generative-models
Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow.
Language:Python3 0
haanvid/google-research
Google Research
haanvid/GPT-Critic
GPT-Critic: Offline Reinforcement Learning for End-to-End Task-Oriented Dialogue Systems
Language:Python0 0
haanvid/haanvid.github.io
Personal website
2 0
haanvid/LSPI
LSPI(Least-Squares Policy Iteration) with TF1.5
Language:Python
haanvid/MC-LAVE-RL
ICLR 2021: "Monte-Carlo Planning and Learning with Language Action Value Estimates"
Language:Python0 0
haanvid/models
Models built with TensorFlow
Language:Python2 0
haanvid/Nadaraya-Watson-Regression-Metric
Language:MATLAB1 0
haanvid/NeuralPipeline_DSTC8
Language:Python0 0
haanvid/palr
Language:Python0 0
haanvid/probability
Probabilistic reasoning and statistical analysis in TensorFlow
Language:Jupyter Notebook0 0
haanvid/RepBM
Representation Balancing MDPs for Off-Policy Policy Evaluation
Language:Python2 0
haanvid/rllab
rllab is a framework for developing and evaluating reinforcement learning algorithms, fully compatible with OpenAI Gym.
Language:Python2 0
haanvid/rllab-colab-tutorial
Language:Jupyter Notebook
haanvid/SBV
Language:PureBasic0 0
haanvid/slope-experiments
haanvid/softlearning
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.
Language:Python2 0
haanvid/SVGD
TensorFlow Implementation of Stein Variational Gradient Descent (SVGD)
Language:Python2 0
haanvid/tutorial-git
:blue_book: 어떻게 깃을 사용하는지 빠르게 알아봅시다. (Quick learn How to use Git.)
2 0
haanvid/zr-obp
Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation