Paper Notes

Enjoy yourself :D

RL Related

Tags: RL, IL, meta-learning, HRL, policy-based, value-based, model-based, model-free, on-policy, off-policy, etc.

Name	Conf	Arxiv	Tags
Trust Region Policy Optimization	ICML2015	1502.05477	policy-based
The Option-Critic Architecture	AAAI2017	1609.05140	HRL, option-critic
Learning to Act by Predicting the Future	ICLR2017	1611.01779	VizDoom
Meta Networks	ICML2017	1703.00837	meta-learning, MetaNet, few-shot classification
FeUdal Networks for Hierarchical Reinforcement Learning	ICML2017	1703.01161	FeUDalNet, HRL
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks	ICML2017	1703.03400	meta-learning, MAML
Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World	IROS2017	1703.06907	sim to real, domain randomization
One-Shot Imitation Learning	NIPS2017	1703.07326	imitation, demonstration
Multi-Level Discovery of Deep Options	-	1703.08294	DDO, HRL
DART - Noise Injection for Robust Imitation Learning	CoRL2017	1703.09327	imitation learning , add noise -> more robust
Stochastic Neural Networks for Hierarchical Reinforcement Learning	ICLR2017	1704.03012	HRL, StocasticNN
Deep Q-learning from Demonstrations	AAAI2018	1704.03732	DQfD : imitation + RL, discrete
Parameter Space Noise for Exploration	ICLR2018	1706.01905	OpenAI NoisyNet
Noisy Networks for Exploration	ICLR2018	1706.10295	DeepMind NoisyNet, part of Rainbow
Deep Reinforcement Learning from Human Preferences	NIPS2017	1706.03741	RL + human feedback (easier than demonstration)
Hindsight Experience Replay	NIPS2017	1707.01495	HER, goal-based env, sparse reward, learn from fail
Emergence of Locomotion Behaviours in Rich Environments	-	1707.02286	PPO
Robust Imitation of Diverse Behaviors	NIPS2017	1707.02747	imitation learning : VAE (behavioral cloning) + GAIL
Imitation from Observation - Learning to Imitate Behaviors from Raw Video via Context Translation	ICRA2018	1707.03374	imitation learning from obs, context translation
Reverse Curriculum Generation for Reinforcement Learning	CoRL2017	1707.05300	reverse curriculum
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards	-	1707.08817	DDPGfD : DDPG + DQfD, off-policy imitation, continuous goal-based env
When Waiting is not an Option - Learning Options with a Deliberation Cost	AAAI2018	1709.04571	HRL, A2OC : A3C + OC + deliberation cost
Autonomous Extracting a Hierarchical Structure of Tasks in Reinforcement Learning and Multi-task Reinforcement Learning	-	1709.04579	HRL, association rule
One-Shot Visual Imitation Learning via Meta-Learning	CoRL2017	1709.04905	MIL : meta learning (MAML) + imitation learning (BC)
Overcoming Exploration in Reinforcement Learning with Demonstrations	ICRA2018	1709.10089	Similar to DDPGfD, imitation + DDPG + HER

Speech

[NIPS 2017 Keynotes] Deep Learning for Robotics - Pieter Abbeel

To read

[ICML 2017] 1703.02702 - Robust Adversarial Reinforcement Learning
[ICML 2017] 1706.05064 - Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
1708.05866 - A Brief Survey of Deep Reinforcement Learning
[NIPS 2017] 1710.03592 - Meta Inverse Reinforcement Learning via Maximum Reward Sharing for Human Motion Analysis
1802.03596 - Deep Meta-Learning: Learning to Learn in the Concept Space
[ICLR 2018] 1802.09081 - Temporal Difference Models: Model-Free Deep RL for Model-Based Control
1802.10567 - Learning by Playing - Solving Sparse Reward Tasks from Scratch
[ICLR 2018] 1803.00933 - Distributed Prioritized Experience Replay
[ICLR 2018] Extending Robust Adversarial Reinforcement Learning Considering Adaptation and Diversity
[ICLR 2018] Learning to Teach
[ICLR 2018] Learning an Embedding Space for Transferable Robot Skills

Currently no notes

1706.09529 - Learning to Learn: Meta-Critic Networks for Sample Efficient Learning
1710.03463 - Learning to Generalize: Meta-Learning for Domain Generalization
1712.00948 - Hierarchical Actor-Critic

Skipped

[NIPS 2017] 1712.08266 - Federated Control with Hierarchical Multi-Agent Deep Reinforcement Learning
[ICLR 2018] 1801.08930 - Recasting Gradient-Based Meta-Learning as Hierarchical Bayes
1802.07245 - Meta-Reinforcement Learning of Structured Exploration Strategies
1802.09564 - Reinforcement and Imitation Learning for Diverse Visuomotor Skills
[ICLR 2018] Zero-Shot Visual Imitation

YunqiuXu/Readings

Paper Notes

RL Related

Speech

To read

Currently no notes

Skipped

DL, ML, CV, etc.