Contain notes on paper on Reinforcement Learning that I read. Based on this list from Spinup OpenAI.
2018 December:
- Playing Atari with Deep Reinforcement Learning, Mnih et al, 2013. Algorithm: DQN. paper note
- Domain Adaptation for reinforcement learning on the atari paper
2019 March:
- Policy invariance under reward transformations: Theory and application to reward shaping. paper
2019 May:
2019 June:
- ICML2015 - Universal Value Function Approximators
2019 July:
- Deep Reinforcement Learning that matters
2019 Sept:
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- Pixel-Attentive Policy Gradient for Multi-Fingered Grasping in Cluttered Scences
- kPAM: KeyPoint Affordances for Category-Level Robotic Manipulation
- MAT: Multi-fingered adaptive tactile grasping via deep reinforcement learning
- Self-supervised correspondence in visuomotor policy learning
2019 Nov:
- Learning to Manipulate Object Collections Using Grounded State Representations (CoRL 2019) [note]
- Teacher algorithms for curriculum learning of Deep RL in continuously parameterized environments (CoRL 2019) [note]
2019 Dec:
- Playing FPS Games with DRL note
- Visual Reinforcement Learning with Imagined Goals note
- Deep Recurrent Q-Learning for POMDPs note
2020 Jan:
- Deep Q-learning from Demonstrations [note] [paper]
- Deep Reinforcement Learning with Double Q-Learning [note] paper
- Dueling Network Architectures for Deep Reinforcement Learning [note] [paper]
- Learning Latent Plans from Play [note] [paper] [project-site] CoRL2019
- Time Limit in RL [note] [paper] ICML 2018
- Multi-model imitation learning in partially observable environments [note] AAMAS 2020 (extended abstract)
- Learning belieft representations for imitation learning in pomdps [note] [code] Alg: Belief-module imitation learning (BMIL), UAI 2019
- Learning deep policies for robot bin picking by simulating robust grasping sequences [note] CORL 2017
2020 Aug:
- Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks a.k.a Cycle GAN [slide]