保存一些自己看过的论文,github下载速度慢,下载链接自动跳转到gitee库

Book

Reinforcement Learning: An Introduction (Richard S. Sutton 著)
Spinning Up Documentation
博弈论基础 (Robert Gibbons 著)
动态合作博弈 (高红伟 彼得罗相 著)

Reinforcement Learning

Model-Free

DQN: Human-level control through deep reinforcement learning
Playing Atari with Deep Reinforcement Learning
DPG: Deterministic policy gradient algorithms
DDPG: Continuous control with deep reinforcement learning
PG: Policy Gradient Methods for Reinforcement Learning with Function Approximation
A3C: Asynchronous methods for deep reinforcement learning
TRPO: Approximately optimal approximate reinforcement learning(前序工作)
Trust region policy optimization
PPO: Proximal policy optimization algorithms

Hierachical Reinforcement Learning

HAM: Reinforcement Learning with Hierarchies of Machines
MAXQ: The MAXQ Method for Hierarchical Reinforcement Learning
Option: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
OC: The Option-Critic Architecture
A2OC: When Waiting is not an Option Learning Options with a Deliberation Cost
Feudal: Feudal Reinforcement Learning
FuNs: FeUdal Networks for Hierarchical Reinforcement Learning
h-DQN: Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation
H-DRLN: A Deep Hierarchical Approach to Lifelong Learning in Minecraft
UVFA: Universal Value Function Approximators
HER: Hindsight Experience Replay
HAC: Learning Multi-Level Hierachies with Hindsight
HIRO: Data-Efficient Hierarchical Reinforcement Learning
SC: Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining
DSC: Option Discovery Using Deep Skill Chaining
Information-Constrained Primitives: Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives
DADS: Dynamic-Aware Unsupervised Discovery Skills
DIAYN: Diversity Is All You Need: Learning Skills Without A Reward Function
VIM: Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning

Multi-Agent

综述: A Survey and Critique of Multiagent Deep Reinforcement Learning
RIAL/DIAL: Learning to Communicate with Deep Multi-agent Reinforcement Learning
CommNet: Learning Multiagent Communication with Backpropagation
Emergence of Grounded Compositional Language in Multi-Agent Populations
BiCNet: Multiagent Bidirectionally-Coordinated Nets
Experience Replay with Fingerprint: Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
Parameter Sharing: Cooperative Multi-Agent Control Using Deep Reinforcement Learning
Lenient-DQN: Lenient Multi-Agent Deep Reinforcement Learning
MADDPG: Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

Multi-Task

Learning Shared Representations: Learning Shared Representations in Multi-Task Reinforcement learning
Progressive Neural Networks: Progressive Neural Networks
PathNet: Pathnet-Evolution Channels Gradient Descent in Super Neural Networks
Gato: A Generalist Agent