/RepL4RL

Representation Learning for RL

Representation Learning for Reinforcement Learning

A curated list of papers that apply representation learning (RepL) in reinforcement learning (RL).

Why RepL for RL?

A major reason to apply RepL in RL is to solve problems with high-dimensional state-action spaces. Another motivation of applying RepL in RL is to improve the sample efficiency problem. Specifically, we usually want to incorporate some inductive biases, i.e., structural information, about the tasks/envs into the representations towards better performance.

  • Prevalent RL methods requires lots of supervisions.
    • Instead of only learning from reward signals, we can also learn from the collected data.
  • Previous methods are sample inefficient in vision-based RL.
    • Good representations can accelerate learning from images.
  • Most of current RL agents are task-specific.
    • Good representations can generalize well across different tasks, or adapt quickly to new tasks.
  • Effective exploration is challenging in many RL tasks.
    • Good representations can accelerate exploration.

Challenges

  • Sequential data
  • Interactive learning tasks

Methods

Some popular methods of applying RepL in RL.

  • Auxiliary tasks, i.e., reconstruction, MI maximization, entropy maximization, dynamics prediction.
    • ACL, APS, AVFs, CIC, CPC, DBC, Dreamer, DreamerV2, DyNE, IDDAC, PBL, PI-SAC, PlaNet, RCRL, SLAC, SAC-AE, SPR, ST-DIM, TIA, UNREAL, Value-Improvement Path, World Model.
  • Contrastive learning.
    • ACL, ATC, Contrastive Fourier, CURL, RCRL, CoBERL.
  • Data augmentation.
    • DrQ, DrQ-v2, PSEs, RAD.
  • Bisimulation.
    • DBC, PSEs.
  • Causal inference.
    • MISA.

Workshops

Related Work

  • Self-Supervised Learning
  • Invariant Representation Learning

Papers

Vision-based Control

Theory

Low-rank MDPs

  • [Model-free Representation Learning and Exploration in Low-rank MDPs]
  • [FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs]
  • [Representation Learning for Online and Offline RL in Low-rank MDPs]
  • [Provably Efficient Representation Learning in Low-rank Markov Decision Processes]

Offline RL

Model-based RL

Multi-task RL

Exploration

Generalization