/OfflineRLReadingGroup

Offline Reinforcement Learning Reading Group

Weekly Reading Group on Offline RL at GRAIL

Organizers: Mathieu Godbout and Ulysse Côté-Allard

Structure

Everyone reads the weekly paper and a discussion guide is (pseudo-)randomly drawn at beginning of each of our meeting. The discussion guide should only drive the conversation and make sure we are able to cover the entirety of the paper within the scheduled hour. Starting from the first of November 2022, meetings are every Tuesday from 9AM to 10AM (EST) and occur on Google Meet.

This reading group is open to anyone interested. If you wish to join, simply send us an email so we can add you to our discussion channel.

SUMMER 2022

For this semester, we will look a various state of the art reinforcement learning approaches without constraint on their particular domain. The reading group will take the form of a discussion and so no one will be considered the main presentator.

Fall 2022

Date Paper
1st November, 2022 Discovering faster matrix multiplication algorithms with reinforcement learning
8th November, 2022 The Primacy Bias in Deep Reinforcement Learning
15th November, 2022 Deep Reinforcement Learning at the Edge of the Statistical Precipice
22th November, 2022 Explainable Reinforcement Learning: A Survey

Spring 2022

Date Paper
8th April, 2022 Why is Posterior Sampling Better than Optimism for Reinforcement Learning
15th April, 2022 Collaborating with Humans without Human Data
22nd April, 2022 Mastering the game of Go without human knowledge
29th April, 2022 AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning
6th May, 2022 GFlowNet
13th May, 2022 Asymmetric self-play for automatic goal discovery in robotic manipulation
20th May, 2022 Continuous multi-task bayesian optimisation with correlation
27th May, 2022 Outracing champion Gran Turismo drivers with deep reinforcement learning
3rd June, 2022 Planning with Diffusion for Flexible Behavior Synthesis
Summer Break The reading group will start again at the end of July. More details regarding the next paper will be communicated closer to the start date.

WINTER 2021-2022

For this semester, we will follow the reinforcement learning class by Emma Brunskill (https://www.youtube.com/playlist?list=PLoROMvodv4rOSOPzutgyCTapiGlY2Nd8u) at a rythm of one class per week.

Date Paper Presenter
3rd of December, 2021 Class 1-2 Introduction & Given a Model of the World
10th December, 2021 Class 3 Model-Free Policy Evaluation
17th December, 2021 Class 4 Model Free Control
7th January, 2022 Class 5 Value Function Approximation
14th January, 2022 Class 6 CNNs and Deep Q Learning
7th January, 2022 Class 5 Value Function Approximation
21st January, 2022 Class 6 CNNs and Deep Q Learning
28th January, 2022 Class 7 Value Function Approximation
4th February, 2022 Class 8 Imitation Learning
11th February, 2022 Class 9 Policy Gradient I
18th February, 2022 Class 10 Policy Gradient II
25th February, 2022 Class 11 Policy Gradient III & Review
4th March, 2022 Class 12 Fast Reinforcement Learning
11th March, 2022 Class 13 Fast Reinforcement Learning II
18th March, 2022 Class 14 Fast Reinforcement Learning III
25th March, 2022 Class 15 Batch Reinforcement Learning
1st April, 2022 Class 16 Monte Carlo Tree Search

Fall 2021 (part 2)

For this portion of the fall session, we will allow for papers outside of the usual offline RL scope. Presenters for submitted papers that aren't offline RL related will no longer be randomly sampled, rather being automatically assigned to the person who submitted said paper.

Date Paper Presenter
29th October, 2021 Pareto Front Identification from Stochastic Bandit Feedback Alexandre Larouche
5th November, 2021 Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning Peng Cheng (Frank)
12th November, 2021 COMBO: Conservative Offline Model-Based Policy Optimization Random
19th November, 2021 Logistic Q-Learning. A 15-minute author presentation is also available Mathieu Godbout
26th November, 2021 Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning. Code for the work's implementation available here Ulysse Côté-Allard

Fall 2021

Date Paper
2nd September, 2021 Skipped to attend the DEEL workshop on adversarial attack (free registration)
10th September, 2021 Causality and Batch Reinforcement Learning: Complementary Approaches To Planning In Unknown Domains
17th September, 2021 Causal Reinforcement Learning (ICML Tutorial) Part 1&2
1st October, 2021 Efficient Counterfactual Learning from Bandit Feedback
8th October, 2021 Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Offline RL
15th October, 2021 A Workflow for Offline Model-Free Robotic Reinforcement Learning
22th October, 2021 Offline Reinforcement Learning with Implicit Q-Learning

Summer 2021

Date Paper
19th May, 2021 Conservative Q-Learning for Offline Reinforcement Learning
26th May, 2021 Offline Reinforcement Learning with Fisher Divergence Critic Regularization
2nd June, 2021 Universal Off-Policy Evaluation
9th June, 2021 What are the Statistical Limits of Offline RL with Linear Function Approximation?
16th June, 2021 An Optimistic Perspective on Offline Reinforcement Learning
23rd June, 2021 Optimism in Reinforcement Learning with Generalized Linear Function Approximation
30th June, 2021 Is Pessimism Provably Efficient for Offline RL? (The paper is pretty long for a single week's reading. We will base our meeting around this talk given by the author)
7th July, 2021 Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism (Talk given by the author)
14th July, 2021 Summer break
21st July, 2021 Summer break
28th July, 2021 Summer break
4th August, 2021 S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning
11th August, 2021 Sample-Efficient Reinforcement Learning via Counterfactual-Based Data Augmentation
19th August, 2021 MOPO: Model-based Offline Policy Optimization [Second Round]
26th August, 2021 Risk-Averse Offline Reinforcement Learning

Winter 2021

Date Paper
11th March, 2021 Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems (1/4)
18th March, 2021 Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems (2/4)
24th March, 2021 Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems (3/4)
31st March, 2021 Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems (4/4)
7th April, 2021 MOPO: Model-based Offline Policy Optimization