Weekly Reading Group on Offline RL at GRAIL

Organizers: Mathieu Godbout and Ulysse Côté-Allard

Structure

Everyone reads the weekly paper and a discussion guide is (pseudo-)randomly drawn at beginning of each of our meeting. The discussion guide should only drive the conversation and make sure we are able to cover the entirety of the paper within the scheduled hour. Starting from the first of November 2022, meetings are every Tuesday from 9AM to 10AM (EST) and occur on Google Meet.

This reading group is open to anyone interested. If you wish to join, simply send us an email so we can add you to our discussion channel.

SUMMER 2022

For this semester, we will look a various state of the art reinforcement learning approaches without constraint on their particular domain. The reading group will take the form of a discussion and so no one will be considered the main presentator.

Fall 2022

Date	Paper
1st November, 2022	Discovering faster matrix multiplication algorithms with reinforcement learning
8th November, 2022	The Primacy Bias in Deep Reinforcement Learning
15th November, 2022	Deep Reinforcement Learning at the Edge of the Statistical Precipice
22th November, 2022	Explainable Reinforcement Learning: A Survey

Spring 2022

Date	Paper
8th April, 2022	Why is Posterior Sampling Better than Optimism for Reinforcement Learning
15th April, 2022	Collaborating with Humans without Human Data
22nd April, 2022	Mastering the game of Go without human knowledge
29th April, 2022	AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning
6th May, 2022	GFlowNet
13th May, 2022	Asymmetric self-play for automatic goal discovery in robotic manipulation
20th May, 2022	Continuous multi-task bayesian optimisation with correlation
27th May, 2022	Outracing champion Gran Turismo drivers with deep reinforcement learning
3rd June, 2022	Planning with Diffusion for Flexible Behavior Synthesis
Summer Break	The reading group will start again at the end of July. More details regarding the next paper will be communicated closer to the start date.

WINTER 2021-2022

For this semester, we will follow the reinforcement learning class by Emma Brunskill (https://www.youtube.com/playlist?list=PLoROMvodv4rOSOPzutgyCTapiGlY2Nd8u) at a rythm of one class per week.

Date	Paper	Presenter
3rd of December, 2021	Class 1-2	Introduction & Given a Model of the World
10th December, 2021	Class 3	Model-Free Policy Evaluation
17th December, 2021	Class 4	Model Free Control
7th January, 2022	Class 5	Value Function Approximation
14th January, 2022	Class 6	CNNs and Deep Q Learning
7th January, 2022	Class 5	Value Function Approximation
21st January, 2022	Class 6	CNNs and Deep Q Learning
28th January, 2022	Class 7	Value Function Approximation
4th February, 2022	Class 8	Imitation Learning
11th February, 2022	Class 9	Policy Gradient I
18th February, 2022	Class 10	Policy Gradient II
25th February, 2022	Class 11	Policy Gradient III & Review
4th March, 2022	Class 12	Fast Reinforcement Learning
11th March, 2022	Class 13	Fast Reinforcement Learning II
18th March, 2022	Class 14	Fast Reinforcement Learning III
25th March, 2022	Class 15	Batch Reinforcement Learning
1st April, 2022	Class 16	Monte Carlo Tree Search

Fall 2021 (part 2)

For this portion of the fall session, we will allow for papers outside of the usual offline RL scope. Presenters for submitted papers that aren't offline RL related will no longer be randomly sampled, rather being automatically assigned to the person who submitted said paper.

Date	Paper	Presenter
29th October, 2021	Pareto Front Identification from Stochastic Bandit Feedback	Alexandre Larouche
5th November, 2021	Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning	Peng Cheng (Frank)
12th November, 2021	COMBO: Conservative Offline Model-Based Policy Optimization	Random
19th November, 2021	Logistic Q-Learning. A 15-minute author presentation is also available	Mathieu Godbout
26th November, 2021	Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning. Code for the work's implementation available here	Ulysse Côté-Allard

Fall 2021

Date	Paper
2nd September, 2021	Skipped to attend the DEEL workshop on adversarial attack (free registration)
10th September, 2021	Causality and Batch Reinforcement Learning: Complementary Approaches To Planning In Unknown Domains
17th September, 2021	Causal Reinforcement Learning (ICML Tutorial) Part 1&2
1st October, 2021	Efficient Counterfactual Learning from Bandit Feedback
8th October, 2021	Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Offline RL
15th October, 2021	A Workflow for Offline Model-Free Robotic Reinforcement Learning
22th October, 2021	Offline Reinforcement Learning with Implicit Q-Learning

Summer 2021

Date	Paper
19th May, 2021	Conservative Q-Learning for Offline Reinforcement Learning
26th May, 2021	Offline Reinforcement Learning with Fisher Divergence Critic Regularization
2nd June, 2021	Universal Off-Policy Evaluation
9th June, 2021	What are the Statistical Limits of Offline RL with Linear Function Approximation?
16th June, 2021	An Optimistic Perspective on Offline Reinforcement Learning
23rd June, 2021	Optimism in Reinforcement Learning with Generalized Linear Function Approximation
30th June, 2021	Is Pessimism Provably Efficient for Offline RL? (The paper is pretty long for a single week's reading. We will base our meeting around this talk given by the author)
7th July, 2021	Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism (Talk given by the author)
14th July, 2021	Summer break
21st July, 2021	Summer break
28th July, 2021	Summer break
4th August, 2021	S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning
11th August, 2021	Sample-Efficient Reinforcement Learning via Counterfactual-Based Data Augmentation
19th August, 2021	MOPO: Model-based Offline Policy Optimization [Second Round]
26th August, 2021	Risk-Averse Offline Reinforcement Learning

Winter 2021

Date	Paper
11th March, 2021	Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems (1/4)
18th March, 2021	Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems (2/4)
24th March, 2021	Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems (3/4)
31st March, 2021	Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems (4/4)
7th April, 2021	MOPO: Model-based Offline Policy Optimization

Quoding/OfflineRLReadingGroup