RL

##Reading Group on Reinforcement Learning topics ##NYU, Fall 2016

###Logistics

Meetings run every Wednesday at 9h30 (before the CILVR lab meeting), at the large conference room at 715/719 Broadway 12th floor. Breakfast will be provided.
Paper discussion + Paper review plan: Each week we will assign one or two papers to volunteers who will present it following week. During the reading/presentation, we will edit a review of the paper which will be posted here. [also subject to change].
Guest speakers. We will try to invite RL experts (e.g. G. Tesauro) with some frequency.
Other communication channels ( Facebook groups?, Slack? ) [TBD].

###Organization The RG is initially organized by J.Bruna, K. Cho, S. Sukhbaatar, K. Ross, D. Sontag, with help from the rest of the CILVR group.

9/21: Tutorial on MDPs, Policy Gradient (part 1). [Keith Ross]
- Markov Decision Process Paradigm
- Discounted and average cost criteria
- Model-free Reinforcement Learning Paradigm
- Policy Gradient: parameterized policies; policy gradient theorem; Monte Carlo Policy Gradient (REINFORCE)
- Using Policy Gradient and deep neural networks to learn the Atari game "pong".
9/28: Tutorial on MDPs, Policy Gradient (part 2). [Keith]
- Dynamic Programming equations for MDPs
- Policy iteration
- Value iteration
- Monte Carlo methods for RL
- Q-learning for RL
10/5 and 10/12: Actor-Critic. [Martin]
- Deterministic Policy Gradient
- Off-Policy variants
- Relevant Papers:
10/19: Tutorial on OpenAI Gym and Mazebase. Also, Twitter's new twrl [Sainaa and Ilya]
- MazeBase: https://github.com/facebook/MazeBase
10/26: Apprenticeship Learning via Inverse Reinforcement Learning and Model-Free Imitation Learning with Policy Optimization [Arthur]
10/31: Trust region policy optimization (TRPO) [Elman, Ilya]

cilvrRG/RL