Implemented most of the methods in Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning (Richard S. Sutton, Doina Precup, Satinder Singh – Artificial Intelligence, 1999).
Experimented with transfer learning abilities of the Option-Critic Architecture (Pierre-Luc Bacon, Jean Harb, Doina Precup - JMLR 2016), and implemented model learning for the learnt options.