REINFORCE == REward Increment = Nonnegative Factors times Offset Reinforcement times Characteristic Eligibility

#Nice #TattooMaterial - source

Reinforcement Learning

Screen captures of solved simulations:

Blackjack Details	Cliff Walking Details
Taxi Details	Lunar Lander Details
Banana Collector Details	Mountain Car Details
Cart Pole Details	Atari Pong Details
Reacher Arms Details	Two-Player Tennis Details

A collection of reinforcement learning projects I have done in OpenAI Gym and Unity ML-agents. Learned and implemented basic to complex reinforcement algorithms, from using the Monte Carlo approach for solving puzzles to using the Multi-Agent Deep Deterministic Policy Gradient method for training table tennis players. Detailed description of each project could be found by clicking on project titles in the table above.

This is one of the most interesting topic I have had a chance to peek into. However, it definitely contains more mathematical concepts than even most of the other deep learning algorithms (my perspective), but the fact that it is also one of the hardest challenge for some of the smartest minds on Earth is soothing the pain of me needing to open 10 google tabs just to comprehend a page of some paper.

Main Topics/Methods

Monte Carlo Methods - Epsilon-Greedy policies, GLIE, state and action value functions, Bellman Equations

Temporal-Different Methods - Sarsa, Q-Learning, and Expected Sarsa

Continuous Spaces - Discretization, Tile Coding, and Function Appoximations

Value-Based Methods - Implementation of Deep Q-Networks, Double Q-Networks

Policy-Based Methods - Stochastic Policy Search, Hill Climbing Algorithm, REINFORCE, Proximal Policy Optimization, A3C, A2C, N-step bootstrapping, GAE, DDPG, Continuous Control

Multi-Agent Reinforcement Learning (MARL) - Cooperative and Competitive Behaviors, Multi-Agent DDPG, Monte Carlo Tree Search

Resources

Textbook

Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto. Awesome textbook that is not afraid to go indepth into the mathematics of RL.

Paper

Structured

The Udacity Nanodegree program provided great assistance. It helped a lot with configuring OpenAI Gym and Unity Environments, provided me with pretty good GPU, and even some skeleton of some early projects to pull me through the initial learning curve. However, since the course does not seem to be very popular due to the low demand, it is quite unstructured for someone to resort knowledge on.

Jacklu0831/Fun-With-RL