Reinforcement Learning Projects:

1) Policy Evaluation in Cliff Walking Environment:

Model free reinforcement learning agents for prediction


Design an agent for policy evaluation in the Cliff Walking environment

alt text


One-step temporal difference learning, TD(0), to estimate value functions for different policies, i.e., run policy evaluation experiments


a) Optimal policy:

alt text

b) Safety policy:

alt text