jjxxmiin/reinforcement-learning

Reinforcement Learning for studying

Jupyter Notebook

Reinforcement

DP

Bellman Expectation Equation(벨만 기대 방정식)
- policy iteration
Bellman Optimality Equation(벨만 최적 방정식)
- value iteration

RL

Monte Carlo Predict
- predict(value)
- model free
- sampling
- on policy
Temporal Difference Predict
- predict(value)
- model free
- sampling
- bootstrap
- on policy
SARSA
- control(policy)
- model free
- sampling
- bootstrap
- on policy
Q Learning
- control(policy)
- model free
- off policy