/reinforcement-learning

Reinforcement Learning for studying

Primary LanguageJupyter Notebook

Reinforcement

DP

  • Bellman Expectation Equation(벨만 기대 방정식)

    • policy iteration
  • Bellman Optimality Equation(벨만 최적 방정식)

    • value iteration

RL

  • Monte Carlo Predict

    • predict(value)
    • model free
    • sampling
    • on policy
  • Temporal Difference Predict

    • predict(value)
    • model free
    • sampling
    • bootstrap
    • on policy
  • SARSA

    • control(policy)
    • model free
    • sampling
    • bootstrap
    • on policy
  • Q Learning

    • control(policy)
    • model free
    • off policy