Notes are from this course from David Silver.
Textbooks:
Introduction to Reinforcement Learning by Sutton and Barto
Algorithms for Reinforcement Learning by Szepesvari
Science of the decision making.
RL is used in:
- Machine Learning
- Optimal Control
- Reward System
- Operations Research
- Bounded Rationality
- Classical/Operant Conditioning
- There is no supervisor, only a reward signal.
- Feedback is delayed, not instantaneuous. Results are are always tested retrospectively to tell if they were a good ones or bad ones.
- Time really matters where data is sequential, and it doesn't matter if the data is i.i.d.
- Agent's actions affect the subsequent data it recieves.