clay-curry/flapPy-RL

an RL algorithm solving Flappy Bird. each episode decides a final score R upon crashing, so we can choose q : S × A → ℝ naturally to be the expected value E(R) from the state-action pair (s, a). the experiment confirms that a tabular, n-step Sarsa algorithm estimating q approximates q* with sufficient precision to decide a π* with arbitrary large R

Python

Stargazers

No one’s star this repository yet.