-
Notifications
You must be signed in to change notification settings - Fork 0
an RL algorithm solving Flappy Bird. each episode decides a final score R upon crashing, so we can choose q : S × A → ℝ naturally to be the expected value E(R) from the state-action pair (s, a). the experiment confirms that a tabular, n-step Sarsa algorithm estimating q approximates q* with sufficient precision to decide a π* with arbitrary large R
clay-curry/flapPy-RL
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
About
an RL algorithm solving Flappy Bird. each episode decides a final score R upon crashing, so we can choose q : S × A → ℝ naturally to be the expected value E(R) from the state-action pair (s, a). the experiment confirms that a tabular, n-step Sarsa algorithm estimating q approximates q* with sufficient precision to decide a π* with arbitrary large R
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published