Skip to content

an RL algorithm solving Flappy Bird. each episode decides a final score R upon crashing, so we can choose q : S × A → ℝ naturally to be the expected value E(R) from the state-action pair (s, a). the experiment confirms that a tabular, n-step Sarsa algorithm estimating q approximates q* with sufficient precision to decide a π* with arbitrary large R

Notifications You must be signed in to change notification settings

clay-curry/flapPy-RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

About

an RL algorithm solving Flappy Bird. each episode decides a final score R upon crashing, so we can choose q : S × A → ℝ naturally to be the expected value E(R) from the state-action pair (s, a). the experiment confirms that a tabular, n-step Sarsa algorithm estimating q approximates q* with sufficient precision to decide a π* with arbitrary large R

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages