This repository displays a solution to the lunar lander problem from the gymnasium library. Solved using Deep Q-learning (DQL) in pytorch. Notes on DQL shown here.
Training was stopped when the average rewards over the latest 100 episodes reached +200, which took 1764 episodes.
After 100 episodes:
lunar_lander_100.mp4
After training completed: