This attempts to use a reinforcement learning algorithm based on a Deep Deterministic Policy Gradient model for use with a continuous observation and action space, in order to solve the Square Packing in a Square problem for N=11 squares.
That was a very intelligent sounding sentence. Let's break it down:
- The Square Packing in a Square problem is a problem in mathematics where the goal is to pack
N
squares with a side length of 1 into another square, while wasting as little space as possible. See the Wikipedia page for more details.- There are known configurations for N=1-10 squares, but 11 (and some others) are only approximately solved. This tries to find a more optimal configuration for N=11 squares by using RL instead of pure math.
- A DDPG model is a kind of actor-critic setup (not technically a model) that allows continuous rather than discrete observation and action spaces. This is important, because I want to find a very precise solution, as opposed to infinitely increasing the discrete resolution of steps the AI can take.
I have a requirements.txt for completeness sake, but you can clone the repo and run all the cells in Attempt3.ipynb, and it should install them for you and then start running.