This project includes the code for an implementation of the deep deterministic policy gradient(DDPG) algorithm which I wrote to solve the Project 2 - Continuous Control of the Deep Reinforcement Learning Nanodegree @ Udacity. My version of the DDPG algorithm is inspired by chapter 14 of Maxim Lapan’s book called "Deep Reinforcement Learning Hands-On".
For more information on the implemented features refer to "Report.ipynb". The notebook includes a summary of all essential concepts used in the code.
The goal of this project was to train an agent, represented by a double-jointed arm, to maintain its position at the target location(great green sphere) for as many time steps as possible.
- a reward of +0.1 is provided for each step that the agent's hand is in the goal location
- the state space has 33 dimensions
- corresponding to the position, rotation, velocity, and angular velocities of the arm
- the action space has four dimensions
- every action is a continuous number between -1 and 1
- the task is episodic
- the agent has to maintain its position at the target location(great green sphere) for as many time steps as possible
- to solve the environment, the agent must get an average score of +30 over 100 consecutive episodes
- Create (and activate) a new environment with Python 3.6.
conda create --name env_name python=3.6
source activate env_name
- Download the environment from one of the links below and place it into \p2_continuous-control\Reacher_One_Linux
-
Linux: click here
-
Mac OSX: click here
-
Windows (32-bit): click here
-
Windows (64-bit): click here
-
your folder should now look something like this:
\Reacher_One_Linux
\Reacher_Data
\Reacher.x86
\Reacher.x86_64
- Install Sourcecode dependencies
conda install -c pytorch pytorch
conda install -c anaconda numpy
pip install tensorboardX
- unityagents is also required
- an easy way to get this is to install the Deep Reinforcement Learning Nanodegree with its dependencies
git clone https://github.com/udacity/deep-reinforcement-learning.git
cd deep-reinforcement-learning/python
pip install .
You can run the project by running the main.py file through the console.
- open the console and run: python main.py -c "your_config_file.json"
- to train the agent from scratch set "run_training" in the config file to true
- to run the pre-trained agent set "run_training" in the config file to false
optional arguments:
-h, --help
- show help message
-c , --config
- Config file name - file must be available as .json in ./configs
Example: python main.py -c "reacher_one.json"