Skip to content

A C++ implementation of the asynchronous advantage actor-critic (A3C) algorithm

License

Notifications You must be signed in to change notification settings

integeruser/GA3C-cpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GA3C-cpp

This repository contains a C++ multithreaded implementation of the asynchronous advantage actor-critic (A3C) algorithm based on NVIDIA's GA3C. It has been tested on the CartPole-v0 OpenAI Gym environment using TensorFlow and integeruser/gym-uds-api, with model configuration and parameters as described in jaromiru/AI-blog.

Requisites

This project requires building TensorFlow 1.3 from sources. Example instructions are provided for macOS (tested on macOS Catalina 10.15.3).

  1. Install Homebrew.

  2. Install OpenJDK 8:

     ~$ brew cask install homebrew/cask-versions/adoptopenjdk8
    
  3. Download bazel-0.4.5-jdk7-installer-darwin-x86_64.sh, make it executable with chmod, and install Bazel 0.4.5 (to $HOME/bin/bazel):

     Downloads$ env JAVA_HOME="/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home" ./bazel-0.4.5-jdk7-installer-darwin-x86_64.sh --user
    
  4. Install pyenv and Python 3.6.10 (to $HOME/.pyenv/versions/3.6.10/bin/python):

     ~$ brew install pyenv
     ~$ pyenv install 3.6.10
    
  5. Install the wheel, NumPy and dlib pip packages:

     ~$ $HOME/.pyenv/versions/3.6.10/bin/pip install wheel numpy dlib
    
  6. Clone TensorFlow from the official repository:

     ~$ git clone https://github.com/tensorflow/tensorflow
    
  7. cd to the TensorFlow directory (assumed to be the working directory for all the next steps), then switch to version 1.3:

     tensorflow$ git checkout r1.3
    
  8. Configure TensorFlow, specifying, when asked, $HOME/.pyenv/versions/3.6.10/bin/python as the location of Python (but expanding $HOME to its value):

     tensorflow$ env PATH="$HOME/bin:$PATH" JAVA_HOME="/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home" ./configure
    
  9. Build the TensorFlow shared library (to bazel-bin/tensorflow/libtensorflow_cc.so):

     tensorflow$ env PATH="$HOME/bin:$PATH" JAVA_HOME="/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home" bazel build //tensorflow:libtensorflow_cc.so
    

    Building may fail if (an incompatible version of) protobuf was already installed in the machine, in which case you need to make sure that TensoFlow builds and uses its internal version of protobuf instead.

  10. Build and install the TensorFlow pip package:

     tensorflow$ env PATH="$HOME/bin:$PATH" JAVA_HOME="/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home" bazel build //tensorflow/tools/pip_package:build_pip_package
     tensorflow$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
     tensorflow$ $HOME/.pyenv/versions/3.6.10/bin/pip install /tmp/tensorflow_pkg/tensorflow-1.3.1-cp36-cp36m-macosx_10_15_x86_64.whl
    
  11. Install the OpenAI Gym pip package:

     $ $HOME/.pyenv/versions/3.6.10/bin/pip install gym
    

Installation

  1. Clone this repository:

     $ git clone https://github.com/integeruser/GA3C-cpp.git
    
  2. cd to the GA3C-cpp directory and compile the code for testing on CartPole-v0:

     GA3C-cpp$ make TENSORFLOW_DIRPATH=/absolute/path/to/tensorflow/repository GA3C-cartpole-v0
    

Usage

  1. cd to the GA3C-cpp/cartpole-v0 directory and start the gym-uds servers:

     GA3C-cpp/cartpole-v0$ env PATH="$HOME/.pyenv/versions/3.6.10/bin:$PATH" ./start-gym-uds-servers.sh
    
  2. Generate the nontrained TensorFlow model:

     GA3C-cpp/cartpole-v0$ $HOME/.pyenv/versions/3.6.10/bin/python ./cartpole-v0.py generate
    
  3. Train the agent (specifying DYLD_LIBRARY_PATH for finding libtensorflow_cc.so):

     GA3C-cpp/cartpole-v0$ env DYLD_LIBRARY_PATH="/absolute/path/to/tensorflow/repository/bazel-bin/tensorflow" bin/GA3C-cartpole-v0
    
  4. The updated weights of the model are saved back to disk. Lastly, see the trained agent in action:

     GA3C-cpp/cartpole-v0$ $HOME/.pyenv/versions/3.6.10/bin/python ./cartpole-v0.py test
    

About

A C++ implementation of the asynchronous advantage actor-critic (A3C) algorithm

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published