You can analyze ChainerRL agent's behavior in well visualized way, making debugging easier.
You can easily inspect your ChainerRL agent's behavior from a browser UI.
- Rollout one episode from UI
- Tick timestep and env/agent behavior is well visualized
- If input of agent's model is raw pixel image, saliency map can be created
- Various ways of visualization are supported
- For now, even if the input of an agent's model is a raw pixel image, the saliency map cannot be created when the model includes an RNN. This will be fixed.
- If you use
gym
environment of classic control, the env window can appear on display when sending rollout command from UI.
Follow the instructions of each example.
To install ChainerRL Visualizer, use pip
.
$ pip install chainerrl-visualizer
To install ChainerRL Visualizer from source:
$ git clone https://github.com/chainer/chainerrl-visualizer.git
$ cd chainerrl-visualizer/frontend
$ npm install && npm run build && cd ..
$ pip install -e .
Simply pass agent
and env
object to chainerrl_visualizer.launch_visualizer
function.
from chainerrl_visualizer import launch_visualizer
# Prepare agent and env object here
#
# Prepare dictionary which explains meanings of each action
ACTION_MEANINGS = {
0: 'hoge',
1: 'fuga',
...
}
launch_visualizer(
agent, # required
env, # required
ACTION_MEANINGS, # required
port=5002, # optional (default: 5002)
log_dir='log_space', # optional (default: 'log_space')
raw_image_input=False, # optional (default: False)
contains_rnn=False, # optional (default: False)
)
agent
object must be instance of Agent class provided by ChainerRL, which extendschainerrl.agent.Agent
class.env
object must implement three gym-like methods below. Of course, gym's env object is accepted.reset
: Reset the environment to initial state.step
: Takenumpy.ndarray
action as argument, and proceed enviroment one step.render
: Return 3Dnumpy.ndarray
which represents RGB image shaped(height, width, 3)
describing env state.
- If you'd like to change this app's log directory name, you can specify one by passing the name to
log_dir
argument. This directory is assumed to be relative directory from python main execution. - If your agent's model contains RNN part, you have to specify
contains_rnn=True
. - If the input of your agent's model is raw image pixel data (or modified one), you have to specify
raw_image_input=True
.
Reset the environment state and returns initial array-like observation object.
Returns:
- observation (array-like object): agent's observation of the current environment
Run the timestep of environment's dynamics.
Args:
- action (numpy.ndarray): ndarray representing next action to take
Returns:
- observation (array-like object): agent's observation of the current environment
- reward (float): amount of reward returned after args action taken
- done (boolean): whether the episode has ended or not
- info (dict): contains various information helpful for debugging
Returns 3d numpy.ndarray
which represents RGB image of current environment appearance.
render
method is assumed not to have any argument.
Though gym's env render
method returns RGB numpy.ndarray
only when rgb_array=True
passed as render
argument,
gym's env object will be wrapped in proper way inside this app and return RGB numpy.ndarray
by default (So all you have to do is to pass gym env object).
Returns:
- image (3d numpy.ndarray): RGB image of current environment appearance.
This library is under development and all modules in chainerrl are not supported yet. Supported and unsupported modules are listed below. If your agent's model returns unsupported one, this app will stop with error message.
- ActionValue
DiscreteActionValue
: supportedDistributionalDiscreteActionValue
: supportedQuadraticActionValue
: unsupportedSingleActionValue
: unsupported
- Distribution
SoftmaxDistribution
: supportedMellowmaxDistribution
: unsupportedGaussianDistribution
: supportedContinuousDeterministicDistribution
: unsupported
For now, saliency map can be created only in situations below. This will be fixed in the future.
- Distribution is
SoftmaxDistribution
&&StateValue
is returned &&contains_rnn=False
- (ActionValue is
DiscreteActionValue
orDistributionalDiscreteActionValue
) &&contains_run=False
If you use MacOS, you may encounter a crash message below when sending rollout
or saliency
command from UI.
objc[42564]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
This behavior is due to a change in high Sierra. If you would like to know detail, see here.
Workaround is to set environment variable OBJC_DISABLE_INITIALIZE_FORK_SAFETY
to YES
prior to executing python.
$ export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES