This repository contains code for our paper Learning Guidance Rewards with Trajectory-space Smoothing, published at the Conference on Neural Information Processing Systems (NeurIPS 2020).
The code reuses the Pytorch SAC code from this awesome repository. It was tested with the following packages:
- python 3.6.6
- pytorch 0.4.1
- gym 0.10.8
- hydra 0.11.3
To run the SAC experiments on MuJoCo, use the command below. The hyperparameters are mentioned in the config
folder. Check the file run_cmds.sh for further commands.
python main.py env_name="Hopper-v2" seed=$RANDOM