Implementation of Schmidhuber's Upside Down Reinforcement Learning paper
Link to paper with theory:
Link to paper with implementation details and results:
Use as you wish. Tweet(@mfharoon)/email( me any interesting results you find and sets of hyperparameters that work for particular environments. I will share here. Thanks!
replay_size = 600
last_few = 50
batch_size = 64
n_warm_up_episodes = 50
n_episodes_per_iter = 50
n_updates_per_iter = 100
command_scale = 0.02
lr = 0.001