Skip to content
This repository has been archived by the owner on Dec 11, 2022. It is now read-only.

Latest commit

 

History

History

sac

Soft Actor Critic

Each experiment uses 3 seeds and is trained for 3M environment steps. The parameters used for SAC are the same parameters as described in the original paper.

Inverted Pendulum SAC - single worker

coach -p Mujoco_SAC -lvl inverted_pendulum

Inverted Pendulum SAC

Hopper Clipped SAC - single worker

coach -p Mujoco_SAC -lvl hopper

Hopper SAC

Half Cheetah Clipped SAC - single worker

coach -p Mujoco_SAC -lvl half_cheetah

Half Cheetah SAC

Walker 2D Clipped SAC - single worker

coach -p Mujoco_SAC -lvl walker2d

Walker 2D SAC

Humanoid Clipped SAC - single worker

coach -p Mujoco_SAC -lvl humanoid

Humanoid SAC