Skip to content

Python package implementing Reinforcement Learning based stepping strategy for Kyon robot

Notifications You must be signed in to change notification settings

ADVRHumanoids/KyonRLStepping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KyonRLStepping package

The preferred way of using KyonRLStepping package is to employ the provided mamba environment.

Installation instructions:

  • First install Mamba by running curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh" and then bash Mambaforge-$(uname)-$(uname -m).sh.

  • Create the mamba environment by running create_mamba_env.sh. This will properly setup a Python 3.7 mamba environment named kyonrlstepping with (almost) all necessary dependencies

  • Activate the environment with mamba activate kyonrlstepping

  • From the root folder install the package with pip install -e ..

  • Test the Lunar Lander example from StableBaselines3 v2.0 with python kyonrlstepping/tests/test_lunar_lander_stable_bs3.py.

  • Download Omniverse Launcer, go to the "exchange" tab and install Omniverse Cache and Isaac Sim 2023.1.0 (might take a while). You can then launch it from the Launcher GUI or by navigating to ${HOME}/.local/share/ov/pkg/isaac_sim-2023.1.0 and running the isaac-sim.sh script. When launching IsaacSim for the first time, compilation of ray tracing shaders will take place and may take a while. If the resources of the workstation/pc are limited (e.g. RAM < 16GB), the compilation may abort after a while. You can still manage to compile them by adding sufficient SWAP memory to the system. Before trying to recompile the shaders, remember however to first delete the cache at .cache/ov/Kit/*.

  • To be able to run any script with dependencies on Omniverse packages, it's necessary to first source ${HOME}/.local/share/ov/pkg/isaac_sim-*/setup_conda_env.sh.

  • You can now test a simple simulation with Kyon by running python kyonrlstepping/tests/spawn_kyon_isaac_sim.py or python kyonrlstepping/tests/test_kyon_cloning.py.

  • To be able to use the controllers, you need to install also the remaining external dependencies (horizon, phase_manager).

External dependencies to be installed separately:

  • horizon-casadi, T.O. tool tailored to robotics, based on Casadi. Branch to be used: add_nodes_rl. Clone this repo at your preferred location and, from its root, run pip install --no-deps -e .. This will install the package in editable mode without its dependencies (this is necessary to avoid circumvent current issues with horizon's pip distribution).
  • phase_manager. Branch to be used: devel. Build this CMake package in you workspace (after activating the kyonrlstepping environment) and set the CMAKE_INSTALL_PREFIX to ${HOME}/mambaforge/envs/kyonrlstepping.
  • Omniverse Isaac Sim, photo-realistic GPU accelerated simulatorfrom NVIDIA.

Other dependencies included in the environment thorough Anaconda which can optionally be installed directly from source for development purposes:

  • CoClusterBridge: utilities to create a CPU-based controllers cluster to be interfaced with GPU-based simulators
  • OmniRoboGym: custom implementations of Tasks and Gyms for for Omniverse Isaac Sim based on Gymnasium. Easy URDF and SRDF import/cloning and simulation configuration exploiting Omniverse API.

Short-term ToDo list:

  • Test Stable Baselines3 with Gymnasium to understand exactly the interfaces needed to set up a custom environment and use it

  • Decide exactly for simulation environment to use. Options:

    • ❌ IsaacGym Preview4 → this is a standalone distribution. It won't be maintained nor developed by NVIDIA. All development will focus on Omniverse ecosystem.
    • ✔️ Omniverse isaac_sim-2023.1.0 → this is maintained and actively developed by NVIDIA team. Modular structure and different API from IsaacGym. More complicated but definitely more complete software environment.
    • ❌ No external simulator or Gazebo/Mujoco/Pybullet(CPU) → might be doable in an initial stage, but requires to setup a custom rendering for visualizing and debugging training, doesn't have realistic dynamics simulation (we should use Horizon's integration) and requires a CPU-based dynamics integration for each environment.
  • Decide which gym library to use (defines the environment the learning algorithm runs on). Viable options might be:

    • ❌ OpenAI’s Gym library (no longer maintained). Currently used by IsaacGym and also Omniverse's VecEnvBase.

    • ✔️ Gymnasium, a maintained fork of OpenAI’s Gym library (no longer maintained). Most notably, the library already provides environments for OpenAI's MuJoCo simulator. The user can implement their custom environments if needed. This library has been improvements wrt the old Gym and migration from Gym should be trivial.

  • Decide which library/algorithms to use for learning. Options:

    • ✔️ Stable Baselines 3, version 2.0.* with support for Gymnasium and PyTorch >=1.11. An advantage of Baselines3 being that it provides a collection of many SoA and learning algorithms (useful for debugging) and is reliable. This should be the starting point.

    • 〰️ SKRL, an open-source modular library for Reinforcement Learning written in PyTorch. Out-of-the-box support of environment interfaces provided by Isaac Orbit, Omniverse Isaac Gym, IsaacGym, OpenAI Gymnasium and Gym. This might be considered in the future.

    • rl_games. Used by both OmniIsaacGymEnvs and IsaacGymEnvs benchmark repos, with support for PyTorch (1.9+) and CUDA 11.1+. Algorithms supported with PyTorch are PPO and asymmetric actor-critic variant of PPO. Apparently, this library is tailored to GPU and shows considerably higher performance than, e.g. stable-baselines3.

  • Code implementation options:

    • ❌ directly develop on top of Isaac Orbit, a framework for creating robot learning environments, based on Omniverse Isaac Sim. It allows to create custom wrappers for interfacing with learning libraries, but it already provides interfaces for Stable-Baselines3, SKRL, rl_games, rsl_rl (see here). Here is also shows an example of converting a URDF to USD (used by Isaac Sim). Orbit's github repo hosts a custom IsaacEnv which inherits from gym.Env, same as omni.isaac.gym.VecEnvBase and implementations of actuators models, controller (e.g. joint impedance), static and dynamic markers, sensors and a base class for robots RobotBase, which wraps omni.isaac.core.articulations.Articulations class
    • ✔️ develop an as-much-as-possible self-contained package with simple environment, tasks, etc.. exploiting and possibly modifying currently available code in Omniverse, Isaac Orbit framework and Omniverse Isaac Gym → this is the preferred option as of now. Code reimplementation should however be kept as low as possible.
  • First simple test of IsaacSim simulators: What was done:

    • Kyon prim is spawned in the scene at a user defined prim_path calling the URDFParseAndImportFile script.
    • A simple joint-space impedance controller is attached to the robot. This allows to send position, velocity, effort commands.
    • Checked simulation stability and collision behavior.
  • Integration and testing of casadi_kin_dyn and horizon in Conda

  • [] Fist proof-of-concept integration of Horizon-based kyon mpc controller within the simulation:

    • abstract ControlCluster class for spawning n controllers and running their solve in parallel, while synchronizing with the simulation.
    • abstract class for RHC controller
    • implementation of RHC controller and ControlCluster for kyon.
    • implementation of a joint space impedance controller for simultaneous application of control actions through all environments
    • [] integration of joint imp controller inside environment and task + testing
    • [] testing of joystick-based MPC control in IsaacSim
  • [] Setting up the RL task:

    • Testing vectorization of Kyon's simulation
    • Test first skeleton of Kyon's Gymnasium environment and task (without rewards, observations...)
    • [] Implementing observations, rewards, envs reset, etc..

About

Python package implementing Reinforcement Learning based stepping strategy for Kyon robot

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages