Merge pull request #38 from rohanpsingh/topic/new-updates

Massive new updates
rohanpsingh · Feb 25, 2025 · 4ab0ad8 · 4ab0ad8
2 parents 7c65a23 + 3e6c58f
commit 4ab0ad8
Show file tree

Hide file tree

Showing 30 changed files with 1,542 additions and 966 deletions.
diff --git a/.gitmodules b/.gitmodules
@@ -4,3 +4,6 @@
 [submodule "models/cassie_mj_description"]
 	path = models/cassie_mj_description
 	url = git@github.com:rohanpsingh/cassie_mj_description.git
+[submodule "models/mujoco_menagerie"]
+	path = models/mujoco_menagerie
+	url = git@github.com:google-deepmind/mujoco_menagerie
diff --git a/README.md b/README.md
@@ -1,12 +1,19 @@
 # LearningHumanoidWalking
 
+<p align="center">
+  <a href="https://www.youtube.com/watch?v=ZgfNzGAkk2Q"><img src="https://github.com/user-attachments/assets/5211cdcd-2267-497b-bd66-ac833703a134" alt="humanoid-walk" style="width:1000px"/></a>
+</p>
+
 Code for the papers:  
-- [**Learning Bipedal Walking On Planned Footsteps For Humanoid Robots**](https://arxiv.org/pdf/2207.12644.pdf) (Humanoids2022)  
+- [**Robust Humanoid Walking on Compliant and Uneven Terrain with Deep Reinforcement Learning**](https://ieeexplore.ieee.org/abstract/document/10769793)  
+[Rohan P. Singh](https://rohanpsingh.github.io), [Mitsuharu Morisawa](https://unit.aist.go.jp/jrl-22022/en/members/member-morisawa.html), [Mehdi Benallegue](https://unit.aist.go.jp/jrl-22022/en/members/member-benalleguem.html), [Zhaoming Xie](https://zhaomingxie.github.io/), [Fumio Kanehiro](https://unit.aist.go.jp/jrl-22022/en/members/member-kanehiro.html)
+
+- [**Learning Bipedal Walking for Humanoids with Current Feedback**](https://arxiv.org/pdf/2303.03724.pdf)  
+[Rohan P. Singh](https://rohanpsingh.github.io), [Zhaoming Xie](https://zhaomingxie.github.io/), [Pierre Gergondet](https://unit.aist.go.jp/jrl-22022/en/members/member-gergondet.html), [Fumio Kanehiro](https://unit.aist.go.jp/jrl-22022/en/members/member-kanehiro.html)
+
+- [**Learning Bipedal Walking On Planned Footsteps For Humanoid Robots**](https://arxiv.org/pdf/2207.12644.pdf)  
 [Rohan P. Singh](https://rohanpsingh.github.io), [Mehdi Benallegue](https://unit.aist.go.jp/jrl-22022/en/members/member-benalleguem.html), [Mitsuharu Morisawa](https://unit.aist.go.jp/jrl-22022/en/members/member-morisawa.html), [Rafael Cisneros](https://unit.aist.go.jp/jrl-22022/en/members/member-cisneros.html), [Fumio Kanehiro](https://unit.aist.go.jp/jrl-22022/en/members/member-kanehiro.html)
 
-- [**Learning Bipedal Walking for Humanoids with Current Feedback**](https://arxiv.org/pdf/2303.03724.pdf) (arxiv)  
-[Rohan P. Singh](https://rohanpsingh.github.io), [Zhaoming Xie](https://zhaomingxie.github.io/), [Pierre Gergondet](https://unit.aist.go.jp/jrl-22022/en/members/member-gergondet.html), [Fumio Kanehiro](https://unit.aist.go.jp/jrl-22022/en/members/member-kanehiro.html)  
-(WIP on branch `topic/omnidirectional-walk`)
 
 ## Code structure:
 A rough outline for the repository that might be useful for adding your own robot:
@@ -16,19 +23,18 @@ LearningHumanoidWalking/
 ├── tasks/               <-- Reward function, termination conditions, and more...
 ├── rl/                  <-- Code for PPO, actor/critic networks, observation normalization process...
 ├── models/              <-- MuJoCo model files: XMLs/meshes/textures
-├── trained/             <-- Contains pretrained model for JVRC
 └── scripts/             <-- Utility scripts, etc.
 ```
 
 ## Requirements:
-- Python version: 3.7.11  
-- [Pytorch](https://pytorch.org/)
+- Python version: 3.12.4
 - pip install:
-  - mujoco==2.2.0
+  - mujoco==3.2.2
+  - ray==2.40.0
+  - pytorch=2.5.1
+  - intel-openmp
   - [mujoco-python-viewer](https://github.com/rohanpsingh/mujoco-python-viewer)
-  - ray==1.9.2
   - transforms3d
-  - matplotlib
   - scipy
 
 ## Usage:
@@ -37,6 +43,7 @@ Environment names supported:
 
 | Task Description      | Environment name |
 | ----------- | ----------- |
+| Basic Standing Task   | 'h1' |
 | Basic Walking Task   | 'jvrc_walk' |
 | Stepping Task (using footsteps)  | 'jvrc_step' |
 
@@ -69,7 +76,35 @@ $ PYTHONPATH=.:$PYTHONPATH python scripts/debug_stepper.py --path <path_to_exp_d
 
 
 ## Citation
-If you find this work useful in your own research:
+If you find this work useful in your own research, please cite the following works:
+
+For omnidirectional walking:
+```
+@inproceedings{singh2024robust,
+  title={Robust Humanoid Walking on Compliant and Uneven Terrain with Deep Reinforcement Learning},
+  author={Singh, Rohan P and Morisawa, Mitsuharu and Benallegue, Mehdi and Xie, Zhaoming and Kanehiro, Fumio},
+  booktitle={2024 IEEE-RAS 23rd International Conference on Humanoid Robots (Humanoids)},
+  pages={497--504},
+  year={2024},
+  organization={IEEE}
+}
+```
+
+For simulating "back-emf" effect and other randomizations:
+```
+@article{xie2023learning,
+  title={Learning bipedal walking for humanoids with current feedback},
+  author={Xie, Zhaoming and Gergondet, Pierre and Kanehiro, Fumio and others},
+  journal={IEEE Access},
+  volume={11},
+  pages={82013--82023},
+  year={2023},
+  publisher={IEEE}
+}
+```
+
+For walking on footsteps:  
+
 ```
 @inproceedings{singh2022learning,
   title={Learning Bipedal Walking On Planned Footsteps For Humanoid Robots},

diff --git a/envs/common/config_builder.py b/envs/common/config_builder.py
@@ -0,0 +1,17 @@
+import yaml
+
+class Configuration:
+    def __init__(self, **kwargs):
+        for key, value in kwargs.items():
+            if isinstance(value, dict):
+                setattr(self, key, Configuration(**value))
+            else:
+                setattr(self, key, value)
+
+    def __repr__(self):
+        return str(self.__dict__)
+
+def load_yaml(file_path):
+    with open(file_path, 'r') as file:
+        config_data = yaml.safe_load(file)
+    return Configuration(**config_data)
diff --git a/envs/common/mujoco_env.py b/envs/common/mujoco_env.py
@@ -1,3 +1,4 @@
+import contextlib
 import os
 import numpy as np
 import mujoco
@@ -16,7 +17,10 @@ def __init__(self, model_path, sim_dt, control_dt):
             raise Exception("Provide full path to robot description package.")
         if not os.path.exists(fullpath):
             raise IOError("File %s does not exist" % fullpath)
-        self.model = mujoco.MjModel.from_xml_path(fullpath)
+
+        self.spec = mujoco.MjSpec()
+        self.spec.from_file(fullpath)
+        self.model = self.spec.compile()
         self.data = mujoco.MjData(self.model)
         self.viewer = None
 
@@ -48,24 +52,71 @@ def viewer_setup(self):
         self.viewer.cam.lookat[2] = 1.5
         self.viewer.cam.lookat[0] = 2.0
         self.viewer.cam.elevation = -20
-        self.viewer.vopt.geomgroup[0] = 1
+        self.viewer.vopt.geomgroup[2] = 0
         self.viewer._render_every_frame = True
 
     def viewer_is_paused(self):
         return self.viewer._paused
 
     # -----------------------------
+    # (some methods are taken directly from dm_control)
+
+    @contextlib.contextmanager
+    def disable(self, *flags):
+      """Context manager for temporarily disabling MuJoCo flags.
+
+      Args:
+        *flags: Positional arguments specifying flags to disable. Can be either
+          lowercase strings (e.g. 'gravity', 'contact') or `mjtDisableBit` enum
+          values.
+
+      Yields:
+        None
+
+      Raises:
+        ValueError: If any item in `flags` is neither a valid name nor a value
+          from `mujoco.mjtDisableBit`.
+      """
+      old_bitmask = self.model.opt.disableflags
+      new_bitmask = old_bitmask
+      for flag in flags:
+        if isinstance(flag, str):
+          try:
+            field_name = "mjDSBL_" + flag.upper()
+            flag = getattr(mujoco.mjtDisableBit, field_name)
+          except AttributeError:
+            valid_names = [
+                field_name.split("_")[1].lower()
+                for field_name in list(mujoco.mjtDisableBit.__members__)[:-1]
+            ]
+            raise ValueError("'{}' is not a valid flag name. Valid names: {}"
+                             .format(flag, ", ".join(valid_names))) from None
+        elif isinstance(flag, int):
+          flag = mujoco.mjtDisableBit(flag)
+        new_bitmask |= flag.value
+      self.model.opt.disableflags = new_bitmask
+      try:
+        yield
+      finally:
+        self.model.opt.disableflags = old_bitmask
 
     def reset(self):
         mujoco.mj_resetData(self.model, self.data)
         ob = self.reset_model()
         return ob
 
     def set_state(self, qpos, qvel):
-        assert qpos.shape == (self.model.nq,) and qvel.shape == (self.model.nv,)
+        assert qpos.shape == (self.model.nq,), \
+            f"qpos shape {qpos.shape} is expected to be {(self.model.nq,)}"
+        assert qvel.shape == (self.model.nv,), \
+            f"qvel shape {qvel.shape} is expected to be {(self.model.nv,)}"
         self.data.qpos[:] = qpos
         self.data.qvel[:] = qvel
-        mujoco.mj_forward(self.model, self.data)
+        self.data.act = []
+        self.data.plugin_state = []
+        # Disable actuation since we don't yet have meaningful control inputs.
+        with self.disable('actuation'):
+            mujoco.mj_forward(self.model, self.data)
 
     @property
     def dt(self):