[Looking for contributors] Fixing BC and RL training #162

micahcarroll · 2025-03-22T19:03:55Z

Hi there, I'm the original creator of the repo. Since the original release, the original packages we were using for BC and RL training have had major updates, people have transitioned from tensoflow to pytorch to jax, etc.

This has broken the BC and RL training. Over the years, I've tried to fix things locally without having to spend too much time on it, but I don't think it's worth doing that anymore: this is investing in a stack (mainly built on tensorflow) which is outdated, and should be re-written. Good news is that writing this stack today is much easier than it once was.

What would this look like?

For BC, you could quickly get the current tensorflow implementation in human_aware_rl to work (I was able to at least), but it's probably best to start from scratch and write it in pytorch (or even better, JAX). This shouldn't be too hard. If you do so, please make a PR!
For RL training, I'd recommend using JaxMARL: it's much faster than our training was. I don't think salvaging our rllib-based training is worth it.
For evaluating/visualizing/interacting with the trained RL policies, this would require some (hopefully minimal) infrastructure to port the JaxMARL-trained policies back to the overcooked-ai environment.

Old instructions from the README about testing whether human_aware_rl is working correctly

To check whether the humam_aware_rl is installed correctly, you can run the following command from the src/human_aware_rl directory:

$ ./run_tests.sh

The text was updated successfully, but these errors were encountered:

meghbhalerao · 2025-03-24T16:00:26Z

btw, I have a quick and dirty pytorch implementation for the BC training algorithm, but I don't think it is yet ready to make a PR - but I'll let you know.
Regarding this I think RL Lib is pretty good too, right? - https://docs.ray.io/en/latest/rllib/index.html - they claim industry standard RL etc. Plus I think JaxMARL already has an overcooked environment, although with different graphics.

PS - I am new to RL so my knowledge might be incorrect but just putting down my thoughts.

micahcarroll · 2025-03-30T04:09:00Z

Rllib with a pytorch backend would be good, but using JaxMARL would still lead to faster training as far as I understand. That said, adding new functionality in Jax is more annoying because it requires one to vectorize everything. Indeed, the JaxMARL implementation of overcooked (which they based off of this repo) has less features, e.g. no tomato ingredients IIRC.

While JaxMARL does have the basic version of overcooked implemented, the graphics are less nice/intuitive, and they don't have associated demo code to be able to deploy agents with humans to e.g. collect human data. So I think this repo still has some value for that, and for ease of implementation of new features relative to JAX.

micahcarroll changed the title ~~Fixing BC and RL training~~ [For those using the repo] Fixing BC and RL training Mar 22, 2025

micahcarroll changed the title ~~[For those using the repo] Fixing BC and RL training~~ [Looking for contributors] Fixing BC and RL training Mar 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Looking for contributors] Fixing BC and RL training #162

[Looking for contributors] Fixing BC and RL training #162

micahcarroll commented Mar 22, 2025 •

edited

Loading

meghbhalerao commented Mar 24, 2025 •

edited

Loading

micahcarroll commented Mar 30, 2025

[Looking for contributors] Fixing BC and RL training #162

[Looking for contributors] Fixing BC and RL training #162

Comments

micahcarroll commented Mar 22, 2025 • edited Loading

Old instructions from the README about testing whether human_aware_rl is working correctly

meghbhalerao commented Mar 24, 2025 • edited Loading

micahcarroll commented Mar 30, 2025

micahcarroll commented Mar 22, 2025 •

edited

Loading

meghbhalerao commented Mar 24, 2025 •

edited

Loading