Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Looking for contributors] Fixing BC and RL training #162

Open
micahcarroll opened this issue Mar 22, 2025 · 2 comments
Open

[Looking for contributors] Fixing BC and RL training #162

micahcarroll opened this issue Mar 22, 2025 · 2 comments

Comments

@micahcarroll
Copy link
Member

micahcarroll commented Mar 22, 2025

Hi there, I'm the original creator of the repo. Since the original release, the original packages we were using for BC and RL training have had major updates, people have transitioned from tensoflow to pytorch to jax, etc.

This has broken the BC and RL training. Over the years, I've tried to fix things locally without having to spend too much time on it, but I don't think it's worth doing that anymore: this is investing in a stack (mainly built on tensorflow) which is outdated, and should be re-written. Good news is that writing this stack today is much easier than it once was.

What would this look like?

  • For BC, you could quickly get the current tensorflow implementation in human_aware_rl to work (I was able to at least), but it's probably best to start from scratch and write it in pytorch (or even better, JAX). This shouldn't be too hard. If you do so, please make a PR!
  • For RL training, I'd recommend using JaxMARL: it's much faster than our training was. I don't think salvaging our rllib-based training is worth it.
  • For evaluating/visualizing/interacting with the trained RL policies, this would require some (hopefully minimal) infrastructure to port the JaxMARL-trained policies back to the overcooked-ai environment.

Old instructions from the README about testing whether human_aware_rl is working correctly

To check whether the humam_aware_rl is installed correctly, you can run the following command from the src/human_aware_rl directory:

$ ./run_tests.sh
@micahcarroll micahcarroll changed the title Fixing BC and RL training [For those using the repo] Fixing BC and RL training Mar 22, 2025
@micahcarroll micahcarroll changed the title [For those using the repo] Fixing BC and RL training [Looking for contributors] Fixing BC and RL training Mar 22, 2025
@meghbhalerao
Copy link

meghbhalerao commented Mar 24, 2025

  1. btw, I have a quick and dirty pytorch implementation for the BC training algorithm, but I don't think it is yet ready to make a PR - but I'll let you know.
  2. Regarding this I think RL Lib is pretty good too, right? - https://docs.ray.io/en/latest/rllib/index.html - they claim industry standard RL etc. Plus I think JaxMARL already has an overcooked environment, although with different graphics.

PS - I am new to RL so my knowledge might be incorrect but just putting down my thoughts.

@micahcarroll
Copy link
Member Author

Rllib with a pytorch backend would be good, but using JaxMARL would still lead to faster training as far as I understand. That said, adding new functionality in Jax is more annoying because it requires one to vectorize everything. Indeed, the JaxMARL implementation of overcooked (which they based off of this repo) has less features, e.g. no tomato ingredients IIRC.

While JaxMARL does have the basic version of overcooked implemented, the graphics are less nice/intuitive, and they don't have associated demo code to be able to deploy agents with humans to e.g. collect human data. So I think this repo still has some value for that, and for ease of implementation of new features relative to JAX.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants