Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

CleanRL can't be used by importing? #197

Closed
cool-RR opened this issue Jun 10, 2022 · 2 comments · Fixed by #200
Closed

CleanRL can't be used by importing? #197

cool-RR opened this issue Jun 10, 2022 · 2 comments · Fixed by #200

Comments

@cool-RR
Copy link
Contributor

cool-RR commented Jun 10, 2022

Hi @vwxyzjn !

Jordan recommended that I use CleanRL when working with Petting Zoo, and it took me a bit to figure out something. Let me know whether I understand correctly. CleanRL isn't meant to be imported and used directly, like SB3. It's supposed to have reference implementations of algorithms which developers could copy off to their own program and tweak as necessary to serve their needs.

Is this correct? If so:

  1. It should be clearly stated in the readme, even in the title. Using the phrase "reference implementations" is helpful, and there needs to be at least a sentence that explains that CleanRL isn't meant to be imported, it's meant to be copied from.
  2. It would be cool if the implementations in CleanRL would have been useful in importable forms in some way. It could be a separate package. Many people want to just import PPO and have it work.
@vwxyzjn
Copy link
Owner

vwxyzjn commented Jun 10, 2022

Hi, @cool-RR thanks for raising this issue. You are absolutely correct that CleanRL cannot be imported. Does https://github.com/vwxyzjn/cleanrl/blob/b0d00df8b926617651638923e622b51c0b477305/README.md better than clarify this?

It would be cool if the implementations in CleanRL would have been useful in importable forms in some way. It could be a separate package. Many people want to just import PPO and have it work.

CleanRL doesn't really support this, unfortunately. For the import PPO use cases, we'd recommend checking out Stable-baselines3 (SB3) which is a great modular library.

That said though, prototyping with SB3 can be challenging. As an example, it takes considerably more code to implement invalid action masking in SB3 (see PR) whereas in CleanRL it's just about 30 lines of code (see "invalid action masking" in this post). Unfortunately, there is no perfect solution for doing prototypes.

CleanRL can help you to understand the core of implementing invalid action masking to work with pettingzoo's chess, empowering you to build prototypes faster. Once you understand the core of the implementation, it would help you more seamlessly contribute the same feature to SB3 for wider adoption.

@cool-RR
Copy link
Contributor Author

cool-RR commented Jun 11, 2022

Thanks for the quick PR, I commented there.

CleanRL doesn't really support this, unfortunately. For the import PPO use cases, we'd recommend checking out Stable-baselines3 (SB3) which is a great modular library.

That said though, prototyping with SB3 can be challenging. As an example, it takes considerably more code to implement invalid action masking in SB3 (see PR) whereas in CleanRL it's just about 30 lines of code (see "invalid action masking" in this post). Unfortunately, there is no perfect solution for doing prototypes.

CleanRL can help you to understand the core of implementing invalid action masking to work with pettingzoo's chess, empowering you to build prototypes faster. Once you understand the core of the implementation, it would help you more seamlessly contribute the same feature to SB3 for wider adoption.

I understand, thanks.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants