CleanRL can't be used by importing? #197

cool-RR · 2022-06-10T19:59:46Z

Jordan recommended that I use CleanRL when working with Petting Zoo, and it took me a bit to figure out something. Let me know whether I understand correctly. CleanRL isn't meant to be imported and used directly, like SB3. It's supposed to have reference implementations of algorithms which developers could copy off to their own program and tweak as necessary to serve their needs.

Is this correct? If so:

It should be clearly stated in the readme, even in the title. Using the phrase "reference implementations" is helpful, and there needs to be at least a sentence that explains that CleanRL isn't meant to be imported, it's meant to be copied from.
It would be cool if the implementations in CleanRL would have been useful in importable forms in some way. It could be a separate package. Many people want to just import PPO and have it work.

The text was updated successfully, but these errors were encountered:

vwxyzjn · 2022-06-10T23:14:15Z

Hi, @cool-RR thanks for raising this issue. You are absolutely correct that CleanRL cannot be imported. Does https://github.com/vwxyzjn/cleanrl/blob/b0d00df8b926617651638923e622b51c0b477305/README.md better than clarify this?

It would be cool if the implementations in CleanRL would have been useful in importable forms in some way. It could be a separate package. Many people want to just import PPO and have it work.

CleanRL doesn't really support this, unfortunately. For the import PPO use cases, we'd recommend checking out Stable-baselines3 (SB3) which is a great modular library.

That said though, prototyping with SB3 can be challenging. As an example, it takes considerably more code to implement invalid action masking in SB3 (see PR) whereas in CleanRL it's just about 30 lines of code (see "invalid action masking" in this post). Unfortunately, there is no perfect solution for doing prototypes.

CleanRL can help you to understand the core of implementing invalid action masking to work with pettingzoo's chess, empowering you to build prototypes faster. Once you understand the core of the implementation, it would help you more seamlessly contribute the same feature to SB3 for wider adoption.

cool-RR · 2022-06-11T07:41:02Z

Thanks for the quick PR, I commented there.

CleanRL doesn't really support this, unfortunately. For the import PPO use cases, we'd recommend checking out Stable-baselines3 (SB3) which is a great modular library.

That said though, prototyping with SB3 can be challenging. As an example, it takes considerably more code to implement invalid action masking in SB3 (see PR) whereas in CleanRL it's just about 30 lines of code (see "invalid action masking" in this post). Unfortunately, there is no perfect solution for doing prototypes.

CleanRL can help you to understand the core of implementing invalid action masking to work with pettingzoo's chess, empowering you to build prototypes faster. Once you understand the core of the implementation, it would help you more seamlessly contribute the same feature to SB3 for wider adoption.

I understand, thanks.

vwxyzjn mentioned this issue Jun 10, 2022

Clarify CleanRL is a non-modular library #200

Merged

1 task

vwxyzjn closed this as completed in #200 Jun 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CleanRL can't be used by importing? #197

CleanRL can't be used by importing? #197

cool-RR commented Jun 10, 2022

vwxyzjn commented Jun 10, 2022

cool-RR commented Jun 11, 2022

CleanRL can't be used by importing? #197

CleanRL can't be used by importing? #197

Comments

cool-RR commented Jun 10, 2022

vwxyzjn commented Jun 10, 2022

cool-RR commented Jun 11, 2022