Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Question about D4RL-gym dataset version #4

Open
FineArtz opened this issue Jan 12, 2022 · 1 comment
Open

Question about D4RL-gym dataset version #4

FineArtz opened this issue Jan 12, 2022 · 1 comment

Comments

@FineArtz
Copy link

Hi, recently I read your paper and it inspire me a lot, and I think it is no doubt a good paper. However, I am confused about the version of D4RL dataset used for your compared baselines. I notice that in "Appendix C Baseline performance sources", the results of BC, MOPO (by the way, I didn't find MOPO in your experiment part) and MBOP are taken from their original papers, all of which use D4RL-gym-v0 datasets.
Because I find that the performance of CQL on D4RL-gym-v0^[1] is greatly different from that on D4RL-gym-v2[2] on several datasets, I wonder that will scores of the above baselines change greatly on D4RL-gym-v2, or you have evidence that this will not happen, since you compare these scores directly?

@jannerm
Copy link
Owner

jannerm commented Feb 1, 2022

Nice catch!

BC on v2 performs 4.1 percentage points higher than on v0, with an average score of 51.8 versus 47.7 [1]. I'll update this in the next arXiv version.

I have reached out to the authors of MBOP to see if they can share code for reevaluation on the v2 datasets.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants