Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

question about paper result #35

Open
cycaoyang opened this issue Sep 2, 2024 · 3 comments
Open

question about paper result #35

cycaoyang opened this issue Sep 2, 2024 · 3 comments

Comments

@cycaoyang
Copy link

Dear author:
Hello! I am a graduate student in a Chinese university. I am working on a project on multi-agent reinforcement learning. I hope to connect my algorithm to the environment you developed to test the effect. However, when I reproduced the results of your paper, I found something confusing:
image
image
These two pictures are the scenes corresponding to L1-33. The first picture is the information recorded by eval(), and the second picture is the information recorded by train(). It can be seen that there is a huge difference in the values ​​between the two. Why is this?

@hsvgbkhgbv
Copy link
Member

Hi,

The main reason is that during training the exploration would be considered, e.g., some suboptimal random actions would be conducted. However, during testing the optimal actions are conducted, so the reward would be naturally higher.

Hope this can help you!

@cycaoyang
Copy link
Author

Thanks for your reply!
image
But after reading your code, I have an idea that your control rate indicator is low in training mode because your log_std is fixed when the Gaussian distribution is not adopted by default. When I test the effect of my algorithm, log_std is generated by the network, so the training results and test results are not much different. This may be the reason why your test results are different from the training results?

@Zebei99
Copy link

Zebei99 commented Sep 26, 2024

Dear author: Hello! I am a graduate student in a Chinese university. I am working on a project on multi-agent reinforcement learning. I hope to connect my algorithm to the environment you developed to test the effect. However, when I reproduced the results of your paper, I found something confusing: image image These two pictures are the scenes corresponding to L1-33. The first picture is the information recorded by eval(), and the second picture is the information recorded by train(). It can be seen that there is a huge difference in the values ​​between the two. Why is this?

Sir,
Could you please tell about the visualization package in this graph?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants