-
Notifications
You must be signed in to change notification settings - Fork 704
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Removing the regular advantage calculation in PPO #207
Comments
Yup, this can be also easily verified from the GAE computation formula, with |
Repository owner
deleted a comment from
3050821417
Aug 31, 2022
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
To resolve issue vwxyzjn#207 in cleanrl, extra advantage code not needed
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
To resolve issue vwxyzjn#207 in cleanrl, extra advantage calc code unnecessary
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
Updated to resolve issue vwxyzjn#207, unncessary additional advantage calc code
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
Updated to resolve issue vwxyzjn#207
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
Updated to resolve issue vwxyzjn#207, unnecessary additional advantage calc code for ppo implementations
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
Updated to resolve issue vwxyzjn#207
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
Updated to resolve issue vwxyzjn#207
6 tasks
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
bragajj
added a commit
to bragajj/cleanrl
that referenced
this issue
Oct 3, 2022
vwxyzjn
pushed a commit
that referenced
this issue
Oct 4, 2022
* Update ppo.py To resolve issue #207 in cleanrl, extra advantage code not needed * Update ppo_atari.py To resolve issue #207 in cleanrl, extra advantage calc code unnecessary * Update ppo_atari_envpool.py Updated to resolve issue #207, unncessary additional advantage calc code * Update ppo_continuous_action.py Updated to resolve issue #207 * Update ppo_atari_lstm.py Updated to resolve issue #207, unnecessary additional advantage calc code for ppo implementations * Update ppo_pettingzoo_ma_atari.py Updated to resolve issue #207 * Update ppo_procgen.py Updated to resolve issue #207 * GAE revisions #207 * GAE revisions #207 * GAE revisions #207 * GAE revisions #207 * GAE revisions #207 * GAE revisions #207 * GAE revisions #207 * GAE revisions #207 * GAE revisions #207 * GAE revisions #207 * Update ppo_rnd_envpool.py Fixed styling of lines 432-436
Closed by #287 |
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
Problem description.
The regular advantage calculation in PPO is a special case of the GAE advantage calculation when
gae_lambda=1
- we empirically demonstrate this with the debugging output in the bottom. Based on this result, we should removecleanrl/cleanrl/ppo.py
Lines 232 to 242 in 94a685d
Debugging output
The text was updated successfully, but these errors were encountered: