Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

fixed new gym API related to step() and reset() #87

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

spyroot
Copy link

@spyroot spyroot commented Oct 13, 2022

Description

There are several issues related to changes in API. I re-adjusted all code, so it returned truncated in step and readjusted reset to match what the gym expected. In parallel, fixed Mario env. Note old code won't work with this since GYM now return five elements in a tuple vs. 4. Same for reset. I also re-adjusted all the unit tests and examples, so it matched and passed all tests.

  • Fixes #Incorrect number of arguments from call to env.step(action) #119

Type of change

Please select all relevant options:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • [ x] Unit test
  • run gyp-super Mario ( with readjusted step /reset) and its unit test.

Test Configuration

  • Operating System: 12.5.1 mac os
  • Python version: 3.10
  • C++ compiler version: Apple clang version 13.1.6 (clang-1316.0.21.2.5)
    Target: x86_64-apple-darwin21.6.0
    Thread model: posix

Checklist

  • [ x] My code follows the style guidelines of this project
  • [ x] I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • [ x] I have added tests that prove my fix is effective or that my feature works

Copy link

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, Im one of the developers of Gym so thanks for starting this PR
I have added a couple of changes for updating the API

This page includes a migration guide, https://gymnasium.farama.org/content/migration-guide/

I couldn't see a function like this, but I would add a testing function like this in Gym to test that environment follow the API.

action = env.action_space.sample()
_, reward, done, info = env.step(action)
_, reward, done, _, info = env.step(action)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with

_, reward, terminated, truncated, info = env.step(action)
done = terminated or truncated

@@ -243,7 +255,7 @@ def seed(self, seed=None):
# return the list of seeds used by RNG(s) in the environment
return [seed]

def reset(self, seed=None, options=None, return_info=None):
def reset(self, seed=None, options=None, return_info=None) -> Tuple[ObsType, dict]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove return_info parameter

@@ -352,7 +371,7 @@ def close(self):
if self.viewer is not None:
self.viewer.close()

def render(self, mode='human'):
def render(self, mode='human') -> Optional[Union[RenderFrame, List[RenderFrame]]]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove mode parameter and add render_mode to __init__ for specifying the type of rendering

action = env.action_space.sample()
_, _, done, _ = env.step(action)
_, _, done, _, _ = env.step(action)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace done with terminated and truncated as in the comment before

action = envs[idx].action_space.sample()
_, _, dones[idx], _ = envs[idx].step(action)
_, _, dones[idx], _, _ = envs[idx].step(action)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above

# check each output
state, reward, done, info = output
state, reward, done, truncated, info = output

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

terminated, truncated, as terminated != done

done = False
state, _, done, _ = env.step(0)
state, _, done, _, _ = env.step(0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment on terminated and truncated

@@ -120,9 +121,9 @@ def test(self):
if done:
state = env.reset()
done = False
state, _, done, _ = env.step(0)
state, _, done, _, _ = env.step(0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment on terminated and truncated

done = False
else:
state, reward, done, info = env.step(env.action_space.sample())
state, reward, done, truncated, info = env.step(env.action_space.sample())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment on terminated and truncated

@@ -37,7 +37,7 @@

setup(
name='nes_py',
version='8.2.1',
version='8.2.2',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point, but I would make this a minor or major release due to the significant code changes

@alrdebugne
Copy link

Any reason why this shouldn't be merged with the suggested edits?

@ItaiBear ItaiBear mentioned this pull request Jul 17, 2023
10 tasks
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants