fixed new gym API related to step() and reset() #87

spyroot · 2022-10-13T09:01:45Z

Description

There are several issues related to changes in API. I re-adjusted all code, so it returned truncated in step and readjusted reset to match what the gym expected. In parallel, fixed Mario env. Note old code won't work with this since GYM now return five elements in a tuple vs. 4. Same for reset. I also re-adjusted all the unit tests and examples, so it matched and passed all tests.

Fixes #Incorrect number of arguments from call to env.step(action) #119

Type of change

Please select all relevant options:

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

[ x] Unit test
run gyp-super Mario ( with readjusted step /reset) and its unit test.

Test Configuration

Operating System: 12.5.1 mac os
Python version: 3.10
C++ compiler version: Apple clang version 13.1.6 (clang-1316.0.21.2.5)
Target: x86_64-apple-darwin21.6.0
Thread model: posix

Checklist

[ x] My code follows the style guidelines of this project
[ x] I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
[ x] I have added tests that prove my fix is effective or that my feature works

pseudo-rnd-thoughts

Hey, Im one of the developers of Gym so thanks for starting this PR
I have added a couple of changes for updating the API

This page includes a migration guide, https://gymnasium.farama.org/content/migration-guide/

I couldn't see a function like this, but I would add a testing function like this in Gym to test that environment follow the API.

pseudo-rnd-thoughts · 2022-10-26T12:54:23Z

nes_py/app/play_random.py

            action = env.action_space.sample()
-            _, reward, done, info = env.step(action)
+            _, reward, done, _, info = env.step(action)


Replace with

_, reward, terminated, truncated, info = env.step(action) done = terminated or truncated

pseudo-rnd-thoughts · 2022-10-26T12:54:44Z

nes_py/nes_env.py

@@ -243,7 +255,7 @@ def seed(self, seed=None):
        # return the list of seeds used by RNG(s) in the environment
        return [seed]

-    def reset(self, seed=None, options=None, return_info=None):
+    def reset(self, seed=None, options=None, return_info=None) -> Tuple[ObsType, dict]:


Remove return_info parameter

pseudo-rnd-thoughts · 2022-10-26T12:55:39Z

nes_py/nes_env.py

@@ -352,7 +371,7 @@ def close(self):
        if self.viewer is not None:
            self.viewer.close()

-    def render(self, mode='human'):
+    def render(self, mode='human') -> Optional[Union[RenderFrame, List[RenderFrame]]]:


Remove mode parameter and add render_mode to __init__ for specifying the type of rendering

pseudo-rnd-thoughts · 2022-10-26T12:56:11Z

nes_py/tests/test_multiple_makes.py

        action = env.action_space.sample()
-        _, _, done, _ = env.step(action)
+        _, _, done, _, _ = env.step(action)


Replace done with terminated and truncated as in the comment before

pseudo-rnd-thoughts · 2022-10-26T12:56:26Z

nes_py/tests/test_multiple_makes.py

                action = envs[idx].action_space.sample()
-                _, _, dones[idx], _ = envs[idx].step(action)
+                _, _, dones[idx], _, _ = envs[idx].step(action)


Same comment as above

pseudo-rnd-thoughts · 2022-10-26T12:56:59Z

nes_py/tests/test_nes_env.py

            # check each output
-            state, reward, done, info = output
+            state, reward, done, truncated, info = output


terminated, truncated, as terminated != done

pseudo-rnd-thoughts · 2022-10-26T12:57:31Z

nes_py/tests/test_nes_env.py

                done = False
-            state, _, done, _ = env.step(0)
+            state, _, done, _, _ = env.step(0)


Same comment on terminated and truncated

pseudo-rnd-thoughts · 2022-10-26T12:57:42Z

nes_py/tests/test_nes_env.py

@@ -120,9 +121,9 @@ def test(self):
            if done:
                state = env.reset()
                done = False
-            state, _, done, _ = env.step(0)
+            state, _, done, _, _ = env.step(0)


Same comment on terminated and truncated

pseudo-rnd-thoughts · 2022-10-26T12:58:01Z

scripts/run.py

            done = False
        else:
-            state, reward, done, info = env.step(env.action_space.sample())
+            state, reward, done, truncated, info = env.step(env.action_space.sample())


Same comment on terminated and truncated

pseudo-rnd-thoughts · 2022-10-26T12:58:31Z

setup.py

@@ -37,7 +37,7 @@

 setup(
    name='nes_py',
-    version='8.2.1',
+    version='8.2.2',


Minor point, but I would make this a minor or major release due to the significant code changes

alrdebugne · 2023-06-01T17:36:04Z

Any reason why this shouldn't be merged with the suggested edits?

spyroot added 5 commits October 13, 2022 12:42

Fixed new gym API.

80c4172

fixed issue with new API related to truncated

654da0d

fixed unit test

198b058

added flags

134d5ab

fixed issue in app

099260a

pseudo-rnd-thoughts reviewed Oct 26, 2022

View reviewed changes

ItaiBear mentioned this pull request Jul 17, 2023

Gymnasium support #94

Open

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixed new gym API related to step() and reset() #87

fixed new gym API related to step() and reset() #87

spyroot commented Oct 13, 2022

pseudo-rnd-thoughts left a comment

pseudo-rnd-thoughts Oct 26, 2022

pseudo-rnd-thoughts Oct 26, 2022

pseudo-rnd-thoughts Oct 26, 2022

pseudo-rnd-thoughts Oct 26, 2022

pseudo-rnd-thoughts Oct 26, 2022

pseudo-rnd-thoughts Oct 26, 2022

pseudo-rnd-thoughts Oct 26, 2022

pseudo-rnd-thoughts Oct 26, 2022

pseudo-rnd-thoughts Oct 26, 2022

pseudo-rnd-thoughts Oct 26, 2022

alrdebugne commented Jun 1, 2023

fixed new gym API related to step() and reset() #87

Are you sure you want to change the base?

fixed new gym API related to step() and reset() #87

Conversation

spyroot commented Oct 13, 2022

Description

Type of change

How Has This Been Tested?

Test Configuration

Checklist

pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alrdebugne commented Jun 1, 2023