Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

fix: Improve LunarLander-v2 step performance by >1.5x (#170) #235

Merged
merged 2 commits into from
Jan 3, 2023

Conversation

PaulMest
Copy link
Contributor

@PaulMest PaulMest commented Jan 2, 2023

Description

Improve model training performance by only calculating particles when in a render mode. Initial results vary between 1.5x - 1.75x speed improvement.

Fixes #170

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Results

$ python -m ml-stuff.lunar_lander_turbo --seed 44
Training took 67.50 seconds (baseline)
Training took 44.39 seconds (with_change)
1.52x faster

$ python -m ml-stuff.lunar_lander_turbo --seed 88
Training took 73.72 seconds (baseline)
Training took 42.09 seconds (with_change)
1.75x faster

$ md5 ppo_lunar_lander_with_change-44/policy.pth ppo_lunar_lander_baseline-44/policy.pth
MD5 (ppo_lunar_lander_with_change-44/policy.pth) = 46266d319de7912428268cccff20d966
MD5 (ppo_lunar_lander_baseline-44/policy.pth) = 46266d319de7912428268cccff20d966

$ md5 ppo_lunar_lander_with_change-88/policy.pth ppo_lunar_lander_baseline-88/policy.pth
MD5 (ppo_lunar_lander_with_change-88/policy.pth) = 859a945a9472447e32b0f180d9ca558e
MD5 (ppo_lunar_lander_baseline-88/policy.pth) = 859a945a9472447e32b0f180d9ca558e
Simple test harness
# lunar_lander_turbo.py
import gymnasium as gym
import sys

sys.modules['gym'] = gym
# Using a special version of stable_baselines3 that works with gymnasium (not gym)
# https://github.com/DLR-RM/stable-baselines3/pull/780
# pip install git+https://github.com/carlosluis/stable-baselines3@fix_tests
from stable_baselines3 import PPO
import time


def run_model(seed=None, description=None):
    print(f'Running with seed {seed} ({description})')
    env = gym.make("LunarLander-v2")
    env.reset(seed=seed)
    model = PPO("MlpPolicy", env, verbose=0, seed=seed)

    # Train the model
    start_time = time.time()
    model.learn(total_timesteps=100000)
    end_time = time.time()
    print(f'Training took {end_time - start_time:.2f} seconds')

    # Save the model
    model.save(f"misc2/ppo_lunar_lander_{description}-{seed}")
    # model.save(f"misc2/ppo_lunar_lander_baseline-{seed}")


if __name__ == '__main__':
    import argparse

    parser = argparse.ArgumentParser()
    parser.add_argument('--seed', type=int, default=42)
    parser.add_argument('--description', type=str)
    args = parser.parse_args()
    seed = args.seed
    description = args.description

    run_model(seed=seed, description=description)

Checklist:

  • I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@pseudo-rnd-thoughts pseudo-rnd-thoughts changed the title fix: Improve LunarLander-v2 model training performance by >1.5x (#170) fix: Improve LunarLander-v2 step performance by >1.5x (#170) Jan 3, 2023
Copy link
Member

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, this looks great.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug Report] Lunar Lander Runs Slowly
2 participants