Releases: kakaoenterprise/JORLDY
JORLDY Beta 0.5.0
❗Important
🛠️ Fixes & Improvements
🔩 Minor fix
- Modify to reset rollout buffer stamp to 0 (#165)
⏰ Known Issues
- R2D2 need to be optimized
- IQN based algorithms debugging should be done
- VMPO performance is unstable (#164)
🙏 Acknowledgement
- Thanks to all who contributes JORLDY v0.5.0: @leonard-q , @ramanuzan, @kan-s0, @erinn-lee
JORLDY Beta 0.4.0
🛠️ Fixes & Improvements
- Update Pytorch version to 1.10 and other packages (#139)
- ICM and RND debugging is done (#145)
- APE-X debugging is done (#147)
- SAC discrete implemented (#150)
🔩 Minor fix
- Update Readme (contributors) (#138)
- Update distributed architecture flowchart and timeline (#143)
- Learning rate decay can be set as optional (#151)
- Split optimizer of ICM and RND from PPO (#152)
- modify calculating async step (#154)
⏰ Known Issues
- R2D2 need to be optimized
- IQN based algorithms have to be evaluated
🙏 Acknowledgement
- Thanks to all who contributes JORLDY v0.4.0: @leonard-q , @ramanuzan, @kan-s0, @erinn-lee
JORLDY Beta 0.3.0
❗Important
- Integrate scripts into one main script (#125)
- TD3 is implemented (#127)
- R2D2 is implemented, but it needs to be optimized (#104)
🛠️ Fixes & Improvements
- Edit stamp step calc; reset to 0 → -= period step(#130)
- implement gather thread to process get from queue with thread(update manage process with it)(#130)
- Intergrate dqn network, deterministic policy actor, critic (#129)
- Add lr scheduler to all RL algorithms (#108)
🔩 Minor fix
- Delete unused variable in ddqn (#128)
⏰ Known Issues
- ICM PPO and RND PPO performance degrades after ppo is modified. It needs to be fixed
- R2D2 need to be optimized
- APE-X debugging has to be done
- IQN based algorithms have to be evaluated
🙏 Acknowledgement
- Thanks to all who contributes JORLDY v0.3.0: @leonard-q , @ramanuzan, @kan-s0, @erinn-lee
JORLDY Beta 0.2.0
❗Important
- Atari wrapper is modified with reference to openai baselines wrapper(#92)
- EpisodicLifeEnv, MaxAndSkipEnv, ClipRewardEnv(sign) are applied
- reference: https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py
🛠️ Fixes & Improvements
- Error in Drone Delivery Env Mac build is fixed (#94)
- Mujoco is supported in docker (#96)
- PPO algorithm debugging is done (#103)
- Implement value-clip
- Update log clac to prevent gradient divergence; prob_tensor.log() → Categorical.log_prob()
- Change the advantage standardization order; before value calc → after value calc
- Add custom LR scheduler (DQN, PPO) (#103)
⏰ Known Issues
- ICM PPO and RND PPO performance degrades after ppo is modified. It needs to be fixed
🙏 Acknowledgement
- Thanks to all who contributes JORLDY v0.2.0: @leonard-q , @ramanuzan
JORLDY Beta 0.1.0
❗Important
- Unit test codes are implemented!
- M-DQN, M-IQN are implemented! (#79)
- Mujoco envs are supported! (#83)
🛠️Fixes & Improvements
- RND code refactoring (#52) occurs fatal error → It is solved with changing parameter name of RND (#71)
- Change default initialization method (Xavier → Orthogonal) (#81)
- Change Softmax to exp(log_softmax) (#82)
- Unit test for Mujoco env is done (#93)
🙏Acknowledgement
- Thanks to all who contributes JORLDY v0.1.0: @leonard-q @ramanuzan @lkm2835
JORLDY Beta 0.0.3
- Important
- Github action is applied for Python code style (PEP8). Please refer to style guide of CONTRIBUTING.md
- New environment: Drone Delivery ML-Agents Environment is added! 🛸
- ML-Agents Server builds are removed! Linux build with no_graphics option can be run on the Server. (#58)
- Fixes & Improvements
- JORLDY supports envs which provides multi modal input (image, vector)
- mlagents Windows issue
- Issue #44 was occurred when mlagents envs were run in Windows
- #46 solved this problem (Thank you so much @zenoengine )
- mlagents Linux build Issue
- mlagents envs had error, because .gitignore contains *.so. It removes all the .so files in mlagents envs. Therefore, all the .so files are restored and .gitignore is modified.
- ICM, RND code refactoring is conducted because of the duplicated functions (#52)
- ICM PPO bug fix: remove softmax before calc cross-entropy (#49)
- *_timers.json files in mlagent envs caused conflict when using git, *_timers.json files are added to .gitignore (#59)
- Benchmark is developed! → config, script, spec are added
- Acknowledgement
- Thanks to all who contributes JORLDY v0.0.3: @zenoengine @ramanuzan @leonard-q
JORLDY Beta 0.0.2
📢 Important
- Now JORLDY fully supports Windows, Mac and Linux!
🛠️ Fixes & Improvements
- README minor fix
- Remove $, >
- fixed typos
- modify gitignore; add python gitignore template
- supports WSL, Windows and Mac
- change agent instantiation code #28
- custom dict can be pickled
- multiprocessing qsize() → empty, full
- remove _nomp.py files
- solve multiprocessing issue on all OS
🙏 Acknowledgement
- Thanks to all who contributes JORLDY v0.0.2: @zenoengine, @ramanuzan, @leonard-q
JORLDY Beta 0.0.1
Hello WoRLd! ✋ This is first version of JORLDY, which is open-source Reinforcement Learning (RL) framework provided by KakaoEnterprise! We expect that JORLDY helps researchers and students who study RL. The features of JORLDY are as follows ⭐.
- 20+ RL Algorithms and various RL environment are provided
- Algorithms and environment can be added and customized
- The running of RL algorithm and environment is conducted using single command
- Distributed RL algorithms are provided using ray
- Benchmark of the algorithms is conducted in many RL environment
🤖 The implemented algorithms are as follows:
- Deep Q Network (DQN), Double DQN, Dueling DQN, Multistep DQN, Prioritized Experience Replay (PER), C51, Noisy Network, Rainbow (DQN, IQN), QR-DQN, IQN, Curiosity Driven Exploration (ICM), Random Network Distillation (RND), APE-X, REINFORCE, DDPG, PPO, SAC, MPO, V-MPO
🌎 The provided environments are as follows
- GYM classic control, Unity ML-Agents, Procgen,
- GYM Atari and Super Mario Bros are excluded from the requirement because of the license issue. You should install these environments manually.