This platform can be implemented on both windows and ubuntu if you have installed python3. Currently, there is no ROS-related packages. It only requires pytorch, numpy, and opencv.
Python higher than 3.8 is recommended.
Pre-installed: Anaconda3, any version that has a default python higher than 3.8 is fine.
pip install opencv-python
pip install torch==1.10.2+cu113 torchvision==0.11.3+cu113 torchaudio===0.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
The version of PyTorch depends on the device you have. You can choose CPU only or a specified CUDA version according to your GPU.
Currently, this repository consists of algorithm, common, datasave, environment, and simulation five parts..
Algorithm
Algorithm includes some commonly used reinforcement learning algorithms.
The following table lists RL algorithms in the corresponding directories.
Directory | Algorithm | Description |
---|---|---|
actor_critic | A2C DDPG SAC TD3 |
---- |
policy_base | PPO DPPO DPPO2 |
---- ---- does not work |
value_base | DQN DoubleDQN DuelingDQN |
---- |
rl_base | ---- | Basic class that inherited by other algorithms |
Common
Common includes common_func.py and common_cls.py containing some basic functions.
The following table lists the contents of the two py files.
File | Description |
---|---|
common_cls.py | ReplayBuffer, RolloutBuffer, OUNoise, NeuralNetworks, etc |
common_func.py | basic mathematical functions, geometry operations, etc |
Datasave
Datasave saves networks trained by RL algorithms and some data files.
Environment
Environment contains some physical models, which are called 'environment' in RL.
The 'config' directory contains the **.xml file, the model description files of all environments.
The 'envs' directory covers the ODE of the physical environments.
The following table lists all the current environments.
Environment | Directory | Description |
---|---|---|
CartPole | ./CartPole/ | continuous, position and angle |
CartPoleAngleOnly | ./CartPole/ | continuous, just angle |
CartPoleAngleOnlyDiscrete | ./CartPole/ | discrete, just angle |
FlightAttitudeSimulator | ./FlightAttitudeSimulator/ | discrete |
FlightAttitudeSimulator2StateContinuous | ./FlightAttitudeSimulator/ | continuous, state are only theta and dtheta |
FlightAttitudeSimulatorContinuous | ./FlightAttitudeSimulator/ | continuous |
UAVHover | ./UAV/ | continuous, other files in ./UAV are not RL environments |
UGVBidirectional | ./UGV/ | continuous, the vehicle can move forward and backward |
UGVForward | ./UGV/ | continuous, the vehicle can only move forward |
UGVForwardDiscrete | ./UGV/ | discrete, the vehicle can only move forward |
UGVForwardObstacleContinuous | ./UGV/ | continuous, the vehicle needs to avoid obstacles |
UGVForwardObstacleDiscrete | ./UGV/ | discrete, the vehicle needs to avoid obstacles |
UGVForward_pid | ./UGV_PID/ | UGV forward with PID controller tuned by RL |
UGVBidirectional_pid | ./UGV_PID/ | UGV bidirectional with PID controller tuned by RL |
TwoLinkManipulator | ./RobotManipulators/ | continuous, full drive |
Simulation
Simulation is the place where we implement our simulation experiments,
which means, using different algorithms in different environments.
Currently, we have the following well-trained controllers:
A DDPG controller for
- FlightAttitudeSimulator
- UGVBidirectional (motion planner)
- UGVForward (motion planner)
- UGVForwardObstacleAvoidance (motion planner)
A DQN controller for
- FlightAttitudeSimulator
- SecondOrderIntegration
- SecondOrderIntegration_Discrete
A Dueling DQN controller for
- FlightAttitudeSimulator
A TD3 trajectory planner for:
- UGVForwardObstacleAvoidance
- CartPole
- CartPoleAngleOnly
- FlightAttitudeSimulator
- SecondOrderIntegration
- UGVForward_pid
A PPO controller for:
- CartPoleAngleOnly
- FlightAttitudeSimulator2State
- SecondOrderIntegration_Discrete
- UGVForward_pid
- UGVBidirectional_pid
- TwoLinkManipulator
A DPPO controller for:
- CartPoleAngleOnly
- CartPole
- FlightAttitudeSimulator2State
- SecondOrderIntegration
- UGVBidirectional_pid
- TwoLinkManipulator
All runnable scripts are in './simulation/'.
In 'DQN-4-Flight-Attitude-Simulator.py', set: (set TRAIN to be True if you want to train a new controller)
TRAIN = False
RETRAIN = False
TEST = not TRAIN
In command window:
cd simulation/DQN_based/
python3 DQN-4-Flight-Attitude-Simulator.py
The result should be similar to the following.
In 'DDPG-4-UGV-Forward-Obstacle.py', set: (set TRAIN to be True if you want to train a new motion planner)
TRAIN = False
RETRAIN = False
TEST = not TRAIN
In command window:
cd simulation/PG_based/
python DDPG-4-UGV-Forward-Obstacle.py
The result should be similar to the following.
The result should be similar to the following.
The result should be similar to the following.
The result should be similar to the following.
- Add A2C
- Add A3C
- Add PPO
- Add DPPO
- Add D4PG
- Train controllers for CartPole
- Add some PPO demos
- Add some DPPO demos
- Add some A3C demos
- Modify UGV (add acceleration loop)
- Add a UAV regulator
- Add a UAV tracker
- Add a 2nd-order integration system
- Add a duel-joint robotic arm
- Add a 2nd-order cartpole (optional)
- Debug DPPO2
- Debug DQN-based algorithms (multi-action agents)