⚔️ Dueling DQN
🏅 PER(Prioritized_Experience_Replay)
🎰 C51
🎲 QR-DQN(Quantile_Regression_DQN)
📊 IQN(Implicit_Quantile_Network)
👪 APE-X
🤖 R2D2
🌴 DDPG(Deep_Deterministic_Policy_Gradient)
🌀 PPO(Proximal_Policy_Optimization)
⚾ MPO(Maximum_a_Posteriori_Policy_Optimization)
🥎 V-MPO(On-Policy Maximum a Posteriori Policy Optimization)