Multi-Agent Reinforcement Learning is a very interesting research area, which has strong connections with single-agent RL, multi-agent systems, game theory, evolutionary computation and optimization theory, and its application in Large Language Models (LLMs) and Robotics.
This is a collection of research and review papers of multi-agent reinforcement learning (MARL). The Papers are sorted by time. Any suggestions and pull requests are welcome.
The sharing principle of these references here is for research. If any authors do not want their paper to be listed here, please feel free to contact us.
- Tutorial
- Review Papers
- Research Papers
- Framework
- Joint action learning
- Cooperation and competition
- Coordination
- Security
- Self-Play
- Learning To Communicate
- Transfer Learning
- Imitation and Inverse Reinforcement Learning
- Meta Learning
- Application
- Networked MARL (Decentralized Training Decentralized Execution)
- MARL in LLMs (MARL in Large Language Models)
- MARL in Robotics (MARL in Robotics)
- Multi-Agent Reinforcement Learning: Foundations and Modern Approaches by Stefano V. Albrecht, Filippos Christianos, Lukas Schäfer, 2023.
- Many-agent Reinforcement Learning by Yaodong Yang, 2021. PhD Thesis.
- Deep Multi-Agent Reinforcement Learning by Jakob N Foerster, 2018. PhD Thesis.
- Multi-Agent Machine Learning: A Reinforcement Approach by H. M. Schwartz, 2014.
- Multiagent Reinforcement Learning by Daan Bloembergen, Daniel Hennes, Michael Kaisers, Peter Vrancx. ECML, 2013.
- Multiagent systems: Algorithmic, game-theoretic, and logical foundations by Shoham Y, Leyton-Brown K. Cambridge University Press, 2008.
- Model-based Multi-agent Reinforcement Learning: Recent Progress and Prospects by Xihuai Wang, Zhicheng Zhang, and Weinan Zhang. 2022.
- An overview of multi-agent reinforcement learning from game theoretical perspective by Yaodong Yang and Jun Wang. 2020.
- A Survey and Critique of Multiagent Deep Reinforcement Learning by Pablo Hernandez-Leal, Bilal Kartal and Matthew E. Taylor. 2019.
- Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms by Kaiqing Zhang, Zhuoran Yang, Tamer Başar. 2019.
- A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems by Silva, Felipe Leno da; Costa, Anna Helena Reali. JAIR, 2019.
- Autonomously Reusing Knowledge in Multiagent Reinforcement Learning by Silva, Felipe Leno da; Taylor, Matthew E.; Costa, Anna Helena Reali. IJCAI, 2018.
- Deep Reinforcement Learning Variants of Multi-Agent Learning Algorithms by Castaneda A O. 2016.
- Evolutionary Dynamics of Multi-Agent Learning: A Survey by Bloembergen, Daan, et al. JAIR, 2015.
- Game theory and multi-agent reinforcement learning by Nowé A, Vrancx P, De Hauwere Y M. Reinforcement Learning. Springer Berlin Heidelberg, 2012.
- Multi-agent reinforcement learning: An overview by Buşoniu L, Babuška R, De Schutter B. Innovations in multi-agent systems and applications-1. Springer Berlin Heidelberg, 2010
- A comprehensive survey of multi-agent reinforcement learning by Busoniu L, Babuska R, De Schutter B. IEEE Transactions on Systems Man and Cybernetics Part C Applications and Reviews, 2008
- If multi-agent learning is the answer, what is the question? by Shoham Y, Powers R, Grenager T. Artificial Intelligence, 2007.
- From single-agent to multi-agent reinforcement learning: Foundational concepts and methods by Neto G. Learning theory course, 2005.
- Evolutionary game theory and multi-agent reinforcement learning by Tuyls K, Nowé A. The Knowledge Engineering Review, 2005.
- An Overview of Cooperative and Competitive Multiagent Learning by Pieter Jan ’t HoenKarl TuylsLiviu PanaitSean LukeJ. A. La Poutré. AAMAS's workshop LAMAS, 2005.
- Cooperative multi-agent learning: the state of the art by Liviu Panait and Sean Luke, 2005.
- Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task by Shao Zhang*, Xihuai Wang*, Wenhao Zhang, Yongshan Chen, Landi Gao, Dakuo Wang, Weinan Zhang, Xinbing Wang, and Ying Wen. 2024.
- Large language model based multi-agents: A survey of progress and challenges by Guo, Taicheng, Xiuying Chen, Yaqi Wang, Ruidi Chang, Shichao Pei, Nitesh V. Chawla, Olaf Wiest, and Xiangliang Zhang. 2024.
- Leveraging Large Language Models for Optimised Coordination in Textual Multi-Agent Reinforcement Learning by Slumbers, Oliver, David Henry Mguni, Kun Shao, and Jun Wang. 2024.
- Theory of mind for multi-agent collaboration via large language models by Li, Huao, Yu Quan Chong, Simon Stepputtis, Joseph Campbell, Dana Hughes, Michael Lewis, and Katia Sycara. 2023.
- Multi-Agent Constrained Policy Optimisation by Shangding Gu, Jakub Grudzien Kuba, Munning Wen, Ruiqing Chen, Ziyan Wang, Zheng Tian, Jun Wang, Alois Knoll, and Yaodong Yang, 2021.
- Settling the Variance of Multi-Agent Policy Gradients by Kuba Jakub, Muning Wen, Linghui Meng, Shangding Gu, Haifeng Zhang, David Mguni, Jun Wang, and Yaodong Yang, NIPS 2021.
- QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning by Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, Shimon Whiteson. ICML 2018.
- Mean Field Multi-Agent Reinforcement Learning by Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, and Jun Wang. ICML 2018.
- Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments by Lowe R, Wu Y, Tamar A, et al. arXiv, 2017.
- Deep Decentralized Multi-task Multi-Agent RL under Partial Observability by Omidshafiei S, Pazis J, Amato C, et al. arXiv, 2017.
- Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games by Peng P, Yuan Q, Wen Y, et al. arXiv, 2017.
- Robust Adversarial Reinforcement Learning by Lerrel Pinto, James Davidson, Rahul Sukthankar, Abhinav Gupta. arXiv, 2017.
- Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning by Foerster J, Nardelli N, Farquhar G, et al. arXiv, 2017.
- Multiagent reinforcement learning with sparse interactions by negotiation and knowledge transfer by Zhou L, Yang P, Chen C, et al. IEEE transactions on cybernetics, 2016.
- Decentralised multi-agent reinforcement learning for dynamic and uncertain environments by Marinescu A, Dusparic I, Taylor A, et al. arXiv, 2014.
- CLEANing the reward: counterfactual actions to remove exploratory action noise in multiagent learning by HolmesParker C, Taylor M E, Agogino A, et al. AAMAS, 2014.
- Bayesian reinforcement learning for multiagent systems with state uncertainty by Amato C, Oliehoek F A. MSDM Workshop, 2013.
- Multiagent learning: Basics, challenges, and prospects by Tuyls, Karl, and Gerhard Weiss. AI Magazine, 2012.
- Classes of multiagent q-learning dynamics with epsilon-greedy exploration by Wunder M, Littman M L, Babes M. ICML, 2010.
- Conditional random fields for multi-agent reinforcement learning by Zhang X, Aberdeen D, Vishwanathan S V N. ICML, 2007.
- Multi-agent reinforcement learning using strategies and voting by Partalas, Ioannis, Ioannis Feneris, and Ioannis Vlahavas. ICTAI, 2007.
- A reinforcement learning scheme for a partially-observable multi-agent game by Ishii S, Fujita H, Mitsutake M, et al. Machine Learning, 2005.
- Asymmetric multiagent reinforcement learning by Könönen V. Web Intelligence and Agent Systems, 2004.
- Adaptive policy gradient in multiagent learning by Banerjee B, Peng J. AAMAS, 2003.
- Reinforcement learning to play an optimal Nash equilibrium in team Markov games by Wang X, Sandholm T. NIPS, 2002.
- Multiagent learning using a variable learning rate by Michael Bowling and Manuela Veloso, 2002.
- Value-function reinforcement learning in Markov game by Littman M L. Cognitive Systems Research, 2001.
- Hierarchical multi-agent reinforcement learning by Makar, Rajbala, Sridhar Mahadevan, and Mohammad Ghavamzadeh. The fifth international conference on Autonomous agents, 2001.
- An analysis of stochastic game theory for multiagent reinforcement learning by Michael Bowling and Manuela Veloso, 2000.
- AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents by Conitzer V, Sandholm T. Machine Learning, 2007.
- Extending Q-Learning to General Adaptive Multi-Agent Systems by Tesauro, Gerald. NIPS, 2003.
- Multiagent reinforcement learning: theoretical framework and an algorithm. by Hu, Junling, and Michael P. Wellman. ICML, 1998.
- The dynamics of reinforcement learning in cooperative multiagent systems by Claus C, Boutilier C. AAAI, 1998.
- Markov games as a framework for multi-agent reinforcement learning by Littman, Michael L. ICML, 1994.
- Order Matters: Agent-by-agent Policy Optimization by Xihuai Wang, Zheng Tian, Ziyu Wan, Ying Wen, Jun Wang, Weinan Zhang, ICLR 2023.
- Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning by Shunyu Liu, Jie Song, Yihe Zhou, Na Yu, Kaixuan Chen, Zunlei Feng, Mingli Song. TPAMI, 2024.
- Contrastive Identity-Aware Learning for Multi-Agent Value Decomposition by Shunyu Liu, Yihe Zhou, Jie Song, Tongya Zheng, Kaixuan Chen, Tongtian Zhu, Zunlei Feng, Mingli Song. AAAI, 2023.
- Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL? by Yihe Zhou, Shunyu Liu, Yunpeng Qing, Kaixuan Chen, Tongya Zheng, Yanhao Huang, Jie Song, Mingli Song. 2023.
- Multi-Agent Reinforcement Learning is a Sequence Modeling Problem, by Wen, Muning, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, and Yaodong Yang, 2022.
- The Complexity of Markov Equilibrium in Stochastic Games by Daskalakis, Constantinos, Noah Golowich, and Kaiqing Zhang, 2022.
- Trust region policy optimisation in multi-agent reinforcement learning by Kuba, Jakub Grudzien, Ruiqing Chen, Munning Wen, Ying Wen, Fanglei Sun, Jun Wang, and Yaodong Yang, ICLR 2022.
- Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts by Weinan Zhang, Xihuai Wang, Jian Shen, and Ming Zhou. IJCAI 2021.
- The Surprising Effectiveness of PPO in Cooperative, Multi-Agent Games by Chao Yu, Akash Velu, Eugene Vinitsky, Yu Wang, Alexandre Bayen, Yi Wu, 2021.
- Human-level performance in 3D multiplayer games with population-based reinforcement learning by Max Jaderberg, Wojciech M. Czarnecki, Iain Dunning, et al. Science 364.6443: 859-865, 2019.
- Emergent complexity through multi-agent competition by Trapit Bansal, Jakub Pachocki, Szymon Sidor, Ilya Sutskever, Igor Mordatch, 2018.
- Learning with opponent learning awareness by Jakob Foerster, Richard Y. Chen2, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch, 2018.
- Multi-agent Reinforcement Learning in Sequential Social Dilemmas by Leibo J Z, Zambaldi V, Lanctot M, et al. arXiv, 2017. [Post]
- Cooperative Multi-Agent Control Using Deep Reinforcement Learning by Gupta, J. K., Egorov, M., & Kochenderfer, M. AAMAS 2017.
- Reinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies with PAC Bounds by Roi Ceren, Prashant Doshi, and Bikramjit Banerjee, pp. 530-538, AAMAS 2016.
- Opponent Modeling in Deep Reinforcement Learning by He H, Boyd-Graber J, Kwok K, et al. ICML, 2016.
- Multiagent cooperation and competition with deep reinforcement learning by Tampuu A, Matiisen T, Kodelja D, et al. arXiv, 2015.
- Emotional multiagent reinforcement learning in social dilemmas by Yu C, Zhang M, Ren F. International Conference on Principles and Practice of Multi-Agent Systems, 2013.
- Multi-agent reinforcement learning in common interest and fixed sum stochastic games: An experimental study by Bab, Avraham, and Ronen I. Brafman. Journal of Machine Learning Research, 2008.
- Combining policy search with planning in multi-agent cooperation by Ma J, Cameron S. Robot Soccer World Cup, 2008.
- Collaborative multiagent reinforcement learning by payoff propagation by Kok J R, Vlassis N. JMLR, 2006.
- Learning to cooperate in multi-agent social dilemmas by de Cote E M, Lazaric A, Restelli M. AAMAS, 2006.
- Learning to compete, compromise, and cooperate in repeated general-sum games by Crandall J W, Goodrich M A. ICML, 2005.
- Sparse cooperative Q-learning by Kok J R, Vlassis N. ICML, 2004.
- Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games by Leonardos, Stefanos, Will Overman, Ioannis Panageas, and Georgios Piliouras. 2021
- Markov α-Potential Games: Equilibrium Approximation and Regret Analysis by Xin G, et al, 2023
- A Natural Actor-Critic Framework for Zero-Sum Markov Games Ahmet A. et al, ICML, 2022
- ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination by Xihuai Wang, Shao Zhang, Wenhao Zhang, Wentao Dong, Jingxiao Chen, Ying Wen, and Weinan Zhang. NeurIPS 2024.
- Collaborating with Humans without Human Data by DJ Strouse, Kevin R. McKee, Matt Botvinick, Edward Hughes, Richard Everett. NeurIPS 2021.
- Coordinated Multi-Agent Imitation Learning by Le H M, Yue Y, Carr P. arXiv, 2017.
- Reinforcement social learning of coordination in networked cooperative multiagent systems by Hao J, Huang D, Cai Y, et al. AAAI Workshop, 2014.
- Coordinating multi-agent reinforcement learning with limited communication by Zhang, Chongjie, and Victor Lesser. AAMAS, 2013.
- Coordination guided reinforcement learning by Lau Q P, Lee M L, Hsu W. AAMAS, 2012.
- Coordination in multiagent reinforcement learning: a Bayesian approach by Chalkiadakis G, Boutilier C. AAMAS, 2003.
- Coordinated reinforcement learning by Guestrin C, Lagoudakis M, Parr R. ICML, 2002.
- Reinforcement learning of coordination in cooperative multi-agent systems by Kapetanakis S, Kudenko D. AAAI/IAAI, 2002.
- Markov Security Games: Learning in Spatial Security Problems by Klima R, Tuyls K, Oliehoek F. The Learning, Inference and Control of Multi-Agent Systems at NIPS, 2016.
- Cooperative Capture by Multi-Agent using Reinforcement Learning, Application for Security Patrol Systems by Yasuyuki S, Hirofumi O, Tadashi M, et al. Control Conference (ASCC), 2015
- Improving learning and adaptation in security games by exploiting information asymmetry by He X, Dai H, Ning P. INFOCOM, 2015.
- A Comparison of Self-Play Algorithms Under a Generalized Framework by Daniel Hernandez, Kevin Denamganai, Sam Devlin, et al. IEEE Transactions on Games 2021
- A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning by Marc Lanctot, Vinicius Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Perolat, David Silver, Thore Graepel. NIPS 2017.
- Deep reinforcement learning from self-play in imperfect-information games by Heinrich, Johannes, and David Silver. arXiv, 2016.
- Fictitious Self-Play in Extensive-Form Games by Heinrich, Johannes, Marc Lanctot, and David Silver. ICML, 2015.
- [Hammer: Multi-level coordination of reinforcement learning agents via learned messaging] by Nikunj Gupta, G. Srinivasaraghavan, Swarup Mohalik, Nishant Kumar, and Matthew E. Taylor, Neural Computing and Applications, 2023."
- Learning to ground multi-agent communication with autoencoders by Lin, Toru, Jacob Huh, Christopher Stauffer, Ser Nam Lim, and Phillip Isola. 2021.
- Emergent Communication through Negotiation by Kris Cao, Angeliki Lazaridou, Marc Lanctot, Joel Z Leibo, Karl Tuyls, Stephen Clark, 2018.
- Emergence of Linguistic Communication From Referential Games with Symbolic and Pixel Input by Angeliki Lazaridou, Karl Moritz Hermann, Karl Tuyls, Stephen Clark. ICLR 2018.
- Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols by Serhii Havrylov, Ivan Titov. ICLR Workshop, 2017.
- Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning by Abhishek Das, Satwik Kottur, et al. arXiv, 2017.
- Emergence of Grounded Compositional Language in Multi-Agent Populations by Igor Mordatch, Pieter Abbeel. arXiv, 2017. [Post]
- Cooperation and communication in multiagent deep reinforcement learning by Hausknecht M J. 2017.
- Multi-agent cooperation and the emergence of (natural) language by Lazaridou A, Peysakhovich A, Baroni M. arXiv, 2016.
- Learning to communicate to solve riddles with deep distributed recurrent q-networks by Foerster J N, Assael Y M, de Freitas N, et al. arXiv, 2016.
- Learning to communicate with deep multi-agent reinforcement learning by Foerster J, Assael Y M, de Freitas N, et al. NIPS, 2016.
- Learning multiagent communication with backpropagation by Sukhbaatar S, Fergus R. NIPS, 2016.
- Efficient distributed reinforcement learning through agreement by Varshavskaya P, Kaelbling L P, Rus D. Distributed Autonomous Robotic Systems, 2009.
- Simultaneously Learning and Advising in Multiagent Reinforcement Learning by Silva, Felipe Leno da; Glatt, Ruben; and Costa, Anna Helena Reali. AAMAS, 2017.
- Accelerating Multiagent Reinforcement Learning through Transfer Learning by Silva, Felipe Leno da; and Costa, Anna Helena Reali. AAAI, 2017.
- Accelerating multi-agent reinforcement learning with dynamic co-learning by Garant D, da Silva B C, Lesser V, et al. Technical report, 2015
- Transfer learning in multi-agent systems through parallel transfer by Taylor, Adam, et al. ICML, 2013.
- Transfer learning in multi-agent reinforcement learning domains by Boutsioukis, Georgios, Ioannis Partalas, and Ioannis Vlahavas. European Workshop on Reinforcement Learning, 2011.
- Transfer Learning for Multi-agent Coordination by Vrancx, Peter, Yann-Michaël De Hauwere, and Ann Nowé. ICAART, 2011.
- On the Utility of Learning about Humans for Human-AI Coordination by Micah Carroll, Rohin Shah, Mark K. Ho, Thomas L. Griffiths, Sanjit A. Seshia, Pieter Abbeel, Anca Dragan. NeurIPS 2019.
- Multi-Agent Adversarial Inverse Reinforcement Learning by Lantao Yu, Jiaming Song, Stefano Ermon. ICML 2019.
- Multi-Agent Generative Adversarial Imitation Learning by Jiaming Song, Hongyu Ren, Dorsa Sadigh, Stefano Ermon. NeurIPS 2018.
- Cooperative inverse reinforcement learning by Hadfield-Menell D, Russell S J, Abbeel P, et al. NIPS, 2016.
- Comparison of Multi-agent and Single-agent Inverse Learning on a Simulated Soccer Example by Lin X, Beling P A, Cogill R. arXiv, 2014.
- Multi-agent inverse reinforcement learning for zero-sum games by Lin X, Beling P A, Cogill R. arXiv, 2014.
- Multi-robot inverse reinforcement learning under occlusion with interactions by Bogert K, Doshi P. AAMAS, 2014.
- Multi-agent inverse reinforcement learning by Natarajan S, Kunapuli G, Judah K, et al. ICMLA, 2010.
- Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments by l-Shedivat, M. 2018.
- MuZero with Self-competition for Rate Control in VP9 Video Compression by Amol Mandhane, Anton Zhernov, Maribeth Rauh, Chenjie Gu, et al. arXiv 2022.
- MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence by Zheng L et al. NIPS 2017 & AAAI 2018 Demo. (Github Page)
- Collaborative Deep Reinforcement Learning for Joint Object Search by Kong X, Xin B, Wang Y, et al. arXiv, 2017.
- Multi-Agent Stochastic Simulation of Occupants for Building Simulation by Chapman J, Siebers P, Darren R. Building Simulation, 2017.
- Extending No-MASS: Multi-Agent Stochastic Simulation for Demand Response of residential appliances by Sancho-Tomás A, Chapman J, Sumner M, Darren R. Building Simulation, 2017.
- Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving by Shalev-Shwartz S, Shammah S, Shashua A. arXiv, 2016.
- Applying multi-agent reinforcement learning to watershed management by Mason, Karl, et al. Proceedings of the Adaptive and Learning Agents workshop at AAMAS, 2016.
- Crowd Simulation Via Multi-Agent Reinforcement Learning by Torrey L. AAAI, 2010.
- Traffic light control by multiagent reinforcement learning systems by Bakker, Bram, et al. Interactive Collaborative Information Systems, 2010.
- Multiagent reinforcement learning for urban traffic control using coordination graphs by Kuyer, Lior, et al. oint European Conference on Machine Learning and Knowledge Discovery in Databases, 2008.
- A multi-agent Q-learning framework for optimizing stock trading systems by Lee J W, Jangmin O. DEXA, 2002.
- Multi-agent reinforcement learning for traffic light control by Wiering, Marco. ICML. 2000.
- QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus Innovations by Kar, Soummya and Moura, José M. F. and Poor, H. Vincent. IEEE Transactions on Signal Processing 2013.
- Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents by Kaiqing Zhang, Zhuoran Yang, Han Liu, Tong Zhang, Tamer Basar. ICML 2018.
- Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning by Chao Qu, Shie Mannor, Huan Xu, Yuan Qi, Le Song, Junwu Xiong. NIPS 2019.
- Multi-agent Reinforcement Learning for Networked System Control by Tianshu Chu, Sandeep Chinchali, Sachin Katti. ICLR 2020.
- F2A2: Flexible fully-decentralized approximate actor-critic for cooperative multi-agent reinforcement learning by Wenhao Li, Bo Jin, Xiangfeng Wang, Junchi Yan, Hongyuan Zha. arXiv 2020.
- Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems by Guannan Qu, Adam Wierman, Na Li. L4DC 2020.
- Finite-Sample Analysis For Decentralized Batch Multi-Agent Reinforcement Learning With Networked Agents by Zhang, Kaiqing and Yang, Zhuoran and Liu, Han and Zhang, Tong and Başar, Tamer. TAC 2021.