Weighted Mean Field Q-Learning for Large Scale Multiagent Systems

dc.contributor.authorChen, Z.
dc.contributor.authorLi, H.
dc.contributor.authorWang, Z.
dc.contributor.authorYan, B.
dc.date.issued2025
dc.description.abstractMean field reinforcement learning (MFRL) addresses the problem of dimensional explosion for largescale multiagent systems. However, MFRL averages the actions of neighbors equally while discarding the diversity and distinct features between individuals, which may lead to poor performance in many application scenarios. In this article, a new MFRL algorithm termed temporal weighted mean filed Q-learning (TWMFQ) is proposed. TWMFQ introduces a temporal compensated multihead attention structure to construct the weighted mean-field framework, which can sort out the complex relationships within the swarm into the interactions between specific agent and the weighted virtual mean agent. This approach allows the mean Q-function to represent the swarm behavior more informatively and comprehensively. In addition, an advanced sampling mechanism called mixed experience replay is established, which enriches the diversity of samples and prevents the algorithm from falling into local optimal solution. The comparison experiments on MAgent and multi-USV platform justify the superior performance of TWMFQ across different population sizes.
dc.description.statementofresponsibilityZhuoying Chen, Huiping Li, Zhaoxu Wang, and Bing Yan
dc.identifier.citationIEEE Transactions on Industrial Informatics, 2025; 21(9):7368-7378
dc.identifier.doi10.1109/tii.2025.3575139
dc.identifier.issn1551-3203
dc.identifier.issn1941-0050
dc.identifier.orcidYan, B. [0000-0003-3945-3069]
dc.identifier.urihttps://hdl.handle.net/2440/147609
dc.language.isoen
dc.publisherInstitute of Electrical and Electronics Engineers
dc.relation.granthttp://purl.org/au-research/grants/arc/DE250101335
dc.rights© 2025 IEEE. All rights reserved, including rights for text and data mining, and training of artificial intelligence and similar technologies. Personal use is permitted, but republication/redistribution requires IEEE permission.
dc.source.urihttps://doi.org/10.1109/tii.2025.3575139
dc.subjectexperience replay; mean field reinforcement learning (MFRL); multi-unmanned surface vehicle (USV)
dc.titleWeighted Mean Field Q-Learning for Large Scale Multiagent Systems
dc.typeJournal article
pubs.publication-statusPublished

Files

Collections