Cooperative learning with joint state value approximation for multi-agent systems

来源 :Journal of Control Theory and Applications | 被引量 : 0次 | 上传用户:hutao95
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
This paper relieves the’curse of dimensionality’ problem, which becomes intractable when scaling reinforcement learning to multi-agent systems. This problem is aggravated exponentially as the number of agents increases, resulting in large memory requirement and slowness in learning speed. For cooperative systems which widely exist in multi-agent systems, this paper proposes a new multi-agent Q-learning algorithm based on decomposing the joint state and joint action learning into two learning processes, which are learning individual action and the maximum value of the joint state approximately. The latter process considers others’ actions to insure that the joint action is optimal and supports the updating of the former one. The simulation results illustrate that the proposed algorithm can learn the optimal joint behavior with smaller memory and faster learning speed compared with friend-Q learning and independent learning. This paper relieves the’curse of dimensionality ’problem, which becomes intractable when scaling reinforcement learning to multi-agent systems. This problem is aggravated exponentially as the number of agents increases, resulting in large memory requirement and slowness in learning speed. which widely exist in multi-agent systems, this paper proposes a new multi-agent Q-learning algorithm based on decomposing the joint state and joint action learning into two learning processes, which are learning individual action and the maximum value of the joint state approximately . The latter process considers others’ actions to insure that the joint action is optimal and supports the updating of the former one. The simulation results illustrate that the proposed algorithm can learn the optimal joint behavior with smaller memory and faster learning speed compared with friend- Q learning and independent learning.
其他文献
20世纪80年代初,来自中国广西农村的村民自发组建了中国有史以来的第一个村民委员会,开启了中国农村基层民主自治的大幕。村民委员会作为中国当代农村基层民主政治和村民自治的
期刊
This paper presents a new approach to singular system analysis by modeling the system in terms of orthogonal triangular functions (TFs). The proposed method is
在国家的行政活动中,行政监察具有重要的地位,是行政机关及其工作人员依法行政、高效行政、清正廉洁的重要保证。我国目前的行政监察体制属于专门行政监督,众多学者认为,在实
学位
廉政文化建设作为建设社会主义先进文化的重要内容,是建立卓越行政文化过程中看不见的软件,也是反腐败斗争强大的精神动力和思想保证。中华民族在漫长的历史过程中形成了优秀的
该文从挂篮荷载计算、施工流程、支座及临时固结施工、挂篮安装及试验、合拢段施工、模板制作安装、钢筋安装、混凝土的浇筑及养生、测量监控等方面人手,介绍了S226海滨大桥
期刊
本文的研究对象是都市民俗旅游口头表演,以什刹海胡同游三轮车夫的口头讲述为个案。笔者的重点访谈对象是一位叫张师傅的三轮车夫。本文在旅游民俗学和人类学的框架下研究旅
本画是近现代美术大家徐悲鸿(1895—1953)创作于1940年左右的一件大作。这件作品取材于《列子·汤问》中家喻户晓的神话故事《愚公移山》。围绕《愚公移山》这一题材,徐悲鸿
2008~2010年在宁夏引黄灌区水稻-春小麦-春玉米轮作体系下,采用田间试验研究了不施肥(CK)、平衡施用化肥(NPK)、单施有机肥(M)、化肥+有机肥(NPK+M)和习惯施肥(CON)5个施肥措