Cooperative learning with joint state value approximation for multi-agent systems

来源 :Journal of Control Theory and Applications | 被引量 : 0次 | 上传用户：hutao95

【摘要】

：

This paper relieves the’curse of dimensionality’ problem, which becomes intractable when scaling reinforcement learning to multi-agent systems. This problem i

【作者】

：

Xin CHEN Gang CHEN Weihua CAO Min WU

【机构】

：

School of Information Science and Engineering, Central South University,

【出处】

：

Journal of Control Theory and Applications

【发表日期】

：

2013年02期

【关键词】

：

Multi-agent system Q-learning Cooperative system Curse of dimensionality Decompo

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

This paper relieves the’curse of dimensionality’ problem, which becomes intractable when scaling reinforcement learning to multi-agent systems. This problem is aggravated exponentially as the number of agents increases, resulting in large memory requirement and slowness in learning speed. For cooperative systems which widely exist in multi-agent systems, this paper proposes a new multi-agent Q-learning algorithm based on decomposing the joint state and joint action learning into two learning processes, which are learning individual action and the maximum value of the joint state approximately. The latter process considers others’ actions to insure that the joint action is optimal and supports the updating of the former one. The simulation results illustrate that the proposed algorithm can learn the optimal joint behavior with smaller memory and faster learning speed compared with friend-Q learning and independent learning. This paper relieves the’curse of dimensionality ’problem, which becomes intractable when scaling reinforcement learning to multi-agent systems. This problem is aggravated exponentially as the number of agents increases, resulting in large memory requirement and slowness in learning speed. which widely exist in multi-agent systems, this paper proposes a new multi-agent Q-learning algorithm based on decomposing the joint state and joint action learning into two learning processes, which are learning individual action and the maximum value of the joint state approximately . The latter process considers others’ actions to insure that the joint action is optimal and supports the updating of the former one. The simulation results illustrate that the proposed algorithm can learn the optimal joint behavior with smaller memory and faster learning speed compared with friend- Q learning and independent learning.

其他文献

村委会选举问题研究

20世纪80年代初，来自中国广西农村的村民自发组建了中国有史以来的第一个村民委员会，开启了中国农村基层民主自治的大幕。村民委员会作为中国当代农村基层民主政治和村民自治的

学位

村委会直选农村基层民主选举制度政治积极性

"互联网+"时代物流管理专业学生创新创业能力培养研究

期刊

Modeling and analysis of singular systems via orthogonal triangular functions

This paper presents a new approach to singular system analysis by modeling the system in terms of orthogonal triangular functions (TFs). The proposed method is

期刊

Singular systemsTriangular functionsBlock pulse functionsSystem analysisMode

民国初年行政监察制度研究

在国家的行政活动中,行政监察具有重要的地位,是行政机关及其工作人员依法行政、高效行政、清正廉洁的重要保证。我国目前的行政监察体制属于专门行政监督,众多学者认为,在实

学位

中国传统廉政文化的现代转化研究

廉政文化建设作为建设社会主义先进文化的重要内容，是建立卓越行政文化过程中看不见的软件，也是反腐败斗争强大的精神动力和思想保证。中华民族在漫长的历史过程中形成了优秀的

学位

廉政文化政治伦理行政文化社会政治心理廉政思想

2008:世界水产品贸易回顾与展望

该文从挂篮荷载计算、施工流程、支座及临时固结施工、挂篮安装及试验、合拢段施工、模板制作安装、钢筋安装、混凝土的浇筑及养生、测量监控等方面人手,介绍了S226海滨大桥

期刊

水产品市场水产品价格需求和供给市场需求市场形势市场表现不同品种粮食危机金融危机价格下降价格上涨持续上涨产品供给原材料萎缩燃油

新时期下的高校学生思想政治问题研究

期刊

都市民俗旅游口头表演研究——以什刹海胡同游的三轮车夫为考察对象

本文的研究对象是都市民俗旅游口头表演,以什刹海胡同游三轮车夫的口头讲述为个案。笔者的重点访谈对象是一位叫张师傅的三轮车夫。本文在旅游民俗学和人类学的框架下研究旅

学位

都市民俗旅游口头表演什刹海胡同游三轮车夫

愚公移山

本画是近现代美术大家徐悲鸿(1895—1953)创作于1940年左右的一件大作。这件作品取材于《列子·汤问》中家喻户晓的神话故事《愚公移山》。围绕《愚公移山》这一题材,徐悲鸿

期刊

徐悲鸿纪念馆《愚公移山》近现代美术神话故事中国现代美术干劲十足横卷精神美人体美

施肥对水旱轮作作物产量、土壤无机氮残留及氮素平衡的影响

2008~2010年在宁夏引黄灌区水稻-春小麦-春玉米轮作体系下,采用田间试验研究了不施肥(CK)、平衡施用化肥(NPK)、单施有机肥(M)、化肥+有机肥(NPK+M)和习惯施肥(CON)5个施肥措

期刊

水旱轮作作物产量土壤无机氮土壤NO3--N春玉米玉米轮作施肥措施田间试验研究磷钾氮肥施用

Cooperative learning with joint state value approximation for multi-agent systems

其他学术论文