Multi-agent reinforcement learning using modular neural network Q-learning algorithms

来源 :Journal of Chongqing University | 被引量 : 0次 | 上传用户:hx147852
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Reinforcement learning is an excellent approach which is used in artificial intelligence,automatic control, etc. However, ordinary reinforcement learning algorithm, such as Q-learning with lookup table cannot cope with extremely complex and dynamic environment due to the huge state space. To reduce the state space, modular neural network Q-learning algorithm is proposed, which combines Q-learning algorithm with neural network and module method. Forward feedback neural network, Elman neural network and radius-basis neural network are separately employed to construct such algorithm. It is revealed that Elman neural network Q-learning algorithm has the best performance under the condition that the same neural network training method, i.e. gradient descent error back-propagation algorithm is applied. Reinforcement learning is an excellent approach which is used in artificial intelligence, automatic control, etc. However, Ordinary reinforcement learning algorithm, such as Q-learning with lookup table can not cope with extremely complex and dynamic environment due to the huge state space. To reduce the state space, modular neural network Q-learning algorithm is proposed, which combines Q-learning algorithm with neural network and module method. Forward feedback neural network, Elman neural network and radius-basis neural network are separately employed to construct such algorithm. It is revealed that Elman neural network Q-learning algorithm has the best performance under the condition that the same neural network training method, ie gradient descent error back-propagation algorithm is applied.
其他文献
“沙打旺”是一种多年生豆科灌木。基部分生枝干,枝干由下而上每隔一至三寸互生分枝,分枝上生长数片到十数片羽状复叶。每年秋季在分枝的叶腋抽出花(?),开花结果。花为总状
目的随着中药材的大面积种植及栽培区域的不断扩大,药材质量发生了一定的改变,为保证药材的优质优价及临床疗效,提高并完善商品规格标准已成为必然要求。本研究以甘肃地产当归为研究对象,从感官评价和化学评价两方面对其商品规格等级进行系统研究,以期提出合理科学的当归商品规格等级划分方法。方法收集甘肃不同产地当归药材,先按已有标准进行初步分级,然后描述并量化不同规格等级药材的外观性状指标,采用显微及薄层鉴别对不
[目的]研究核桃光合特性与其品质和产量的关系.[方法]以新疆优良核桃品种新新2和温185为材料,对照分析2个品种的光合作用及其叶绿素荧光特性,与其果实品质和产量的关系.[结果
[目的]研究核桃小麦(以下简称核麦)间作模式下,种植密度对冬小麦冠层结构及农田小气候的影响.[方法]2016~2017年在核麦间作模式下,设置450×104株/hm2(M1)、525×104株/hm2(M2
国玉篆刻、书法自学而为,事余而操之。靠勤、悟、专尚得。其作品与其人之品性是同。自谓半生时光,多在“红与黑”间嬉戏,在“红与黑”的古今中孜孜探寻那 State jade carvin
目的:金樱子根具有拔毒收敛、活血化瘀、祛风驱湿等作用,是生产三金片、金鸡胶囊、妇科千金片、王老吉等中成药的关键原料药材。但是金樱子根的化学成分研究较少,药理活性也相对较为单一不够全面,亟待进一步研究。本文将对金樱子根的化学成分进行较为系统的化学成分研究,并对其中部分单体化合物进行抗炎活性研究。希望这些研究结果能为金樱子根的幵发利用奠定一定的化学基础。方法:利用石油醚、二氯甲烷、乙酸乙酯、丙酮、50
[目的]研究新疆南疆地区戈壁日光温室内不同番茄留果数对叶片光合特性与果实品质的影响,获得戈壁温室番茄生产的经济最适留果穗数.[方法]以NS3389番茄作为试验材料,设置4种留
[目的]研究连作对甜高粱主要农艺性状及产量与质量的影响,筛选最佳连作施肥方式,为新疆干旱区连作甜高粱的合理施肥及养分管理提供科学依据.[方法]长期定位测定2年连作甜高粱
如何在全苗壮苗的基础上争取“三桃”(伏前桃、伏桃、秋桃),达到增加产量、提高品质的目的,是棉花栽培技术的中心问题。这个问题在生产实践上具有现实意义,在理论研究上也有
随着互联网的急速发展以及数字设备在大众生活中的迅速普及,所有人都被“微”在这个世界的每个角落。微博、微信、微电影,这些时髦的“微”家族如雨后春笋般活跃在我们的日常