强化学习求解组合最优化问题的研究综述

来源 :计算机科学与探索 | 被引量 : 0次 | 上传用户：dukewyh

【摘要】

：

【作者】

：

王扬陈智斌吴兆蕊高远

【机构】

：

昆明理工大学理学院,昆明 650000

【出处】

：

计算机科学与探索

【发表日期】

：

2022年2期

【关键词】

：

强化学习(RL) 深度强化学习(DRL) 组合最优化问题(COP)

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

组合最优化问题(COP)的求解方法已经渗透到人工智能、运筹学等众多领域.随着数据规模的不断增大、问题更新速度的变快,运用传统方法求解COP问题在速度、精度、泛化能力等方面受到很大冲击.近年来,强化学习(RL)在无人驾驶、工业自动化等领域的广泛应用,显示出强大的决策力和学习能力,故而诸多研究者尝试使用RL求解COP问题,为求解此类问题提供了一种全新的方法.首先简要梳理常见的COP问题及其RL的基本原理;其次阐述RL求解COP问题的难点,分析RL应用于组合最优化(CO)领域的优势,对RL与COP问题结合的原理进行研究;然后总结近年来采用RL求解COP问题的理论方法和应用研究,对各类代表性研究所解决COP问题的关键要点、算法逻辑、优化效果进行对比分析,以突出RL模型的优越性,并对不同方法的局限性及其使用场景进行归纳总结;最后提出了四个RL求解COP问题的潜在研究方向.

其他文献

Preface

As a continuation of previous years\'special section on software systems,this special section encourages and promotes research to address challenges from the perspective of software systems.The goal of this special section is to present state-of-the-art

期刊

Activity Diagram Synthesis Using Labelled Graphs and the Genetic Algorithm

Many applications need to meet diverse requirements of a large-scale distributed user group.That challenges the current requirements engineering techniques.Crowd-based requirements engineering was proposed as an umbrella term for dealing with the requirem

期刊

crowd-based requirements engineeringrequirements synthesisactivity diagramgen

HRPDF:A Software-Based Heterogeneous Redundant Proactive Defense Framework for Programmable Logic Co

Programmable logic controllers(PLCs)play a critical role in many industrial control systems,yet face in-creasingly serious cyber threats.In this paper,we propose a novel PLC-compatible software-based defense mechanism,called Heterogeneous Redundant Proact

期刊

industrial control systemprogrammable logic controllerproactive defensehetero

MEBS:Uncovering Memory Life-Cycle Bugs in Operating System Kernels

Allocation,dereferencing,and freeing of memory data in kernels are coherently linked.There widely exist real cases where the correctness of memory is compromised.This incorrectness in kernel memory brings about significant security issues,e.g.,information

期刊

software securityoperating systemmemory life-cyclestatic analysisvulnerabili

Symbolic Reasoning About Quantum Circuits in Coq

A quantum circuit is a computational unit that transforms an input quantum state to an output state.A natural way to reason about its behavior is to compute explicitly the unitary matrix implemented by it.However,when the number of qubits increases,the ma

期刊

quantum circuitsymbolic reasoningDirac notationCoq

Verifying Contextual Refinement with Ownership Transfer

Contextual refinement is a compositional approach to compositional verification of concurrent objects.There has been much work designing program logics to prove the contextual refinement between the object implementation and its abstract specification.How

期刊

contextual refinementprogram logicconcurrent objectownership transferverific

AMCheX:Accurate Analysis of Missing-Check Bugs for Linux Kernel

The Linux kernel adopts a large number of security checks to prevent security-sensitive operations from being executed under unsafe conditions.If a security-sensitive operation is unchecked,a missing-check issue arises.Missing check is a class of severe b

期刊

security check functionsecurity-sensitive operationprogram analysismissing-ch

Pre-Train and Learn:Preserving Global Information for Graph Neural Networks

Graph neural networks(GNNs) have shown great power in learning on graphs.However,it is still a challenge for GNNs to model information faraway from the source node.The ability to preserve global information can enhance graph representation and hence impro

期刊

graph neural networknetwork embeddingrepresentation learningglobal informatio

A Unified Shared-Private Network with Denoising for Dialogue State Tracking

Dialogue state tracking(DST)leverages dialogue information to predict dialogues states which are generally represented as slot-value pairs.However,previous work usually has limitations to efficiently predict values due to the lack of a powerful strategy f

期刊

dialogue state trackingunified strategyshared-private networkreinforcement le

中文命名实体识别综述

中文命名实体识别(NER)任务是信息抽取领域内的一个子任务,其任务目标是给定一段非结构文本后,从句子中寻找、识别和分类相关实体,例如人名、地名和机构名称.中文命名实体识别是一个自然语言处理(NLP)领域的基本任务,在许多下游NLP任务中,包括信息检索、关系抽取和问答系统中扮演着重要角色.全面回顾了现有的基于神经网络的单词-字符晶格结构的中文NER模型.首先介绍了中文NER相比英语NER难度更大,存在着中文文本相关实体边界难以确定和中文语法结构复杂等难点及挑战.然后调研了在不同神经网络架构下(RNN、CNN

期刊

命名实体识别(NER)晶格结构神经网络

强化学习求解组合最优化问题的研究综述

其他学术论文