论文部分内容阅读
深度数据包检测(Deep Packet Inspection,DPI)采用正则表达式匹配算法,将每个数据包内容与一组预定义的特征进行匹配.正则表达式匹配算法是一种多模式特征匹配算法,采用确定型有限自动机(Deterministic Finite Automaton,DFA)表示一组正则表达式特征,实现一次内容扫描可匹配多个特征.基于硬件的正则表达式匹配算法面临存储空间需求大等挑战,即片上嵌入式存储器难以存储日益增长的DFA存储空间需求,从而限制了DPI的性能和可伸缩性.近年来,Smith等人提出了一种基于扩展有限自动机(eXtended Finite Automaton,XFA)的正则表达式匹配算法,即在状态上增加辅助变量和简单操作指令,消除了DFA状态空间爆炸问题,从状态方面减少存储空间需求.为了进一步减少XFA存储空间需求,本文提出了一种基于紧凑型有限自动机(Compact Finite Automaton,CFA)的正则表达式匹配算法,称为紧凑型正则表达式匹配算法.CFA是一种存储高效的有限自动机,即从迁移边方面减少XFA存储空间需求.在CFA构建过程,本文提出了基于优先级的迁移边压缩方法,融合相同目的状态最多的迁移边,从而减少存储空间需求;在CEA匹配过程,本文提出了基于位图的迁移边查找方法,并行查找不同优先级的迁移边子集,从而确保匹配效率.Snort特征规则集的实验结果表明:与XFA相比,CFA在迁移边条数上减少了88.2%,在存储空间大小上减少了83%,在匹配时间上减少了12%.
Deep Packet Inspection (DPI) adopts a regular expression matching algorithm to match the content of each packet with a set of predefined features.Regression expression matching algorithm is a multi-mode feature matching algorithm, Deterministic Finite Automaton (DFA) represents a set of regular expression features that enable a content scan to match multiple features. Hardware-based regular expression matching algorithms face the challenge of high memory requirements such as on-chip embedded memory It is difficult to store the increasing demand of DFA storage space, which limits the performance and scalability of DPI.In recent years, Smith et al. Proposed a regular expression matching algorithm based on eXtended Finite Automaton (XFA) Namely, adding auxiliary variables and simple operation instructions to the state, eliminating the problem of explosion in DFA state space and reducing the requirement of storage space from the state.In order to further reduce the requirement of XFA storage space, this paper proposes a Compact Finite Automaton, CFA) regular expression matching algorithm, called compact regular expression hops Algorithm.CFA is a storage-efficient finite automaton, which reduces the need of XFA storage space from the migration side.In the process of CFA construction, this paper proposes a priority-based migration-side compression method, which combines the migration edge with the same destination state , So as to reduce the storage space requirement.In the process of CEA matching, we propose a bitmap-based migration edge search method to find out the migration priority subsets in parallel, so as to ensure the matching efficiency.Experimental results of Snort signature rule set show that Compared with XFA, CFA decreased by 88.2% in the number of migration edges, decreased by 83% in storage space, and decreased by 12% in matching time.