HK1208103B - Method and apparatus for processing finite automata - Google Patents
Method and apparatus for processing finite automata Download PDFInfo
- Publication number
- HK1208103B HK1208103B HK15108607.1A HK15108607A HK1208103B HK 1208103 B HK1208103 B HK 1208103B HK 15108607 A HK15108607 A HK 15108607A HK 1208103 B HK1208103 B HK 1208103B
- Authority
- HK
- Hong Kong
- Prior art keywords
- pattern
- nfa
- node
- offset
- payload
- Prior art date
Links
Description
发明背景Background of the Invention
开放系统互连(OSI)参考模型定义了用于通过传输介质进行通信的7个网络协议层(L1-L7)。上层(L4-L7)表示端到端通信并且下层(L1-L3)表示本地通信。The Open Systems Interconnection (OSI) reference model defines seven network protocol layers (L1-L7) for communication over a transmission medium. The upper layers (L4-L7) represent end-to-end communication and the lower layers (L1-L3) represent local communication.
联网应用感知系统需要处理、过滤和切换L3到L7网络协议层的范围,例如,L7网络协议层诸如超文本传输协议(HTTP)和简单邮件传输协议(SMTP),以及L4网络协议层诸如传输控制协议(TCP)。除了处理网络协议层以外,联网应用感知系统需要以线速通过L4-L7网络协议层来同时通过基于访问和内容的安全性来保护这些协议,这些协议层包括防火墙、虚拟专用网(VPN)、安全套接字层(SSL)、入侵检测系统(IDS)、互联网协议安全(IPSec)、防病毒(AV)和防垃圾邮件功能。Networked application-aware systems need to process, filter, and switch a range of network protocol layers from L3 to L7, for example, L7 network protocol layers such as Hypertext Transfer Protocol (HTTP) and Simple Mail Transfer Protocol (SMTP), and L4 network protocol layers such as Transmission Control Protocol (TCP). In addition to processing network protocol layers, networked application-aware systems need to protect these protocols at line speed through L4-L7 network protocol layers with both access-based and content-based security, including firewalls, virtual private networks (VPNs), secure sockets layers (SSL), intrusion detection systems (IDS), Internet Protocol Security (IPSec), antivirus (AV), and anti-spam capabilities.
网络处理器可用于高吞吐量L2和L3网络协议处理,即,执行数据包处理从而以线速转发数据包。通常,通用处理器用于处理需要更多智能处理的L4-L7网络协议。虽然通用处理器可以执行计算密集型任务,但是没有提供足够用于处理数据使得其能够被以线速转发的性能。Network processors are used for high-throughput Layer 2 and Layer 3 network protocol processing, that is, they perform packet processing to forward packets at line speed. Typically, general-purpose processors are used to process Layer 4-7 network protocols, which require more intelligent processing. While general-purpose processors can perform computationally intensive tasks, they do not provide sufficient performance to process data at line speed.
内容感知联网需要以“线速”对数据包的内容进行检查。可以对内容进行分析,以确定是否存在安全漏洞或入侵。应用大量正则表达式形式的图样和规则以确保所有的安全漏洞或入侵被检测到。正则表达式是用于描述字符串中的图样的紧凑型方法。由正则表达式所匹配的最简单图样是单个字符或字符串,例如,/c/或/cat/。正则表达式还包括具有特殊含义的运算符和元字符。Content-aware networking requires inspecting the contents of data packets at "wire speed." This content can be analyzed to determine if there are security vulnerabilities or intrusions. A wide range of patterns and rules in the form of regular expressions are applied to ensure that all security vulnerabilities or intrusions are detected. Regular expressions are a compact method for describing patterns within strings. The simplest pattern matched by a regular expression is a single character or string, such as /c/ or /cat/. Regular expressions also include operators and metacharacters with special meanings.
通过使用元字符,正则表达式可以用于更复杂的搜索,诸如“abc.*xyz”。即,在“abc”和“xyz”之间的无限量字符的情况下,找到字符串“abc”,之后是字符串“xyz”。另一示例是正则表达式“abc..abc.*xyz”,即,找到字符串“abc”,在两个字符以后跟随字符串“abc”并且在无限量字符后跟随字符串“xyz”。By using metacharacters, regular expressions can be used for more complex searches, such as "abc.*xyz". That is, with an unlimited number of characters between "abc" and "xyz", find the string "abc" followed by the string "xyz". Another example is the regular expression "abc..abc.*xyz", which finds the string "abc" followed by the string "abc" two characters later and followed by the string "xyz" after an unlimited number of characters.
入侵检测系统(IDS)应用检查所有流过网络的单独数据包的内容,并且标识可能指示尝试闯入或危害系统的可疑图样。可疑图样的一个示例可以是数据包中的特定文本串,该特定文本串在100个字符以后跟随另一特定文本串。Intrusion Detection System (IDS) applications examine the contents of all individual packets flowing through a network and identify suspicious patterns that may indicate an attempt to break into or compromise a system. An example of a suspicious pattern might be a specific string of text in a packet that is followed 100 characters later by another specific string of text.
通常使用搜索方法(如用于处理正则表达式的确定有限自动机(DFA)或非确定有限自动机(NFA))执行内容搜索。Content searches are typically performed using a search method such as a Deterministic Finite Automaton (DFA) or a Non-Deterministic Finite Automaton (NFA) for processing regular expressions.
发明概述SUMMARY OF THE INVENTION
本发明的实施例提供了一种用于有限自动机的编译和运行时间处理的方法、装置、计算机程序产品、和相应的系统。Embodiments of the present invention provide a method, apparatus, computer program product, and corresponding system for compilation and runtime processing of finite automata.
根据一个实施例,在操作性地耦合至一个网络的一个安全装置中的操作性地耦合至至少一个存储器的至少一个处理器中,一种方法可以:通过用来自一个有效载荷的多个字符遍历该至少一个存储器中所存储的一个统一确定有限自动机(DFA)的多个节点,使该有效载荷的多个字符走过该统一DFA,由基于至少一种启发法从一个或多个正则表达式图样的一个集合中的每个图样所选择的多个子图样生成该统一DFA。该方法可以:通过用来自该有效载荷的多个字符遍历该至少一个存储器中所存储的至少一个非确定有限自动机(NFA)的多个节点,使该有效载荷的多个字符走过该至少一个NFA,针对该集合中的至少一个图样生成的该至少一个NFA、该至少一个图样的用于生成该至少一个NFA的一部分、以及用于使多个字符走过该至少一个NFA的至少一个行走方向基于选自该至少一个图样的一个子图样的一个长度是固定的还是可变的以及所选择的该子图样在该至少一个图样内的一个位置。According to one embodiment, in at least one processor operatively coupled to at least one memory in a security device operatively coupled to a network, a method may include: traversing a unified deterministic finite automaton (DFA) stored in the at least one memory with a plurality of characters from a payload through the DFA, the unified DFA being generated from a plurality of subpatterns selected from each pattern in a set of one or more regular expression patterns based on at least one heuristic. The method may traverse at least one nondeterministic finite automaton (NFA) stored in the at least one memory with a plurality of characters from the payload through the NFA, the at least one NFA generated for at least one pattern in the set, a portion of the at least one pattern used to generate the at least one NFA, and at least one direction for traversing the plurality of characters through the NFA being based on whether a length of a subpattern selected from the at least one pattern is fixed or variable and a position of the selected subpattern within the at least one pattern.
该方法可以在遍历该至少一个NFA的与指示该至少一个图样的一次最终匹配的元数据相关联的一个NFA节点的基础上上报该有效载荷中的该至少一个图样的一次匹配。The method may report a match of the at least one pattern in the payload based on traversing an NFA node of the at least one NFA associated with metadata indicating a final match of the at least one pattern.
该方法可以使一个用于该DFA的一次给定行走的事务标识符与用于匹配该有效载荷中的该至少一个图样的该至少一个NFA相关联。该方法可以在以下内容的基础上上报该有效载荷中的该至少一个图样的一次匹配:遍历该统一DFA的具有指示该至少一个图样的一次DFA部分匹配的元数据的一个DFA节点;随后遍历该至少一个NFA的具有指示该至少一个图样的一次NFA部分匹配的元数据的至少一个NFA节点;以及The method may associate a transaction identifier for a given walk of the DFA with the at least one NFA used to match the at least one pattern in the payload. The method may report a match of the at least one pattern in the payload based on: traversing a DFA node of the unified DFA having metadata indicating a DFA partial match of the at least one pattern; subsequently traversing at least one NFA node of the at least one NFA having metadata indicating a NFA partial match of the at least one pattern; and
使该遍历和随后的该遍历与该事务标识符相关。The traversal and subsequent traversals are correlated with the transaction identifier.
该方法可以基于以下内容将该有效载荷中与该至少一个图样的一个第一元素匹配的一个字符的一个偏移作为该有效载荷中的该至少一个图样的一个起始偏移上报:与该至少一个NFA的一个NFA节点相关联并指示该有效载荷中的该至少一个图样的一次最终匹配的元数据;以及与该统一DFA的一个DFA节点相关联并在该DFA节点处指示(i)为该至少一个图样选择的该子图样的一个长度、以及(ii)该有效载荷中与为该至少一个图样选择的该子图样的一个最后元素匹配的一个子图样字符的一个子图样结束偏移的元数据,该至少一个处理器在从该子图样结束偏移减去该长度的基础上确定该起始偏移。The method may report an offset of a character in the payload that matches a first element of the at least one pattern as a starting offset of the at least one pattern in the payload based on: metadata associated with an NFA node of the at least one NFA and indicating a final match of the at least one pattern in the payload; and metadata associated with a DFA node of the unified DFA and indicating at the DFA node (i) a length of the sub-pattern selected for the at least one pattern, and (ii) a sub-pattern end offset of a sub-pattern character in the payload that matches a last element of the sub-pattern selected for the at least one pattern, the at least one processor determining the starting offset based on subtracting the length from the sub-pattern end offset.
该方法可以:在使与该统一DFA的多个节点相关联的元数据中所指示的多个部分匹配结果与该至少一个图样的该至少一个NFA相关的基础上,上报与该至少一个NFA的一个NFA节点处的该至少一个图样的一个第一元素匹配的该有效载荷的一个字符的一个偏移作为该有效载荷中的该至少一个图样的一个起始偏移。The method may: on the basis of correlating multiple partial matching results indicated in metadata associated with multiple nodes of the unified DFA with the at least one NFA of the at least one pattern, report an offset of a character of the payload that matches a first element of the at least one pattern at an NFA node of the at least one NFA as a starting offset of the at least one pattern in the payload.
该方法可以:在与该NFA节点相关联的元数据以及在该NFA节点处为该有效载荷中的该至少一个图样所确定的一次最终匹配的基础上,上报该至少一个NFA的一个NFA节点处的与该至少一个图样的一个第一元素匹配的该有效载荷的一个字符的一个偏移作为该有效载荷中的该至少一个图样的一个起始偏移。The method may: report an offset of a character of the payload that matches a first element of the at least one pattern at an NFA node of the at least one NFA as a starting offset of the at least one pattern in the payload based on metadata associated with the NFA node and a final match determined for the at least one pattern in the payload at the NFA node.
该至少一种启发法可以包括将所选择的唯一子图样的数量和所选择的每个子图样的长度最大化,所选择的每个子图样的该长度至少具有一个最小阈值长度。The at least one heuristic may include maximizing the number of unique subpatterns selected and the length of each subpattern selected, the length of each subpattern selected having at least a minimum threshold length.
如果所选择的该子图样的一个第一元素是该至少一个子图样的一个第一元素并且所选择的该子图样的该长度是固定的,所选择的该子图样的该位置可以是该至少一个子图样的一个开始位置,该至少一个图样的用于生成该至少一个NFA的该部分可以是排除所选择的该子图样的该至少一个图样,该至少一个NFA可以是单个NFA,并且该至少一个NFA的该至少一个行走方向可以是一个前向行走方向。If a first element of the selected sub-pattern is a first element of the at least one sub-pattern and the length of the selected sub-pattern is fixed, the position of the selected sub-pattern may be a starting position of the at least one sub-pattern, the portion of the at least one pattern used to generate the at least one NFA may be the at least one pattern excluding the selected sub-pattern, the at least one NFA may be a single NFA, and the at least one walking direction of the at least one NFA may be a forward walking direction.
该方法可以在与所选择的该子图样的该最后元素和向该至少一个处理器指示一个指向该至少一个NFA的一个起始节点的指针的元数据相关联的该统一DFA的一个DFA节点处转换成在一个前向行走方向行走该至少一个NFA。该至少一个NFA的该起始节点可以与该至少一个图样的用于生成该至少一个NFA的该部分的一个第一元素相关联。该至少一个NFA的一个有效载荷起始偏移可以与所选择的该子图样的该结束偏移处的另一字节之后的一个字节的一个偏移相关联,并上报所选择的该子图样的一次匹配,与该DFA节点处所选择的该子图样的该最后元素匹配的一个前导字符的该有效载荷内的一个前导偏移作为所选择的该子图样的一个结束偏移,以及所选择的该子图样的一个长度。The method may transition to walking the at least one NFA in a forward walking direction at a DFA node of the unified DFA associated with the last element of the selected subpattern and metadata indicating a pointer to a starting node of the at least one NFA to the at least one processor. The starting node of the at least one NFA may be associated with a first element of the at least one pattern used to generate the portion of the at least one NFA. A payload starting offset of the at least one NFA may be associated with an offset of a byte after another byte at the ending offset of the selected subpattern, and a match of the selected subpattern is reported, as an ending offset of the selected subpattern, and a leading offset within the payload of a leading character that matches the last element of the selected subpattern at the DFA node, and a length of the selected subpattern.
该方法可以在该至少一个NFA的与元数据相关联的一个NFA节点处:终止该行走,该NFA节点与该至少一个图样的一个最后元素相关联,以及上报该NFA节点处匹配的一个滞后字符的该有效载荷内的一个滞后偏移作为该至少一个图样的一个结束偏移,以及该至少一个图样的一次最终匹配。The method may terminate the walk at an NFA node of the at least one NFA associated with metadata, the NFA node being associated with a last element of the at least one pattern, and reporting a lag offset within the payload of a lag character matched at the NFA node as an end offset of the at least one pattern and a final match of the at least one pattern.
如果所选择的该子图样的一个第一元素不是该至少一个图样的一个第一元素,并且所选择的该子图样的一个最后元素不是该至少一个图样的一个最后元素,所选择的该子图样的该位置可以是该至少一个图样的一个中间位置;并且如果所选择的该子图样的该长度是固定的,该至少一个图样的用于生成该至少一个NFA的该部分可以包括该至少一个图样的一个滞后部分和一个前导部分,该至少一个图样的该滞后部分可以是排除了该至少一个图样的所选择的该子图样和该前导部分的该至少一个图样,该至少一个图样的该前导部分可以排除该至少一个图样的所选择的该子图样和该滞后部分。该至少一个NFA可以包括一个滞后NFA和一个前导NFA,该至少一个行走方向可以包括一个前向行走方向和一个反向行走方向,该滞后NFA可以具有该前向行走方向,该前导NFA可以具有该反向行走方向,该至少一个图样的该滞后部分用于生成该滞后NFA,并且该至少一个图样的该前导部分用于生成该前导NFA。If a first element of the selected sub-pattern is not a first element of the at least one pattern, and a last element of the selected sub-pattern is not a last element of the at least one pattern, the position of the selected sub-pattern may be a middle position of the at least one pattern; and if the length of the selected sub-pattern is fixed, the portion of the at least one pattern used to generate the at least one NFA may include a lag portion and a leading portion of the at least one pattern, the lag portion of the at least one pattern may be the at least one pattern excluding the selected sub-pattern and the leading portion of the at least one pattern, and the leading portion of the at least one pattern may exclude the selected sub-pattern and the lag portion of the at least one pattern. The at least one NFA may include a lag NFA and a leading NFA, the at least one walking direction may include a forward walking direction and a reverse walking direction, the lag NFA may have the forward walking direction, the leading NFA may have the reverse walking direction, the lag part of the at least one pattern is used to generate the lag NFA, and the leading part of the at least one pattern is used to generate the leading NFA.
该方法可以在与所选择的该子图样的该最后元素和向该至少一个处理器指示一个指向该滞后NFA的一个起始节点的指针和一个指向该前导NFA的一个起始节点的指针的元数据相关联的该统一DFA的一个DFA节点处:使该统一DFA的行走转换成在该前向行走方向行走该滞后NFA,该滞后NFA的该起始节点可以与该滞后部分的一个第一元素相关联。该方法可以使行走该滞后NFA转换成在该反向行走方向行走该前导NFA,该前导NFA的该起始节点可以与该前导部分的一个最后元素相关联。该方法可以上报与该DFA节点处的所选择的该子图样的该最后元素匹配的一个字符的该有效载荷内的一个偏移作为所选择的该子图样的一个结束偏移,所选择的该子图样的一次匹配,以及所选择的该子图样的一个长度。The method may, at a DFA node of the unified DFA associated with the last element of the selected subpattern and metadata indicating to the at least one processor a pointer to a start node of the lag NFA and a pointer to a start node of the predecessor NFA: convert a walk of the unified DFA to a walk of the lag NFA in the forward walk direction, the start node of the lag NFA being associated with a first element of the lag portion. The method may convert the walk of the lag NFA to a walk of the predecessor NFA in the reverse walk direction, the start node of the predecessor NFA being associated with a last element of the predecessor portion. The method may report an offset within the payload of a character that matches the last element of the selected subpattern at the DFA node as an end offset of the selected subpattern, a match of the selected subpattern, and a length of the selected subpattern.
该方法可以在与该至少一个图样的该最后元素相关联,与元数据相关联的该滞后NFA的一个滞后节点处终止行走该滞后NFA。该方法可以上报与该滞后节点处的该最后元素匹配的该有效载荷的一个滞后字符的该有效载荷内的一个滞后偏移,以及该至少一个图样的该滞后部分的一个匹配。该方法可以在与该至少一个图样的该第一元素相关联,与元数据相关联该前导NFA的一个前导节点处:终止行走该前导NFA,以及上报该至少一个图样的该前导部分的一次匹配,以及与该前导节点处的该第一元素匹配的该有效载荷的一个前导字符的该有效载荷内的一个前导偏移作为该至少一个图样的一个起始偏移,如果与该至少一个图样相关联的一个限定符需要的话。The method may terminate walking the lag NFA at a lag node of the lag NFA associated with the last element of the at least one pattern and associated with metadata. The method may report a lag offset within the payload of a lag character of the payload that matches the last element at the lag node, and a match of the lag portion of the at least one pattern. The method may terminate walking the leading NFA at a leading node of the leading NFA associated with metadata that is associated with the first element of the at least one pattern, and report a match of the leading portion of the at least one pattern and a leading offset within the payload of a leading character of the payload that matches the first element at the leading node as a starting offset of the at least one pattern, if required by a qualifier associated with the at least one pattern.
如果所选择的该子图样的一个第一元素不是该至少一个图样的一个第一元素,并且所选择的该子图样的一个最后元素不是该至少一个图样的一个最后元素,所选择的该子图样的该位置可以是该至少一个图样的一个中间位置,并且如果所选择的该子图样的该第一元素是该至少一个图样的该第一元素,所选择的该子图样的该位置可以是该至少一个图样的该开始位置。如果该子图样的该长度是固定的或可变的,该至少一个图样的用于生成该至少一个NFA的该部分可以包括该至少一个图样的一个滞后部分和一个整个部分,该至少一个图样的该滞后部分可以是排除了该至少一个图样的一个前导部分的该至少一个图样。该前导部分可以包括该至少一个图样的该第一元素、所选择的该子图样的该最后元素、以及其间的该至少一个图样中的所有元素。该至少一个图样的该整个部分可以是该至少一个图样。如果所选择的该子图样的位置可以是开始位置,该前导部分可以是所选择的该子图样。该至少一个NFA可以包括一个滞后NFA和一个伞状NFA,该至少一个行走方向可以包括一个前向行走方向和一个反向行走方向。该滞后NFA可以具有该前向行走方向。该伞状NFA可以具有该反向行走方向。该至少一个图样的该滞后部分已经可以用于生成该滞后NFA,并且该至少一个图样的该整个部分已经可以用于生成该伞状NFA。If the first element of the selected sub-pattern is not the first element of the at least one pattern, and the last element of the selected sub-pattern is not the last element of the at least one pattern, the position of the selected sub-pattern may be an intermediate position of the at least one pattern, and if the first element of the selected sub-pattern is the first element of the at least one pattern, the position of the selected sub-pattern may be the starting position of the at least one pattern. If the length of the sub-pattern is fixed or variable, the portion of the at least one pattern used to generate the at least one NFA may include a lag portion and an entire portion of the at least one pattern. The lag portion of the at least one pattern may be the at least one pattern excluding a leading portion of the at least one pattern. The leading portion may include the first element of the at least one pattern, the last element of the selected sub-pattern, and all elements of the at least one pattern therebetween. The entire portion of the at least one pattern may be the at least one pattern. If the position of the selected sub-pattern may be the starting position, the leading portion may be the selected sub-pattern. The at least one NFA may include a lag NFA and an umbrella NFA, and the at least one walking direction may include a forward walking direction and a reverse walking direction. The lag NFA may have the forward walking direction. The umbrella NFA may have the reverse walking direction. The lag portion of the at least one pattern may be used to generate the lag NFA, and the entire portion of the at least one pattern may be used to generate the umbrella NFA.
该方法可以在该统一DFA的与所选择的该子图样的该最后元素相关联,与向该至少一个处理器指示一个指向该滞后NFA的一个起始节点的指针的元数据相关联的一个DFA节点处使该统一DFA的行走转换成使在向前向行走方向行走该滞后NFA。该滞后NFA的起始节点可以与该滞后部分的一个第一元素相关联。该方法可以上报所选择的该子图样的一次匹配,以及与该DFA节点处的所选择的该子图样的该最后元素匹配的一个字符的该有效载荷内的一个偏移作为所选择的该子图样的一个结束偏移,以及所选择的该子图样的一个长度,如果该长度是固定的。The method may convert a walk of the unified DFA to walking the lag NFA in a forward direction at a DFA node of the unified DFA associated with the last element of the selected subpattern and with metadata indicating a pointer to a start node of the lag NFA to the at least one processor. The start node of the lag NFA may be associated with a first element of the lag portion. The method may report a match of the selected subpattern, an offset within the payload of a character that matches the last element of the selected subpattern at the DFA node as an end offset of the selected subpattern, and a length of the selected subpattern, if the length is fixed.
该方法可以在与该至少一个图样的该最后元素相关联,与向该至少一个处理器指示一个指向该伞状NFA的一个起始节点的指针的元数据相关联的该至少一个NFA的一个滞后节点处使该至滞后NFA的行走转换成在反向行走方向行走该伞状NFA。该伞状NFA的起始节点可以与该至少一个图样的最后元素相关联。该方法可以可选地上报与该滞后节点处的该至少一个图样的该最后元素匹配的一个字符的该有效载荷内的一个偏移。该方法可以可选地上报该至少一个图样的该滞后部分的一个匹配。该方法可以在与该至少一个图样的该第一元素相关联,与元数据相关联该伞状NFA的一个伞状节点处:终止该行走,以及上报该至少一个图样的一次最终匹配,以及与该伞状节点处的该至少一个图样的该第一元素匹配的一个起始字符的该有效载荷内的一个起始偏移作为该至少一个图样的一个起始偏移,如果与该至少一个图样相关联的一个限定符需要的话。The method may, at a lag node of the at least one NFA associated with the last element of the at least one pattern and metadata indicating a pointer to a start node of the umbrella NFA to the at least one processor, switch the walk to the lag NFA to a walk in the reverse direction of the umbrella NFA. The start node of the umbrella NFA may be associated with the last element of the at least one pattern. The method may optionally report an offset within the payload of a character that matches the last element of the at least one pattern at the lag node. The method may optionally report a match for the lag portion of the at least one pattern. The method may, at an umbrella node of the umbrella NFA associated with metadata associated with the first element of the at least one pattern, terminate the walk and report a final match for the at least one pattern and, if required by a qualifier associated with the at least one pattern, a starting offset within the payload of a starting character that matches the first element of the at least one pattern at the umbrella node as a starting offset for the at least one pattern.
如果所选择的该子图样的一个第一元素不是该至少一个图样的一个第一元素,并且所选择的该子图样的一个最后元素不是该至少一个图样的一个最后元素,所选择的该子图样的该位置可以是该至少一个图样的一个中间位置,并且如果所选择的该子图样的该第一元素是该至少一个图样的该第一元素,所选择的该子图样的该位置可以是该至少一个图样的一个开始位置,并且如果该子图样的该长度是固定的或可变的,该至少一个图样的用于生成该至少一个NFA的该部分可以包括该至少一个图样的一个滞后部分和一个前导部分。该至少一个图样的该滞后部分可以是排除了该至少一个图样的该前导部分的该至少一个图样。该前导部分可以包括该至少一个图样的该第一元素、所选择的该子图样的该最后元素、以及其间的该至少一个图样中的所有元素。如果所选择的该子图样的位置可以是该开始位置,该滞后部分可以是所选择的该子图样。该至少一个NFA可以包括一个滞后NFA和一个前导NFA。该至少一个行走方向可以包括一个前向行走方向和一个反向行走反向。该滞后NFA可以具有该前向行走方向。该前导NFA可以具有该反向行走方向。该至少一个图样的该滞后部分已经可以用于生成该滞后NFA,并且该至少一个图样的该前导部分已经可以用于生成该前导NFA。If the first element of the selected sub-pattern is not the first element of the at least one pattern, and the last element of the selected sub-pattern is not the last element of the at least one pattern, the position of the selected sub-pattern may be a middle position of the at least one pattern, and if the first element of the selected sub-pattern is the first element of the at least one pattern, the position of the selected sub-pattern may be a starting position of the at least one pattern, and if the length of the sub-pattern is fixed or variable, the portion of the at least one pattern used to generate the at least one NFA may include a lag portion and a leading portion of the at least one pattern. The lag portion of the at least one pattern may be the at least one pattern excluding the leading portion of the at least one pattern. The leading portion may include the first element of the at least one pattern, the last element of the selected sub-pattern, and all elements of the at least one pattern therebetween. If the position of the selected sub-pattern may be the starting position, the lag portion may be the selected sub-pattern. The at least one NFA may include a lag NFA and a leading NFA. The at least one walking direction may include a forward walking direction and a reverse walking direction. The lag NFA may have the forward walking direction. The leading NFA may have the reverse walking direction. The lag portion of the at least one pattern may have been used to generate the lag NFA, and the leading portion of the at least one pattern may have been used to generate the leading NFA.
该方法可以在该统一DFA的与所选择的该子图样的该最后元素相关联,与向该至少一个处理器指示一个指向该滞后NFA的一个起始节点的指针以及一个指向该前导NFA的一个起始节点的元数据相关联的一个DFA节点处使该统一DFA的行走转换成在该前向行走方向行走该滞后NFA。该滞后NFA的起始节点可以与该滞后部分的一个第一元素相关联。该方法可以使该统一DFA的行走转换成在反向行走方向行走前导NFA。该前导NFA的起始节点可以与所选择的该子图样的最后元素相关联。该方法可以上报所选择的该子图样的一次匹配,以及与该DFA节点处的所选择的该子图样的该最后元素匹配的一个字符的该有效载荷内的一个偏移作为所选择的该子图样的一个结束偏移,以及所选择的该子图样的一个长度,如果该长度是固定的。The method may convert a walk of the unified DFA to a walk of the lag NFA in the forward direction at a DFA node of the unified DFA associated with the last element of the selected subpattern and metadata indicating to the at least one processor a pointer to a start node of the lag NFA and a start node of the predecessor NFA. The start node of the lag NFA may be associated with a first element of the lag portion. The method may convert a walk of the unified DFA to a walk of the predecessor NFA in the reverse direction. The start node of the predecessor NFA may be associated with the last element of the selected subpattern. The method may report a match of the selected subpattern, an offset within the payload of a character that matches the last element of the selected subpattern at the DFA node as an end offset of the selected subpattern, and a length of the selected subpattern, if the length is fixed.
该方法可以在与该至少一个图样的该最后元素相关联,与元数据相关联的该至少一个NFA的一个滞后节点处终止行走该滞后NFA。该方法可以上报与该滞后节点处的该至少一个图样的该最后元素匹配的滞后字符的该有效载荷内的一个滞后偏移,并上报该至少一个图样的该滞后部分的一个匹配。该方法可以在与该至少一个图样的该第一元素相关联,与元数据相关联的该至少一个NFA的一个前导节点处:终止行走该前导NFA,以及上报该前导部分的一次匹配,以及与该前导节点处的该至少一个图样的该第一元素匹配的一个前导字符的该有效载荷内的一个前导偏移。The method may terminate walking the lag NFA at a lag node of the at least one NFA associated with the last element of the at least one pattern and the metadata. The method may report a lag offset within the payload of the lag character that matches the last element of the at least one pattern at the lag node, and report a match of the lag portion of the at least one pattern. The method may terminate walking the lead NFA at a lead node of the at least one NFA associated with the first element of the at least one pattern and the metadata, and report a match of the lead portion and a lead offset within the payload of a lead character that matches the first element of the at least one pattern at the lead node.
如果所选择的该子图样的一个第一元素不是该至少一个图样的一个第一元素,并且所选择的该子图样的一个最后元素不是该至少一个图样的一个最后元素,所选择的该子图样的该位置可以是该至少一个图样的一个中间位置;并且如果所选择的该子图样的该长度是固定的或可变的,该至少一个NFA可以是单个NFA。该至少一个行走方向可以包括一个前向行走方向,用于与该至少一个图样的一个滞后部分的多个元素相关联的单个NFA的多个运行时间处理节点,以及一个反向行走方向,用于与该至少一个图样的所有元素相关联的该单个NFA的多个运行时间处理节点。该至少一个图样的该滞后部分可以是排除了该至少一个图样的一个前导部分的该至少一个图样。该前导部分可以包括该至少一个图样的该第一元素、所选择的该子图样的该最后元素、以及其间的该至少一个图样中的所有元素。If a first element of the selected subpattern is not a first element of the at least one pattern, and a last element of the selected subpattern is not a last element of the at least one pattern, the position of the selected subpattern may be an intermediate position of the at least one pattern; and if the length of the selected subpattern is fixed or variable, the at least one NFA may be a single NFA. The at least one walking direction may include a forward walking direction for runtime processing nodes of the single NFA associated with elements of a lag portion of the at least one pattern, and a reverse walking direction for runtime processing nodes of the single NFA associated with all elements of the at least one pattern. The lag portion of the at least one pattern may be the at least one pattern excluding a leading portion of the at least one pattern. The leading portion may include the first element of the at least one pattern, the last element of the selected subpattern, and all elements of the at least one pattern therebetween.
该方法可以在该统一DFA的与所选择的该子图样的该最后元素相关联,与向该至少一个处理器指示一个指向该单个NFA的一个起始节点的指针的元数据相关联的一个DFA节点处使行走该统一DFA转换成在向前向行走方向行走该单个NFA。该起始节点可以与该至少一个图样中紧随所选择的该子图样的最后元素的下一元素相关联。该方法可以上报所选择的该子图样的一次匹配,与该DFA节点处的所选择的该子图样的该最后元素匹配的一个字符的该有效载荷内的一个偏移作为所选择的该子图样的一个结束偏移,以及所选择的该子图样的一个长度,如果该长度是固定的。The method may convert walking the unified DFA into walking the single NFA in a forward direction at a DFA node of the unified DFA associated with the last element of the selected subpattern and associated with metadata indicating a pointer to a start node of the single NFA to the at least one processor. The start node may be associated with the element immediately following the last element of the selected subpattern in the at least one pattern. The method may report a match of the selected subpattern, an offset within the payload of a character that matched the last element of the selected subpattern at the DFA node as an end offset of the selected subpattern, and a length of the selected subpattern, if the length is fixed.
该方法可以在与该至少一个图样的一个最后元素相关联,与元数据相关联的该至少一个NFA的一个滞后节点处使用与所选择的该子图样的结束偏移相关联的有效载荷起始偏移使得从行走该统一DFA转换成在反向行走方向行走该单个NFA。该方法可以在与该至少一个图样的该第一元素相关联,与元数据相关联的该至少一个NFA的一个前导节点处终止该行走。该方法可以上报与该前导节点处的该至少一个图样的该第一元素匹配的一个字符的该有效载荷内的一个偏移作为该至少一个图样的一个起始偏移,如果与该至少一个图样相关联的一个限定符需要的话,以及该至少一个图样的一次最终匹配。The method may use the payload start offset associated with the end offset of the selected sub-pattern at a lag node of the at least one NFA associated with metadata, associated with a last element of the at least one pattern, to switch from walking the unified DFA to walking the single NFA in a reverse walking direction. The method may terminate the walk at a predecessor node of the at least one NFA associated with metadata, associated with the first element of the at least one pattern. The method may report an offset within the payload of a character that matched the first element of the at least one pattern at the predecessor node as a starting offset for the at least one pattern, if required by a qualifier associated with the at least one pattern, and a final match for the at least one pattern.
如果所选择的该子图样的一个第一元素不是该至少一个图样的一个第一元素,并且所选择的该子图样的一个最后元素不是该至少一个图样的一个最后元素,所选择的该子图样的该位置可以是该至少一个图样的一个中间位置;并且如果所选择的该子图样的该长度是固定的,该至少一个NFA可以是单个NFA。该至少一个行走方向可以包括一个反向行走方向,用于与该至少一个图样的一个前导部分相关联的单个NFA的多个运行时间处理节点,以及一个前向行走方向,用于与该至少一个图样的所有元素相关联的该单个NFA的多个运行时间处理节点的。该前导部分可以是排除了该至少一个图样的一个滞后部分的该至少一个图样。该滞后部分可以包括所选择的该子图样的该第一元素、该至少一个图样的该最后元素、以及其间的该至少一个图样中的所有元素。If a first element of the selected subpattern is not a first element of the at least one pattern, and a last element of the selected subpattern is not a last element of the at least one pattern, the position of the selected subpattern may be an intermediate position of the at least one pattern; and if the length of the selected subpattern is fixed, the at least one NFA may be a single NFA. The at least one walking direction may include a reverse walking direction for multiple runtime processing nodes of the single NFA associated with a leading portion of the at least one pattern, and a forward walking direction for multiple runtime processing nodes of the single NFA associated with all elements of the at least one pattern. The leading portion may be the at least one pattern excluding a lag portion of the at least one pattern. The lag portion may include the first element of the selected subpattern, the last element of the at least one pattern, and all elements of the at least one pattern therebetween.
该方法可以在该统一DFA的与所选择的该子图样的该最后元素相关联,与向该至少一个处理器指示一个指向该单个NFA的一个起始节点的指针的元数据相关联的一个DFA节点处使该统一DFA的行走转换成在反向行走方向行走该单个NFA。该起始节点可以与该前导部分的一个最后元素相关联。可以通过从所选择的该子图样的结束偏移减去所选择的该子图样的长度来确定有效载荷起始偏移。该方法可以上报所选择的该子图样的一次匹配,与该DFA节点处的所选择的该子图样的该最后元素匹配的一个字符的该有效载荷内的一个偏移作为所选择的该子图样的一个结束偏移,以及所选择的该子图样的该长度。The method may, at a DFA node of the unified DFA associated with the last element of the selected subpattern and associated with metadata indicating a pointer to a start node of the single NFA to the at least one processor, cause a walk of the unified DFA to switch to walking the single NFA in a reverse direction. The start node may be associated with a last element of the preamble. A payload start offset may be determined by subtracting the length of the selected subpattern from the end offset of the selected subpattern. The method may report a match of the selected subpattern, an offset within the payload of a character that matched the last element of the selected subpattern at the DFA node as an end offset of the selected subpattern, and the length of the selected subpattern.
该方法可以在与该至少一个图样的该第一元素相关联,与元数据相关联的该单个NFA的一个前导节点处在前向行走方向行走该单个NFA。该方法可以在与该至少一个图样的该最后元素相关联,与元数据相关联的该单个NFA的一个滞后节点处终止该行走。该方法可以上报与该滞后节点处的该至少一个图样的该最后元素匹配的一个字符的该有效载荷内的一个偏移,以及该至少一个图样的一次最终匹配。The method may walk the single NFA in a forward walking direction at a preceding node of the single NFA associated with the first element of the at least one pattern and associated with metadata. The method may terminate the walking at a lag node of the single NFA associated with the last element of the at least one pattern and associated with metadata. The method may report an offset within the payload of a character that matches the last element of the at least one pattern at the lag node and a final match of the at least one pattern.
如果所选择的该子图样的一个最后元素是该至少一个子图样的一个最后元素,所选择的该子图样的该位置可以是该至少一个子图样的一个结束位置,并且如果所选择的该子图样的该长度是固定的,该至少一个图样的用于生成该至少一个NFA的该部分是可以排除了所选择的该子图样的该至少一个图样,并且该至少一个行走方向可以是一个反向行走方向。If a last element of the selected sub-pattern is a last element of the at least one sub-pattern, the position of the selected sub-pattern may be an end position of the at least one sub-pattern, and if the length of the selected sub-pattern is fixed, the portion of the at least one pattern used to generate the at least one NFA may be the at least one pattern excluding the selected sub-pattern, and the at least one walking direction may be a reverse walking direction.
该方法可以在该统一DFA的与所选择的该子图样的该最后元素相对应的与向该至少一个处理器指示一个指向该至少一个NFA的一个起始节点的指针的元数据相关联的一个DFA节点处使该统一DFA的行走转换成在反向行走方向行走该至少一个NFA。该至少一个NFA的该起始节点可以与该部分的一个最后元素相关联。该方法可以上报所选择的该子图样的一次匹配,以及与该DFA节点处的所选择的该子图样的该最后元素匹配的一个字符的该有效载荷内的一个偏移作为所选择的该子图样的一个结束偏移。可以通过从所选择的该子图样的结束偏移减去所选择的该子图样的长度来确定该至少一个NFA的有效载荷起始偏移,如果该长度是固定的。The method may convert a walk of the unified DFA to a walk of the at least one NFA in a reverse direction at a DFA node of the unified DFA corresponding to the last element of the selected subpattern and associated with metadata indicating a pointer to a start node of the at least one NFA to the at least one processor. The start node of the at least one NFA may be associated with a last element of the portion. The method may report a match of the selected subpattern and an offset within the payload of a character that matches the last element of the selected subpattern at the DFA node as an end offset of the selected subpattern. The payload start offset of the at least one NFA may be determined by subtracting the length of the selected subpattern from the end offset of the selected subpattern, if the length is fixed.
该方法可以在与该部分的一个第一元素相关联,该至少一个NFA的与元数据相关联的一个NFA节点处:终止该行走,以及上报该至少一个图样的一次最终匹配,以及与该NFA节点处的该部分的该第一元素匹配的一个字符的该有效载荷内的一个偏移作为该至少一个图样的一个起始偏移,如果与该至少一个图样相关联的一个限定符需要的话。The method may, at an NFA node of the at least one NFA associated with metadata, associated with a first element of the portion: terminate the walk, and report a final match of the at least one pattern, and an offset within the payload of a character that matches the first element of the portion at the NFA node as a starting offset of the at least one pattern, if required by a qualifier associated with the at least one pattern.
如果所选择的该子图样的一个最后元素可以是该至少一个子图样的一个最后元素,所选择的该子图样的该位置可以是该至少一个子图样的一个结束位置,并且如果所选择的该子图样的该长度是可变的或固定的,该至少一个图样的用于生成该至少一个NFA的该部分可以是该至少一个图样,并且该至少一个行走方向可以是一个反向行走方向。If a last element of the selected sub-pattern may be a last element of the at least one sub-pattern, the position of the selected sub-pattern may be an end position of the at least one sub-pattern, and if the length of the selected sub-pattern is variable or fixed, the portion of the at least one pattern used to generate the at least one NFA may be the at least one pattern, and the at least one walking direction may be a reverse walking direction.
该方法可以在该统一DFA的与所选择的该子图样的该最后元素相对应的与向该至少一个处理器指示一个指向该至少一个NFA的一个起始节点的指针的元数据相关联的一个DFA节点处使该统一DFA的行走转换成在反向行走方向行走该至少一个NFA。该至少一个NFA的该起始节点可以与该所选择的该子图样的一个最后元素相关联。该方法可以上报所选择的该子图样的一次匹配,以及与该DFA节点处的所选择的该子图样的该最后元素匹配的一个字符的该有效载荷内的一个偏移作为所选择的该子图样的一个结束偏移,以及所选择的该子图样的一个长度,如果该长度是固定的,该至少一个NFA的一个有效载荷起始偏移与所选择的该子图样的该结束偏移相关联。The method may, at a DFA node of the unified DFA corresponding to the last element of the selected subpattern and associated with metadata indicating a pointer to a start node of the at least one NFA to the at least one processor, switch the walk of the unified DFA to a walk of the at least one NFA in a reverse direction. The start node of the at least one NFA may be associated with the last element of the selected subpattern. The method may report a match of the selected subpattern, an offset within the payload of a character that matches the last element of the selected subpattern at the DFA node as an end offset of the selected subpattern, and a length of the selected subpattern, whereby, if the length is fixed, a payload start offset of the at least one NFA is associated with the end offset of the selected subpattern.
该方法可以在与该部分的一个第一元素相关联,该至少一个NFA的与元数据相关联的一个NFA节点处:终止该行走,以及上报该至少一个图样的一次最终匹配,以及与该NFA节点处的该部分的该第一元素匹配的一个字符的该有效载荷内的一个偏移作为该至少一个图样的一个起始偏移,如果与该至少一个图样相关联的一个限定符需要的话。The method may, at an NFA node of the at least one NFA associated with metadata, associated with a first element of the portion: terminate the walk, and report a final match of the at least one pattern, and an offset within the payload of a character that matches the first element of the portion at the NFA node as a starting offset of the at least one pattern, if required by a qualifier associated with the at least one pattern.
可以将该统一DFA和该至少一个NFA存储为包括该统一DFA和该至少一个NFA的一个二进制图像。The unified DFA and the at least one NFA may be stored as a binary image including the unified DFA and the at least one NFA.
该至少一个处理器可以包括被配置成一个加速单元以分别分流DFA和NFA运行时间处理的一个DFA协处理器和一个NFA协处理器。The at least one processor may include a DFA coprocessor and an NFA coprocessor configured as an acceleration unit to offload DFA and NFA runtime processing, respectively.
在此所披露的另一示例实施例包括一种对应于与在此所披露的方法实施例一致的操作的装置。Another example embodiment disclosed herein includes an apparatus corresponding to operations consistent with the method embodiments disclosed herein.
进一步地,又一个示例实施例可以包括非瞬态计算机可读介质,该非瞬态计算机可读存储介质上存储有一个指令序列,当被处理器加载和执行时,该指令序列使处理器执行在此所披露的方法。Furthermore, yet another example embodiment may include a non-transitory computer-readable storage medium having stored thereon a sequence of instructions that, when loaded and executed by a processor, causes the processor to perform the methods disclosed herein.
附图简要说明BRIEF DESCRIPTION OF THE DRAWINGS
从本发明的示例实施例的以下更具体的说明中上述内容将是明显的,如在这些附图中所展示的,其中,贯穿这些不同的视图,相似的参考字符是指相同的部分。附图不一定按比例,而是着重于展示本发明的实施例。The foregoing will be apparent from the following more particular description of example embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments of the invention.
图1是安全装置的实施例的框图,可以在该安全装置中实施在此所披露的实施例。FIG1 is a block diagram of an embodiment of a security device in which the embodiments disclosed herein may be implemented.
图2A-G是示例NFA和DFA图形和展示了图爆的概念的表格。2A-G are example NFA and DFA graphs and a table illustrating the concept of graph explosion.
图3A是安全装置的实施例的另一框图,可以在该安全装置中实施在此所披露的实施例。3A is another block diagram of an embodiment of a security device in which embodiments disclosed herein may be implemented.
图3B为可以在至少一个处理器中实施的方法的示例实施例的流程图(350),该处理器操作性地耦合至安全装置内的至少一个存储器,该安全装置操作性地耦合至网络。3B is a flow chart ( 350 ) of an example embodiment of a method that may be implemented in at least one processor operatively coupled to at least one memory within a security device operatively coupled to a network.
图3C为可以在至少一个处理器中实施的方法的示例实施例的流程图,该处理器操作性地耦合至安全装置内的至少一个存储器,该安全装置操作性地耦合至网络。3C is a flow diagram of an example embodiment of a method that may be implemented in at least one processor operatively coupled to at least one memory within a security device operatively coupled to a network.
图4是用于在所选择的子图样的长度是固定的并且所选择的该子图样的位置是正则表达式图样的开始位置的基础上生成统一DFA和至少一个NFA的实施例的框图。4 is a block diagram of an embodiment for generating a unified DFA and at least one NFA based on the length of a selected sub-pattern being fixed and the position of the selected sub-pattern being the start position of a regular expression pattern.
图5是用于在所选择的子图样的位置是正则表达式图样的中间位置并且所选择的该子图样的长度固定的基础上生成统一DFA和至少一个NFA的实施例的框图。FIG5 is a block diagram of an embodiment for generating a unified DFA and at least one NFA based on the position of a selected sub-pattern being the middle position of a regular expression pattern and the length of the selected sub-pattern being fixed.
图6是用于在所选择的子图样的位置是正则表达式图样的中间位置或开始位置并且该子图样的长度固定或可变的基础上生成统一DFA和至少一个NFA的实施例的框图。6 is a block diagram of an embodiment for generating a unified DFA and at least one NFA based on the position of a selected sub-pattern being the middle or the beginning of a regular expression pattern and the length of the sub-pattern being fixed or variable.
图7是用于在所选择的子图样的位置是正则表达式图样的中间位置或开始位置并且所选择的该子图样的长度固定或可变的基础上生成统一DFA和至少一个NFA的另一实施例的框图。7 is a block diagram of another embodiment for generating a unified DFA and at least one NFA based on the position of the selected sub-pattern being the middle or the beginning of the regular expression pattern and the length of the selected sub-pattern being fixed or variable.
图8是用于在所选择的子图样的位置是正则表达式图样的中间位置并且所选择的该子图样的长度固定或可变的基础上生成统一DFA和至少一个NFA的实施例的框图。8 is a block diagram of an embodiment for generating a unified DFA and at least one NFA based on the position of the selected sub-pattern being the middle position of the regular expression pattern and the length of the selected sub-pattern being fixed or variable.
图9是用于在所选择的该子图样的位置是正则表达式图样的中间位置并且所选择的该子图样的长度固定的基础上生成统一DFA和至少一个NFA的实施例的框图。FIG9 is a block diagram of an embodiment for generating a unified DFA and at least one NFA based on the position of the selected sub-pattern being the middle position of the regular expression pattern and the length of the selected sub-pattern being fixed.
图10是用于在所选择的子图样的位置是正则表达式图样的结束位置并且所选择的该子图样的长度固定的基础上生成统一DFA和至少一个NFA的实施例的框图。FIG10 is a block diagram of an embodiment for generating a unified DFA and at least one NFA based on the position of a selected sub-pattern being the end position of a regular expression pattern and the length of the selected sub-pattern being fixed.
图11是用于在所选择的子图样的位置是正则表达式图样的结束位置并且所选择的该子图样的长度可变或固定的基础上生成统一DFA和至少一个NFA的实施例的框图。11 is a block diagram of an embodiment for generating a unified DFA and at least one NFA based on the position of a selected sub-pattern being the end position of a regular expression pattern and the length of the selected sub-pattern being variable or fixed.
图12是可选地在此所披露的实施例内部的计算机的示例内部结构的框图。12 is a block diagram of an example internal structure of a computer optionally within embodiments disclosed herein.
发明详细说明Detailed Description of the Invention
在详细描述本发明的示例实施例之前,以下紧接着描述了可以使用确定有限自动机(DFA)和非确定有限自动机(NFA)在其中实施这些实施例的示例网络安全应用,以帮助读者理解本发明的发明特征。Before describing example embodiments of the present invention in detail, an example network security application in which these embodiments may be implemented using deterministic finite automata (DFA) and nondeterministic finite automata (NFA) is described below to help readers understand the inventive features of the present invention.
图1是安全装置102的实施例的框图,可以在该安全装置中实施本发明的实施例。安全装置102可以包括网络服务处理器100。安全装置102可以是可以将在一个网络接口103a接收到的数据包切换到另一个网络接口103b并且可以在转发这些数据包之前在所接收到的数据包上执行多种安全功能的独立系统。例如,安全装置102可以用于在数据包101a上执行安全处理,可以在将处理过的数据包101b转发至局域网(LAN)105b或任何其他合适的网络之前在广域网(WAN)105a或任何其他合适的网络上接收到这些数据包。FIG1 is a block diagram of an embodiment of a security appliance 102 in which embodiments of the present invention may be implemented. Security appliance 102 may include a network services processor 100. Security appliance 102 may be a standalone system that switches packets received on one network interface 103a to another network interface 103b and performs various security functions on the received packets before forwarding them. For example, security appliance 102 may be configured to perform security processing on packets 101a received on a wide area network (WAN) 105a or any other suitable network before forwarding the processed packets 101b to a local area network (LAN) 105b or any other suitable network.
网络服务处理器100可以被配置成用于对所接收到的数据包中所封装的开放系统互连(OSI)网络L2-L7层协议进行处理。如本领域技术人员所熟知的,OSI参考模型定义了七层网络协议层(L1-7)。物理层(L1)表示将设备连接到传输媒介的实际接口,包括电气接口和物理接口。数据链路层(L2)执行数据组帧。网络层(L3)将数据格式化为数据包。传输层(L4)处理端到端的传输。会话层(L5)管理设备之间的通信,例如,无论通信是半双工的还是全双工的。表现层(L6)管理数据格式化及表现,例如,语法、控制代码、特殊图形及字符集。应用层(L7)允许多个用户之间的通信,例如,文件传送及电子邮件。The network service processor 100 can be configured to process the open systems interconnection (OSI) network L2-L7 layer protocols encapsulated in the received data packets. As is well known to those skilled in the art, the OSI reference model defines seven network protocol layers (L1-7). The physical layer (L1) represents the actual interface that connects the device to the transmission medium, including the electrical interface and the physical interface. The data link layer (L2) performs data framing. The network layer (L3) formats the data into data packets. The transport layer (L4) handles end-to-end transmission. The session layer (L5) manages communication between devices, for example, whether the communication is half-duplex or full-duplex. The presentation layer (L6) manages data formatting and presentation, for example, syntax, control codes, special graphics and character sets. The application layer (L7) allows communication between multiple users, for example, file transfer and email.
网络服务处理器100可以为上层网络协议(例如,L4-L7)调度和排列工作(例如,数据包处理操作),并且使得能够在所接收到的数据包中进行上层网络协议的处理,以便以线速(即,网络的数据传送速率,可以在该网络上传输和接收数据)转发数据包。通过处理这些协议来以线速转发这些数据包,网络服务处理器100不会降低网络数据传送速率。网络服务处理器100可以从网络接口103a或103b接收数据包(这些网络接口可以是物理硬件接口),并且可以在所接收的数据包上执行L2-L7网络协议。网络服务处理器100可以随后通过网络接口103a或103b将处理过的数据包101b转发至网络中的另一跳(最终目的地),或通过另一总线(未示出)以用于由主处理器(未示出)进行进一步处理。网络协议处理可以包括网络安全协议的处理,如防火墙、应用防火墙、包括IP安全(IPSec)和/或安全套接字层(SSL)的虚拟专用网(VPN)、入侵检测系统(IDS)和防病毒(AV)。The network services processor 100 can schedule and queue work (e.g., packet processing operations) for upper layer network protocols (e.g., L4-L7) and enable processing of the upper layer network protocols in received packets to forward the packets at line speed (i.e., the data transfer rate of the network over which data can be transmitted and received). By processing these protocols to forward these packets at line speed, the network services processor 100 does not degrade the network data transfer rate. The network services processor 100 can receive packets from network interfaces 103a or 103b (these network interfaces can be physical hardware interfaces) and can execute L2-L7 network protocols on the received packets. The network services processor 100 can then forward the processed packets 101b via network interfaces 103a or 103b to another hop in the network (the final destination) or via another bus (not shown) for further processing by a host processor (not shown). Network protocol processing may include processing of network security protocols such as firewalls, application firewalls, virtual private networks (VPNs) including IP Security (IPSec) and/or Secure Sockets Layer (SSL), intrusion detection systems (IDS), and antivirus (AV).
网络服务处理器100可以使用多个处理器(即,内核)传送高应用性能。内核(未示出)中的每个可以专用于执行数据面或控制面操作。数据面操作可以包括数据包操作以便转发数据包。控制面操作可以包括处理复杂的高层协议的多个部分,如互联网协议安全(IPSec)、传输控制协议(TCP)和安全套接字层(SSL)。数据面操作可以包括处理这些复杂的高层协议的其他部分。The network services processor 100 can use multiple processors (i.e., cores) to deliver high application performance. Each of the cores (not shown) can be dedicated to performing data plane or control plane operations. Data plane operations can include packet operations to forward packets. Control plane operations can include processing multiple parts of complex high-level protocols such as Internet Protocol Security (IPSec), Transmission Control Protocol (TCP), and Secure Sockets Layer (SSL). Data plane operations can include processing other parts of these complex high-level protocols.
网络服务处理器100还可以包括分流内核从而使得网络服务处理器100实现高吞吐量的特定用途协处理器(未示出)。例如,网络服务处理器100可以包括一个加速单元106,该加速单元可以包括一个用于NFA处理的硬件加速的超非确定自动机(HNA)协处理器108和一个用于DFA处理的硬件加速的超级有限自动机(HFA)协处理器110。HNA 108和HFA 110协处理器可以被配置成用于分流网络服务处理器100通用内核(未示出)的执行计算和内存密集型图样匹配法的重负。The network services processor 100 may also include special purpose coprocessors (not shown) that offload cores to enable high throughput for the network services processor 100. For example, the network services processor 100 may include an acceleration unit 106, which may include a hardware-accelerated hyper nondeterministic automaton (HNA) coprocessor 108 for NFA processing and a hardware-accelerated hyper finite automaton (HFA) coprocessor 110 for DFA processing. The HNA 108 and HFA 110 coprocessors may be configured to offload the burden of executing computationally and memory-intensive pattern matching methods from the general purpose cores (not shown) of the network services processor 100.
网络服务处理器100可以执行图样搜索、正则表达式处理、内容验证、变换和安全以加速数据包处理。正则表达式处理和图样搜索可以用于针对AV和IDS应用以及需要字符串匹配的其他应用执行字符串匹配。网络服务处理器100中的存储器控制器(未示出)可以控制对存储器104的访问,该存储器操作性地耦合至网络服务处理器100。该存储器可以是内部的(即,片上)或外部(即,片外)、或其组合,并且可以被配置成用于存储所接收到的数据包,如用于由网络服务处理器100进行处理的数据包101a。该存储器可以被配置成用于存储DFA和NFA图形表达式搜索中查找和图样匹配所利用的编译规则数据。编译规则数据可以被存储为包括用于DFA和NFA两者的编译规则数据的二进制图像112,或被存储为将DFA编译规则数据从NFA编译规则数据分开的多个二进制图像。The network services processor 100 can perform pattern searching, regular expression processing, content validation, transformation, and security to accelerate packet processing. Regular expression processing and pattern searching can be used to perform string matching for AV and IDS applications, as well as other applications that require string matching. A memory controller (not shown) in the network services processor 100 can control access to a memory 104 that is operatively coupled to the network services processor 100. The memory can be internal (i.e., on-chip) or external (i.e., off-chip), or a combination thereof, and can be configured to store received packets, such as packet 101a for processing by the network services processor 100. The memory can be configured to store compiled rule data utilized for lookups and pattern matching in DFA and NFA graph expression searches. The compiled rule data can be stored as a binary image 112 that includes compiled rule data for both the DFA and NFA, or as multiple binary images that separate the DFA compiled rule data from the NFA compiled rule data.
典型的内容感知应用处理使用或者DFA或者NFA来识别所接收到的数据包的内容中的图样。DFA和NFA两者都是有限状态机,即,计算模型,计算模型中的每一个都包括状态集合、开始状态、输入字母表(所有可能的符号集合)和转换函数。计算在开始状态中开始,并且根据转换函数而改变为新的状态。Typical content-aware application processing uses either a DFA or an NFA to identify patterns in the content of received data packets. Both DFA and NFA are finite state machines, i.e., computational models, each of which consists of a set of states, a start state, an input alphabet (the set of all possible symbols), and a transition function. Computation begins in the start state and changes to a new state according to the transition function.
图样通常使用正则表达式来表达,正则表达式包括基本元素,例如,诸如A-Z、0-9的正常文本字符以及诸如*、^和|的元字符。正则表达式的基本元素是有待匹配的符号(单个字符)。基本元素可以与允许连结(+)交替(|)、以及Kleene星号(*)的元字符结合。用于连结的元字符可以用于从单个字符(或子串)创建多个字符匹配图样,而用于交替(|)的元字符可以用于创建可以匹配两个或更多个子串中任何一个的正则表达式。元字符Kleene星号(*)允许图样匹配任意次数,包括不会发生在前字符或字符串。Patterns are typically expressed using regular expressions, which include basic elements such as normal text characters such as A-Z, 0-9, and metacharacters such as *, ^, and |. The basic element of a regular expression is the symbol (single character) to be matched. The basic elements can be combined with metacharacters that allow concatenation (+), alternation (|), and the Kleene asterisk (*). The metacharacters used for concatenation can be used to create multiple character matching patterns from a single character (or substring), while the metacharacters used for alternation (|) can be used to create a regular expression that can match any of two or more substrings. The metacharacter Kleene asterisk (*) allows the pattern to be matched any number of times, including not occurring in the preceding character or string.
组合不同的运算符和单个字符允许构建表达式的复杂子图样。例如,如(th(is|at)*)的子图样可以与多个字符串匹配,如:th、this、that、thisis、thisat、thatis或thatat。表达式的复杂子图样的另一示例可以是结合字符类结构[...]的示例,该字符类结构允许列出要搜索的字符列表。例如,gr[ea]y查找grey和gray两者。其他复杂子图样示例是可以使用破折号指示字符范围(例如[A-Z])的示例或与任一字符匹配的元字符“.”。该图样的元素可以是基本元素或一个或多个基本元素与一个或多个元字符的组合。Combining different operators and single characters allows to build complex subpatterns of expressions. For example, a subpattern like (th(is|at)*) can match multiple strings like: th, this, that, thisis, thisat, thatis or thatat. Another example of a complex subpattern of an expression can be the one that combines the character class structure [...] which allows to list a list of characters to be searched for. For example, gr[ea]y finds both grey and gray. Other examples of complex subpatterns are the one that can use a dash to indicate a range of characters (e.g. [A-Z]) or the metacharacter "." which matches any character. The elements of the pattern can be basic elements or a combination of one or more basic elements with one or more metacharacters.
对DFA或NFA状态机的输入通常是(8位)字节串,即,字母可以是来自输入流(即,所接收到的数据包)的单个字节(一个字符或符号)。输入流中的每个字节可以产生从一个状态到另一状态的转换。可以用一个图形表示DFA或NFA状态机的状态和转换函数。该图形中的每个节点可以代表一个状态,并且该图形中的弧可以代表状态转换。状态机的当前状态可以由选择图形中的特定节点的节点标识符来表示。The input to a DFA or NFA state machine is typically a (8-bit) byte string, i.e., a letter can be a single byte (a character or symbol) from the input stream (i.e., the received data packet). Each byte in the input stream can generate a transition from one state to another. The states and transition functions of a DFA or NFA state machine can be represented by a graph. Each node in the graph can represent a state, and the arcs in the graph can represent state transitions. The current state of the state machine can be represented by the node identifier of a particular node in the selection graph.
可以将使用DFA来处理正则表达式并且找到字符的输入流中由正则表达式描述的一个或多个图样表征为具有确定的运行时间性能。可以从输入字符(或符号)以及DFA的当前状态确定DFA的下一个状态,因为每个DFA状态只有一次状态转换。这样,DFA的运行时间性能被认为是确定的并且完全可以从输入预测行为。然而,用于确定性的折中办法是其中节点的数量(或图形大小)可以随着图样的大小呈指数增长的图形。DFA can be used to process regular expressions and find one or more patterns described by regular expressions in the input stream of characters to be characterized as having definite runtime performance. The next state of DFA can be determined from the current state of input characters (or symbols) and DFA, because each DFA state has only one state transition. Like this, the runtime performance of DFA is considered to be definite and can predict behavior from input completely. However, a compromise for determinism is a graph in which the number of nodes (or graph size) can be exponentially increased along with the size of the pattern.
相比之下,可以将NFA图形的节点数量(或图形大小)可以表征为随着图样的大小而呈线性增长。然而,可以将使用NFA来处理正则表达式并且找到字符的输入流中由正则表达式所描述的一个或多个图样表征为具有非确定运行时间性能。例如,给定输入字符(或符号)和NFA的当前状态,可能存在不只一个要转换成的NFA的下一状态。这样,不能唯一地从NFA的输入和当前状态确定NFA的下一状态。因此,NFA的运行时间性能被认为是不确定的,因为不能完全从输入预测行为。In contrast, the number of nodes (or graph size) of an NFA graph can be characterized as growing linearly with the size of the pattern. However, using an NFA to process a regular expression and find one or more patterns described by the regular expression in an input stream of characters can be characterized as having non-deterministic runtime performance. For example, given an input character (or symbol) and the current state of the NFA, there may be more than one next state of the NFA to be converted to. Thus, the next state of the NFA cannot be uniquely determined from the input and the current state of the NFA. Therefore, the runtime performance of an NFA is considered to be non-deterministic because the behavior cannot be fully predicted from the input.
图2A-G示出了DFA“图爆”的概念。图2A、图2B、和图2C分别示出了图样“.*a[^\n]”、“.*a[^\n][^\n]”、“.*a[^\n][^\n][^\n]”的NFA图形,并且图2D、图2E、和图2F分别示出了相同图样的DFA图形。如图2A至图2F中所示以及图2G的表格所总结的,NFA对于某些图样可以呈线性增长,同时对于相同图样的DFA可以呈指数增长,从而导致图爆。如所示的,对于一个或多个给定图样而言,DFA状态的数量可以大于NFA状态的数量,通常近似多几百或多几千种状态。这是“图爆”的一个示例,这是DFA的一个标志特点。Figures 2A-G illustrate the concept of DFA "graph explosion." Figures 2A, 2B, and 2C respectively illustrate NFA graphs for the patterns ".*a[^\n]," ".*a[^\n][^\n]," and ".*a[^\n][^\n][^\n]," and Figures 2D, 2E, and 2F respectively illustrate DFA graphs for the same patterns. As shown in Figures 2A through 2F and summarized in the table of Figure 2G, the NFA can grow linearly for certain patterns, while the DFA for the same pattern can grow exponentially, leading to graph explosion. As shown, for one or more given patterns, the number of DFA states can be greater than the number of NFA states, typically by the order of hundreds or thousands of states more. This is an example of "graph explosion," a hallmark characteristic of DFAs.
根据在此所披露的实施例,可以使用DFA、NFA、或其组合进行内容搜索。根据一个实施例,运行时间处理器、协处理器、或其组合可以用硬件实施并且可以被配置成用于实施一个编译器或行走器。According to embodiments disclosed herein, content search may be performed using DFA, NFA, or a combination thereof. According to one embodiment, a runtime processor, coprocessor, or a combination thereof may be implemented in hardware and may be configured to implement a compiler or walker.
编译器可以将图样或图样的输入列表(也被称为特征或规则)编译成DFA、NFA、或其组合。DFA和NFA可以是二进制数据结构,如DFA和NFA图形和表格。The compiler can compile a pattern or an input list of patterns (also referred to as features or rules) into a DFA, NFA, or a combination thereof. The DFA and NFA can be binary data structures such as DFA and NFA graphs and tables.
行走器可以执行运行时间处理,即,用于标识输入流中的图样的存在、或将图样与输入流中的内容进行匹配的动作。内容可以是互联网协议(IP)数据报的有效载荷部分、或输入流中的任何其他合适的有效载荷。DFA或NFA图形的运行时间处理可以被称为用有效载荷行走DFA或NFA图形以确定图样匹配。处理器被配置成用于生成DFA、NFA、或其组合,在此可以被称为编译器。被配置成用于使用所生成的DFA、NFA、或其组合来实施有效载荷的运行时间处理的处理器在此可以被称为行走器。根据在此所披露的实施例,网络服务处理器100可以被配置成用于在安全装置102中实施编译器和行走器。The walker can perform runtime processing, that is, the action of identifying the existence of a pattern in an input stream or matching the pattern with the content in the input stream. The content can be the payload portion of an Internet Protocol (IP) datagram, or any other suitable payload in the input stream. The runtime processing of a DFA or NFA graph can be referred to as walking the DFA or NFA graph with a payload to determine pattern matching. The processor is configured to generate a DFA, NFA, or a combination thereof and can be referred to as a compiler at this time. The processor configured to use the generated DFA, NFA, or a combination thereof to implement runtime processing of a payload can be referred to as a walker at this time. According to the embodiments disclosed herein, the network service processor 100 can be configured to implement a compiler and a walker in the security device 102.
图3A是图1的安全装置102的另一实施例的框图,可以在该安全装置中实施本发明的实施例。如参照图1所描述的,安全装置102可以操作性地耦合至一个或多个网络并且可以包括存储器104和网络服务处理器100,该网络服务处理器可以包括加速单元106。参照图3A,网络服务处理器100可以被配置成用于实施生成二进制图像112的编译器306和使用二进制图像112的行走器320。例如,编译器306可以生成二进制图像112,该二进制图像包括被行走器320用于在所接收的数据包101a(图1中所示)上执行图样匹配方法的编译规则数据。根据在此所披露的实施例,编译器306可以通过基于如下文进一步描述的至少一种启发法确定针对DFA、NFA、或其组合来生成二进制图像112。编译器306可以确定有利地适合DFA和NFA的规则数据。FIG3A is a block diagram of another embodiment of the security device 102 of FIG1 , in which embodiments of the present invention may be implemented. As described with reference to FIG1 , the security device 102 may be operatively coupled to one or more networks and may include a memory 104 and a network services processor 100, which may include an acceleration unit 106. Referring to FIG3A , the network services processor 100 may be configured to implement a compiler 306 that generates a binary image 112 and a walker 320 that uses the binary image 112. For example, the compiler 306 may generate a binary image 112 that includes compiled rule data used by the walker 320 to perform a pattern matching method on a received data packet 101a (shown in FIG1 ). According to embodiments disclosed herein, the compiler 306 may generate the binary image 112 by determining, based on at least one heuristic method as further described below, for a DFA, an NFA, or a combination thereof. The compiler 306 may determine rule data that is advantageously suitable for both a DFA and an NFA.
根据在此披露的实施例,编译器306可以通过处理规则集310来生成二进制图像112,该规则集可以包括一个或多个正则表达式图样的集合304和可选的限定符308。根据规则集310,编译器306可以使用从一个或多个正则表达式图样的所有中选择的子图样和用于一个或多个正则表达式图样的集合304中的至少一个图样的至少一个NFA 314以供行走器320在运行时间处理过程中使用、以及包括映射信息的元数据(未示出)来生成统一DFA312,该映射信息用于将行走器320在统一DFA 312的状态(未示出)与至少一个NFA 314的状态之间进行转换。可以逐数据结构地将统一DFA 312和至少一个NFA 314表示为图形、或为任何其他合适的形式,并且可以逐数据结构地将元数据中的映射表示为一个或多个表、或用任何其他合适的形式。根据在此披露的实施例,如果从图样中选择的子图样为该图样,则不为该图样生成NFA。根据在此披露的实施例,所生成的每个NFA可以用于集合中的具体图样,而可以基于来自集合中所有图样的所有子图样来生成统一DFA。According to embodiments disclosed herein, compiler 306 can generate binary image 112 by processing rule set 310, which can include set 304 of one or more regular expression patterns and optional qualifiers 308. Based on rule set 310, compiler 306 can generate unified DFA 312 using subpatterns selected from all of the one or more regular expression patterns and at least one NFA 314 for at least one pattern in set 304 of one or more regular expression patterns for use by walker 320 during runtime processing, as well as metadata (not shown) including mapping information for use by walker 320 in transitioning between states (not shown) of unified DFA 312 and states of at least one NFA 314. Unified DFA 312 and at least one NFA 314 can be represented as a graph or in any other suitable form on a data structure basis, and the mapping in the metadata can be represented as one or more tables or in any other suitable form on a data structure basis. According to embodiments disclosed herein, if a sub-pattern selected from a pattern is the pattern, then no NFA is generated for the pattern. According to embodiments disclosed herein, each NFA generated can be used for a specific pattern in a set, and a unified DFA can be generated based on all sub-patterns from all patterns in the set.
基于从所接收到的数据包101a中的有效载荷消耗字节,行走器320通过转换统一DFA 312和至少一个NFA的状态来用有效载荷行走统一DFA 312和至少一个NFA 314。这样,行走器320使该有效载荷走过统一DFA 312和该至少一个NFA 314。Based on the bytes consumed from the payload in the received data packet 101a, the walker 320 walks the unified DFA 312 and the at least one NFA 314 with the payload by transitioning the states of the unified DFA 312 and the at least one NFA. In this way, the walker 320 walks the payload through the unified DFA 312 and the at least one NFA 314.
规则集310可以包括一个或多个正则表达式图样的集合304并且可以是Perl兼容正则表达式(PCRE)脚本文件的形式或任何其他合适的形式。PCRE已经成为安全和联网应用中正则表达式语法的事实标准。随着更多应用需要深度数据包检查已经兴起或更多威胁在互联网中变得普遍,用于标识病毒/攻击的相应特征/图样或应用也已经变得更加复杂。例如,特征数据库已经从具有简单字符串图样演进到具有通配符字符/范围/字符类和高级PCRE特征的正则表达式(regex)图样。The rule set 310 may include a set 304 of one or more regular expression patterns and may be in the form of a Perl Compatible Regular Expression (PCRE) script file or any other suitable form. PCRE has become the de facto standard for regular expression syntax in security and networking applications. As more applications requiring deep packet inspection have emerged and more threats have become prevalent on the Internet, the corresponding features/patterns or applications used to identify viruses/attacks have become more complex. For example, feature databases have evolved from having simple string patterns to regular expression (regex) patterns with wildcard characters/ranges/character classes and advanced PCRE features.
如图3A中所示,可选的限定符308可以各自与正则表达式图样集合304中的一个图样相关联。例如,可选的限定符322可以与图样316相关联。可选的限定符308可以各自是指定所希望的自定义、高级PCRE特征选项、或于处理与限定符相关联的图样的其他合适的选项的一个或多个限定符。例如,限定符322可以指示是否希望用于图样316的高级PCRE特征选项中的起始偏移(即,图样的在有效载荷中匹配的第一匹配字符的有效载荷中的位置)选项。As shown in FIG3A , optional qualifiers 308 can each be associated with a pattern in regular expression pattern set 304. For example, optional qualifier 322 can be associated with pattern 316. Optional qualifiers 308 can each be one or more qualifiers that specify desired customization, advanced PCRE feature options, or other suitable options for processing the pattern associated with the qualifier. For example, qualifier 322 can indicate whether the start offset (i.e., the position in the payload of the first matching character of the pattern that is matched in the payload) option in the advanced PCRE feature options for pattern 316 is desired.
由于出现的应用,起始偏移已经变得对深度数据包检查(DPI)系统中的处理比较重要。有利的是,有限自动机只需要上报给定图样在输入内的存在或不存在,并上报有效载荷中的已匹配的图样的结束偏移以用于处理。如下文参照图4至图11所描述的,如果限定符322指示希望起始偏移,编译器306可以用使得行走器320能够上报(即,声明)图样的在有效载荷中匹配的第一匹配字符的有效载荷的位置偏移的方式生产二进制图像112。Due to emerging applications, the starting offset has become more important for processing in deep packet inspection (DPI) systems. Advantageously, the finite automaton only needs to report the presence or absence of a given pattern in the input and the ending offset of the matched pattern in the payload for processing. As described below with reference to Figures 4 to 11, if the qualifier 322 indicates that a starting offset is desired, the compiler 306 can generate the binary image 112 in a manner that enables the walker 320 to report (i.e., declare) the position offset of the payload of the first matching character of the pattern that matches in the payload.
根据在此披露的实施例,编译器306可以使用从一个或多个正则表达式图样的集合304中的所有图样选择的子图样302生成统一DFA 312。编译器306可以基于如以下进一步描述的至少一种启发法来从一个或多个正则表达式图样的集合304中的每个图样中选择多个子图样302。编译器306还可以为该集合中的至少一个图样316生成至少一个NFA 314,可以基于所选择的子图样318的长度是否是固定的或可变的和所选择的子图样318在至少一个图样316内的位置来确定至少一个图样316的用于生成至少一个NFA 314的部分(未示出)和至少一个NFA 314的运行时间处理(即,行走)的至少一个行走方向。编译器306可以将统一DFA 312和至少一个NFA 314存储在至少一个存储器104内。According to embodiments disclosed herein, compiler 306 can generate a unified DFA 312 using subpatterns 302 selected from all patterns in a set 304 of one or more regular expression patterns. Compiler 306 can select multiple subpatterns 302 from each pattern in the set 304 of one or more regular expression patterns based on at least one heuristic, as described further below. Compiler 306 can also generate at least one NFA 314 for at least one pattern 316 in the set. Compiler 306 can determine a portion (not shown) of at least one pattern 316 used to generate at least one NFA 314 and at least one direction of run-time processing (i.e., walking) of at least one NFA 314 based on whether the length of the selected subpattern 318 is fixed or variable and the position of the selected subpattern 318 within the at least one pattern 316. Compiler 306 can store unified DFA 312 and at least one NFA 314 in at least one memory 104.
编译器可以确定所选择的潜在子图样的长度是否是固定的或可变的。例如,如“cdef”子图样的长度可以被确定为具有一个固定长度4,因为“cdef”为一个字符串,而包括运算符的复杂子图样可以被确定为具有一个可变长度。例如,如“a.*cd[^\n]{0,10}.*y”复杂子图样根据所选择的图样可以具有“cd[^\n]{0,10}”,其可以具有一个2到12的可变长度。The compiler can determine whether the length of a selected potential sub-pattern is fixed or variable. For example, a sub-pattern such as "cdef" can be determined to have a fixed length of 4 because "cdef" is a string, while a complex sub-pattern including operators can be determined to have a variable length. For example, a complex sub-pattern such as "a.*cd[^\n]{0,10}.*y" can have a length of "cd[^\n]{0,10}", which can have a variable length of 2 to 12, depending on the selected pattern.
根据在此披露的实施例,子图样选择可以基于至少一种启发法。子图样是来自图样的一个或多个连续元素的集合,其中,为了与来自有效载荷的字节或字符匹配,来自图样的每个元素可以用DFA或NFA图形中的节点来表示。如上所述的元素可以是用节点表示的单个文本字符或用节点表示的字符类。编译器306可以基于子图样是否很可能引起如以上参照图2A-G所描述的过多DFA图爆来确定图样中的哪些子图样更适用于NFA。例如,从包括连续文本字符的子图样生成DFA将不会引起DFA图爆,而如上所述的复杂子图样可以包括多个运算符和多个字符,并且从而可能引起DFA图爆。例如,包括通配符字符或重复多次的较大字符类(例如,[^\n]*或[^\n]{16})的子图样可以在DFA中产生过多状态,并且因此可以更有利地适用于NFA。According to the embodiments disclosed herein, subpattern selection can be based on at least one heuristic. A subpattern is a collection of one or more consecutive elements from a pattern, wherein each element from the pattern can be represented by a node in a DFA or NFA graph in order to match bytes or characters from a payload. The elements described above can be single text characters represented by nodes or character classes represented by nodes. The compiler 306 can determine which subpatterns in a pattern are more suitable for an NFA based on whether the subpattern is likely to cause excessive DFA graph explosion as described above with reference to Figures 2A-G. For example, generating a DFA from a subpattern including consecutive text characters will not cause a DFA graph explosion, while a complex subpattern as described above can include multiple operators and multiple characters and thus may cause a DFA graph explosion. For example, a subpattern including wildcard characters or a large character class repeated multiple times (e.g., [^\n]* or [^\n]{16}) can produce too many states in a DFA and therefore can be more advantageously suitable for an NFA.
如以上所披露的,从一个或多个正则表达式的集合304中的每个图样中选择子图样可以基于至少一种启发法。根据一个实施例,该至少一种启发法可以包括使所选择的唯一子图样的数量和所选择的每个图样的长度最大化。例如,如“ab.*cdef.*mn”图样可以具有多个潜在字符图样,如“ab.*”、“cdef”、和“.*mn”。编译器可以选择“cdef”作为该图样的子图样,因为其是图样“ab.*cdef.*mn”中不太可能引起DFA图爆的最大子图样。然而,如果已经为另一个图样选择了子图样“cdef”,则编译器可以为图样“ab.*cdef.*mn”选择替代子图样。可替代地,编译器可以用其他图样的另一个子图样替换子图样“cdef”,从而能够为图样“ab.*cdef.*mn”选择图样“cdef”。As disclosed above, the selection of a subpattern from each pattern in the set 304 of one or more regular expressions can be based on at least one heuristic. According to one embodiment, the at least one heuristic can include maximizing the number of unique subpatterns selected and the length of each selected pattern. For example, a pattern such as "ab.*cdef.*mn" can have multiple potential character patterns, such as "ab.*," "cdef," and ".*mn." The compiler can select "cdef" as the subpattern of this pattern because it is the largest subpattern in the pattern "ab.*cdef.*mn" that is unlikely to cause a DFA graph explosion. However, if the subpattern "cdef" has already been selected for another pattern, the compiler can select an alternative subpattern for the pattern "ab.*cdef.*mn." Alternatively, the compiler can replace the subpattern "cdef" with another subpattern of another pattern, thereby being able to select the pattern "cdef" for the pattern "ab.*cdef.*mn."
这样,编译器306可以基于图样304中每个图样的可能子图样的上下文为图样304选择子图样,从而能够使所选择的唯一子图样的数量和所选择的每个子图样的长度最大化。如此,编译器306可以从所选择的图样302生成统一DFA 312,该统一DFA通过增加至少一个NFA 314中的图样匹配的概率来使至少一个NFA 314的图样匹配中的误报(即,没有匹配或部分匹配)数量最小化。In this way, compiler 306 can select subpatterns for pattern 304 based on the context of the possible subpatterns of each pattern in pattern 304, thereby maximizing the number of unique subpatterns selected and the length of each subpattern selected. Thus, compiler 306 can generate unified DFA 312 from selected pattern 302 that minimizes the number of false positives (i.e., no match or partial match) in pattern matching in at least one NFA 314 by increasing the probability of a pattern match in at least one NFA 314.
通过使子图样长度最大化,可以避免NFA处理中的误报。NFA处理中的误报会引起非确定运行时间处理,并且因此会降低运行时间性能。进一步地,通过使所选择的唯一子图样的数量最大化,鉴于统一DFA中(来自图样)子图样的匹配,编译器306能够实现统一DFA到从集合中的图样生成的至少一个NFA 314之间的1:1转换。By maximizing the subpattern length, false positives in NFA processing can be avoided. False positives in NFA processing can cause non-deterministic runtime processing and, therefore, degrade runtime performance. Further, by maximizing the number of unique subpatterns selected, compiler 306 can achieve a 1:1 conversion between the unified DFA and at least one NFA 314 generated from a pattern in the set, given a match of the subpattern (from the pattern) in the unified DFA.
例如,如果多个图样共享所选择的子图样,则统一DFA的行走器将需要转换成多个至少一个NFA,因为每个至少一个NFA是每图样NFA,并且来自统一DFA的子图样匹配意味着针对该多个图样中每个图样的部分匹配。如此,使唯一子图样的数量最大化减少了DFA:NFA1:N转换的数量,从而减少了行走器320进行的运行时间处理。For example, if multiple patterns share the selected subpattern, the walker for the unified DFA will need to convert to multiple at least one NFA, because each at least one NFA is a per-pattern NFA, and a subpattern match from the unified DFA means a partial match for each of the multiple patterns. Thus, maximizing the number of unique subpatterns reduces the number of DFA:NFA1:N transitions, thereby reducing the runtime processing performed by the walker 320.
为了能够使唯一子图样的数量最大化,编译器302可以计算所选择的子图样318的散列值326并将所计算的散列值326结合从其中选择子图样318的图样316的标识符(未示出)一起存储。例如,针对集合304中的每个图样,编译器306可以计算所选择的子图样的散列值。所计算的散列值324可以按照表格的方式、或以任何合适的方式存储在至少一个存储器104中。所使用的散列法可以是任何合适的散列法。编译器可以将所计算的散列值与为该集合中的其他图样选择的子图样的散列值列表进行比较,以便确定所选择的子图样是否是唯一的。To maximize the number of unique subpatterns, compiler 302 may calculate a hash value 326 for selected subpattern 318 and store the calculated hash value 326 along with an identifier (not shown) of pattern 316 from which subpattern 318 was selected. For example, compiler 306 may calculate a hash value for the selected subpattern for each pattern in set 304. The calculated hash values 324 may be stored in at least one memory 104 in a table or in any other suitable manner. The hashing method used may be any suitable hashing method. The compiler may compare the calculated hash value with a list of hash values for subpatterns selected for other patterns in the set to determine whether the selected subpattern is unique.
如果在该列表找到所计算的散列值,则编译器可以确定是否用来自图样的另一个子图样来替换(i)所选择的子图样或用从该集合中的其他图样中选择的替代子图样替换(ii)为该集合中的另一个图样选择的子图样。可以基于与该列表中所计算的散列值的关联来标识该集合中的其他图样。确定是否替换(i)或(ii)可以基于对考虑替换的子图样的长度进行比较以便如上所述最大化所选择的唯一子图样的长度。替换所选择的子图样可以包括选择为给定图样标识的下一个最长子图样、或下一个最高优先级的子图样。例如,可以基于引起DFA图爆或所预期的DFA图爆幅度的可能性来确定潜在子图样的优先次序。If the calculated hash value is found in the list, the compiler can determine whether to replace (i) the selected sub-pattern with another sub-pattern from the pattern or replace (ii) the sub-pattern selected for another pattern in the set with an alternative sub-pattern selected from other patterns in the set. Other patterns in the set can be identified based on their association with the calculated hash value in the list. Determining whether to replace (i) or (ii) can be based on comparing the lengths of the sub-patterns being considered for replacement so as to maximize the length of the selected unique sub-pattern as described above. Replacing the selected sub-pattern can include selecting the next longest sub-pattern identified for a given pattern, or the next highest priority sub-pattern. For example, the priority of potential sub-patterns can be determined based on the likelihood of causing a DFA graph explosion or the expected DFA graph explosion magnitude.
根据在此披露的实施例,至少一种启发法可以包括标识每个图样的子图样和如果给定子图样具有一个小于最小阈值的长度则忽视每个图样的所标识的子图样中的给定子图样。例如,为了减少至少一个NFA中的误报,编译器可以忽视具有小于最小阈值的长度的子图样,因为此类子图样会引起至少一个NFA中的误报的更高概率。According to embodiments disclosed herein, at least one heuristic may include identifying subpatterns of each pattern and ignoring a given subpattern in the identified subpatterns of each pattern if the given subpattern has a length less than a minimum threshold. For example, to reduce false positives in at least one NFA, the compiler may ignore subpatterns having a length less than the minimum threshold because such subpatterns may cause a higher probability of false positives in the at least one NFA.
该至少一种启发法可以包括访问子图样的与使用指示符的历史频率相关联的知识库(未示出),以及如果所访问的知识库中针对给定子图样的使用指示符的历史频率大于等于频率使用阈值则忽视每个图样的所标识的子图样中的给定子图样。例如,特定用途或协议特定子图样可能具有高使用频率,如针对超文本传输协议(HTTP)有效载荷、“回车换行”、或流量清零(如来自二进制文件的多个连续0)、或任何其他频繁使用的子图样。The at least one heuristic may include accessing a knowledge base (not shown) of sub-patterns associated with historical frequencies of usage indicators, and disregarding a given sub-pattern in the identified sub-patterns of each pattern if the historical frequency of the usage indicator for the given sub-pattern in the accessed knowledge base is greater than or equal to a frequency usage threshold. For example, a specific usage or protocol-specific sub-pattern may have a high frequency of usage, such as for Hypertext Transfer Protocol (HTTP) payloads, "carriage return line feed," or flow zeros (e.g., multiple consecutive zeros from a binary file), or any other frequently used sub-pattern.
该至少一种启发法可以包括标识每个图样的和用于每个图样的子图样,通过基于具有所标识的子图样的最大数量的连续文本字符的给定子图样和基于为一个或多个正则表达式集合选择的所有子图样之间为唯一的给定子图样选择所标识的子图样中的给定子图样使所选择的子图样中的连续文本字符的数量最大化。如以上所披露的,使所选择的子图样的长度最大化可以使得能够实现至少一个NFA中匹配的更高概率。The at least one heuristic may include identifying subpatterns for each pattern and for each pattern, maximizing the number of consecutive text characters in the selected subpatterns by selecting a given subpattern from the identified subpatterns based on the given subpattern having the largest number of consecutive text characters in the identified subpatterns and based on the given subpattern being unique among all subpatterns selected for the one or more sets of regular expressions. As disclosed above, maximizing the length of the selected subpatterns may enable a higher probability of a match in the at least one NFA.
该至少一种启发法可以包括基于给定子图样中的每一个的子图样类型和给定子图样的长度来确定每个图样的给定子图样的优先次序。子图样类型可以是纯文本、交替、单字符重复、或多字符重复,并且子图样类型的从最高到最低的优先次序可以是纯文本、交替、单字符重复、或多字符重复。这样,可以将具有至少最小长度阈值长度的文本字符串的子图样的优先次序确定为比可变长度的复杂子图样更高。The at least one heuristic method can include determining a priority for a given sub-pattern of each pattern based on the sub-pattern type of each of the given sub-patterns and the length of the given sub-pattern. The sub-pattern type can be plain text, alternating, single character repeating, or multiple character repeating, and the sub-pattern type can be prioritized from highest to lowest by plain text, alternating, single character repeating, or multiple character repeating. In this way, sub-patterns having text strings of at least a minimum length threshold can be prioritized higher than complex sub-patterns of variable length.
编译器306可以将较长长度的子图样的优先次序确定为在较短长度的另一个子图样之上。编译器306可以基于确定优先次序来选择唯一子图样作为所选择的子图样。如上所述,所选择的唯一子图样可以具有至少最小长度阈值的长度。The compiler 306 may prioritize a sub-pattern of a longer length over another sub-pattern of a shorter length. The compiler 306 may select a unique sub-pattern as the selected sub-pattern based on the prioritization. As described above, the selected unique sub-pattern may have a length of at least a minimum length threshold.
如果给定子图样中没有一个是唯一的并且具有一个至少最小长度阈值的长度,编译器306可以在确定优先次序的基础上选择非唯一子图样作为所选择的子图样。如此,编译器306可以从图样中选择一个是从另一个图样中选择的子图样的副本的子图样而不是选择具有一个小于最小阈值的长度的子图样。为了方便子图样的最终确定,编译器306可以对图样进行多次忽略并按长度对可能的子图样进行分类。如此,可以在针对一个或多个正则表达式的集合304中的其他图样的子图样选择的上下文中执行针对一个或多个正则表达式的集合304中的给定图样的编译器子图样选择。If none of the given subpatterns is unique and has a length of at least a minimum length threshold, compiler 306 can select a non-unique subpattern as the selected subpattern based on a prioritization. Thus, compiler 306 can select a subpattern from a pattern that is a copy of a subpattern selected from another pattern rather than selecting a subpattern with a length less than the minimum length threshold. To facilitate the final determination of a subpattern, compiler 306 can perform multiple passes on the pattern and sort the possible subpatterns by length. Thus, compiler subpattern selection for a given pattern in set 304 of one or more regular expressions can be performed within the context of subpattern selection for other patterns in set 304 of one or more regular expressions.
如上所述,限定符322可以指示希望报告起始偏移。然而,起始偏移可能不是容易可辨别的。例如,鉴于如“axycamb”有效载荷,因为两个图样可以匹配,“axycamb”和“amb”,在如“a.*b”或“a.*d”有效载荷匹配图样中找到起始偏移会是困难的。如此,可能需要按照潜在起始偏移跟踪有效载荷中的“a”的两个实例的偏移。根据在此披露的实施例,不需要跟踪潜在起始偏移,因为直到确定已经在有效载荷中找到整个图样的匹配才确定起始偏移。可以利用来自统一DFA、至少一个NFA、或其组合的匹配结果找到确定整个图样的匹配。As described above, qualifier 322 may indicate that a starting offset is desired to be reported. However, the starting offset may not be readily discernible. For example, given a payload such as "axycamb," finding the starting offset in a matching pattern such as "a.*b" or "a.*d" may be difficult because two patterns may match: "axycamb" and "amb." Thus, it may be necessary to track the offsets of the two instances of "a" in the payload according to the potential starting offsets. According to the embodiments disclosed herein, tracking potential starting offsets is not necessary because the starting offset is not determined until a match for the entire pattern has been found in the payload. A match for the entire pattern can be found using matching results from a unified DFA, at least one NFA, or a combination thereof.
根据在此披露的实施例,如果所接收到的数据包101中的有效载荷包括与从图样316中选择的子图样318匹配的内容,则行走器可以转换成行走图样318的至少一个NFA。行走器320可以上报所选择的子图样318的匹配,以及对匹配的子图样的最后字符的所接收到的数据包中的一个位置进行标识的偏移作为有效载荷中的子图样的结束偏移。如果子图样是图样的子集,则子图样匹配可以是图样的部分匹配。如此,行走器320可以通过行走图样的至少一个NFA来继续搜索有效载荷中的图样的剩余部分,以便确定图样的最终匹配。应理解的是,图样可以遍历所接收到的数据包101a中的一个或多个有效载荷。According to embodiments disclosed herein, if the payload in a received data packet 101 includes content that matches a sub-pattern 318 selected from pattern 316, the walker can transition to walking at least one NFA of pattern 318. Walker 320 can report the match of the selected sub-pattern 318, along with an offset in the received data packet identifying a position in the received data packet of the last character of the matching sub-pattern as the end offset of the sub-pattern in the payload. If the sub-pattern is a subset of the pattern, the sub-pattern match can be a partial match of the pattern. Thus, walker 320 can continue searching for the remainder of the pattern in the payload by walking at least one NFA of the pattern to determine a final match for the pattern. It should be understood that the pattern can traverse one or more payloads in the received data packet 101a.
图3B为可以在至少一个处理器中实施的方法的示例实施例的流程图(350),该处理器操作性地耦合至安全装置内的至少一个存储器,该安全装置操作性地耦合至网络。该方法可以开始(352)并基于至少一种启发法从一个或多个正则表达式图样的集合中的每个图样中选择子图样(354)。该方法可以使用从该集合中的所有图样中选择的子图样来生成统一确定有限自动机(DFA)(356)。该方法可以为该集合中的至少一个图样生成至少一个非确定有限自动机(NFA),基于所选择的子图样的长度是否是固定的或可变的和所选择的子图样在至少一个图样内的位置来确定至少一个图样的用于生成该至少一个NFA的部分和用于该至少一个NFA的运行时间处理的至少一个行走方向(358)。该方法可以将统一DFA和所生成的至少一个NFA存储在至少一个存储器内(360)。之后,在该示例实施例中,该方法结束(362)。FIG3B is a flowchart (350) of an example embodiment of a method that can be implemented in at least one processor operatively coupled to at least one memory within a security device operatively coupled to a network. The method can begin (352) and select a subpattern from each pattern in a set of one or more regular expression patterns based on at least one heuristic (354). The method can generate a unified deterministic finite automaton (DFA) (356) using the subpatterns selected from all patterns in the set. The method can generate at least one nondeterministic finite automaton (NFA) for at least one pattern in the set, determining a portion of the at least one pattern for generating the at least one NFA and at least one walk direction for runtime processing of the at least one NFA based on whether the length of the selected subpattern is fixed or variable and the position of the selected subpattern within the at least one pattern (358). The method can store the unified DFA and the generated at least one NFA in at least one memory (360). Thereafter, in this example embodiment, the method ends (362).
图3C为可以在至少一个处理器中实施的方法的示例实施例的流程图(380),该处理器操作性地耦合至安全装置内的至少一个存储器,该安全装置操作性地耦合至网络。通过用来自有效载荷的多个字符遍历至少一个存储器中所存储的一个统一DFA的多个节点,可以开始(382)并使该有效载荷的多个字符走过该统一DFA(384),基于至少一种启发法从一个或多个正则表达式图样的集合中的每个图样选择的多个子图样生成该统一DFA。该方法可以:通过用来自该有效载荷的多个字符遍历该至少一个存储器中所存储的至少一个NFA的多个节点,使该有效载荷的多个字符走过该至少一个NFA,针对该集合中的至少一个图样生成的该至少一个NFA、该至少一个图样的用于生成该至少一个NFA的一部分、以及用于使多个字符走过该至少一个NFA的至少一个行走方向基于选自该至少一个图样的一个子图样的一个长度是固定的还是可变的以及所选择的该子图样在该至少一个图样(386)内的一个位置。之后,在该示例实施例中,该方法结束(388)。FIG3C is a flowchart (380) of an example embodiment of a method that can be implemented in at least one processor operatively coupled to at least one memory within a security device operatively coupled to a network. A unified DFA (384) can be started (382) by traversing a plurality of characters from a payload through a plurality of nodes of a unified DFA stored in at least one memory, the unified DFA being generated based on at least one heuristic selected from a plurality of subpatterns from each pattern in a set of one or more regular expression patterns. The method can include traversing a plurality of characters from the payload through a plurality of nodes of at least one NFA stored in the at least one memory, the at least one NFA generated for at least one pattern in the set, a portion of the at least one pattern used to generate the at least one NFA, and at least one direction for traversing the plurality of characters through the at least one NFA being based on whether a length of a subpattern selected from the at least one pattern is fixed or variable and a position of the selected subpattern within the at least one pattern (386). Thereafter, in this example embodiment, the method ends (388).
如上文所披露的,编译器306可以生成统一DFA 312和至少一个NFA 314以使行走器320能够搜索所接收到的数据包101a中的一个或多个正则表达式图样304的匹配。编译器306可以基于至少一种启发法从一个或多个正则表达式图样的集合304中的每个图样中选择子图样。可以使用从集合304中的所有图样中选择的子图样302来生成统一DFA 312。编译器306可以为集合304中的至少一个图样316生成至少一个NFA 314。如下文参照图4至图11所披露的,可以基于所选择的子图样318的长度是否是固定的或可变的和所选择的子图样318在至少一个图样316内的位置来确定至少一个图样316的用于生成至少一个NFA 314的部分、和用于至少一个NFA 314的运行时间处理的至少一个行走方向。As disclosed above, compiler 306 can generate unified DFA 312 and at least one NFA 314 to enable walker 320 to search for matches to one or more regular expression patterns 304 in received data packet 101a. Compiler 306 can select a subpattern from each pattern in set 304 of one or more regular expression patterns based on at least one heuristic. Unified DFA 312 can be generated using subpatterns 302 selected from all patterns in set 304. Compiler 306 can generate at least one NFA 314 for at least one pattern 316 in set 304. As disclosed below with reference to Figures 4 through 11, the portion of at least one pattern 316 used to generate at least one NFA 314 and at least one walking direction for runtime processing of at least one NFA 314 can be determined based on whether the length of the selected subpattern 318 is fixed or variable and the position of the selected subpattern 318 within the at least one pattern 316.
图4是用于在所选择的子图样404的长度是固定的并且所选择的该子图样的位置是至少一个图样406的开始位置的基础上生成统一DFA 312和至少一个NFA 314的框图400。如图4中所示,所选择的子图样404的第一元素408是该至少一个图样406的第一元素。该至少一个图样406的用于生成该至少一个NFA 402的部分410可以是排除了所选择的子图样404的该至少一个图样406。该至少一个NFA 314可以是单个NFA 402,并且该至少一个NFA314的至少一个行走方向可以是前向行走方向412。例如,对于给定图样(如“cavium”),前向行走方向将在一个从“c”到“m”的行走方向使输入有效载荷走过该至少一个NFA 314的多个节点,而反向行走方向将在一个从“m”到“c”的行走方向行走输入有效载荷。FIG4 is a block diagram 400 for generating a unified DFA 312 and at least one NFA 314 based on a selected subpattern 404 having a fixed length and a selected location of the subpattern as the starting location of at least one pattern 406. As shown in FIG4 , the first element 408 of the selected subpattern 404 is the first element of the at least one pattern 406. The portion 410 of the at least one pattern 406 used to generate the at least one NFA 402 may be the at least one pattern 406 excluding the selected subpattern 404. The at least one NFA 314 may be a single NFA 402, and the at least one walking direction of the at least one NFA 314 may be a forward walking direction 412. For example, for a given pattern (e.g., "cavium"), the forward walking direction would move an input payload through multiple nodes of the at least one NFA 314 in a walking direction from "c" to "m," while the reverse walking direction would move the input payload in a walking direction from "m" to "c."
根据图4的示例实施例,编译器306可以使统一DFA 312的DFA节点414与元数据418相关联,该DFA节点与所选择的子图样404的最后元素416相关联。元数据418可以向行走器320指示一个指向单个NFA 402的起始节点422的指针420,该行走器被配置成用于用有效载荷426行走统一DFA 312和至少一个NFA 314。元数据418可以包括转换成在前向行走方向412行走NFA 402的指令。单个NFA 402的起始节点422可以与该至少一个图样406的用于生成单个NFA 402的部分410的第一元素424相关联。元数据418可以指示行走器320上报所选择的子图样404的一次匹配,与DFA节点414处的所选择的子图样404的最后元素416匹配的(多个字符430中的)一个前导字符的有效载荷426内的(多个偏移428中的)一个前导偏移作为所选择的该子图样的一个结束偏移,以及所选择的该子图样的一个长度。有效载荷的用于行走单个NFA 402的起始偏移可以是有效载荷426中的结束偏移处的字节之后的一个字节的偏移。例如,该有效载荷中用于在起始节点422处开始行走单个NFA 402的下一字符可以被确定为是该有效载荷内的结束偏移处的字节之后的字节。由于所选择的该子图样的长度是固定的,编译器306可以确定所选择的该子图样的长度并将其包括在元数据418中。行走器320可以使用元数据418中所包括的长度以便确定有效载荷426内的图样406的起始偏移。例如,如果限定符308中的一个限定符需要的话,行走器320可以通过从所确定的结束偏移减去元数据418中所包括的长度来确定起始偏移。4 , compiler 306 may associate DFA node 414 of unified DFA 312 with metadata 418, the DFA node being associated with last element 416 of selected subgraph 404. Metadata 418 may indicate a pointer 420 to start node 422 of single NFA 402 to walker 320, which is configured to walk unified DFA 312 and at least one NFA 314 with payload 426. Metadata 418 may include instructions for translating into walking NFA 402 in forward walking direction 412. Start node 422 of single NFA 402 may be associated with first element 424 of portion 410 of at least one graph 406 used to generate single NFA 402. The metadata 418 may instruct the walker 320 to report a match of the selected sub-pattern 404, a leading offset (of the plurality of offsets 428) within the payload 426 that matches the last element 416 of the selected sub-pattern 404 at the DFA node 414 as an ending offset of the selected sub-pattern, and a length of the selected sub-pattern. The starting offset of the payload for walking the single NFA 402 may be an offset of one byte after the byte at the ending offset in the payload 426. For example, the next character in the payload for starting to walk the single NFA 402 at the start node 422 may be determined to be the byte after the byte at the ending offset in the payload. Since the length of the selected sub-pattern is fixed, the compiler 306 may determine the length of the selected sub-pattern and include it in the metadata 418. The walker 320 may use the length included in the metadata 418 to determine the starting offset of the pattern 406 within the payload 426. For example, if one of the qualifiers 308 requires it, the walker 320 may determine the starting offset by subtracting the length included in the metadata 418 from the determined ending offset.
应当理解的是,可以用任何合适的方式进行上报。例如,行走器320可以通过向网络服务处理器100声明结束偏移来上报该结束偏移,例如通过写到存储单元、触发中断、发送或张贴消息等。可替代地,行走器320可以通过声明结束偏移或其本身的数据结构中用于行走器本身的过程中的其他确定的结果来上报结束偏移或任何其他偏移或基于匹配结果的信息。It should be understood that reporting can be performed in any suitable manner. For example, the walker 320 can report the end offset by declaring the end offset to the network service processor 100, such as by writing to a storage unit, triggering an interrupt, sending or posting a message, etc. Alternatively, the walker 320 can report the end offset or any other offset or matching result-based information by declaring the end offset or other determination results in its own data structure for use in the walker's own process.
根据图4的示例实施例,编译器306可以使所生成的单个NFA的NFA节点432与由于已经标识了整个图样406的最终匹配而向行走器指示终止行走的指令的元数据434。NFA节点432可以与该至少一个图样406的一个最后元素436相关联。元数据434可以指示行走器320上报在NFA节点432处匹配的(多个字符430中的)一个滞后字符的有效载荷426内的(多个偏移428中的)一个滞后偏移作为该至少一个图样406的结束偏移,以及该至少一个图样406的最终匹配。4 , the compiler 306 may associate an NFA node 432 of the generated single NFA with metadata 434 indicating to the walker an instruction to terminate the walk because a final match has been identified for the entire pattern 406. The NFA node 432 may be associated with a last element 436 of the at least one pattern 406. The metadata 434 may instruct the walker 320 to report a lag offset (of the plurality of offsets 428) within the payload 426 of a lag character (of the plurality of characters 430) matched at the NFA node 432 as the end offset of the at least one pattern 406 and the final match for the at least one pattern 406.
行走器320可以使给定图样的每次行走与一个事务标识符相关。这样,可以结合相应的事务标识符上报子图样长度、有效载荷字符偏移、和图样匹配结果。在示例实施例中,网络服务处理器100可以基于事务标识符使给定图样的行走器结果信息相关,以供行走搜索该给定图样。The walker 320 can associate each walk of a given pattern with a transaction identifier. In this way, the sub-pattern length, payload character offset, and pattern matching results can be reported in conjunction with the corresponding transaction identifier. In an exemplary embodiment, the network services processor 100 can associate the walker result information for a given pattern based on the transaction identifier for use in a walk search for the given pattern.
图5是用于在所选择的子图样504的位置是至少一个图样506的中间位置并且所选择的子图样504的长度固定的基础上生成统一DFA 312和至少一个NFA 314的实施例的框图500。根据图5的示例实施例,该至少一个图样506的用于生成该至少一个NFA 314的部分包括该至少一个图样506的一个滞后部分508和一个前导部分510。如图5中所示,该至少一个图样506的滞后部分508可以是排除了该至少一个图样506所选择的子图样504和前导部分510的该至少一个图样506。该至少一个图样506的前导部分510排除了该至少一个图样506的所选择的子图样504和滞后部分508。FIG5 is a block diagram 500 of an embodiment for generating a unified DFA 312 and at least one NFA 314 based on a selected sub-pattern 504 being positioned in the middle of at least one pattern 506 and having a fixed length. According to the example embodiment of FIG5 , the portion of the at least one pattern 506 used to generate the at least one NFA 314 includes a lag portion 508 and a leading portion 510 of the at least one pattern 506. As shown in FIG5 , the lag portion 508 of the at least one pattern 506 may be the at least one pattern 506 excluding the selected sub-pattern 504 and the leading portion 510 of the at least one pattern 506. The leading portion 510 of the at least one pattern 506 excludes the selected sub-pattern 504 and the lag portion 508 of the at least one pattern 506.
根据图5的示例实施例,该至少一个NFA 314包括一个滞后NFA 512和一个前导NFA514。该至少一个行走方向包括一个前向行走方向516和一个反向行走反向518。可以在前向行走方向516行走滞后NFA 512,并且可以在反向行走方向518行走前导NFA 514。该至少一个图样506的滞后部分508可以用于生成滞后NFA 512,并且该至少一个图样506的前导部分510可以用于生成前导NFA 514。5 , the at least one NFA 314 includes a lag NFA 512 and a leading NFA 514. The at least one travel direction includes a forward travel direction 516 and a reverse travel direction 518. The lag NFA 512 can be traveled in the forward travel direction 516, and the leading NFA 514 can be traveled in the reverse travel direction 518. The lag portion 508 of the at least one pattern 506 can be used to generate the lag NFA 512, and the leading portion 510 of the at least one pattern 506 can be used to generate the leading NFA 514.
根据图5的示例实施例,编译器306可以将统一DFA 312的DFA节点515与元数据520相关联,该DFA节点具有所选择的子图样504的最后元素522。元数据520可以指示行走器,该行走器被配置成用于用有效载荷(如图4的有效载荷426)行走统一DFA 312和至少一个NFA314。元数据520可以包括指向滞后NFA 512的起始节点526的指针524,使行走器320转换成在前向行走方向516用起始于有效载荷426中的结束偏移处的字节之后的一个字节的偏移的有效载荷行走滞后NFA 512的指令。滞后NFA 512的起始节点526可以与滞后部分508的一个第一元素528相关联。元数据520可以指示一个指向前导NFA 514的起始节点532的指针530,以及供行走器320转换成在反向行走方向518行走前导NFA 514的指令。前导NFA 514的起始节点532可以与前导部分510的一个最后元素534相关联。元数据520可以指示行走器320上报与DFA节点515处的所选择的子图样522的该最后元素匹配的(多个字符430中的)一个字符的有效载荷426内的(多个偏移428中的)一个偏移作为所选择的子图样504的一个结束偏移,所选择的该子图样的一次匹配,以及所选择的该子图样的一个长度。行走器320可以使用元数据520中所包括的长度,以便通过从所选择的子图样504的结束偏移减去元数据520中的所选择的该子图样的长度来确定有效载荷的起始偏移,以用于在起始节点532处开始反向行走。According to the example embodiment of FIG5 , compiler 306 may associate DFA node 515 of unified DFA 312 with metadata 520, the DFA node having last element 522 of selected subgraph 504. Metadata 520 may indicate to a walker that the walker is configured to walk unified DFA 312 and at least one NFA 314 using a payload, such as payload 426 of FIG4 . Metadata 520 may include a pointer 524 to a start node 526 of lag NFA 512, causing walker 320 to convert instructions to walk lag NFA 512 in a forward walking direction 516 using a payload starting at an offset of one byte after the end offset of payload 426. Start node 526 of lag NFA 512 may be associated with a first element 528 of lag portion 508. The metadata 520 may indicate a pointer 530 to a start node 532 of the preceding NFA 514 and instructions for the walker 320 to convert into walking the preceding NFA 514 in the reverse walking direction 518. The start node 532 of the preceding NFA 514 may be associated with a last element 534 of the leading portion 510. The metadata 520 may instruct the walker 320 to report an offset (from a plurality of offsets 428) within the payload 426 of a character (from a plurality of characters 430) that matches the last element of the selected sub-pattern 522 at the DFA node 515 as an end offset of the selected sub-pattern 504, a match of the selected sub-pattern, and a length of the selected sub-pattern. The walker 320 may use the length included in the metadata 520 to determine the starting offset of the payload for starting the reverse walking at the start node 532 by subtracting the length of the selected sub-pattern in the metadata 520 from the end offset of the selected sub-pattern 504.
根据图5的示例实施例,编译器306可以将滞后NFA 512的滞后节点536与元数据540相关联,该滞后节点与该至少一个图样506的最后元素538相关联。元数据540可以向行走器320指示一个指令:终止行走滞后NFA 512,以及上报与滞后节点536处的最后元素538匹配的有效载荷426的(多个字符430中的)一个滞后字符的有效载荷426内的(多个偏移428中的)一个滞后偏移。元数据540可以指示行走器320上报该至少一个图样506的滞后部分508的一次匹配。5 , the compiler 306 may associate a lag node 536 of the lag NFA 512 with metadata 540, the lag node being associated with the last element 538 of the at least one pattern 506. The metadata 540 may indicate to the walker 320 an instruction to terminate walking the lag NFA 512 and to report a lag offset (from among the plurality of offsets 428) within the payload 426 of a lag character (from among the plurality of characters 430) of the payload 426 that matches the last element 538 at the lag node 536. The metadata 540 may instruct the walker 320 to report a match of the lag portion 508 of the at least one pattern 506.
根据图5的示例实施例,编译器306可以将前导NFA 514的前导节点542与元数据546相关联,该滞后节点与该至少一个图样506的第一元素544相关联,其中,元数据向行走器320指示一个终止行走前导NFA 514的指令。元数据546可以指示行走器320上报该至少一个图样506的前导部分510的一次匹配。元数据546可以指示行走器320上报与前导节点542处的第一元素544匹配的有效载荷426的(多个字符430中的)一个前导字符的有效载荷426内的(多个偏移428中的)一个前导偏移作为该至少一个图样506的起始偏移,如果与该至少一个图样506相关联的限定符(如这些限定符308之一)需要的话。5 , the compiler 306 may associate the predecessor node 542 of the predecessor NFA 514 with metadata 546, the lag node being associated with the first element 544 of the at least one pattern 506, wherein the metadata indicates to the walker 320 an instruction to terminate walking the predecessor NFA 514. The metadata 546 may instruct the walker 320 to report a match for the predecessor portion 510 of the at least one pattern 506. The metadata 546 may instruct the walker 320 to report a leading offset (from the plurality of offsets 428) within the payload 426 of a leading character (from the plurality of characters 430) of the payload 426 that matches the first element 544 at the predecessor node 542 as the starting offset of the at least one pattern 506, if required by a qualifier associated with the at least one pattern 506 (such as one of the qualifiers 308).
图6是用于在所选择的该子图样的位置是该至少一个图样的中间位置或开始位置并且该子图样的长度固定或可变的基础上生成统一DFA 312和至少一个NFA 314的实施例的框图600。根据图6的示例实施例,该至少一个图样606的用于生成该至少一个NFA 314的部分包括该至少一个图样606的一个滞后部分608和一个整个部分610。该至少一个图样606的滞后部分608可以是排除了该至少一个图样606的一个前导部分612的该至少一个图样606。前导部分612包括该至少一个图样606的第一元素614、所选择的子图样604的最后元素616、以及其间的该至少一个图样606中的所有元素。该至少一个图样606的整个部分610可以是该至少一个图样606。FIG6 is a block diagram 600 of an embodiment for generating a unified DFA 312 and at least one NFA 314 based on the position of the selected sub-pattern being at the middle or beginning of the at least one pattern and the length of the sub-pattern being fixed or variable. According to the example embodiment of FIG6 , the portion of the at least one pattern 606 used to generate the at least one NFA 314 includes a lag portion 608 and an entire portion 610 of the at least one pattern 606. The lag portion 608 of the at least one pattern 606 may be the at least one pattern 606 excluding a leading portion 612 of the at least one pattern 606. The leading portion 612 includes the first element 614 of the at least one pattern 606, the last element 616 of the selected sub-pattern 604, and all elements of the at least one pattern 606 therebetween. The entire portion 610 of the at least one pattern 606 may be the at least one pattern 606.
如果所选择的子图样604的一个第一元素618不是该至少一个图样606的一个第一元素614,并且所选择的子图样604的一个最后元素616不是该至少一个图样606的一个最后元素620,所选择的该子图样的该位置是该至少一个图样606的一个中间位置,并且一个开始部分622在该至少一个图样606中的所选择的子图样604之前。If a first element 618 of the selected sub-pattern 604 is not a first element 614 of the at least one pattern 606, and a last element 616 of the selected sub-pattern 604 is not a last element 620 of the at least one pattern 606, the position of the selected sub-pattern is a middle position of the at least one pattern 606, and a starting portion 622 precedes the selected sub-pattern 604 in the at least one pattern 606.
如果所选择的子图样604的第一元素618是该至少一个图样的第一元素614,所选择的该子图样的位置是该至少一个图样606的开始位置。如果所选择的该子图样的位置是该开始位置,开始部分622不存在,并且前导部分612是所选择的子图样604。If the first element 618 of the selected sub-pattern 604 is the first element 614 of the at least one pattern, the position of the selected sub-pattern is the starting position of the at least one pattern 606. If the position of the selected sub-pattern is the starting position, the starting portion 622 does not exist, and the leading portion 612 is the selected sub-pattern 604.
根据图6的示例实施例,该至少一个NFA包括一个滞后NFA 624和一个伞状NFA626。该至少一个行走方向包括一个前向行走方向628和一个反向行走反向630。滞后NFA624具有前向行走方向628,并且伞状NFA 626具有反向行走方向630。该至少一个图样606的滞后部分608可以被编译器306用于生成滞后NFA 624。该至少一个图样606的整个部分610可以被编译器306用于生成伞状NFA 626。According to the example embodiment of FIG6 , the at least one NFA includes a hysteresis NFA 624 and an umbrella NFA 626. The at least one walking direction includes a forward walking direction 628 and a reverse walking direction 630. The hysteresis NFA 624 has a forward walking direction 628, and the umbrella NFA 626 has a reverse walking direction 630. The hysteresis portion 608 of the at least one graph 606 can be used by the compiler 306 to generate the hysteresis NFA 624. The entire portion 610 of the at least one graph 606 can be used by the compiler 306 to generate the umbrella NFA 626.
根据图6的示例实施例,编译器306可以将统一DFA 312的DFA节点632与元数据634相关联,该DFA节点具有所选择的子图样604的最后元素616。元数据634可以向行走器320指示一个指向滞后NFA 624的起始节点638的指针636,以及转换成在前向行走方向628行走滞后NFA 624的指令。滞后NFA 624的起始节点638可以与滞后部分608的一个第一元素640相关联。元数据634可以指示行走器320上报所选择的子图样604的一次匹配,以及与该DFA节点处的所选择的子图样604的最后元素616匹配的(多个字符430中的)一个字符的有效载荷426内的(多个偏移428中的)一个偏移作为所选择的子图样604的一个结束偏移,以及所选择的子图样604的一个长度,如果该长度固定的话。6 , compiler 306 may associate DFA node 632 of unified DFA 312 with metadata 634, the DFA node having last element 616 of selected subgraph 604. Metadata 634 may indicate to walker 320 a pointer 636 to a start node 638 of lag NFA 624 and instructions to walk lag NFA 624 in a forward walking direction 628. Start node 638 of lag NFA 624 may be associated with a first element 640 of lag portion 608. The metadata 634 may instruct the walker 320 to report a match of the selected sub-pattern 604, an offset (of the offsets 428) within the payload 426 of a character (of the characters 430) that matches the last element 616 of the selected sub-pattern 604 at the DFA node as an ending offset of the selected sub-pattern 604, and a length of the selected sub-pattern 604, if the length is fixed.
根据图6的示例实施例,编译器306可以将滞后NFA 624的滞后节点642与元数据652相关联,该滞后节点与该至少一个图样606的最后元素620相关联。元数据652可以向行走器320指示一个指向伞状NFA 626的起始节点646的指针644,转换成在反向行走方向630行走伞状NFA 626的指令。伞状NFA 626的起始节点646可以与该至少一个图样606的最后元素620相关联。元数据652可以指示行走器可选地上报与滞后节点642处的该至少一个图样606的最后元素620匹配的(多个字符430中的)一个字符的有效载荷426内的(多个偏移428中的)一个偏移;以及可选地上报该至少一个图样606的滞后部分608的一次匹配。According to the example embodiment of FIG6 , the compiler 306 may associate a lag node 642 of a lag NFA 624 with metadata 652, the lag node being associated with the last element 620 of the at least one graph 606. The metadata 652 may indicate to the walker 320 a pointer 644 to a start node 646 of an umbrella NFA 626, translating into instructions to walk the umbrella NFA 626 in the reverse walking direction 630. The start node 646 of the umbrella NFA 626 may be associated with the last element 620 of the at least one graph 606. The metadata 652 may instruct the walker to optionally report an offset (from the plurality of offsets 428) within the payload 426 of a character (from the plurality of characters 430) that matches the last element 620 of the at least one graph 606 at the lag node 642, and optionally report a match for the lag portion 608 of the at least one graph 606.
根据图6的示例实施例,编译器306可以使伞状NFA 626的伞状节点648与元数据650相关联,该伞状节点与该至少一个图样606的第一元素614相关联。元数据650可以向行走器320指示一个指令:终止行走并上报该至少一个图样606的一次最终匹配。元数据650可以指示行走器将与伞状节点648处的该至少一个图样606的第一元素614匹配的一个起始字符的有效载荷426内的(多个偏移428中的)一个起始偏移作为该至少一个图样606的起始偏移上报,如果与该至少一个图样606相关联的这些限定符308中的一个限定符需要的话。According to the example embodiment of FIG6 , the compiler 306 may associate an umbrella node 648 of the umbrella NFA 626 with metadata 650, the umbrella node being associated with the first element 614 of the at least one pattern 606. The metadata 650 may indicate to the walker 320 an instruction to terminate the walk and report a final match for the at least one pattern 606. The metadata 650 may instruct the walker to report a starting offset (from the plurality of offsets 428) within the payload 426 of a starting character that matches the first element 614 of the at least one pattern 606 at the umbrella node 648 as the starting offset of the at least one pattern 606, if required by one of the qualifiers 308 associated with the at least one pattern 606.
图7是用于在所选择的子图样704的位置是该至少一个图样706的中间位置或开始位置并且所选择的子图样704的长度固定或可变的基础上生成统一DFA 312和至少一个NFA314的另一实施例的框图700。根据图7的示例实施例,该至少一个图样的用于生成该至少一个NFA 314的部分包括该至少一个图样706的一个滞后部分708和一个前导部分712。该至少一个图样706的滞后部分708可以是排除了该至少一个图样706的前导部分712的该至少一个图样706。前导部分712包括该至少一个图样706的第一元素714、所选择的子图样704的最后元素716、以及其间的该至少一个图样706中的所有元素。如果所选择的该子图样的位置是该开始位置,该前导部分712可以是所选择的子图样704。FIG7 is a block diagram 700 of another embodiment for generating a unified DFA 312 and at least one NFA 314 based on the position of a selected sub-pattern 704 being at the middle or starting position of at least one pattern 706 and the length of the selected sub-pattern 704 being fixed or variable. According to the example embodiment of FIG7 , the portion of the at least one pattern used to generate the at least one NFA 314 includes a lag portion 708 and a leading portion 712 of the at least one pattern 706. The lag portion 708 of the at least one pattern 706 may be the at least one pattern 706 excluding the leading portion 712 of the at least one pattern 706. The leading portion 712 includes the first element 714 of the at least one pattern 706, the last element 716 of the selected sub-pattern 704, and all elements of the at least one pattern 706 therebetween. If the position of the selected sub-pattern is at the starting position, the leading portion 712 may be the selected sub-pattern 704.
如果所选择的子图样704的一个第一元素718不是该至少一个图样706的一个第一元素714,并且所选择的子图样704的一个最后元素716不是该至少一个图样706的一个最后元素720,所选择的该子图样的该位置是该至少一个图样706的一个中间位置,并且一个开始部分722在该至少一个图样606中的所选择的子图样704之前。If a first element 718 of the selected sub-pattern 704 is not a first element 714 of the at least one pattern 706, and a last element 716 of the selected sub-pattern 704 is not a last element 720 of the at least one pattern 706, the position of the selected sub-pattern is a middle position of the at least one pattern 706, and a starting portion 722 is before the selected sub-pattern 704 in the at least one pattern 606.
如果所选择的子图样704的第一元素718是该至少一个图样的第一元素714,所选择的该子图样的位置是该至少一个图样706的开始位置。如果所选择的该子图样的位置是该开始位置,开始部分722不存在,并且前导部分712是所选择的子图样704。If the first element 718 of the selected sub-pattern 704 is the first element 714 of the at least one pattern, the position of the selected sub-pattern is the starting position of the at least one pattern 706. If the position of the selected sub-pattern is the starting position, the starting portion 722 does not exist, and the leading portion 712 is the selected sub-pattern 704.
根据图7的示例实施例,该至少一个NFA 314包括一个滞后NFA 724和一个前导NFA726,该至少一个行走方向包括一个前向行走方向728和一个反向行走方向730。滞后NFA724具有前向行走方向728。前导NFA 726具有反向行走方向730。该至少一个图样706的滞后部分708可以用于生成滞后NFA 724。该至少一个图样706的前导部分712可以用于生成前导NFA 726。According to the example embodiment of FIG7 , the at least one NFA 314 includes a lag NFA 724 and a leading NFA 726, and the at least one travel direction includes a forward travel direction 728 and a reverse travel direction 730. The lag NFA 724 has a forward travel direction 728. The leading NFA 726 has a reverse travel direction 730. The lag portion 708 of the at least one pattern 706 may be used to generate the lag NFA 724. The leading portion 712 of the at least one pattern 706 may be used to generate the leading NFA 726.
根据图7的示例实施例,编译器306可以使统一DFA 312的DFA节点732与元数据734相关联,该DFA节点与所选择的子图样704的最后元素716相关联。元数据734可以向行走器320指示一个指向滞后NFA 724的起始节点738的指针736,以及转换成在前向行走方向728行走滞后NFA 724的指令。滞后NFA 724的起始节点738可以与滞后部分708的一个第一元素740相关联。有效载荷的用于开始滞后NFA 724的前向行走的起始偏移可以是所选择的子图样704的结束偏移处的字节之后的一个字节的偏移。元数据734可以向行走器320指示一个指向前导NFA 726的起始节点746的指针744,以及转换成在反向行走方向730行走前导NFA726的指令。前导NFA 726的起始节点746可以与所选择的子图样704的一个最后元素716相关联。有效载荷的用于开始前导NFA 726的反向行走的偏移可以是所选择的子图样704的结束偏移。元数据734可以指示行走器320上报所选择的子图样704的一次匹配,以及与DFA节点732处的所选择的子图样704的最后元素716匹配的(这些字符430中的)一个字符的有效载荷426内的(这些偏移428中的)一个偏移作为所选择的子图样704的一个结束偏移,以及所选择的子图样704的一个长度,如果该长度固定的话。According to the example embodiment of FIG7 , compiler 306 may associate DFA node 732 of unified DFA 312 with metadata 734, which is associated with last element 716 of selected subgraph 704. Metadata 734 may indicate to walker 320 a pointer 736 to start node 738 of lag NFA 724 and may translate into instructions to walk lag NFA 724 in forward walking direction 728. Start node 738 of lag NFA 724 may be associated with first element 740 of lag portion 708. The starting offset of the payload for starting forward walking of lag NFA 724 may be a byte offset after the byte at the end offset of selected subgraph 704. Metadata 734 may indicate to walker 320 a pointer 744 to start node 746 of predecessor NFA 726 and may translate into instructions to walk predecessor NFA 726 in reverse walking direction 730. The starting node 746 of the leading NFA 726 may be associated with a last element 716 of the selected sub-pattern 704. The offset of the payload used to start the reverse walk of the leading NFA 726 may be the ending offset of the selected sub-pattern 704. The metadata 734 may instruct the walker 320 to report a match of the selected sub-pattern 704, an offset (of the offsets 428) within the payload 426 of a character (of the characters 430) that matches the last element 716 of the selected sub-pattern 704 at the DFA node 732 as an ending offset of the selected sub-pattern 704, and a length of the selected sub-pattern 704, if the length is fixed.
根据图7的示例实施例,编译器306可以将滞后NFA 724的滞后节点742与元数据752相关联,该滞后节点与该至少一个图样706的最后元素720相关联。元数据752可以指示行走器320终止行走滞后NFA,以及上报与滞后节点742处的该至少一个图样706的最后元素720匹配的(这些字符430中的)一个滞后字符的有效载荷426内的(这些偏移428中的)一个滞后偏移,以及上报该至少一个图样706的滞后部分708的一个匹配。7 , the compiler 306 may associate a lag node 742 of the lag NFA 724 with metadata 752, the lag node being associated with the last element 720 of the at least one pattern 706. The metadata 752 may instruct the walker 320 to terminate walking the lag NFA and to report a lag offset (of the offsets 428) within the payload 426 of a lag character (of the characters 430) that matches the last element 720 of the at least one pattern 706 at the lag node 742, and to report a match for the lag portion 708 of the at least one pattern 706.
根据图7的示例实施例,编译器306可以使所生成的前导NFA724的前导节点748与元数据750相关联,该前导节点与该至少一个图样706的第一元素714相关联。元数据750可以向行走器320指示一个指令:终止行走前导NFA 726,以及上报前导部分712的一次匹配,以及与前导节点748处的该至少一个图样706的第一元素714匹配的(这些字符430中的)一个前导字符的有效载荷内的(这些偏移428中的)一个前导偏移。7 , the compiler 306 may associate a predecessor node 748 of the generated predecessor NFA 724 with metadata 750, the predecessor node being associated with the first element 714 of the at least one pattern 706. The metadata 750 may indicate to the walker 320 an instruction to terminate walking the predecessor NFA 726 and to report a match of the predecessor portion 712 and a predecessor offset (one of the offsets 428) within the payload of a predecessor character (one of the characters 430) that matches the first element 714 of the at least one pattern 706 at the predecessor node 748.
图7的实施例可以被视为图6的实施例的优化,因为行走器320不需要在反向方向为滞后部分708遍历NFA。The embodiment of FIG. 7 can be viewed as an optimization of the embodiment of FIG. 6 , as the walker 320 does not need to traverse the NFA in the reverse direction for the lag portion 708 .
图8是用于在所选择的子图样804的位置是该至少一个图样806的中间位置,并且所选择的子图样804的长度固定或可变的基础上生成统一DFA 312和至少一个NFA 314的一个实施例的框图800。根据图8的示例实施例,该至少一个NFA 314是单个NFA 854。该至少一个行走方向包括一个前向行走方向828,用于单个NFA 854的与该至少一个图样806的一个滞后部分808的多个元素相关联的多个运行时间处理节点,以及一个反向行走方向830,用于单个NFA 854的与该至少一个图样806的所有元素相关联的多个运行时间处理节点。该至少一个图样806的滞后部分808是排除了该至少一个图样806的一个前导部分812的该至少一个图样806。前导部分812包括该至少一个图样806的第一元素814、所选择的子图样804的最后元素816、以及其间的该至少一个图样806中的所有元素。FIG8 is a block diagram 800 of one embodiment for generating a unified DFA 312 and at least one NFA 314 based on a selected sub-pattern 804 being positioned in the middle of at least one pattern 806 and having a fixed or variable length. According to the example embodiment of FIG8 , the at least one NFA 314 is a single NFA 854. The at least one walking direction includes a forward walking direction 828 for a plurality of runtime processing nodes associated with a plurality of elements of a lag portion 808 of the at least one pattern 806 of the single NFA 854, and a reverse walking direction 830 for a plurality of runtime processing nodes associated with all elements of the at least one pattern 806 of the single NFA 854. The lag portion 808 of the at least one pattern 806 is the at least one pattern 806 excluding a leading portion 812 of the at least one pattern 806. The leading portion 812 includes a first element 814 of the at least one pattern 806 , a last element 816 of the selected sub-pattern 804 , and all elements of the at least one pattern 806 therebetween.
根据图8的示例实施例,编译器306可以使统一DFA 312的DFA节点832与元数据834相关联,该DFA节点与所选择的子图样804的最后元素816相关联。元数据834可以向行走器320指示一个指向单个NFA 854的起始节点856的指针836,以及转换成在前向行走方向828行走单个NFA 854的指令。起始节点856可以与该至少一个图样806中紧随所选择的子图样804的最后元素816的下一元素840相关联。元数据834可以指示行走器320上报所选择的子图样804的一次匹配,与DFA节点832处的所选择的子图样804的最后元素816匹配的(这些字符430中的)一个字符的有效载荷426内的(这些偏移428中的)一个偏移作为所选择的子图样804的一个结束偏移,以及所选择的子图样804的一个长度,如果该长度固定的话。8 , compiler 306 may associate DFA node 832 of unified DFA 312 with metadata 834, the DFA node being associated with the last element 816 of the selected sub-graph 804. Metadata 834 may indicate to walker 320 a pointer 836 to a start node 856 of a single NFA 854 and may be converted into instructions for walking single NFA 854 in a forward walking direction 828. Start node 856 may be associated with the next element 840 in the at least one graph 806 that immediately follows the last element 816 of the selected sub-graph 804. The metadata 834 may instruct the walker 320 to report a match of the selected sub-pattern 804, an offset (of the offsets 428) within the payload 426 of a character (of the characters 430) that matches the last element 816 of the selected sub-pattern 804 at the DFA node 832 as an ending offset of the selected sub-pattern 804, and a length of the selected sub-pattern 804, if the length is fixed.
根据图8的示例实施例,编译器306可以将单个NFA 854的滞后节点842与元数据852相关联,该滞后节点与该至少一个图样806的最后元素820相关联,该元数据向行走器320指示一个转换成在反向行走方向830用在所选择的该子图样的结束偏移处开始的有效载荷行走单个NFA 854的指令。编译器306可以将单个NFA 854的前导节点848与元数据850相关联,该前导节点与该至少一个图样806的第一元素814相关联。元数据850可以向行走器320指示一个指令:终止行走,以及上报与前导节点848处的该至少一个图样806的第一元素814匹配的(这些字符430中的)一个字符的有效载荷426内的(这些偏移428中的)一个偏移作为该至少一个图样806的起始偏移(如果与该至少一个图样806相关联的这些限定符308中的一个限定符需要的话),以及该至少一个图样806的一次最终匹配。8 , the compiler 306 may associate a lag node 842 of a single NFA 854 associated with the last element 820 of the at least one pattern 806 with metadata 852, the metadata indicating to the walker 320 an instruction to convert the single NFA 854 into a walk in the reverse walking direction 830 with a payload starting at the end offset of the selected sub-pattern. The compiler 306 may also associate a predecessor node 848 of the single NFA 854 associated with the first element 814 of the at least one pattern 806 with metadata 850. The metadata 850 may indicate an instruction to the walker 320 to terminate the walk and report an offset (of the offsets 428) within the payload 426 of a character (of the characters 430) that matches the first element 814 of the at least one pattern 806 at the preceding node 848 as the starting offset of the at least one pattern 806 (if required by one of the qualifiers 308 associated with the at least one pattern 806), and a final match of the at least one pattern 806.
图9是用于在所选择的子图样904的位置是该至少一个图样906的中间位置,并且所选择的子图样904的长度固定的基础上生成统一DFA 312和至少一个NFA 314的一个实施例的框图900。根据图9的示例实施例,该至少一个NFA 314可以是单个NFA 954,并且该至少一个行走方向包括一个反向行走方向930,用于单个NFA 954的与该至少一个图样906的一个前导部分912相关联的多个运行时间处理节点,以及一个前向行走方向928,用于单个NFA954的与该至少一个图样906的所有元素相关联的多个运行时间处理节点。前导部分912可以是排除了该至少一个图样906的一个滞后部分908的该至少一个图样906。滞后部分908包括所选择的子图样904的第一元素918、该至少一个图样906的最后元素920、以及其间的该至少一个图样906中的所有元素。FIG9 is a block diagram 900 of one embodiment for generating a unified DFA 312 and at least one NFA 314 based on a selected sub-pattern 904 being positioned in the middle of at least one pattern 906 and having a fixed length. According to the example embodiment of FIG9 , the at least one NFA 314 may be a single NFA 954, and the at least one walking direction includes a reverse walking direction 930 for a plurality of runtime processing nodes associated with a leading portion 912 of the at least one pattern 906 of the single NFA 954, and a forward walking direction 928 for a plurality of runtime processing nodes associated with all elements of the at least one pattern 906 of the single NFA 954. The leading portion 912 may be the at least one pattern 906 excluding a lagging portion 908 of the at least one pattern 906. The lag portion 908 includes a first element 918 of the selected sub-pattern 904, a last element 920 of the at least one pattern 906, and all elements of the at least one pattern 906 therebetween.
根据图9的示例实施例,编译器306可以将统一DFA 312的DFA节点932与元数据956相关联,该DFA节点与有所选择的子图样904的最后元素916相关联。元数据956可以向行走器320指示一个指向单个NFA 954的起始节点946的指针936,以及一个转换成在反向行走方向930行走单个NFA 954的指令。起始节点946可以与前导部分912的一个最后元素912相关联。元数据956可以指示行走器320上报所选择的子图样904的一次匹配。元数据956可以指示行走器320上报与DFA节点932处的所选择的子图样904的最后元素916匹配的(这些字符430中的)一个字符的有效载荷426内的(这些偏移428中的)一个偏移作为所选择的子图样904的一个结束偏移,以及所选择的该子图样的一个长度。行走器320可以使用长度(如果包括在元数据956中的话),以便通过从所选择的该子图样的结束偏移减去元数据956中的所选择的该子图样的长度来确定起始节点946的有效载荷起始偏移。According to the example embodiment of FIG9 , compiler 306 may associate DFA node 932 of unified DFA 312 with metadata 956, which is associated with last element 916 of selected subgraph 904. Metadata 956 may indicate to walker 320 a pointer 936 to start node 946 of single NFA 954 and an instruction to walk single NFA 954 in reverse walking direction 930. Start node 946 may be associated with last element 912 of preamble 912. Metadata 956 may instruct walker 320 to report a match for selected subgraph 904. Metadata 956 may instruct walker 320 to report an offset (of offsets 428) within payload 426 of a character (of characters 430) that matches last element 916 of selected subgraph 904 at DFA node 932 as an ending offset for selected subgraph 904, and a length for the selected subgraph. The walker 320 may use the length, if included in the metadata 956 , to determine the payload start offset of the start node 946 by subtracting the length of the selected subpattern in the metadata 956 from the end offset of the selected subpattern.
根据图9的示例实施例,编译器306可以使单个NFA 954的前导节点948与元数据950相关联,该前导节点与该至少一个图样906的第一元素914相关联。元数据950可以向行走器320指示一个转换成在前向行走方向928行走单个NFA 954的指令。编译器306可以使单个NFA 954的滞后节点942与元数据952相关联,该前导节点与该至少一个图样906的最后元素920相关联。元数据952可以向行走器320指示一个终止行走的指令。元数据952可以指示行走器上报与滞后节点942处的该至少一个图样906的最后元素920匹配的(这些字符430中的)一个字符的有效载荷426内的(这些偏移428中的)一个偏移,以及该至少一个图样906的一次最终匹配。According to the example embodiment of FIG9 , the compiler 306 may associate a predecessor node 948 of a single NFA 954 with metadata 950, the predecessor node being associated with the first element 914 of the at least one pattern 906. The metadata 950 may indicate to the walker 320 an instruction to convert the single NFA 954 into a forward walk direction 928. The compiler 306 may associate a lag node 942 of the single NFA 954 with metadata 952, the predecessor node being associated with the last element 920 of the at least one pattern 906. The metadata 952 may indicate to the walker 320 an instruction to terminate the walk. The metadata 952 may instruct the walker to report an offset (of the offsets 428) within the payload 426 of a character (of the characters 430) that matches the last element 920 of the at least one pattern 906 at the lag node 942, and a final match of the at least one pattern 906.
图10是用于在所选择的子图样1004的位置是该至少一个图样1006的结束位置并且所选择的子图样1004的长度固定的基础上生成统一DFA 312和至少一个NFA 314的一个实施例的框图1000。根据图10的示例实施例,如果所选择的子图样1004的最后元素1016可以是该至少一个图样1016的最后元素,所选择的子图样1004的位置可以是该至少一个图样1006的结束位置,并且该至少一个NFA 314可以是单个NFA 1054。如果所选择的子图样1004的长度是固定的,该至少一个图样1006的用于生成单个NFA 1054的部分1012可以是排除了所选择的子图样1004的该至少一个图样1006。该至少一个行走方向可以是用于行走单个NFA 1054的反向行走方向1030。FIG10 is a block diagram 1000 of one embodiment for generating a unified DFA 312 and at least one NFA 314 based on the position of a selected sub-pattern 1004 being the end position of at least one pattern 1006 and the length of the selected sub-pattern 1004 being fixed. According to the example embodiment of FIG10 , if the last element 1016 of the selected sub-pattern 1004 can be the last element of the at least one pattern 1016, the position of the selected sub-pattern 1004 can be the end position of the at least one pattern 1006, and the at least one NFA 314 can be a single NFA 1054. If the length of the selected sub-pattern 1004 is fixed, the portion 1012 of the at least one pattern 1006 used to generate the single NFA 1054 can be the at least one pattern 1006 excluding the selected sub-pattern 1004. The at least one walking direction can be a reverse walking direction 1030 for walking the single NFA 1054.
根据图10的示例实施例,编译器306可以使对应于所选择的子图样1004的最后元素1016的DFA节点1032与元数据1052相关联。元数据1052可以向行走器320指示一个指向单个NFA 1054的起始节点1046的指针1036,以及一个转换成在反向行走方向1030行走单个NFA 1054的指令。单个NFA 1046的起始节点1046与部分1012的一个最后元素1034相关联。元数据1052可以指示行走器320上报所选择的子图样1004的一次匹配,以及与DFA节点1032处的所选择的子图样1004的最后元素1016匹配的(这些字符430中的)一个字符的有效载荷426内的(这些偏移428中的)一个偏移作为所选择的子图样1004的一个结束偏移,以及所选择的子图样1004的一个长度。行走器320可以使用长度(如果包括在元数据1052中的话),以便通过从所选择的子图样1004的结束偏移减去元数据1052中的所选择的该子图样的长度来确定起始节点1046的有效载荷起始偏移。According to the example embodiment of FIG10 , compiler 306 may associate metadata 1052 with DFA node 1032 corresponding to last element 1016 of selected sub-pattern 1004. Metadata 1052 may indicate to walker 320 a pointer 1036 to start node 1046 of single NFA 1054 and an instruction to walk single NFA 1054 in reverse walking direction 1030. Start node 1046 of single NFA 1046 is associated with last element 1034 of portion 1012. Metadata 1052 may instruct walker 320 to report a match of selected sub-pattern 1004, an offset (of offsets 428) within payload 426 of a character (of characters 430) that matches last element 1016 of selected sub-pattern 1004 at DFA node 1032 as an end offset of selected sub-pattern 1004, and a length of selected sub-pattern 1004. The walker 320 may use the length, if included in the metadata 1052 , to determine the payload start offset of the start node 1046 by subtracting the length of the selected sub-pattern 1004 in the metadata 1052 from the end offset of the selected sub-pattern.
根据图10的示例实施例,编译器306可以使与部分1012的第一元素1014相关联的NFA节点1048与元数据1050相关联。元数据1050可以向行走器320指示终止行走,以及上报该至少一个图样1006的一次最终匹配,以及与NFA节点1048处的部分1012的第一元素1014匹配的(这些字符430中的)一个字符的有效载荷426内的(这些偏移428中的)一个偏移作为该至少一个图样1006的起始偏移,如果与该至少一个图样1006相关联的这些限定符308中的一个限定符需要的话。10 , the compiler 306 may associate the NFA node 1048 associated with the first element 1014 of the portion 1012 with metadata 1050. The metadata 1050 may indicate to the walker 320 to terminate the walk and report a final match of the at least one pattern 1006, as well as an offset (of the offsets 428) within the payload 426 of the one (of the characters 430) that matched the first element 1014 of the portion 1012 at the NFA node 1048 as the starting offset of the at least one pattern 1006, if required by one of the qualifiers 308 associated with the at least one pattern 1006.
图11是用于在所选择的子图样1104的位置是该至少一个图样1106的结束位置并且所选择的子图样1004的长度可变或固定的基础上生成统一DFA 312和至少一个NFA 314的一个实施例的框图1100。根据图11的示例实施例,如果所选择的子图样1104的最后元素1116可以是该至少一个图样1116的最后元素,所选择的子图样1104的位置是该至少一个图样1106的结束位置,并且该至少一个NFA 314可以是单个NFA 1154。如果所选择的子图样1104的长度是固定的或可变的,该至少一个图样1106的用于生成单个NFA 1154的部分1112可以该至少一个图样1006。该至少一个行走方向可以是用于行走单个NFA 1154的反向行走方向1130。FIG11 is a block diagram 1100 of one embodiment for generating a unified DFA 312 and at least one NFA 314 based on the position of a selected sub-pattern 1104 being the end position of at least one pattern 1106 and the length of the selected sub-pattern 1104 being variable or fixed. According to the example embodiment of FIG11 , if the last element 1116 of the selected sub-pattern 1104 can be the last element of the at least one pattern 1116, the position of the selected sub-pattern 1104 is the end position of the at least one pattern 1106, and the at least one NFA 314 can be a single NFA 1154. If the length of the selected sub-pattern 1104 is fixed or variable, the portion 1112 of the at least one pattern 1106 used to generate the single NFA 1154 can be the at least one pattern 1006. The at least one walking direction can be a reverse walking direction 1130 for walking the single NFA 1154.
根据图11的示例实施例,编译器306可以使对应于所选择的子图样1104的最后元素1116的DFA节点1132与元数据1152相关联。元数据1152可以向行走器320指示一个指向单个NFA 1154的起始节点1146的指针1136,以及转换成在反向行走方向1130行走单个NFA1154的指令。单个NFA 1154的起始节点1146可以与所选择的子图样1104的一个最后元素1116相关联。元数据1152可以指示行走器320上报所选择的子图样1104的一次匹配,以及与DFA节点1132处的所选择的子图样1104的最后元素1116匹配的(这些字符430中的)一个字符的有效载荷426内的(这些偏移428中的)一个偏移作为所选择的子图样1104的一个结束偏移,以及所选择的子图样1104的一个长度,如果该长度固定的话。11 , compiler 306 may associate DFA node 1132 corresponding to last element 1116 of selected subgraph 1104 with metadata 1152. Metadata 1152 may indicate to walker 320 a pointer 1136 to a start node 1146 of a single NFA 1154 and instructions to walk single NFA 1154 in reverse walking direction 1130. Start node 1146 of single NFA 1154 may be associated with last element 1116 of selected subgraph 1104. The metadata 1152 may instruct the walker 320 to report a match of the selected sub-pattern 1104, an offset (of the offsets 428) within the payload 426 of a character (of the characters 430) that matches the last element 1116 of the selected sub-pattern 1104 at the DFA node 1132 as an ending offset of the selected sub-pattern 1104, and a length of the selected sub-pattern 1104, if the length is fixed.
根据图11的实施例,编译器306可以使与部分1112的第一元素1114相关联的NFA节点1148与元数据1150相关联。元数据1150可以向行走器320指示终止行走,并上报该至少一个图样1106的一次最终匹配。元数据1152可以向行走器320指示上报与NFA节点1148处的部分1112的第一元素1114匹配的(这些字符430中的)一个字符的有效载荷426内的(这些偏移428中的)一个偏移作为该至少一个图样1106的起始偏移,如果与该至少一个图样1106相关联的这些限定符304中的一个限定符需要的话。11 , compiler 306 may associate NFA node 1148 associated with first element 1114 of portion 1112 with metadata 1150. Metadata 1150 may indicate to walker 320 to terminate the walk and report a final match for at least one pattern 1106. Metadata 1152 may indicate to walker 320 to report an offset (of offsets 428) within payload 426 of a character (of characters 430) that matches first element 1114 of portion 1112 at NFA node 1148 as the starting offset for at least one pattern 1106, if required by one of the qualifiers 304 associated with at least one pattern 1106.
图12为计算机1200内部结构的示例的框图,可以在该计算机中实施本发明的各实施例。计算机1200包含一条系统总线1202,其中,总线为用于在计算机或处理系统的组件之间进行数据传输的硬件线集合。系统总线1202实质上为一条将计算机系统的不同元件(例如,处理器、磁盘存储器、存储器、输入/输出端口、网络端口等)耦合起来的共享管道,该管道能够在这些元件之间传输信息。对系统总线1202起作用的是用于将各输入和输出设备(例如,键盘、鼠标、显示器、打印机、扩音器等)耦合到计算机1200的I/O设备接口1204。网络接口1206允许计算机1200耦合到附接至网络的各其他设备。存储器1208为可以用于实施在本发明的实施例的计算机软件指令1210和数据1212提供易失性存储。磁盘存储器1214为可以用于实施在本发明的实施例的计算机软件指令1210和数据1212提供非易失性存储。中央处理器单元1218也对系统总线1202起作用并且提供计算机指令的执行。Figure 12 is a block diagram of an example of the internal structure of a computer 1200 in which embodiments of the present invention may be implemented. Computer 1200 includes a system bus 1202, where a bus is a collection of hardware lines used to transfer data between components of a computer or processing system. System bus 1202 is essentially a shared pipe that couples the various components of a computer system (e.g., processor, disk storage, memory, input/output ports, network ports, etc.), enabling information to be transferred between these components. Serving upon system bus 1202 is an I/O device interface 1204 for coupling various input and output devices (e.g., keyboard, mouse, display, printer, amplifier, etc.) to computer 1200. A network interface 1206 allows computer 1200 to be coupled to various other devices attached to a network. Memory 1208 provides volatile storage for computer software instructions 1210 and data 1212 that may be used to implement embodiments of the present invention. Disk storage 1214 provides non-volatile storage for computer software instructions 1210 and data 1212 that may be used to implement embodiments of the present invention. Central processor unit 1218 also functions with system bus 1202 and provides for the execution of computer instructions.
可以使用计算机程序产品来配置在本发明的进一步示例实施例;例如,可以在用于实施本发明示例实施例的软件中对控制装置进行编程。本发明的进一步的实施例可以包括包含可以由处理器执行的指令的非瞬态计算机可读介质,并且当被执行时,这些指令致使处理器完成在此描述的方法。应理解的是,可以用软件、硬件、固件、或将来确定的其他类似实现方式来实现在此描述的框图和流程图中的元素。此外,可以用任何方式在软件、硬件、或固件中组合或分离在此描述的框图和流程图中的元素。A computer program product may be used to configure further example embodiments of the present invention; for example, the control device may be programmed in software for implementing an example embodiment of the present invention. Further embodiments of the present invention may include a non-transient computer-readable medium containing instructions that can be executed by a processor, and when executed, these instructions cause the processor to perform the methods described herein. It should be understood that the elements in the block diagrams and flow charts described herein may be implemented in software, hardware, firmware, or other similar implementations to be determined in the future. In addition, the elements in the block diagrams and flow charts described herein may be combined or separated in software, hardware, or firmware in any manner.
应理解到,术语“在此”可转换为结合了在此介绍的教导的申请或专利,这样使得主题、定义、或数据转入进行该结合的申请或专利内。It should be understood that the term "herein" may be translated into an application or patent that incorporates the teachings introduced herein, such that the subject matter, definitions, or data is carried over into the incorporated application or patent.
如果以软件形式实现,可以用可以支持在此披露的示例实施例的任何语言来编写软件。可以用计算机可读介质的形式存储软件,如随机存取存储器(RAM)、只读存储器(ROM)、光盘只读存储器(CD-ROM)等等。在操作时,通用或专用处理器以本领域中很好理解的方式加载和执行软件。应进一步理解的是,这些框图和流程图可以包括有区别地安排或定向的、或有区别地展现的更多或更少的元素。应理解到,实现方式可以指定框图、流程图、和/或网络图和展示本发明的实施例的执行的框图和流程图的数量。If implemented in software, the software can be written in any language that can support the example embodiments disclosed herein. The software can be stored in the form of a computer-readable medium, such as a random access memory (RAM), a read-only memory (ROM), a compact disc read-only memory (CD-ROM), etc. In operation, a general-purpose or special-purpose processor loads and executes the software in a manner well understood in the art. It should be further understood that these block diagrams and flow charts may include more or fewer elements that are differently arranged or oriented, or differently presented. It should be understood that an implementation may specify the number of block diagrams, flow charts, and/or network diagrams and block diagrams and flow charts that illustrate the execution of an embodiment of the present invention.
尽管本发明已经参照其示例实施例做了具体的展示和说明,本领域技术人员将理解到通过在不偏离由所附的权利要求书涵盖的本发明的范围下可以从中做出在形式和细节上的不同的变化。While the invention has been particularly shown and described with reference to example embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention as encompassed by the appended claims.
Claims (67)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US14/015,929 | 2013-08-30 | ||
| US14/015,929 US9426166B2 (en) | 2013-08-30 | 2013-08-30 | Method and apparatus for processing finite automata |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1208103A1 HK1208103A1 (en) | 2016-02-19 |
| HK1208103B true HK1208103B (en) | 2019-09-13 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9426166B2 (en) | Method and apparatus for processing finite automata | |
| US9426165B2 (en) | Method and apparatus for compilation of finite automata | |
| US10002326B2 (en) | Compilation of finite automata based on memory hierarchy | |
| US9419943B2 (en) | Method and apparatus for processing of finite automata | |
| US9602532B2 (en) | Method and apparatus for optimizing finite automata processing | |
| US9823895B2 (en) | Memory management for finite automata processing | |
| US9904630B2 (en) | Finite automata processing based on a top of stack (TOS) memory | |
| US8990259B2 (en) | Anchored patterns | |
| CN107122221B (en) | compiler for regular expressions | |
| US9438561B2 (en) | Processing of finite automata based on a node cache | |
| US10110558B2 (en) | Processing of finite automata based on memory hierarchy | |
| EP2560338B1 (en) | Method and apparatus for protocol parsing | |
| HK1208103B (en) | Method and apparatus for processing finite automata | |
| HK1208105B (en) | Method and apparatus for compilation of finite automata | |
| US8289854B1 (en) | System, method, and computer program product for analyzing a protocol utilizing a state machine based on a token determined utilizing another state machine | |
| Yu et al. | Fast packet pattern-matching algorithms | |
| HK1211718A1 (en) | System and method to traverse nfa generated for regular expression patterns | |
| HK1193278A (en) | Compiler for regular expressions | |
| HK1212119B (en) | Method and apparatus for processing of finite automata | |
| HK1193278B (en) | Compiler for regular expressions | |
| HK1214695B (en) | Compilation of finite automata based on memory hierarchy | |
| HK1208104B (en) | A method and computer system of compiling a pattern into a non-deterministic finite automata (nfa) graph |