CN110858823B

CN110858823B - Data packet classification method and device and computer readable storage medium

Info

Publication number: CN110858823B
Application number: CN201810972030.7A
Authority: CN
Inventors: 王磊; 刘明强
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2018-08-24
Filing date: 2018-08-24
Publication date: 2023-03-07
Anticipated expiration: 2038-08-24
Also published as: WO2020038399A1; CN110858823A

Abstract

Disclosed herein are a method, an apparatus and a computer-readable storage medium for classifying a data packet, including: acquiring prefix information, range information and accurate information from a data packet to be classified; and respectively taking prefix information, accurate information and range information as dimensions, and utilizing a pre-established prefix matching table and a rule set classification table to obtain classification results of the data packets to be classified. It can be seen from the embodiment of the invention that the classification result is obtained only by searching from three dimensions of prefix information, accurate information and range information, thereby realizing the rapid classification of the data packet and greatly improving the classification efficiency.

Description

Data packet classification method and device and computer readable storage medium

Technical Field

The present invention relates to the field of network communication technologies, and in particular, to a method and an apparatus for classifying data packets, and a computer-readable storage medium.

Background

With the development of the Internet, the demands of network applications on the performance, security and service types of the network tend to be diversified. When a Network application desires a router to provide various functions such as firewall (firewall), virtual Private Network (VPN), quality of Service (QoS), differentiated Services (Differentiated Services), traffic charging (Traffic billing), policy-based routing (Policy-based routing), and the like for support, the router is required to perform different treatments on packets with different functions, that is, a packet classification function is integrated into the router. When the classification of the data packet is realized, a classification rule set is defined, which is composed of a series of rules, each rule comprises a plurality of fields (or fields) and corresponding actions, wherein the fields in the rules correspond to the fields in the data stream, after the data packet enters the router, the information corresponding to the fields in the rule set in the data packet is extracted, then the matched rules are found in the rule set based on the information, and the data packet is processed according to the actions defined in the rules.

In the related art, packet classification algorithms are often used to classify packets, and the classification method is to search a rule meeting a field in a rule set according to the field, and then search a rule meeting a new field in the searched rule according to another field until all the fields are traversed, and an action in the finally obtained rule is a packet classification result, that is, the packet is processed according to the action in the finally obtained rule.

However, since this method performs the rule search according to fields in the rule set, it takes a long time to classify the data packet, which results in low efficiency of classification.

Disclosure of Invention

In order to solve the foregoing technical problems, embodiments of the present invention provide a method and an apparatus for classifying a data packet, and a computer-readable storage medium, which can implement fast classification of a data packet and improve classification efficiency of the data packet.

In order to achieve the object of the present invention, an embodiment of the present invention provides a method for classifying a data packet, including:

acquiring prefix information, accurate information and range information from a data packet to be classified; the prefix information is information corresponding to a prefix basic information matching field in a prefix matching field of a preset rule set, the accurate information is information corresponding to an accurate matching field in the preset rule set, and the accurate information is information corresponding to the accurate matching field in the preset rule set;

and respectively taking the prefix information, the accurate information and the range information as dimensions, and acquiring the classification result of the data packet to be classified by utilizing a pre-established prefix matching table and a rule set classification table.

An embodiment of the present invention further provides a device for classifying a data packet, including:

the acquisition module is used for acquiring prefix information, accurate information and range information from the data packet to be classified; the prefix information is information corresponding to a prefix basic information matching field in a prefix matching field of a preset rule set, the accurate information is information corresponding to an accurate matching field in the preset rule set, and the accurate information is information corresponding to the accurate matching field in the preset rule set;

and the processing module is used for acquiring the classification result of the data packet to be classified by using the prefix information, the accurate information and the range information as dimensions and utilizing a pre-established prefix matching table and a rule set classification table.

An embodiment of the present invention further provides a device for processing a hash collision, including: a processor and a memory, wherein the memory has stored therein the following instructions executable by the processor:

acquiring prefix matching information corresponding to the prefix information from a pre-established prefix matching table as target prefix matching information;

An embodiment of the present invention further provides a computer-readable storage medium, where the storage medium stores computer-executable instructions, and the computer-executable instructions are configured to perform the following steps:

Compared with the prior art, the prefix information, the range information and the accurate information are obtained from the data packet to be classified, and the classification result is obtained only by searching according to the three dimensions of the prefix information, the accurate information and the range information, so that the data packet is classified quickly, and the classification efficiency is improved to a great extent.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.

Fig. 1 is a flowchart illustrating a method for classifying a data packet according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a packet classifying device according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a rule structure according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an alternative packet classification apparatus according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a rule set and storage table structure according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of another rule set and storage table according to an embodiment of the present invention.

Detailed Description

To make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.

The steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.

In the related art, a typical flow classification algorithm includes: the method comprises three categories of exhaustion method, rule set partition method and dimension decomposition method, wherein the exhaustion method comprises the following steps: the method comprises a linear matching method and a matching method based on Ternary Content Addressable Memory (TCAM), wherein the rule set division method comprises the following steps: the method comprises a grid dictionary tree (Trie tree) matching method, a Hicuts matching method and a tuple space matching method, wherein the dimension decomposition method comprises the following steps: bit Vector (BV) matching and Recursive Flow Classification (RFC) matching. The following description will be made in order for these packet classification methods:

1. exhaustion method

1.1 Linear matching method

The simplest linear data structure is used to store the classification rules, which are typically arranged in descending order according to a priority or cost function. When searching, the data packets are compared with the rules one by one until a matching rule is found. The algorithm is simple to implement and high in storage utilization rate, but the searching performance is poor, and matching items can be found only after all rules are traversed under the worst condition.

1.2 matching method based on TCAM

An exhaustive classification algorithm implemented in hardware employs a TCAM as the classification engine. The matching of the data packets can be completed once in one clock cycle.

2. Rule set partitioning method

2.1 Grid Trie tree matching method (Grid of Trie)

The grid Trie tree algorithm introduces a steering pointer, deletes repeated subtrees in the sub rule set, only one subtree is reserved, and a pointer is added to point to the reserved subtree at a parent node of the original deleted subtree.

2.2 Hicuts matching method

The Hicuts matching method combines two classification modes of decision tree searching and linear searching, adopts multi-level spatial decomposition, each level of decomposition is carried out on one dimension, and divides a rule set into small rule sets in each leaf node. The starting point of the algorithm division is to automatically adjust the data structure according to the characteristics of the rule set, and the optimized data structure is utilized to the maximum extent to reduce redundancy.

2.3 tuple space matching method

The Tuple space algorithm firstly calculates the number of valid bits in prefixes of all dimensions of each rule in the rule set to obtain a Tuple (Tuple), and all the tuples of the rules form a set which is called Tuple space. And dividing all rules in the rule set into a plurality of sub-rule sets according to the tuple space, wherein each tuple corresponds to one sub-rule set. And inside each sub-rule set, establishing a hash table according to the effective bits in each one-dimensional prefix of the rule to store all the rules of the sub-rule set.

3. Dimension decomposition method

3.1 BV matching method

The algorithm constructs a one-dimensional Trie tree for each dimension, and formulates a bitmap for each node with a valid prefix in the Trie tree, wherein the bitmap is an n-bit feature vector (n is the number of rules) for identifying the rule matched with the prefix corresponding to the node, for example, if the prefix of a certain node is matched with the rule R (i), the ith position in the bitmap is 1, otherwise, 0 is set. During searching, longest prefix matching is carried out on each dimension of the data packet in a corresponding Trie tree to obtain a corresponding bitmap, and finally, the bitmap of each dimension is subjected to AND operation, and finally, the rule identified by the bit 1 in the bitmap is the optimal matching rule.

3.2 RFC matching method

The algorithm divides the whole domain into a plurality of segments by utilizing the structural characteristics of a rule set, and realizes the distribution compression mapping of the space by calculating the equivalence class and the recursion operation in the segments.

The performance of a flow classification algorithm is measured mainly by the following aspects: temporal complexity, spatial complexity, update complexity, scalability, and the arbitrariness of rules. If the number of fields included in the rule set is denoted by d, the algorithm input bit width is denoted by N, the number of rules is denoted by N, the bit width of one field in the rule set used by the algorithm is denoted by W, the total bit width of a rule is denoted by W, and the number of stages in the hierarchical implementation of the RFC algorithm is denoted by P, the performance comparison of the above algorithm is as shown in table 1 below,

TABLE 1

To this end, an embodiment of the present invention provides a method for classifying a data packet, as shown in fig. 1, the method includes:

step 101, obtaining prefix information, accurate information and range information from a data packet to be classified.

The prefix information is information corresponding to a prefix basic information matching field in a prefix matching field of a preset rule set, the accurate information is information corresponding to an accurate matching field in the preset rule set, and the accurate information is information corresponding to the accurate matching field in the preset rule set.

And 102, respectively taking prefix information, accurate information and range information as dimensions, and acquiring a classification result of the data packet to be classified by utilizing a pre-established prefix matching table and a rule set classification table.

According to the data packet classification method provided by the embodiment of the invention, as the prefix information, the range information and the accurate information are obtained from the data packet to be classified, the classification result is obtained only by searching according to the three dimensions of the prefix information, the accurate information and the range information, so that the data packet is rapidly classified, and the classification efficiency is greatly improved.

Optionally, the obtaining, by using a pre-established prefix matching table and a rule set classification table, classification results of the data packets to be classified by using prefix information, precise information, and range information as dimensions respectively includes:

and 102a, acquiring prefix matching information corresponding to the prefix information from the prefix matching table as target prefix matching information.

And 102b, acquiring data records corresponding to the target prefix matching information and the accurate information from the rule set classification table.

Wherein the data record comprises: prefix matching information, exact matching information, multiple range matching information, and their corresponding action matching information.

And 102c, acquiring range matching information containing range information from the acquired data record as target range matching information.

And 102d, acquiring action matching information corresponding to the target range matching information as a classification result of the data packet to be classified.

It should be noted that the prefix matching table includes: a plurality of prefix matching information; the rule set classification table comprises: a plurality of data records.

Optionally, obtaining prefix matching information corresponding to the prefix information from the prefix matching table includes:

step 102a1, generating a first key value according to the prefix information and each prefix matching length.

The number of the first key values is plural.

Specifically, the method for generating the key value may be generated by performing a predetermined operation on the prefix information and the prefix matching length, or may be generated by intercepting the prefix information according to the prefix matching length, which is not limited in the present invention.

Step 102a2, performing hash operation on the obtained first key value by using a first preset hash function to obtain a first hash value.

It should be noted that, because the number of the first key values is multiple, performing the hash operation on the obtained first key value by using the first preset hash function means: and performing hash operation on each obtained first key value by using a first preset hash function.

And 102a3, reading a prefix matching table by taking the obtained first hash value as an address to obtain prefix matching information.

It should be noted that, reading the prefix matching table by using the obtained first hash value as an address, and obtaining prefix matching information refers to: and reading the prefix matching table by taking each obtained first hash value as an address to obtain prefix matching information.

And 102a4, determining prefix matching information which is the same as the first key value in the obtained prefix matching information as target prefix matching information.

Optionally, obtaining data records corresponding to the target prefix matching information and the precise information from the rule set classification table includes:

and 102b1, generating a second key value according to the target prefix matching information and the accurate matching information.

It should be noted that there may be multiple target prefix matching information, and when there are multiple target prefix matching information, generating the second key value according to the target prefix matching information and the exact matching information means: and generating a plurality of second key values according to each target prefix matching information and the accurate matching information.

And 102b2, performing hash operation on the second key value by using a second preset hash function to obtain a second hash value.

It should be noted that, when a plurality of second key values are provided, performing hash operation on the second key value by using a second preset hash function means: and performing hash operation on each second key value by using a second preset hash function.

And 102b3, reading the rule set classification table by taking the obtained second hash value as an address to obtain a data record.

It should be noted that, when there are a plurality of second hash values, reading the rule set classification table with the obtained second hash value as the address means: and reading the rule set classification table by taking each obtained second hash value as an address.

And step 102b4, determining data records corresponding to the target prefix matching information and the accurate information in the obtained data records.

Optionally, the number of the prefix matching tables is the same as the number of the kinds of the prefix matching lengths.

Reading a prefix matching table by taking the obtained first hash value as an address, wherein the prefix matching table comprises:

and reading a prefix matching table by taking each obtained first hash value as an address.

The prefix matching read by taking any one of the first hash values as an address is different from the prefix matching table read by taking any one of the other first hash values as an address.

It should be noted that, when the number of prefix matching tables is the same as the number of types of prefix matching lengths, the reading speed of prefix matching information can be greatly increased when each obtained first hash value is used as an address to read one prefix matching table.

Specifically, assuming that the number of the types of the prefix matching lengths is N, the number of the obtained first hash values is also N, and the number of the corresponding prefix matching tables is also N, and reading one prefix matching table with each obtained first hash value as an address means: the method comprises the steps of reading a1 st prefix matching table by taking an obtained 1 st first hash value as an address, reading a2 nd prefix matching table by taking an obtained 2 nd first hash value as the address, \8230, and reading an Nth prefix matching table by taking an obtained Nth first hash value as the address, wherein the 1 st prefix matching table and the 2 nd prefix matching table \8230arenot limited to the prefix matching table, but are only used for identifying the prefix matching table.

Optionally, determining prefix matching information that is the same as the first key value in the obtained prefix matching information includes:

step 102a4a, determining whether the obtained prefix matching information is the same as a first key value according to which the prefix matching information is based in the reading process.

And 102a4b, acquiring prefix matching information which is the same as the first key value and is obtained in the reading process and is used as target prefix matching information.

Optionally, determining, from the obtained data records, a data record corresponding to both the target prefix matching information and the precise information includes:

and step 102b4a, judging whether the prefix matching information in the obtained data record is the same as the target prefix matching information or not, and whether the accurate matching information is the same as the accurate information or not.

It should be noted that, if there are multiple obtained data records, determining whether the prefix matching information in the obtained data record is the same as the target prefix matching information, and whether the exact matching information is the same as the exact information means: and judging whether the prefix matching information in each obtained data record is the same as the target prefix matching information or not and whether the accurate matching information is the same as the accurate information or not.

And 102b4b, acquiring data records of which the prefix matching information is the same as the target prefix matching information and the accurate matching information is the same as the accurate information, and taking the data records as data records corresponding to the target prefix matching information and the accurate information.

Optionally, before obtaining prefix information corresponding to a prefix information matching field in prefix matching fields of a preset rule set, range information corresponding to a range matching field, and accurate information corresponding to an accurate matching field from a to-be-classified data packet, the method further includes:

step 103, acquiring prefix matching information, range matching information, accurate matching information and action information of the rule in the preset rule set.

Wherein, the prefix matching information includes: prefix basic matching information and a prefix matching length corresponding to the prefix basic matching information.

It should be noted that obtaining prefix matching information, range matching information, precise matching information, and action information of the rule in the preset rule set means: prefix matching information, range matching information, accurate matching information and action information of each rule in a preset rule set are obtained.

And 104, filling a pre-established first empty table according to the obtained basic prefix matching information and the prefix matching length corresponding to the basic prefix matching information to obtain a prefix matching table.

Specifically, the number of the first empty tables may be one, or may be the same number as the number of the types of prefix matching lengths. When the number of the first empty tables is the number of the same number as the type number of the prefix matching length, the reading speed of the subsequent prefix matching information can be effectively improved.

And 105, filling a pre-established second empty table according to the obtained prefix matching information, range matching information, accurate information and action information to obtain a rule set classification table.

Optionally, the step of filling a pre-established first empty table according to the obtained prefix basic matching information and the prefix matching length corresponding to the prefix basic matching information includes:

and 104a, generating a third correlation key value according to the prefix matching basic information and the prefix matching length corresponding to the prefix matching basic information.

And 104b, carrying out hash operation on the obtained third key value by using a first preset hash function to obtain a third hash value.

It should be noted that, if the number of the third key values is multiple, performing the hash operation on the obtained third key value by using the first preset hash function means: and performing hash operation on each obtained third key value by using a first preset hash function.

And step 104c, writing the third key value into the first empty table by taking the obtained third hash value as an address.

It should be noted that, if the number of the third hash values is multiple, writing the third key value into the first empty table by using the obtained third hash value as an address means: and writing the third key value into the first empty table by taking each obtained third hash value as an address.

Optionally, the filling a second empty table established in advance according to the obtained prefix matching information, range matching information, accurate information, and action information includes:

and 105a, generating a fourth key value according to the obtained prefix matching information and the corresponding accurate matching length.

And 105b, carrying out hash operation on the obtained fourth key value by using a second preset hash function to obtain a fourth hash value.

It should be noted that, if a plurality of fourth key values are obtained, performing a hash operation on the obtained fourth key value by using a second preset hash function means: and performing hash operation on each obtained fourth key value by using a second preset hash function.

And 105c, reading the second empty table by taking the fourth hash value as an address to obtain a data record.

It should be noted that, if there are multiple fourth hash values, reading the second empty table with the fourth hash value as an address means: and reading the second empty table by taking each fourth hash value as an address.

And 105d, filling a second empty table according to the obtained data record, the prefix matching information, the range matching information, the accurate information and the action information.

Optionally, populating a second empty table according to the obtained data records, the prefix matching information, the range matching information, the precision information, and the action information, including:

and 105d1, judging whether the prefix matching field and the precise matching field in the obtained data record are empty or not.

And 105d2, if the prefix matching field and the accurate matching field in the obtained data record are both empty, filling the prefix matching information, the accurate matching information, the range matching information and the action information into the prefix matching field, the accurate matching field, the range matching field and the action field of the data record.

It should be noted that, if the prefix matching field and the exact matching field in the obtained data record are both empty, it indicates that the data record has not been written with data yet. Accordingly, once the data record is written with data, the prefix match field and exact match field are necessarily not empty, even if the range match field and action field are empty.

Optionally, if the prefix matching field and the exact matching field in the obtained data record are not empty, the method further includes:

and 105d3, judging whether the information on the prefix matching field in the obtained data record is the same as the prefix matching information or not and whether the information on the precise matching field in the data record is the same as the precise matching information or not.

And 105d4, if the information on the prefix matching field in the obtained data record is the same as the prefix matching information and the information on the precise matching field in the data record is the same as the precise matching information, judging whether an empty range matching field and an empty action field exist in the obtained data record.

Step 105d 5, if there are empty scope matching fields and action fields in the obtained data record, fill the scope matching fields and action fields in the empty scope matching fields and action fields.

An embodiment of the present invention provides a device for classifying a data packet, and as shown in fig. 2, the device 2 for classifying a data packet includes:

an obtaining module 21, configured to obtain prefix information, accurate information, and range information from a data packet to be classified; the prefix information is information corresponding to a prefix basic information matching field in a prefix matching field of a preset rule set, the accurate information is information corresponding to an accurate matching field in the preset rule set, and the accurate information is information corresponding to the accurate matching field in the preset rule set.

And the processing module 22 is configured to obtain classification results of the data packets to be classified by using prefix matching tables and rule set classification tables established in advance, with the prefix information, the precision information, and the range information as dimensions.

Optionally, the processing module 22 is specifically configured to:

and acquiring prefix matching information corresponding to the prefix information from the prefix matching table as target prefix matching information.

And acquiring data records corresponding to the target prefix matching information and the accurate information from the rule set classification table. Wherein the data record comprises: prefix matching information, exact matching information, multiple range matching information and their corresponding action matching information.

Range matching information including range information is acquired from the obtained data record as target range matching information.

And acquiring action matching information corresponding to the target range matching information as a classification result of the data packet to be classified.

Optionally, the processing module 22 is specifically configured to:

and generating a first key value according to the prefix information and each prefix matching length.

And carrying out hash operation on the obtained first key value by utilizing a first preset hash function to obtain a first hash value.

And reading the prefix matching table by taking the obtained first hash value as an address to obtain prefix matching information.

And determining prefix matching information which is the same as the first key value in the obtained prefix matching information as target prefix matching information.

Optionally, the processing module 22 is further specifically configured to:

and generating a second key value according to the target prefix matching information and the accurate matching information.

And carrying out hash operation on the obtained second key value by utilizing a second preset hash function to obtain a second hash value.

And reading the rule set classification table by taking the obtained second hash value as an address to obtain a data record.

And determining the data records corresponding to the target prefix matching information and the accurate information in the obtained data records.

Optionally, the number of the prefix matching tables is the same as the number of the kinds of the prefix matching lengths. The processing module 22 is further specifically configured to read a prefix matching table by using each obtained first hash value as an address. The prefix matching read by taking any one of the first hash values as an address is different from the prefix matching table read by taking any one of the other first hash values as an address.

Optionally, the processing module 22 is further specifically configured to:

and judging whether the obtained prefix matching information is the same as a first key value according to which the prefix matching information is read.

And acquiring prefix matching information which is the same as the first key value and is obtained in the reading process as target prefix matching information.

Optionally, the processing module 22 is further specifically configured to:

and judging whether the prefix matching information in the obtained data record is the same as the target prefix matching information or not and whether the accurate matching information is the same as the accurate information or not.

And acquiring data records of which the prefix matching information is the same as the target prefix matching information and the accurate matching information is the same as the accurate information as data records corresponding to the target prefix matching information and the accurate information.

Optionally, the processing module 22 is further configured to:

prefix matching information, range matching information, accurate matching information and action information of the preset rule set rule are obtained. Wherein, the prefix matching information includes: prefix basic matching information and a prefix matching length corresponding to the prefix basic matching information.

And filling a pre-established first empty table according to the obtained prefix basic matching information and the prefix matching length corresponding to the prefix basic matching information to obtain a prefix matching table.

And filling a pre-established second empty table according to the obtained prefix matching information, the range matching information, the accurate information and the action information to obtain a rule set classification table.

Optionally, the processing module 22 is further specifically configured to:

and generating a third key value according to the prefix matching basic information and the prefix matching length corresponding to the prefix matching basic information.

And carrying out hash operation on the obtained third correlation key value by utilizing a first preset hash function to obtain a third hash value.

And writing the third key value into the first empty table by taking the obtained third hash value as an address.

Optionally, the processing module 22 is further specifically configured to:

and generating a fourth key value according to the obtained prefix matching information and the corresponding accurate matching length.

And carrying out hash operation on the obtained fourth key value by utilizing a second preset hash function to obtain a fourth hash value.

And reading the second empty table by taking the fourth hash value as an address to obtain a data record.

And filling a second empty table according to the obtained data record, the prefix matching information, the range matching information, the accurate information and the action information.

Optionally, the processing module 22 is further specifically configured to:

and judging whether the prefix matching field and the precise matching field in the obtained data record are empty or not.

And if the prefix matching field and the accurate matching field in the obtained data record are both empty, filling the prefix matching information, the accurate matching information, the range matching information and the action information into the prefix matching field, the accurate matching field, the range matching field and the action field of the data record.

Optionally, if the prefix matching field and the exact matching field in the obtained data record are not null, the processing module 22 is further specifically configured to:

and judging whether the information on the prefix matching field in the obtained data record is the same as the prefix matching information or not and whether the information on the precise matching field in the data record is the same as the precise matching information or not.

And if the information on the prefix matching field in the obtained data record is the same as the prefix matching information and the information on the precise matching field in the data record is the same as the precise matching information, judging whether an empty range matching field and an action field exist in the obtained data record.

If the obtained data record has empty scope matching fields and action fields, the scope matching fields and action fields are filled in the empty scope matching fields and action fields.

According to the data packet classification device provided by the embodiment of the invention, the prefix information, the range information and the accurate information are obtained from the data packet to be classified, and the classification result is obtained only by searching according to the three dimensions of the prefix information, the accurate information and the range information, so that the data packet is rapidly classified, and the classification efficiency is greatly improved.

In practical applications, the obtaining module 21 and the Processing module 22 may be implemented by a Central Processing Unit (CPU), a microprocessor Unit (MPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), or the like, which are located in the packet classification device.

The invention also provides a data packet classification device, and the rule set R is assumed to contain N (N = 2) ^I I is a positive integer) rules, respectively: r (1), R (2), \ 8230;, R (N). The structure diagram of the rule is shown in fig. 3, and includes 3 fields: the field device comprises an F1 field, an F2 field, an F3 field and an action ACT field, wherein the F1 field is a prefix matching field, the F2 field is a range matching field, the F3 field is an exact matching field, and the F1 field, the F2 field and the F3 field are all represented in binary.

The F1 field is composed of prefix matching basic information F1_ info having a bit width of F1_ info _ len, and a prefix size F1_ pre having a bit width of F1_ pre _ len. Where the maximum value of F1_ pre is F1_ info _ len, and the total length of F1 field F1_ len = F1_ info _ len + F1_ pre _ len.

The F2 field is composed of a lower range limit F2_ min having a bit width of F2_ min _ len, and an upper range limit F2_ max having a bit width of F2_ max _ len. Wherein F2_ len = F2_ min _ len + F2_ max _ len.

The F3 field is denoted F3_ info, and the bit width is F3_ len.

As shown in fig. 4, the packet classification apparatus includes: a rule set preprocessing module 31, a prefix matching table storage module 32, a rule set merging storage module 33, a rule set query module 34 and a rule set matching selection module 35.

And the rule set preprocessing module 31 is configured to extract information from the rule set R, perform hash, and perform read-write operation on the prefix matching table storage module 32 and the rule set and storage module 33.

And a prefix matching table storage module 32, configured to store prefix matching information in the rule set R.

And a rule set merge storage module 33 for storing the rules in the rule set R.

And a rule set query module 34 for extracting field information corresponding to the rule set R from the input data packet, and then reading the prefix matching table storage module 32 and the rule set storage module 33 after hashing the extracted information by using the same hash function as that of the rule set preprocessing module 31.

And the rule set matching selection module 35 is configured to compare the fields extracted from the data packets to be classified with the fields returned by the rule set and storage module 33, so as to obtain an optimal matching result.

A specific embodiment is provided below to explain a process of classifying a packet by the packet classification apparatus according to the embodiment of the present invention, where the process may be divided into two parts, which are: a rule preprocessing process and a packet classification process.

And (3) rule preprocessing:

step 1, let i =1, and steps 2 and 2

\ u

1 and 2 \.

Step 2.1, extracting prefix matching information from R (1), that is, extracting information corresponding to an F1 field, which is denoted as F1_ info and F1_ pre, taking KEY _ F1= { F1_ info, F1_ pre }, then hashing KEY _ F1 with Hash function Hash1, hi1= Hash1 (KEY _ F1), hi1 is a binary number with an I-bit width, and writing KEY _ F1 into M (M = F1_ info _ len) prefix matching tables at the same time, where the M prefix matching tables are: r _ F1_ table _1, R _ F1_ table _2.. R _ F1_ table _ M, M prefix matching tables are stored in the prefix matching storage module. Where their write address is Hi1.

Step 2.2, (a) extracting prefix information and accurate information from R (1), namely information corresponding to the F1 field and the F3 field, and recording as: f1_ info, F1_ pre, and F3_ info, which are grouped into KEY values KEY _ R = { F1_ info, F1_ pre, F3_ info }. (b) KEY _ R is hashed using Hash function Hash2, hi2= Hash2 (KEY _ R), hi2 being a binary number of I bits wide. (c) The Hi2 is used as an address to read a rule set merge storage table R _ table in the rule set merge storage module, and a schematic structural diagram of the rule set merge storage table is shown in FIG. 5. (d) And judging whether the F1 field and the F3 field are empty, if the F1 field and the F3 field are both empty, writing prefix information into the F1 field of the R _ table, writing accurate information into the F3 field of the R _ table, writing range information into the first F2 field of the R _ table, and writing action information into an ACT field corresponding to the first F2 field of the R _ table. If the F1 field and the F3 field are not empty, firstly judging whether the information in the F1 field is the same as prefix information or not, and whether the information in the F3 field is the same as accurate information or not, if so, continuously judging whether an empty F2 field and a corresponding ACT field exist or not, if an empty F2 field and a corresponding ACT field exist, writing range information into the empty F2 field in an R _ table, writing action information into the corresponding ACT field, and otherwise, determining that the preprocessing fails. The write address of this step is Hi2.

Step 3, let i =2, execute step 2_1 and 2_2 at the same time again until i = N, and the rule preprocessing process ends.

A data packet classification process:

the process is realized by a rule set query module 34 and a rule set matching selection module 35, and comprises the following specific steps:

step 1, the rule set query module 34 extracts information of fields corresponding to the rule set R from the data packets to be classified, and the information includes: f1_ P, F2_ P, and F3_ P.

Step 2, using F1_ P and 1, 2.. M to generate a KEY value, respectively, which is recorded as: KEY _ P _1, KEY _ P _2. The method of generating the KEY value here is: the high F1_ P _ pre bit of the KEY value is the same as the high F1_ P _ pre bit of the F1_ P, and the other bits are 0.

And step 3, using a Hash function Hash1 to Hash KEY _ P _1 and KEY _ P _2.. KEY _ P _ M at the same time to obtain M Hash results Hi1_ P _1 and Hi1_ P _2.. Hi1_ P _ M. Hi1_ P _1 and Hi1_ P _2.. Hi1_ P _ M are all binary numbers with bit width of I bits.

And 4, reading the prefix matching tables R _ F1_ table _1 and R _ F1_ table _2.. R _ F1_ table _ M by using Hi1_ P _1 and Hi1_ P _2.. Hi1_ P _ M.

Step 5, returning M results to R _ F1_ table _1 and R _ F1_ table _2.. R _ F1_ table _ M, comparing the results with KEY _ P _1 and KEY _ P _2.. KEY _ P _ M respectively, extracting KEY values with consistent comparison results, and marking the KEY values as KEY _ P _1 and KEY _ P _2.. KEY _ P _ J and J ≦ M.

Step 6, generating new KEY values by using KEY _ P _1, KEY _ P _2.. KEY _ P _ J and F3_ P respectively, and marking the new KEY values as KEY _ R _1= { KEY _ P _1, F3_ P }, KEY _ R _2= { KEY _ P _2, F3_ P }. KEY _ R _ J = { KEY _ P _ J, F3_ P }.

And 7, hashing the KEY _ R _1 and the KEY _ R _2.. KEY _ R _ J respectively by using a Hash function Hash2, and outputting Hash values of Hi2_ R _1 and Hi2_ R _2.. Hi2_ R _ J. Hi2_ R _1 and Hi2_ R _2.. Hi2_ R _ J are binary numbers with bit width of I bits.

And 8, reading the R _ table sequentially by using Hi2_ R _1 and Hi2_ R _2.

Step 9, the R _ table returns J data records, the format of each data record is as shown in fig. 5, KEY _ R _1, KEY _ R _2.. KEY _ R _ J is compared with the F1 field and the F3 field in the J returned data records, data records with consistent comparison results are left, then F2_ P is used for dematching in the range field in the left results, the range field matched with F2_ P is found out, the ACT field is determined after the range field is determined, the ACT field is output, and if no match exists finally, the current stream classification fails.

A specific embodiment is provided below to explain a process of classifying a data packet by the data packet classification apparatus according to the embodiment of the present invention. Assume that 16K rules are included in the rule set R, i.e., N = 1691, i =14. And assume that the rule set contains five-tuple information of Internet Protocol (IP), that is, the rule contains 5 fields of F1 field, F2 field, F3 field, F4 field and F5 field, the F1 field is a destination IP address prefix matching field, the bit width of F1_ info is 32 bits, and F1_ pre is 6 bits; the F2 field is a source IP address prefix matching field, the bit width of the F2_ info is 32 bits, and the F2_ pre is 6 bits; f3 is a destination port range field, F3_ min is 16 bits, and F3_ max is 16 bits; f4 is a source port range field, F4_ min is 16 bits, and F4_ max is 16 bits; f5 is the protocol field (exact match) and the bit width is 8 bits. And L is set to 5 as shown in fig. 6.

The process can be divided into two parts, respectively: a rule preprocessing process and a packet classification process.

And (3) rule preprocessing:

step 1, let i =1, and simultaneously perform steps 2 _1and 2_2;

and 2.1, extracting destination IP address prefix information of R (1), namely information corresponding to the F1 field, hashing the information as a KEY value by using a Hash function Hash1 to obtain a Hash value Hi1_ F1, and writing the extracted destination IP address prefix information into F1 fields of 32 prefix matching tables R _ F1_ table _ 1-R _ F1_ table _32 by taking the Hash value Hi1_ F1 as an address. Extracting source IP address prefix information of R (1), namely information corresponding to an F2 field, hashing the information as a KEY value by using a hashing function Hash1 to obtain a Hash value Hi1_ F2, and writing the extracted source IP address prefix information into F2 fields of 32 prefix matching tables R _ F2_ table _ 1-R _ F2_ table _32 by taking the Hash value Hi1_ F2 as an address. R _ F1_ table _1 to R _ F1_ table _32 and R _ F2_ table _1 to R _ F2_ table _32 are stored in the prefix matching table storage module 32.

Step 2.2, (a) extracting destination IP address prefix information, source IP address prefix information and protocol information of R (1), that is, information corresponding to F1, F2 and F5 fields, generating a KEY value, hashing with a Hash function Hash2 to obtain a Hash value Hi2_ R, reading a rule set and storing table R _ table in the rule set and storing module with Hi2_ R as an address, wherein a schematic structural diagram of the rule set and storing table is shown in fig. 6. (b) And returning data by the rule set merging and storing module, judging whether the F1 field, the F2 field and the F5 field are empty, if the F1 field, the F2 field and the F5 field are empty, writing the destination IP address prefix information into the F1 field of the R _ table, writing the source IP address prefix information into the F2 field of the R _ table, and writing the protocol information into the F5 field of the R _ table. If the F1 field, the F2 field and the F5 field are not empty, firstly, whether the information in the F1 field is the same as the prefix information of the destination IP address, whether the information in the F2 field is the same as the prefix information of the destination IP address, whether the information in the F5 field is the same as the protocol information, if the information in the F1 field is the same as the prefix information of the destination IP address, continuously judging whether the empty F3 field, the empty F4 field and the corresponding ACT field exist, if the empty F3 field, the empty F4 field and the corresponding ACT field exist, writing the range information of the destination port into the empty F3 field in the R _ table, writing the range information of the source port into the empty F4 field in the R _ table, writing the action information into the corresponding ACT field, and determining that the preprocessing fails under other conditions.

Step 3, let i =2, and execute

step

2 and 2\u2 at the same time again until i = N, and the rule preprocessing process ends.

A data packet classification process:

step 1, the rule set query module 34 extracts information of fields corresponding to the rule set R from the data packets to be classified, and the information includes: f1_ P, F2_ P, F3_ P, F4_ P, and F5_ P.

Step 2.1, (a) taking 1-32 as prefix size, respectively generating KEY value with F1_ P: KEY _ P _ F1_1, KEY _ P _ F1_2, KEY _ P _ F1_32. (b) The 32 KEY values are hashed simultaneously by using a Hash function Hash1, and 32 Hash values are output: hi1_ P _ F1_1, hi1_ P _ F1_2.. Hi1_ P _ F1_32, and read R _ F1_ table _1, R _ F1_ table _2.. R _ F1_ table _32 with these 32 hash values as addresses (each hash value reads only one table). (c) And comparing the data returned by each table with the respective KEY value, and storing the consistent KEY in R _ F1_ table _ KEY (u), wherein u is less than or equal to 5.

Step 2.2, (a) taking 1-32 as prefix size, respectively generating KEY value with F2_ P: KEY _ P _ F2_1, KEY _ P _ F2_2.. KEY _ P _ F2_32. (b) The 32 KEY values are hashed simultaneously by using a Hash function Hash1, and 32 Hash values are output: hi1_ P _ F2_1, hi1_ P _ F2_2.. Hi1_ P _ F2_32, and read R _ F2_ table _1, R _ F2_ table _2.. R _ F2_ table _32 with these 32 hash values as addresses (each hash value reads only one table). (c) And comparing the data returned by each table with the respective KEY value, and storing the consistent KEY in R _ F2_ table _ KEY (v), wherein v is less than or equal to 5.

And 3, generating a new KEY value according to R _ F1_ table _ KEY (u), R _ F2_ table _ KEY (v) and F5_ P, and storing the new KEY value in KEY _ P, wherein the KEY _ P has a total of 25 combinations of u x v x 1 ≦ 5 x 1.

And 4, hashing the KEY _ P by using a Hash function Hash2, reading the rule set merging storage table in the rule set merging storage module by using a Hash address, returning data with the beat of at most 25, and sending the data to the rule matching selection module 35.

And step 5, the execution rule matching selection module 35 receives the rule set and the data output by the storage module 33.

And 6, comparing the KEY _ P with the F1 field, the F2 field and the F5 field in the returned data, matching the range fields existing in the F3_ P, the F4_ P and the returned data in the consistent data, selecting the optimal matching result, and outputting the corresponding ACT.

It should be noted that, the method and the device for classifying data packets provided by the present invention do not need to perform complex repartitioning on the rule set R, and only need to perform hash according to corresponding fields of the rules and store the hash in the prefix matching table storage module and the rule set and storage module during preprocessing. When the data packet to be classified is classified, only required fields are extracted from the packet, then hash is carried out, the hash result is used for reading the prefix matching table storage table and the rule set merging storage table, and finally the fields are extracted from the data packet according to the reading result for matching. As can be seen from the above description, at most M matches can be read back from the prefix match table storage module, that is, at most M rules need to be read from the rule set and the storage table for final comparison, where in the existing rule for flow classification (except for classifying the addresses of the Internet Protocol Version 6 (ipv 6)), M is at most 32, and M is smaller than 32 in actual use. It is statistically known that for a given IP address, the number of occurrences in the rule set with different prefixes is no greater than 5, so M is no greater than 5. The temporal complexity of this patent is O (M + 1), the spatial complexity is O (2 × N), and N is the number of rules in the rule set. When the rules need to be updated, including adding or deleting the rules, the rules to be added or deleted only need to be input into the rule set preprocessing module for hash processing, and then the prefix matching table storage module and the rule set merging storage module are added or deleted, so that the rule table updating process is simplified. Other algorithms will have an influence on the existing rules when updating the rule set, and most seriously, the whole rule set needs to be rearranged, which consumes time and has a complex process. When the rule set needs to be expanded, only the prefix matching table storage module and the rule set need to be modified, the depth of the storage module and the output bit width of the used hash function need to be stored, and the rule set is easy to adapt to the rule expansion.

The embodiment of the invention also provides a data packet classification device, which comprises a memory and a processor, wherein the memory stores the following instructions which can be executed by the processor:

acquiring prefix information, accurate information and range information from a data packet to be classified; the prefix information is information corresponding to a prefix basic information matching field in a prefix matching field of a preset rule set, the accurate information is information corresponding to an accurate matching field in the preset rule set, and the accurate information is information corresponding to the accurate matching field in the preset rule set.

And respectively taking prefix information, accurate information and range information as dimensions, and acquiring classification results of the data packets to be classified by utilizing a pre-established prefix matching table and a rule set classification table.

Optionally, the memory has stored therein the following instructions executable by the processor:

Optionally, the memory further has the following instructions stored therein, which are executable by the processor:

Optionally, the number of the prefix matching tables is the same as the number of the types of the prefix matching lengths. The memory also has embodied therein the following instructions executable by the processor:

and reading a prefix matching table by taking the obtained first hash value as an address. The prefix matching read by taking any one of the first hash values as an address is different from the prefix matching table read by taking any other one of the first hash values as an address.

Optionally, the memory further stores the following instructions executable by the processor:

And acquiring prefix matching information which is the same as the first key value and the prefix matching information in the reading process and is used as target prefix matching information.

And acquiring data records of which the prefix matching information is the same as the target prefix matching information and the accurate matching information is the same as the accurate information as the data records corresponding to the target prefix matching information and the accurate information.

prefix matching information, range matching information, accurate matching information and action information of the preset rule set rule are obtained. Wherein, the prefix matching information includes: prefix basic matching information and prefix matching length corresponding to the prefix basic matching information.

and generating a third correlation key value according to the prefix matching basic information and the prefix matching length corresponding to the prefix matching basic information.

Optionally, if the prefix matching field and the exact matching field in the obtained data record are not empty, the following instructions executable by the processor are further specifically stored in the memory:

If the obtained data record has empty scope match fields and action fields, the scope match fields and action fields are filled in the empty scope match fields and action fields.

And respectively taking prefix information, accurate information and range information as dimensions, and utilizing a pre-established prefix matching table and a rule set classification table to obtain classification results of the data packets to be classified.

Optionally, the computer executable instructions specifically perform the steps of:

Optionally, the computer executable instructions further specifically perform the steps of:

Optionally, the number of the prefix matching tables is the same as the number of the kinds of the prefix matching lengths. The computer-executable instructions further specifically perform the steps of:

and reading a prefix matching table by taking the obtained first hash value as an address. The prefix matching read by taking any one of the first hash values as an address is different from the prefix matching table read by taking any one of the other first hash values as an address.

Optionally, the computer executable instructions further perform the steps of:

prefix matching information, range matching information, accurate matching information and action information of the preset rule set rules are obtained. Wherein, the prefix matching information includes: prefix basic matching information and a prefix matching length corresponding to the prefix basic matching information.

And carrying out hash operation on the obtained third key value by utilizing a first preset hash function to obtain a third hash value.

Optionally, if the prefix matching field and the exact matching field in the obtained data record are not null, the computer-executable instructions further specifically perform the steps of:

Although the embodiments of the present invention have been described above, the above description is only for the purpose of understanding the present invention, and is not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A method of classifying a data packet, comprising:

acquiring prefix information, accurate information and range information from a data packet to be classified; the prefix information is information corresponding to a prefix basic information matching field in a prefix matching field of a preset rule set, and the accurate information is information corresponding to an accurate matching field in the preset rule set;

respectively taking the prefix information, the accurate information and the range information as dimensions, and acquiring a classification result of the data packet to be classified by utilizing a pre-established prefix matching table and a rule set classification table;

the method for acquiring the classification result of the data packet to be classified by using the pre-established prefix matching table and the rule set classification table by respectively using the prefix information, the accurate information and the range information as dimensions comprises the following steps:

acquiring prefix matching information corresponding to the prefix information from the prefix matching table as target prefix matching information;

acquiring data records corresponding to the target prefix matching information and the accurate information from the rule set classification table; wherein the data record comprises: prefix matching information, exact matching information, multiple range matching information and corresponding action matching information;

acquiring range matching information containing the range information from the acquired data record as target range matching information;

acquiring action matching information corresponding to the target range matching information as a classification result of the data packet to be classified;

the obtaining prefix matching information corresponding to the prefix information from the prefix matching table includes:

generating a first key value according to the prefix information and each prefix matching length;

performing hash operation on the obtained first key value by using a first preset hash function to obtain a first hash value;

reading the prefix matching table by taking the obtained first hash value as an address to obtain prefix matching information;

and determining prefix matching information which is the same as the first key value in the obtained prefix matching information as the target prefix matching information.

2. The method according to claim 1, wherein the obtaining the data record corresponding to the target prefix matching information and the precise information from the rule set classification table comprises:

generating a second key value according to the target prefix matching information and the accurate matching information;

performing hash operation on the obtained second key value by using a second preset hash function to obtain a second hash value;

reading the rule set classification table by taking the obtained second hash value as an address to obtain a data record;

3. The classification method according to claim 1, wherein the number of the prefix matching tables is the same as the number of the kinds of the prefix matching lengths;

the reading of the prefix matching table with the obtained first hash value as an address includes:

reading a prefix matching table by taking each obtained first hash value as an address; the prefix matching read by taking any one of the first hash values as an address is different from the prefix matching table read by taking any one of the other first hash values as an address.

4. The classification method according to claim 1, wherein the determining prefix matching information that is the same as the first key value from among the obtained prefix matching information includes:

judging whether the obtained prefix matching information is the same as a first key value according to which the prefix matching information is read or not;

and acquiring prefix matching information which is the same as the first key value and is obtained in the reading process as the target prefix matching information.

5. The method according to claim 2, wherein the determining, from among the obtained data records, a data record corresponding to both the target prefix matching information and the precise information comprises:

judging whether prefix matching information in the obtained data records is the same as the target prefix matching information or not and whether accurate matching information is the same as the accurate information or not;

6. The classification method according to claim 2, wherein before the obtaining prefix information corresponding to a prefix information matching field in prefix matching fields of a preset rule set, range information corresponding to a range matching field, and accurate information corresponding to an accurate matching field from the data packet to be classified, the method further comprises:

acquiring prefix matching information, range matching information, accurate matching information and action information of the preset rule set rule; wherein the prefix matching information includes: prefix basic matching information and prefix matching length corresponding to the prefix basic matching information;

filling a pre-established first empty table according to the obtained prefix basic matching information and the prefix matching length corresponding to the prefix basic matching information to obtain a prefix matching table;

and filling a pre-established second empty table according to the obtained prefix matching information, the range matching information, the accurate information and the action information to obtain the rule set classification table.

7. The method according to claim 6, wherein the step of filling a pre-established first empty table according to the obtained prefix basic matching information and the prefix matching length corresponding to the prefix basic matching information comprises:

generating a third correlation value according to the prefix matching basic information and the prefix matching length corresponding to the prefix matching basic information;

performing hash operation on the obtained third key value by using the first preset hash function to obtain a third hash value;

8. The method according to claim 6, wherein the populating a pre-established second empty table according to the obtained prefix match information, range match information, precision information, and action information includes:

generating a fourth key value according to the obtained prefix matching information and the corresponding accurate matching length;

performing hash operation on the obtained fourth key value by using the second preset hash function to obtain a fourth hash value;

reading the second empty table by taking the fourth hash value as an address to obtain a data record;

and filling the second empty table according to the obtained data record, the prefix matching information, the range matching information, the accurate information and the action information.

9. The method of claim 8, wherein populating the second empty table based on the obtained data records, prefix match information, range match information, precision information, and action information comprises:

judging whether the prefix matching field and the precise matching field in the obtained data record are empty or not;

10. The method of claim 9, wherein if the prefix match field and the exact match field in the obtained data record are not empty, further comprising:

judging whether the information on the prefix matching field in the obtained data record is the same as the prefix matching information or not and whether the information on the precise matching field in the data record is the same as the precise matching information or not;

if the information on the prefix matching field in the obtained data record is the same as the prefix matching information and the information on the precise matching field in the data record is the same as the precise matching information, judging whether an empty range matching field and an action field exist in the obtained data record;

and if the obtained data record has empty range matching fields and action fields, filling the range matching fields and the action fields into the empty range matching fields and the action fields.

11. An apparatus for classifying a packet, comprising:

the acquisition module is used for acquiring prefix information, accurate information and range information from the data packet to be classified; the prefix information is information corresponding to a prefix basic information matching field in a prefix matching field of a preset rule set, and the accurate information is information corresponding to an accurate matching field in the preset rule set;

the processing module is used for acquiring the classification result of the data packet to be classified by using a pre-established prefix matching table and a rule set classification table by respectively taking the prefix information, the accurate information and the range information as dimensions;

the method for obtaining the classification result of the data packet to be classified by using the pre-established prefix matching table and the rule set classification table by respectively using the prefix information, the accurate information and the range information as dimensions comprises the following steps:

obtaining range matching information containing the range information from the obtained data record as target range matching information;

12. An apparatus for classifying a packet, comprising: a processor and a memory, wherein the memory has stored therein the following instructions executable by the processor:

respectively taking the prefix information, the accurate information and the range information as dimensions, and utilizing a pre-established prefix matching table and a rule set classification table to obtain a classification result of the data packet to be classified;

prefix matching information corresponding to the prefix information is obtained from the prefix matching table and is used as target prefix matching information;

13. A computer-readable storage medium having stored thereon computer-executable instructions that, when executed by a computer, perform the steps of:

carrying out hash operation on the obtained first key value by utilizing a first preset hash function to obtain a first hash value;