Disclosure of Invention
The invention provides a method and a device for representing network flow, which are used for solving the following problems in the prior art: the existing representation method of the network traffic has inappropriate granularity.
To solve the above technical problem, the present invention provides a method and an apparatus for representing network traffic, including: expressing the semantics of the network traffic according to a predetermined event semantics method by adopting a preset basic predicate, an argument of the preset basic predicate, a preset extended predicate and an argument of the preset extended predicate; defining semantic relations between the network traffic and other network traffic according to the semantics of the network traffic; generating a network flow set according to the semantics of the network flow and the semantic relation between the network flow and other network flows and according to preset characteristics; and determining the operation condition of the communication main body corresponding to the network flow according to the set of the network flow.
Optionally, the basic predicate includes: the method comprises the following steps of five-tuple of the network flow, a client host address, a server host address, the number of bytes contained in the network flow, the number of network flow packets contained in the network flow, the occurrence time of the network flow, connection establishment of the five-tuple of the network flow, and the start time and the end time of the network flow, wherein the five-tuple of the network flow comprises: a source address, a destination address, a source port, a destination port, and a transport layer protocol of the network traffic; and the argument of the basic predicate expresses communication parameters in the network traffic expressed by the basic predicate.
Optionally, the extended predicate includes: the network traffic comprises an interval time sequence of the network traffic packets, a length sequence of the network traffic packets, a feature string of the network traffic, a geographic location of an address in the network traffic, an entropy of the network traffic in a predetermined direction, and a reset time of the network traffic; and the argument of the expanded predicate expresses communication parameters in the network traffic expressed by the expanded predicate.
Optionally, the predetermined event semantics method is a novidesson event semantics method.
Optionally, the semantic relationship between the network traffics includes: sequential, concurrent, causal and conditional.
In addition, to achieve the above object, the present invention further provides a device for representing network traffic, including: the expression module is used for expressing the semantics of the network traffic according to a predetermined event semantics method by adopting a preset basic predicate, arguments of the preset basic predicate, a preset extended predicate and arguments of the preset extended predicate; the definition module is used for defining the semantic relation between the network flow and other network flows according to the semantics of the network flow; the set generation module is used for generating a set of network traffic according to the semantics of the network traffic and the semantic relation between the network traffic and other network traffic and according to the preset characteristics; and the determining module is used for determining the operating condition of the communication main body corresponding to the network flow according to the set of the network flow.
Optionally, the basic predicate includes: the method comprises the following steps of five-tuple of the network flow, a client host address, a server host address, the number of bytes contained in the network flow, the number of network flow packets contained in the network flow, the occurrence time of the network flow, connection establishment of the five-tuple of the network flow, and the start time and the end time of the network flow, wherein the five-tuple of the network flow comprises: a source address, a destination address, a source port, a destination port, and a transport layer protocol of the network traffic; and the argument of the basic predicate expresses communication parameters in the network traffic expressed by the basic predicate.
Optionally, the extended predicate includes: the network traffic comprises an interval time sequence of the network traffic packets, a length sequence of the network traffic packets, a feature string of the network traffic, a geographic location of an address in the network traffic, an entropy of the network traffic in a predetermined direction, and a reset time of the network traffic; and the argument of the expanded predicate expresses communication parameters in the network traffic expressed by the expanded predicate.
Optionally, the predetermined event semantics method is a novidesson event semantics method.
Optionally, the semantic relationship between the network traffics includes: sequential, concurrent, causal and conditional.
The invention provides a method and a device for representing network flow, wherein the method comprises the following steps: the method comprises the steps of representing the semantics of network traffic by a preset predicate and argument thereof according to a predetermined event semantics method, defining the relation between the network traffic and other network traffic according to the semantics of the network traffic, generating a set of the network traffic according to the preset characteristics according to the relation, and finally determining the operation condition of a communication main body corresponding to the network traffic according to the set of the network traffic. The method generates related predicates and arguments by defining the network traffic, performs semantic representation on the network traffic by adopting a predetermined semantic method, and forms a network traffic set according to the semantics and semantic relations of the network traffic to represent the operation condition of a communication main body, can accurately represent the network traffic from proper granularity, has a simple representation form, and solves the following problems in the prior art: the existing representation method of the network traffic has inappropriate granularity.
Detailed Description
In order to solve the following problems in the prior art: the existing representation method of the network traffic has inappropriate granularity. A first embodiment of the present invention provides a method for representing network traffic, and a flowchart of the method is shown in fig. 1, and includes steps S102 to S108:
and S102, representing the semantics of the network traffic according to a predetermined event semantics method by adopting a preset basic predicate, an argument of the preset basic predicate, a preset extended predicate and an argument of the preset extended predicate.
In this embodiment, for network traffic, regarding the network traffic as an event, a set of predicates may be defined to describe a communication action corresponding to the network traffic, and the argument may be used to describe a corresponding communication parameter in the network traffic.
And S104, defining the semantic relation between the network flow and other network flows according to the semantics of the network flow.
After the semantics of a series of network traffic are expressed, the relationship between the network traffic and other network traffic is defined according to the expressed semantics of each network traffic.
And S106, generating a network flow set according to the semantic of the network flow and the semantic relation between the network flow and other network flows and the preset characteristics.
The collection of network traffic includes a series of network traffic having semantic relationships. The predetermined feature in this embodiment may be time or space, that is, a set may be generated for network traffic within a certain time range, or a set may be generated according to a region range corresponding to an IP address in the network traffic.
And S108, determining the operation condition of the communication main body corresponding to the network flow according to the set of the network flow.
The network traffic set generated in the above step may be a set formed by different network traffic generated by one communication subject under different conditions, so that the operation condition of the communication subject corresponding to the network traffic may be determined according to the generated network traffic set.
In addition, in this embodiment, predicates and arguments of the network traffic are defined, which are specifically defined as follows:
the basic predicates can at least include: a quintuple corresponding to the network traffic, a client host address, a server host address, a byte number included in the network traffic, a network traffic packet number included in the network traffic, an occurrence time of the network traffic, a connection establishment of the quintuple of the network traffic, a start time and an end time of the network traffic, in this embodiment, the quintuple of the network traffic includes: source address, destination address, source port, destination port, and transport layer protocol of the network traffic. Furthermore, arguments of the basic predicate represent communication parameters in the network traffic represented by the basic predicate. For example, the argument of the client host address is a specific network address corresponding to the client host in the network traffic generated by the specific client host.
In addition, in addition to the basic predicate, the extended predicate in this embodiment at least includes: the network traffic comprises an interval time sequence of network traffic packets, a length sequence of the network traffic packets, a feature string of the network traffic, a geographic position of a certain address in the network traffic, entropy of the network traffic in a preset direction, and reset time of the network traffic. In this embodiment, the predetermined direction of the network traffic refers to from the client to the server or from the server to the client, and the reset time of the network traffic refers to that the network traffic is reset at a certain time. And the argument of the expanded predicate expresses communication parameters in the network traffic expressed by the expanded predicate.
In order to make the semantic representation structure of the network traffic clear, the event semantics method adopted in the embodiment is a new davison time semantics method, and the method corresponds to a predicate argument structure and can clearly represent the semantic of the network traffic.
Furthermore, in this embodiment, the semantic relationship between network traffic at least includes: sequential, concurrent, causal and conditional.
The method for representing network traffic provided by this embodiment adopts a preset predicate and its argument to represent the semantics of network traffic according to a predetermined event semantics method, defines the relationship between network traffic and other network traffic according to the semantics of network traffic, generates a set of network traffic according to a predetermined characteristic according to the relationship, and finally determines the operating condition of a communication subject corresponding to network traffic according to the set of network traffic. The method generates related predicates and arguments by defining the network traffic, performs semantic representation on the network traffic by adopting a predetermined semantic method, and forms a network traffic set according to the semantics and semantic relations of the network traffic to represent the operation condition of a communication main body, can accurately represent the network traffic, has a simple representation form, and solves the following problems in the prior art: the existing representation method of the network traffic has inappropriate granularity.
A second embodiment of the present invention provides an apparatus for representing network traffic, where a schematic structural diagram of the apparatus is shown in fig. 2, and the apparatus includes: the expression module 10 is configured to express semantics of network traffic according to a predetermined event semantics method by using a preset basic predicate, arguments of the preset basic predicate, a preset extended predicate, and arguments of the preset extended predicate; a defining module 20, coupled to the representing module 10, for defining a semantic relationship between the network traffic and other network traffic according to the semantics of the network traffic; a set generating module 30, coupled to the defining module 20, for generating a set of network traffic according to a predetermined characteristic according to the semantics of the network traffic and the semantic relationship between the network traffic and other network traffic; and the determining module 40 is coupled to the set generating module and configured to determine, according to the set of network traffic, an operating condition of the communication subject corresponding to the network traffic.
In the expression module, the predicate is a term used for describing or determining the property, feature, or relationship between objects, and a noun collocated with the predicate is called an argument.
Further, after representing the semantics of a series of network traffic, a definition module may be used to define the relationship between the network traffic and other network traffic based on the represented semantics of each network traffic.
Further, the set generating module generates a set of network traffic according to the semantic of the network traffic and the semantic relationship between the network traffic and other network traffic and according to the predetermined characteristics. The collection of network traffic includes a series of network traffic having semantic relationships. The predetermined feature in this embodiment may be time or space, that is, a set may be generated for network traffic within a certain time range, or a set may be generated according to a region range corresponding to an IP address in the network traffic.
And finally, the determining module determines the operation condition of the communication main body corresponding to the network flow according to the set of the network flow. The network traffic set generated by the set generating module may be a set formed by different network traffic generated by a communication subject under different conditions, and therefore, according to the generated network traffic set, the operating condition of the communication subject corresponding to the network traffic may be determined.
In addition, in this embodiment, predicates and arguments of the network traffic are defined, which are specifically defined as follows:
the basic predicates can at least include: a quintuple corresponding to the network traffic, a client host address, a server host address, a byte number included in the network traffic, a network traffic packet number included in the network traffic, an occurrence time of the network traffic, a connection establishment of the quintuple of the network traffic, a start time and an end time of the network traffic, in this embodiment, the quintuple of the network traffic includes: source address, destination address, source port, destination port, and transport layer protocol of the network traffic. Furthermore, arguments of the basic predicate represent communication parameters in the network traffic represented by the basic predicate. For example, the argument of the client host address is a specific network address corresponding to the client host in the network traffic generated by the specific client host.
In addition, in addition to the basic predicate, the extended predicate in this embodiment at least includes: the network traffic comprises an interval time sequence of network traffic packets, a length sequence of the network traffic packets, a feature string of the network traffic, a geographic position of a certain address in the network traffic, entropy of the network traffic in a preset direction, and reset time of the network traffic. In this embodiment, the predetermined direction of the network traffic refers to from the client to the server or from the server to the client, and the reset time of the network traffic refers to that the network traffic is reset at a certain time. And the argument of the expanded predicate expresses communication parameters in the network traffic expressed by the expanded predicate.
In order to make the semantic representation structure of the network traffic clear, the event semantics method adopted in the embodiment is a new davison time semantics method, and the method corresponds to a predicate argument structure and can clearly represent the semantic of the network traffic.
Furthermore, in this embodiment, the semantic relationship between network traffic at least includes: sequential, concurrent, causal and conditional.
In the apparatus for representing network traffic provided in the second embodiment of the present invention, the representing module uses a preset predicate and its argument to represent the semantics of the network traffic according to a predefined event semantics method, the defining module defines a relationship between the network traffic and other network traffic according to the semantics of the network traffic, the set generating module generates a set of the network traffic according to a predefined feature according to the relationship, and the determining module determines the operation condition of the communication subject corresponding to the network traffic according to the set of the network traffic. The device generates related predicates and arguments by defining the network traffic, performs semantic representation on the network traffic by adopting a predetermined semantic method, forms a network traffic set according to the semantics and semantic relations of the network traffic to represent the operation condition of a communication main body, can accurately represent the network traffic, has a simple representation form, and solves the following problems in the prior art: the existing means for representing network traffic indicate that the granularity of the network traffic is not suitable.
A third embodiment of the present invention provides a method for representing network traffic, where a flowchart of the method is shown in fig. 3, and includes steps S302 to S308:
s302, the semantics of the network flow is represented.
In this embodiment, the semantics of the network traffic are expressed by a predetermined event semantics method by using a preset basic predicate, an argument of the preset basic predicate, a preset extended predicate, and an argument of the preset extended predicate.
In this embodiment, for network traffic, regarding network traffic as an event, a set of predicates may be defined to describe a communication action corresponding to the network traffic, and the argument may be used to describe a corresponding communication parameter in the network traffic.
In this embodiment, predicates and arguments of network traffic are defined, and specifically defined as shown in table 1, the arguments of the basic predicate are the part in parentheses after the basic predicate, and the specific content of the arguments is related to the specific network traffic.
TABLE 1
Predicate (argument)
|
Description of the invention
|
IP_ADDRsource(ip1)
|
Source address ip1 |
IP_ADDRdest(ip1)
|
Destination address ip1 |
PORTsource(pt1)
|
Source port pt1 |
PORTdest(pt1)
|
Destination port pt1 |
PROTO(pr1)
|
Protocol pr1 |
HOSTclient(ip)
|
Client host ip
|
HOSTserver(ip)
|
Host ip of server
|
COUNTbytes(cb1)
|
Byte count cb1 |
COUNTpkts(cp1)
|
Packet count cp1 |
TIME(t1)
|
The network flow has a time t1 |
CONNECT(f1)
|
Network flow f1Establishing connections in corresponding quintuple
|
TIME(f1,st1,et1)
|
Network flow f1Respectively, the start and stop time of1And et1 |
The basic predicates can at least include: a quintuple corresponding to the network traffic, a client host address, a server host address, a byte number included in the network traffic, a network traffic packet number included in the network traffic, an occurrence time of the network traffic, a connection establishment of the quintuple of the network traffic, a start time and an end time of the network traffic, in this embodiment, the quintuple of the network traffic includes: source address, destination address, source port, destination port, and transport layer protocol of the network traffic. Furthermore, arguments of the basic predicate represent communication parameters in the network traffic represented by the basic predicate. For example, the argument of the client host address is a specific network address corresponding to the client host in the network traffic generated by the specific client host.
In addition, in addition to the basic predicate, the extended predicate in this embodiment is shown in table 2, where arguments of the extended predicate are parts in parentheses after the extended predicate, and specific contents of the arguments are related to specific network traffic, and the specific contents at least include: the network traffic comprises an interval time sequence of network traffic packets, a length sequence of the network traffic packets, a feature string of the network traffic, a geographic position of a certain address in the network traffic, entropy of the network traffic in a preset direction, and reset time of the network traffic. In this embodiment, the predetermined direction of the network traffic refers to from the client to the server or from the server to the client, and the reset time of the network traffic refers to that the network traffic is reset at a certain time. And the argument of the expanded predicate expresses communication parameters in the network traffic expressed by the expanded predicate.
TABLE 2
Further, in order to make the semantic representation structure of the network traffic clear, the event semantics method adopted in the embodiment is a new davison time semantics method, and the method corresponds to a predicate argument structure and can clearly represent the semantic of the network traffic.
For example, the network traffic realizes that the client sip establishes the connection by the tcp protocol, and the semantic representation method of this embodiment may be represented as:
s304, defining semantic relation between the network traffic according to the semantics of the network traffic.
After the semantics of a series of network traffic are expressed, the relationship between the network traffic and other network traffic is defined according to the expressed semantics of each network traffic.
In this embodiment, the semantic relationship between network traffic at least includes: sequential, concurrent, causal and conditional. For representing the relationship between network traffic, this embodiment specifically means:
event e for two network traffic representations1And e2If e is1In time sequence at e2When it has occurred previously, it is called e1And e2Satisfy a sequential bearing relationship, and are denoted as e1→e2;
If e1And e2All occur within a certain time window W, then are called e1And e2Satisfy the concurrency relationship, and is marked as e1↑↑e2;
If it is e
1Generation, e
2If it happens, it is called e
1And e
2Satisfy the causal relationship, is recorded as
If only e1Happen to result in e2When it occurs, it is called e1And e2Satisfy the conditional relationship, which can be denoted as e in this embodiment1↗e2。
For example, let us say network traffic s
1The events represented are: the client with the IP address sip initiates a DNS request to the domain name server with the IP address dip, and the network flow s
2The events represented are: the client with the IP address sip receives the DNS response returned by the domain name server with the IP address dip, and then the network flow s
1Expressed as:
network traffic s
2Expressed as:
and s
1And s
2Satisfy the causal relationship, is recorded as
E.g. network traffic s
1The client with the IP address sip accesses the Web server dip
And also network traffic s
2The method realizes the dip of the Web page and simultaneously downloads the dip from another Web server
2Downloading a picture by a concurrent network stream includes:
and s
1And s
2Satisfy the concurrency relationship, and is marked as s
1↑↑s
2。
S306, generating a network flow set according to the semantic relation between the network flow and the preset characteristics.
The collection of network traffic includes a series of network traffic having semantic relationships. The predetermined feature in this embodiment may be time or space, that is, a set may be generated for network traffic within a certain time range, or a set may be generated according to a region range corresponding to an IP address in the network traffic.
And S308, determining the operation condition of the communication main body corresponding to the network flow according to the set of the network flow.
The network traffic set generated in the above step may be a set formed by different network traffic generated by one communication subject under different conditions, so that the operation condition of the communication subject corresponding to the network traffic may be determined according to the generated network traffic set.
For example, suppose there is a set S of network traffic, there is a set
The set T represents the case of network traffic generated by the server host over a certain time frame.
In addition, it is assumed that there is a network traffic set S, with a set
The P server hosts are aggregated for the case of network traffic generated within the geographic range corresponding to their IP addresses.
The method for representing network traffic provided by this embodiment adopts a preset predicate and its argument to represent the semantics of network traffic according to a predetermined event semantics method, defines the relationship between network traffic and other network traffic according to the semantics of network traffic, generates a set of network traffic according to a predetermined characteristic according to the relationship, and finally determines the operating condition of a communication subject corresponding to network traffic according to the set of network traffic. The method generates related predicates and arguments by defining the network traffic, performs semantic representation on the network traffic by adopting a predetermined semantic method, and forms a network traffic set according to the semantics and semantic relations of the network traffic to represent the operation condition of a communication main body, can accurately represent the network traffic, has a simple representation form, and solves the following problems in the prior art: the existing representation method of the network traffic has inappropriate granularity.
Although the preferred embodiments of the present invention have been disclosed for illustrative purposes, those skilled in the art will appreciate that various modifications, additions and substitutions are possible, and the scope of the invention should not be limited to the embodiments described above.