Disclosure of Invention
Aiming at the defects and improvement demands of the prior art, the invention provides an anomaly detection method and an anomaly detection system based on a dynamic traceability graph, and aims to improve the detection precision and the real-time performance of anomaly detection.
In order to achieve the above object, according to one aspect of the present invention, there is provided an anomaly detection method based on a dynamic tracing graph, including:
Collecting system tracing information, converting the system tracing information into a tracing image, respectively extracting node characteristics and characteristics of edges connected with the nodes for each node in the tracing image, and splicing the node characteristics and the characteristics of edges connected with the node to serve as state characteristic vectors of corresponding nodes;
continuously monitoring system tracing information and dynamically updating a tracing graph, and executing an abnormality detection step when a new edge e t is generated in the tracing graph, wherein the abnormality detection step comprises the following steps:
Step S1, inputting a side e t, a state vector S t- of a graph structure surrounding the side e t at a time t - and a generation time t of the side e t into a graph attention network to obtain an embedded vector z of the side e t, and updating state feature vectors of a source node v src and a destination node v dst of the side e t;
the time t - represents the time before the generation time of the edge e t, and s t- is obtained by aggregating state feature vectors of a source node and a neighbor node thereof, a destination node and a neighbor node thereof of the edge e t at the time t - through graph volume lamination;
Step S2, inputting an embedded vector z of the edge e t into a decoder, predicting the probability that the edge e t belongs to each type, and obtaining a predicted vector P (e t);
the decoder comprises a long-short-term memory network and a plurality of full-connection layers which are connected in sequence;
Step S3, calculating a reconstruction error RE between a predicted vector P (e t) and an actual edge type vector of the edge e t, if RE > Th, judging that an abnormality exists, otherwise, judging that the abnormality does not exist;
Wherein Th is a preset abnormal threshold.
Further, the trace-source graph is stored in a distributed super-table of the time sequence database.
Further, in step S1, updating the state feature vectors of the source node v src and the destination node v dst of the edge e t includes:
Inputting a state vector s t-(vsrc) of the graph structure of the edge e t and the source node v src surrounding the edge e t at the time t - into the gating neural network, and updating the state feature vector of the source node v src of the edge e t;
The state vector s t-(vdst of the graph structure of the destination node v dst around the edge e t and the edge e t at the time t -) is input into the gating neural network, and the state feature vector of the destination node v dst of the edge e t is updated.
Further, for the node v in the traceability graph, the extracting method of the node characteristic Φ (v) includes:
Dividing the attribute of the node v into a plurality of substrings;
for each substring, mapping each character in the substring to a corresponding dimension of the substring feature by using a first hash function H, and hashing each character to { ±1} by using a second hash function H to serve as a feature value of the corresponding dimension in the substring feature;
the substring characteristics of the substrings are added as node characteristics Φ (v) of the node v.
Further, for the node v in the traceability graph, the extraction method of the feature ψ (v) of the edge connected with the node v comprises the following steps:
Respectively obtaining edge type vectors of all the incoming edges of the node v, and mapping all the incoming edges into integers between 0-N e -1 according to the edge type vectors;
Respectively obtaining the edge type vectors of all the outgoing edges of the node v, and mapping all the outgoing edges into integers between N e~2Ne -1 according to the edge type vectors;
sequentially splicing integers obtained by mapping the input edge and the output edge of the node v to obtain the characteristic psi (v) of the edge connected with the node v;
where N e represents the total number of edge types.
According to still another aspect of the present invention, a computer program product is provided, which includes a computer program, where the computer program is executed by a processor to implement the foregoing method for detecting an anomaly based on a dynamic tracing graph.
According to still another aspect of the present invention, there is provided an electronic device, including a stored computer program, where the computer program, when executed by a processor, controls a device in which a computer readable storage medium is located to execute the above-mentioned anomaly detection method based on a dynamic tracing map provided by the present invention.
According to still another aspect of the present invention, there is provided an anomaly detection system based on a dynamic tracing map, including:
a computer readable storage medium storing a computer program;
and a processor for reading the computer program stored in the computer readable storage medium and executing the anomaly detection method based on the dynamic tracing graph.
In general, through the above technical solutions conceived by the present invention, the following beneficial effects can be obtained:
(1) According to the method, after an initial tracing graph is established and initial state feature vectors of all nodes in the tracing graph are generated, system tracing information is continuously monitored, dynamic update is carried out on the tracing graph, anomaly detection is achieved through detection of newly generated edges in the tracing graph, due to the fact that relevant detection only involves the newly generated edges and graph structures surrounding the edges in the tracing graph, calculation amount is greatly reduced, detection instantaneity is effectively improved, when the initial state feature vectors of the nodes are generated, node features of the nodes and the features of the edges connected with the node features are fused, semantic information of the initial state feature vectors of the nodes is richer, subsequent detection accuracy is facilitated to be improved, and state feature vectors of source nodes and destination nodes of the new edges are updated each time, accordingly, the state feature vectors of the nodes can accurately reflect evolution history of the nodes in the whole graph, when the new edges are analyzed, the embedded vectors of the new edges are generated by using a graph attention network, attention weight of each node to neighbors can be dynamically adjusted, accordingly, the important relationship between the nodes in the complex graph structures can be adaptively captured, the time sequence model can be better, the time sequence model can be accurately connected with the time sequence model, the time sequence model can be further improved, the time sequence model can be accurately connected, and the time sequence model can be well improved, and the time sequence model can be well connected with the time-depended on the edges can be well, and the time-depended on the model can be well, and the time-depended on the time-frame can be well accurately detected, and the time-prolonged, and the time can be well and the time.
(2) In the preferred scheme of the invention, the tracing graph is stored by using the time sequence database, and based on the storage principle of the time sequence database, nodes and edges in the tracing graph are stored in the distributed super-table provided by the time sequence database, so that a large amount of tracing graph data are scattered in different blocks (physical tables), the burden of a single table is reduced, and the query efficiency in the subsequent detection process is improved.
(3) In the preferred scheme of the invention, the state feature vectors of the source node and the destination node of the new edge are updated by using the gate-controlled neural network, and the state vectors of the new edge and the graph structure surrounding the node at the previous moment are considered during updating, so that the history change process of the node in the traceability graph can be accurately recorded.
(4) In the preferred scheme of the invention, when the node characteristics of the node are extracted, the attribute of the node is divided into a plurality of substrings, each substring is mapped to a low dimension by utilizing a hash function and then added to serve as the node characteristic vector of the node, so that the hierarchical similarity is maintained while the high-dimension attribute is mapped to a low-dimension space, and different types of entities under the similar meshes can also have reasonable characteristic representation.
(5) In the preferred scheme of the invention, under the condition that each input edge and each output edge of the nodes are mapped into integers according to the types of the edges, the extraction of the characteristics of the edges connected with the nodes is realized, so that the types of the edges connected with the nodes can be fully considered, and the node state characteristic vector obtained by fusing the final node characteristics and the edge characteristics has richer semantic information.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
In the present invention, the terms "first," "second," and the like in the description and in the drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
Before explaining the technical scheme of the invention in detail, the tracing diagram is briefly explained as follows:
The trace information is composed of system calls reflecting user behaviors, objects such as applications, files, processes, sockets and the like involved in each system call form nodes of a trace system, the ID of each node is globally unique, the trace information records comprise node IDs, node attributes and other nodes on which the nodes depend, the trace system can be used for collecting trace information, trace graphs established based on the trace information can be used for representing the user behaviors, the trace graphs are constructed in a mode that the nodes in the trace system are correspondingly used as the nodes in the trace graphs, the attributes of the nodes in the trace system are correspondingly used as the attributes of the nodes in the trace graphs, if a dependency relationship exists between two nodes in the trace system, a directed edge exists between the corresponding two nodes in the trace graph, and the directed edge points to the depended nodes.
The data volume of the tracing information is huge, and in order to ensure the efficiency of behavior detection, the embodiment firstly extracts useful information including nodes, node attributes and dependency relations among the nodes from massive tracing information before constructing the tracing graph;
FIG. 1 shows an example of constructing a traceability graph according to traceability information, wherein the traceability information is on the left side, and the corresponding traceability graph is on the right side;
The trace information can describe node attributes, such as '2.0 NAME/usr/local/sbin/vsftpd', NAME attribute exists in 2.0 and the attribute value of NAME attribute is 'usr/local/sbin/vsftpd', and can describe dependency relationship among nodes, such as '2.0 FORKPARENT [ ANC ] 3.0', and node 2.0 depends on node 3.0;
In the constructed tracing graph, nodes in the tracing system, namely node 2.0, node 3.0, node 2.1 and the like, are correspondingly used as nodes in the tracing graph, the attribute of the node in the tracing system is correspondingly used as the attribute of the node in the tracing graph, in the embodiment, only the attribute number of each node is stored in the tracing graph, but specific attribute values are not stored, in the tracing system, the dependency relationship among the nodes is correspondingly used as a directed edge in the tracing graph, for example, in the tracing system, node 2.0 depends on node 3.0, and then the directed edge pointing from node 2.0 to node 3.0 exists in the tracing graph.
The edges in the traceability graph also reflect specific events, for normal events such as read operations, write operations and the like, the corresponding edge types in the traceability graph are known, and for unknown events which may have anomalies such as attack behaviors and the like, the types in the traceability graph can be determined through further predictive analysis. Based on the method, in order to solve the technical problems that the existing tracing-based anomaly detection method needs to be further improved in terms of semantic understanding, detection precision, instantaneity and the like, the invention provides the dynamic tracing-based anomaly detection method and system. Based on the thought, in order to improve the prediction precision of the new edge type, the invention provides a new method for generating the initial state feature vector of each node in the traceability graph so as to enable the initial state feature vector to have richer semantic information, and provides a new encoder-decoder structure for predicting the new edge type.
The following are examples.
Example 1:
an anomaly detection method based on a dynamic traceability graph, as shown in fig. 2, comprises the following steps:
Collecting system tracing information, converting the system tracing information into a tracing image, respectively extracting node characteristics and characteristics of edges connected with the nodes for each node in the tracing image, and splicing the node characteristics and the characteristics of edges connected with the node to serve as state characteristic vectors of corresponding nodes;
continuously monitoring system tracing information and dynamically updating a tracing graph, and executing an abnormality detection step when a new edge e t is generated in the tracing graph, wherein the abnormality detection step comprises the following steps:
Step S1, inputting a side e t, a state vector S t- of a graph structure surrounding the side e t at a time t - and a generation time t of the side e t into a graph attention network to obtain an embedded vector z of the side e t, and updating state feature vectors of a source node v src and a destination node v dst of the side e t;
the time t - represents the time immediately before the generation time of the edge e t, and the state vector of the graph structure at the time t - is represented by the state feature vectors of all nodes in the graph structure at the time t -, and the state feature vectors of the nodes together describe the global state of the graph at the time; specifically, in this embodiment, s t- is obtained by aggregating state feature vectors of a source node and a neighboring node thereof, a destination node and a neighboring node thereof of the edge e t at a time t - through graph roll stacking;
Step S2, inputting an embedded vector z of the edge e t into a decoder, predicting the probability that the edge e t belongs to each type, and obtaining a predicted vector P (e t);
the decoder comprises a long-short-term memory network and a plurality of full-connection layers which are connected in sequence;
Step S3, calculating a reconstruction error RE between a predicted vector P (e t) and an actual edge type vector of the edge e t, if RE > Th, judging that an abnormality exists, otherwise, judging that the abnormality does not exist;
Wherein Th is a preset abnormal threshold.
According to the embodiment, an initial tracing image is firstly established, the initial state feature vector of each node in the tracing image is generated, then the tracing information of the system is continuously monitored, the tracing image is dynamically updated, and the abnormity detection is realized through the detection of the newly generated edges in the tracing image, so that the real-time performance is effectively improved.
After the tracing graph is built, the tracing graph is stored in the distributed super table of the timing database timescaledb, specifically, for nodes such as network nodes, body nodes, file nodes, etc., the corresponding node information is stored by using the structure nodeid shown in (a) in fig. 3, where node_type represents the type (e.g. netflow, subject, file) of the node, msg stores information (e.g. socket, file path, etc.) of the node, index_id is a unique index of all nodes, and for event information, the corresponding side information is stored by using the structure shown in (b) in fig. 3, where src_node is hash_id of the node in hash ID corresponding to nodeid of the source node, and src_index_id is index_id of the node in nodeid. Similarly, dst_node is the hash_id of the destination node, dst_node_index is the index_id of the destination node, operation is the operation type between nodes, namely the type of edge (event), and timestamp of the timestamp_rec edge.
When the distributed super-table of the time sequence database timescaledb stores time sequence data, a large amount of data is dispersed in different blocks (physical tables), so that the burden of a single table is reduced. As shown in fig. 4, each block contains only data within a specific time range, and when a time range which does not exist yet is inserted, a new block is automatically created. In fig. 4, "value" represents data within a specific time range, and in this embodiment, represents various information related to the trace data, including node information and event information between nodes, which are stored in a distributed super table of TimescaleDB and organized in a time series form. In this embodiment, based on the node information and the event information stored in the distributed super-table of the time sequence database timescaledb, the distributed super-table can determine the physical table to be searched according to the query condition in the query process, and only perform the query in these tables. The efficiency of the query will be higher because each physical table contains a smaller amount of data than a single whole table. The longer the time span of the data, the larger the amount of data, and the more significant this advantage.
When the initial feature vector of the node is generated, the node features of the node and the features of the edges connected with the node are fused, so that the semantic information of the initial state feature vector of the node is richer, and the subsequent detection precision is improved.
As a preferred embodiment, in this embodiment, for a node v in the traceability graph, the extracting manner of the node characteristic Φ (v) includes:
Dividing the attribute of the node v into a plurality of substrings;
for each substring, mapping each character in the substring to a corresponding dimension of the substring feature by using a first hash function H, and hashing each character to { ±1} by using a second hash function H to serve as a feature value of the corresponding dimension in the substring feature;
the substring characteristics of the substrings are added as node characteristics Φ (v) of the node v.
The above-described relational expression for extracting the node feature Φ (v) can be expressed as follows:
Where s j denotes the j-th character in the substring s, φ i(s) denotes the i-th dimension feature value in the substring features of the substring s, and φ(s) k denotes the substring features of the k-th substring.
In the embodiment, the attribute of the node is divided into a plurality of substrings, and each substring is mapped to the low dimension by using a hash function and then added to serve as the node characteristic vector of the node, so that the hierarchical similarity is maintained while the high-dimension attribute is mapped to the low-dimension space, and different types of entities under the similar directory can also have reasonable characteristic representation.
It is easy to understand that in practical application, the partitioning manner of the attributes of the nodes should be determined according to the attribute type and format, so as to ensure that each sub-string obtained by partitioning has a corresponding physical meaning. Taking the attribute socket 192.168.11.6 of node v as an example, it may be divided into 4 substrings, "192", "192.168", "192.168.11" and "192.168.11.6", respectively.
As a preferred embodiment, in this embodiment, for a node v in the traceability graph, the extraction method of the feature ψ (v) of the edge connected to the node v includes:
Respectively obtaining edge type vectors of all the incoming edges of the node v, and mapping all the incoming edges into integers between 0-N e -1 according to the edge type vectors;
Respectively obtaining the edge type vectors of all the outgoing edges of the node v, and mapping all the outgoing edges into integers between N e~2Ne -1 according to the edge type vectors;
sequentially splicing integers obtained by mapping the input edge and the output edge of the node v to obtain the characteristic psi (v) of the edge connected with the node v;
where N e represents the total number of edge types.
The above manner of extracting the features of the edge to which the node is connected can be expressed as follows by the formula:
where In (v) represents the set of In-edges of node v, out (v) represents the set of Out-edges of node v, A type vector representing edge e; Representing a mapping function for Mapping is an integer between 0 and N e -1.
Under the condition that each input edge and each output edge of the nodes are mapped into integers according to the types of the edges, the extraction of the characteristics of the edges connected with the nodes is realized, so that the types of the edges connected with the nodes can be fully considered, and the node state characteristic vector obtained by fusing the final node characteristics and the edge characteristics has richer semantic information.
In this embodiment, when a new edge is generated, the state feature vectors of the source node and the destination node of the new edge are updated each time, so that the state feature vector of the node can accurately reflect the evolution history of the node in the whole graph, and as a preferred implementation manner, the updating of the state feature vector of the node is implemented by using a gate-controlled neural network (GRU), and correspondingly, in step S1 of this embodiment, the updating of the state feature vectors of the source node v src and the destination node v dst of the edge t includes:
The state vector s t-(vsrc) of the graph structure of the edge e t and the source node v src surrounding the edge e t at the time of t - is input into the gating neural network, the state feature vector of the source node v src of the edge e t is updated, and s t(vsrc) represents the updated state feature vector v src, and then the corresponding formula is expressed as:
st(vsrc)=GRU(et,st-(vsrc))
The state vector s t-(vdst) of the graph structure of the edge e t and the destination node v dst surrounding the edge e t at the time of t - is input into the gating neural network, the state feature vector of the destination node v dst of the edge e t is updated, and s t(vdst) represents the updated state feature vector v dst, and then the corresponding formula can be expressed as follows:
st(vdst)=GRU(et,st-(vdst))
according to the method and the device, the change history of the nodes in the graph is accurately recorded through the state feature vectors of the nodes, and the type prediction of the new edge is completed based on the state feature vectors of the nodes, so that the prediction precision of the edge type can be effectively improved.
It will be readily appreciated that when a new node appears in the graph structure, its corresponding state feature vector is initialized to all zeros because there is no history information. When a new edge appears in the graph structure changing the neighborhood of the node, the system will also update the state of the source node and the destination node.
When the anomaly detection is carried out on a new edge, an embedded vector is specifically generated by using a graph attention network, a decoder formed by sequentially connecting a long-period memory network and two full-connection layers predicts the type of the edge according to the embedded vector of the new edge, wherein the graph attention network and the decoder form an encoder-decoder model, the encoder part can dynamically adjust the attention weight of each node to the neighbors of the encoder part by introducing the graph attention network, so that the important relation between different nodes can be adaptively captured in a complex graph structure, the model has stronger expressive force in a scene with multiple node/edge types, the structure and the background information are provided for the decoder to predict the type vector of the new edge, and the decoder part adopts the long-period memory network (LSTM), so that the modeling capability of the model for time dependence is improved, and particularly in the graph structure with continuous time stamps, the time sequence evolution information of the edge can be captured by combining the two full-connection layers, and the model can be better integrated with time and structure information for accurate prediction of the edge type.
The encoding process may be formulated as follows:
z=GAT(et,st-,t)
wherein GAT represents a graph attention network;
the decoding process may be formulated as follows:
wherein LSTM represents a long and short term memory network and Linear represents a fully connected layer.
Finally, the reconstruction error of the predicted vector of the new edge type output by the decoder and the actual edge type vector of the new edge observed based on the traceability information reflects the degree of deviation of the new edge from normal behavior, the reconstruction error of benign edges is lower, and those edges which deviate from normal behavior obviously have higher reconstruction errors. In practice, normal edge type vectors may be created by constructing benign trace-source graphs (i.e., trace-source graphs that do not contain anomalies). Optionally, in this embodiment, the normal edge type vector is an N e -dimensional unique hot vector corresponding to the normal event one-to-one, N e is the total number of types of the normal event, and in the normal edge type vector, the dimension corresponding to the normal event is 1, and the remaining dimensions are 0. The specific value of the threshold Th for determining whether to be abnormal or not can be determined by statistical analysis of known normal events and abnormal events.
In practical application, the encoder-decoder structure and the gating neural network can be trained by utilizing the traceability information simultaneously containing known normal events and abnormal events and taking the reconstruction error between the edge type prediction vector output by the decoder and the actual edge type vector as a target, so that the corresponding model structure has relatively accurate prediction capability.
In order to more fully understand the attack behavior, as a preferred embodiment, the attack footprint may be accurately reconstructed by comprehensively considering a plurality of steps such as a time window queue, suspicious nodes, community discovery, graph simplification, and the like on the basis of the detection result of the step S3.
Example 2:
a computer program product, comprising a computer program, which when executed by a processor, implements the anomaly detection method based on a dynamic traceability map provided in the foregoing embodiment 1.
Example 3:
an electronic device includes a stored computer program, and when the computer program is executed by a processor, controls a device in which a computer readable storage medium is located to execute the anomaly detection method based on the dynamic tracing map provided in the above embodiment 1.
Example 4:
an anomaly detection system based on a dynamic traceability graph, comprising:
a computer readable storage medium storing a computer program;
And a processor for reading a computer program stored in a computer readable storage medium, and executing the anomaly detection method based on the dynamic tracing map provided in the above embodiment 1.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.