[go: up one dir, main page]

CN119853979B - Real-time tracing graph compression method and device for reserving causal association - Google Patents

Real-time tracing graph compression method and device for reserving causal association

Info

Publication number
CN119853979B
CN119853979B CN202411926940.3A CN202411926940A CN119853979B CN 119853979 B CN119853979 B CN 119853979B CN 202411926940 A CN202411926940 A CN 202411926940A CN 119853979 B CN119853979 B CN 119853979B
Authority
CN
China
Prior art keywords
node
information
target
initial
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411926940.3A
Other languages
Chinese (zh)
Other versions
CN119853979A (en
Inventor
董卫宇
张子龙
唐永鹤
王瑞敏
林键
王奕森
江子锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University Of Chinese People's Liberation Army Cyberspace Force
Original Assignee
Information Engineering University Of Chinese People's Liberation Army Cyberspace Force
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Engineering University Of Chinese People's Liberation Army Cyberspace Force filed Critical Information Engineering University Of Chinese People's Liberation Army Cyberspace Force
Priority to CN202411926940.3A priority Critical patent/CN119853979B/en
Publication of CN119853979A publication Critical patent/CN119853979A/en
Application granted granted Critical
Publication of CN119853979B publication Critical patent/CN119853979B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2463/00Additional details relating to network architectures or network communication protocols for network security covered by H04L63/00
    • H04L2463/146Tracing the source of attacks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明涉及网络安全技术领域,提供一种保留因果关联的实时溯源图压缩方法及装置。该方法包括:从日志信息源中获取事件流,将所述事件流解析为边以及边对应的源节点和目的节点,并获取所述源节点和所述目的节点的初始状态信息;基于所述边的信息流对所述源节点和所述目的节点的初始状态信息进行更新,生成目标状态信息;基于所述目标状态信息和所述初始状态信息判断所述边是否进行压缩,以实现所述溯源图的压缩。本发明基于节点依赖,在保留因果关系的情况下,增强了压缩算法的应用场景与压缩效果,适用于实时溯源图构建与存储场景,可减少图存储压力,有利于后续的威胁检测与攻击调查。

This invention relates to the field of network security technology, providing a method and apparatus for real-time source graph compression that preserves causal relationships. The method includes: obtaining an event stream from a log information source; parsing the event stream into edges and corresponding source and destination nodes, and obtaining initial state information of the source and destination nodes; updating the initial state information of the source and destination nodes based on the edge information stream to generate target state information; and determining whether the edges should be compressed based on the target state information and the initial state information, thereby compressing the source graph. This invention, based on node dependencies, enhances the application scenarios and compression effects of the compression algorithm while preserving causal relationships. It is suitable for real-time source graph construction and storage scenarios, reduces graph storage pressure, and is beneficial for subsequent threat detection and attack investigation.

Description

Real-time tracing graph compression method and device for reserving causal association
Technical Field
The invention relates to the technical field of network security, in particular to a real-time tracing graph compression method and device for reserving causal association.
Background
Complex network attacks represented by APT attacks are difficult to detect and completely analyze by the traditional detection method due to the characteristics of multiple steps, long period, strong concealment and the like. The traceability graph is generally constructed by a system audit log, wherein system entities in the log are represented as nodes, and interaction behaviors among the nodes are represented as edges. Because the association relation between the system entity and the behavior thereof is completely recorded, the traceability graph has great advantages in the aspect of multi-step complex network attack detection, and is often used as a evidence obtaining tool to help security personnel attack investigation and grasp the coming and going pulse of a network attack event. Therefore, in recent years, traceability graphs have been widely studied and applied to threat detection and attack investigation. The tracing graph compression is an indispensable work in the tracing graph work field and is used for relieving storage and calculation pressure caused by massive data. However, in the compression of the traceability graph, mutual exclusivity exists between the fidelity of the information and the compression effect, and the design of a compression algorithm with good compression effect and high information retention has great application value.
Existing trace-source graph compression methods can be categorized into entity-driven, event-driven, and attack-survey-driven based compression. Entity-driven compression compresses the graph by removing or merging nodes, for example, some isolated entities which do not affect analysis results can be safely deleted, and entities with the same functions and attributes can be merged to a certain extent. Event driven compression compresses the graph by removing or merging edges. For example, the event edges are summarized and compressed according to statistical information such as the types, the number and the like of the reserved events. Attack investigation driven compression is graph compression by removing nodes or edges that are not relevant to the attack event. For example, the benign event edges or nodes irrelevant to network attacks are removed according to the assumed attack pattern.
Attack survey-driven compression methods often lose much of the underlying threat behavior information because of the attack assumptions. In the construction of the traceability graph, the number of edges is much larger than the number of nodes, so that the edge compression has larger compression potential, the story-driven compression algorithm has better compression effect, and the method has more remarkable effects of relieving the dependence explosion and reducing the cost. However, existing mainstream event-driven compression limits the effectiveness of the compression algorithm due to strict reachability preservation, and is not applicable to real-time traceability graph construction scenarios.
In summary, the existing compression algorithm has the problems that the compression effect is not enough to highlight and lose the tracing image information, and is not beneficial to subsequent attack tracing analysis.
Disclosure of Invention
In order to at least solve the problems that the existing trace-source diagram compression method has no obvious compression effect and loses trace-source diagram information, the invention provides a real-time trace-source diagram compression method and device for retaining causal relation.
In a first aspect, the present invention provides a real-time tracing graph compression method for preserving causal association, including:
Acquiring an event stream from a log information source, analyzing the event stream into edges, source nodes and destination nodes corresponding to the edges, and acquiring initial state information of the source nodes and the destination nodes;
Updating the initial state information of the source node and the destination node based on the information flow of the side to generate target state information;
And judging whether the edges are compressed or not based on the target state information and the initial state information so as to realize the compression of the traceability graph.
Further, the initial state information comprises initial node total-dependence information and initial writing node information;
The initial node total dependence information records node dependence by utilizing a set, and the node writing-in node information is writing-in node information with unchanged node dependence;
correspondingly, the target state information comprises target node total-dependence information and target writing node information.
Further, updating initial state information of the source node and the destination node based on the information flow of the edge, generating target state information, including:
the target writing node information of the source node is equal to the initial writing node information of the source node plus the target node;
the target node total-dependence information of the source node is equal to the initial node total-dependence information of the source node;
the target node total-dependence information of the target node is equal to the initial node total-dependence information of the source node, the union of the initial node total-dependence of the source node and the initial node total-dependence of the target node minus the set of the target node, and the target node total-dependence information of the target node is calculated through a formula (1):
wherein, the Initial node full-dependency information representing the source node, src represents the source node,Initial node full-dependency information representing the destination node, dst representing the destination node,Target node total-dependence information representing a target node;
The target writing node information of the destination node is null when the target node total dependency information of the destination node is different from the initial node total dependency information of the destination node, and the null is equal to the initial writing node information of the destination node, and the updating of the target writing node information of the destination node is judged through a formula (2):
Further, or the initial state information includes initial node partial dependency information and initial writing node information;
The initial node part dependence information is queue information with the length L, and the initial writing node information is writing node information with unchanged node dependence;
correspondingly, the target state information comprises target node part dependent information and target writing node information.
Further, the initial node part dependency information and the target node part dependency information further include node version information;
correspondingly, updating the initial state information of the source node and the destination node based on the information flow of the edge to generate target state information, including:
the target writing node information of the source node is equal to the initial writing node information of the source node plus the target node;
The target node part dependency information of the source node is equal to the initial node part dependency information of the source node;
the target node part dependency information of the target node is updated as follows:
Firstly, using an application function to sequentially add the initial node part dependent information of the destination node, the initial node part dependent information of the source node and the source node to the tail part of a list;
then deleting repeated information close to the head of the list by using a marking function, and reserving the tail of the list with the dependency length L as the target node part dependency information of the target node;
the target node part dependency information of the target node is updated by the formula (3):
wherein, the The target node representing the destination node depends partly on the information,The initial node representing the destination node depends partly on the information,The initial node part dependency information of the source node is represented, src represents the source node, application represents an adding function, and prune represents a reducing function;
the target writing node information of the destination node is null when the target node part dependency information of the destination node is different from the initial node part dependency information of the destination node, and the same is equal to the initial writing node information of the destination node, and the updating of the target writing node information of the destination node is judged through a formula (4):
Wherein dp new represents the node part dependency information newly added by the destination node, if dp new is not null, the destination writing node information of the destination node is null, and if dp new is null, the destination writing node information of the destination node is equal to the initial writing node information of the destination node.
Further, determining whether the edge is compressed based on the target state information and the initial state information to implement the compression of the tracing graph includes:
judging whether the initial state information and the target state information of the destination node and the source node are the same, and if so, performing edge compression on the edges.
Further, determining whether the edge is compressed based on the target state information and the initial state information to implement the compression of the tracing graph includes:
Judging whether the initial state information and the target state information of the destination node and the source node are the same, and if so, performing edge compression on the edges;
And if the version of the destination node is different, updating the version of the destination node.
Further, updating the initial state information of the source node and the destination node based on the information flow of the edge to generate target state information, and further comprising:
When dp new is not null and node X e dp new, if the node part dependency information of node X includes destination node N V and depends on dp new, node X depends on version N v+1 updated by N V, and the dependent version update of node X is determined by formula (5):
Where NV represents the destination node and, Node part dependency information representing node X.
In a second aspect, the present invention provides a real-time tracing graph compression apparatus for preserving causal association, including:
The node state recording module is used for acquiring an event stream from a log information source, analyzing the event stream into edges, source nodes and destination nodes corresponding to the edges, and acquiring initial state information of the source nodes and the destination nodes;
The node state updating module is used for updating the initial state information of the source node and the destination node based on the information flow of the side to generate target state information;
and the edge compression module is used for judging whether the edge is compressed or not based on the target state information and the initial state information so as to realize the compression of the traceability graph.
In a third aspect, an electronic device comprises a memory, a processor and a computer program stored on the memory and running on the processor, the processor implementing the method as described above when executing the program.
The invention has the beneficial effects that:
The invention is based on node dependence, and enhances the application scene and compression effect of the compression algorithm under the condition of keeping causal relationship. The method is suitable for constructing and storing the real-time traceability map, can reduce the map storage pressure, and is beneficial to subsequent threat detection and attack investigation.
(1) In terms of compression effect, when the node a depends on unchanged condition, the repeated writing to the node B is not rare, such as a condition of continuously writing information into a file, and circularly reading and writing by a process and a network node. And in the traceability graph, the number of edges is often several times or even tens of times that of the nodes. In the previous research, the node has an input edge, namely the node dependence is determined to be changed, the condition that the node has information flow input and is dependent is ignored, and the scene of graph compression application is limited. According to the invention, the dependency of the node is recorded, the edge compression range keeping the causality unchanged is enlarged, and the capability of graph compression is enhanced.
(2) From the point of view of causality, if the results of the forward analysis and the backward analysis are unchanged, the causality among the nodes is kept, and the compressed traceability graph can still be used for an attack investigation scene.
(3) The invention aims at two traceability graphs to construct scenes, and divides node dependence into a node total dependence record and a node partial dependence record. For constructing a scene according to the graph without considering the overhead, a node total dependency recording method can be adopted, and node dependencies are recorded by a set. For the real-time tracing graph compression scene, a recording method of node partial dependence can be adopted, and the latest node dependence is recorded by using a list, because the scene of two or three node cyclic interaction in system activity is more common, and higher compression effect can be realized with lower expenditure.
Drawings
Fig. 1 is a schematic flow chart of a real-time tracing graph compression method for preserving causality according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a framework of a real-time trace-source diagram compression method for preserving causality according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a node-dependent state change according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a version-less node-part dependent transfer according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a versioned node partial dependency transfer provided by an embodiment of the present invention;
FIG. 6 is a schematic diagram of updating a node version in node part dependency information of a non-destination node according to an embodiment of the present invention;
FIG. 7 is a second schematic diagram of updating a node version in node part dependency information of a non-destination node according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a compression method implementation flow provided in an embodiment of the present invention;
FIG. 9 is a schematic diagram of a cyclic interactive compression scene state change provided by an embodiment of the present invention;
FIG. 10 is a schematic diagram illustrating compression of information flow between multiple nodes according to an embodiment of the present invention;
FIG. 11 is a schematic diagram illustrating the compression of the interactive information flow between two nodes according to an embodiment of the present invention;
FIG. 12 is a schematic diagram of multi-node interactive information stream compression provided by an embodiment of the present invention;
Fig. 13 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The key concept to be declared in the invention is as follows:
Events are log records of inter-system entity behavior with event stamps extracted from sources such as system logs or application logs. Each event contains time, a subject (e.g., a process entity), an object (e.g., a file entity, a socket entity, etc.), and interactions (e.g., reads, writes, etc.) between the entities.
Information flow-information flow is the flow of information between entities. The information stream includes a control stream and a data stream. If the process A for generates the sub-process B, a control flow exists between the process A and the process B, and the information transfer direction is A to B. If the process C writes node information into the file D, a data stream exists between C and D, and the information transmission direction is C to D.
The node is a representation form of an entity analyzed from the event in the traceability graph. For each edge, the parsed nodes may be divided into source and destination nodes. Wherein the information flow of the source node is directed to the destination node. The node is denoted by the letter N, the source node is denoted by Src, and the destination node is denoted by Dst.
Edges are the representation form of information flow transmission among nodes in the traceability graph. Edges in the tracing graph are directed edges, and the direction of the edges is the direction of information flow. For each event, the parsed edge has attributes such as time, interaction behavior and the like. Edges are denoted by letter E.
The traceability map is extracted from a log source and is a directed graph representation of information flow interaction between entities. The trace-source graph G is composed of nodes and edges of event resolution, and can be generally represented as g= (N, E).
Node dependency-node dependency refers to the source of the information flow of the node. If there are one or more time-increasing edges between node A and node B such that A's information flow passes to B, B depends on A and node A is the information source of node B. A node's dependencies are all sources of information for the node.
Causality that there is causality between two nodes with dependency.
And (3) forward analysis, namely, giving a detection node A, and finding all nodes which depend on the detection node A in the traceability graph according to the causality.
Backward analysis given one detection node a, the backward analysis tracks all nodes affecting detection point a (i.e., node a's dependencies) based on causality.
Attack investigation in the field of traceability graphs, attack investigation generally refers to network attack scene reconstruction according to threat points. Forward/backward analysis is often used in attack surveys to discover the source of a threat and the impact of the threat.
The compression basis of the invention is that when the dependence of the node A is unchanged, the repeated writing of the node B is redundant, only the first edge of the node A and the first edge of the node B can be reserved, and the causal relationship between the nodes required by subsequent attack investigation can be correctly reserved.
The method is mainly used for constructing and compressing the traceability graph, the log information source is from a real-time audit log of a host or an offline data set, an algorithm processes log event streams of which the data forms are arranged in time sequence, and the events can be analyzed into directed edges between Src, dst and nodes. As shown in fig. 1, the real-time tracing graph compression method for retaining causal association provided by the embodiment of the invention includes:
S1, acquiring an event stream from a log information source, analyzing the event stream into edges, source nodes and destination nodes corresponding to the edges, and acquiring initial state information of the source nodes and the destination nodes.
S2, updating initial state information of a source node and a destination node based on information flow of edges to generate target state information;
And S3, judging whether the edges are compressed or not based on the target state information and the initial state information so as to realize the compression of the traceability graph.
Specifically, as shown in fig. 2, the real-time traceability graph compression method with causal association is used in a framework that firstly, event streams are obtained from log sources (log information sources), and each event is resolved into nodes and edges of the traceability graph. And then, acquiring the source node state information and the destination node state information corresponding to the edges, and simultaneously transmitting the state information of the update nodes according to the information flow. If the source node has written to the destination node with the dependency unchanged, this indicates that the edge can be compressed. And finally, constructing a tracing graph by utilizing the nodes generated by the event and the edges reserved by the algorithm.
The method provided by the embodiment of the invention is based on node dependence, and enhances the application scene and compression effect of the compression algorithm under the condition of keeping the causal relationship. The method is suitable for constructing and storing the real-time traceability map, can reduce the map storage pressure, and assist subsequent threat detection and attack investigation.
The real-time tracing graph compression method with causal association reserved in the embodiment of the invention further comprises initial node total-dependence information and initial writing node information, wherein the initial node total-dependence information records node dependence by utilizing a set, the node writing node information is writing node information with unchanged node dependence, and the target state information comprises target node total-dependence information and target writing node information.
Correspondingly, updating the initial state information of the source node and the destination node based on the information flow of the edge to generate target state information, wherein the method comprises the following steps:
the target writing node information of the source node is equal to the initial writing node information of the source node plus the target node;
the total dependence information of the target node of the source node is equal to the initial node total dependence information of the source node;
the target node total-dependence information of the target node is equal to the initial node total-dependence information of the source node, the union of the initial node total-dependence of the source node and the initial node total-dependence of the target node minus the set of the target node, and the target node total-dependence information of the target node is calculated through the formula (1):
wherein, the Initial node full-dependency information representing the source node, src represents the source node,Initial node full-dependency information representing the destination node, dst representing the destination node,Target node total-dependence information representing a target node;
The target writing node information of the destination node is null when the target node total dependency information of the destination node is different from the initial node total dependency information of the destination node, if the target node total dependency information is the same as the initial writing node information of the destination node, the updating of the target writing node information of the destination node is judged through a formula (2):
it can be understood that the node total dependency record method and the corresponding state update method provided in this embodiment are applicable to a graph construction scenario in which the edges need to be compressed to the greatest extent, but the overhead is not considered. And the edges in the traceability graph are compressed by utilizing the change of the dependent information, so that the compression effect is enhanced.
The real-time tracing graph compression method with causal relation reserved in the embodiment of the invention further comprises the steps that or the initial state information comprises initial node part dependent information and initial writing node information, wherein the initial node part dependent information is queue information with the length L, the initial writing node information is writing node information with unchanged node dependence, and the target state information comprises target node part dependent information and target writing node information.
Correspondingly, the initial node distribution dependency information and the target node part dependency information also comprise node version information;
updating initial state information of a source node and a destination node based on information flow of edges to generate target state information, wherein the method comprises the following steps:
the target writing node information of the source node is equal to the initial writing node information of the source node plus the target node;
the target node part dependency information of the source node is equal to the initial node part dependency information of the source node;
the target node part dependency information of the destination node is updated as follows:
Firstly, using an application function to sequentially add initial node part dependent information of a destination node, initial node part dependent information of a source node and the source node to the tail part of a list;
Then deleting repeated information close to the head of the list by using a marking function, and reserving the tail of the list with the dependency length L as target node part dependency information of a target node;
The target node part dependency information of the destination node is updated by the formula (3):
wherein, the The target node representing the destination node depends partly on the information,The initial node representing the destination node depends partly on the information,The initial node part dependency information of the source node is represented, src represents the source node, application represents an adding function, and prune represents a reducing function;
the target writing node information of the destination node is null when the target node part dependent information of the destination node is different from the initial node part dependent information of the destination node, if the target node part dependent information of the destination node is the same as the initial writing node information of the destination node, the updating of the target writing node information of the destination node is judged through a formula (4):
Wherein dp new represents newly added node part dependent information of the destination node, and is used for judging whether the node part dependent information of the destination node changes, if dp new is not null, the target writing node information of the destination node is null, and if dp new is null, the target writing node information of the destination node is equal to the initial writing node information of the destination node.
It can be understood that the node part dependent information recording method provided by the embodiment is suitable for compression in the construction of the tracing graph and compression of the real-time tracing graph, because the scene of cyclic interaction of two or three nodes in system activity is more common. The method can be used for recording the occurrence of the real-time event stream of the traceability graph, and a higher compression effect is realized by using lower expenditure.
As the real-time tracing graph compression method with causal relationship reserved in the embodiment of the invention, further, whether the edges are compressed or not is judged based on the target state information and the initial state information so as to realize the compression of the tracing graph, the method comprises the following steps:
And judging whether the initial state information and the target state information of the destination node and the source node are the same, and if so, performing edge compression on the edges.
Further, if the node dependency information is node partial dependency information, the method further includes:
and if the initial state information and the target state information are different, updating the version of the destination node.
As the real-time tracing graph compression method with causal association reserved in the embodiment of the invention, further, the initial state information of the source node and the destination node is updated based on the information flow of the edge, and the method for generating the target state information further comprises the following steps:
When dp new is not null and node X e dp new, if the node part dependency information of node X includes destination node N V and depends on dp new, node X depends on version N v+1 updated by N V, and the dependency version update of node X is determined by formula (5):
Where NV represents the destination node and, Node part dependency information representing node X.
The following further explains four aspects of node state record, node state update, compression determination of edges and compression algorithm applicable scenario.
1. State record of node
The state information of the nodes is necessary node information in the algorithm implementation process. For node N, its state information mainly includes two parts. The first part is node dependency information DP of node N, which records the dependent nodes of N. The second part is writing node information WN of the node N, and records writing node information of which the N is dependent unchanged, namely all destination nodes Dst of which information flows in as the node N of the source node Src under the condition that the node N is dependent unchanged. By recording node dependency information DP of the nodes and written node information WN, whether the node dependency information is changed or not and whether edges can be compressed or not can be judged, so that the function implementation of a compression algorithm is supported.
The dependency information of the nodes is accumulated continuously along with the information flow, so that the memory and CPU overhead of the algorithm is increased continuously along with the increase of the trace-source image volume. All node dependency information DP of the nodes is recorded, and edges can be compressed to the greatest extent on the premise of ensuring that the causal relationship is unchanged, but the method is only suitable for a scene with a short event and a proper graph size. Therefore, the embodiment of the invention provides a record form for keeping the node part dependent information, ensures small storage and calculation cost, and can be used for constructing a compressed scene by a real-time tracing graph.
The dependency information record of the node can be divided into an ALL-dependency information ALL_DP record and a PART-dependency information PART_DP record of the node aiming at the two kinds of traceability graphs to construct scenes. Table 1 shows the record content and effect of node state information, wherein node total dependency information all_dp or node partial dependency information part_dp can be selected together with writing node information WN as node state information. For the graph construction scene requiring the maximum compression of edges and not considering the overhead, a recording method of node total dependency information ALL_DP can be adopted, and node dependencies are recorded by a set. For the real-time tracing graph compression scene, a recording method of node PART dependency information part_dp can be adopted, and the latest node dependency is recorded by a list. The node PART dependency information part_dp may optionally be configured to record the length L of the node dependency information, which is at least 1. If the length L is relied upon, the algorithm may compress the redundant cyclic interactions between l+1 nodes. Generally, L can be configured to be1 or 2, because the scenario of two or three nodes cycling interactions in system activity is more common, and higher compression effects can be achieved with lower overhead.
TABLE 1 node State information Table
Node state information Content Action
WN Node set And judging whether Dst is WN or not, and realizing compression judgment.
ALL_DP (optional) Node set Determination of dependency changes for transfer of dependency states
Part_DP (optional) Node queue (Length L) Determination of dependency changes for transfer of dependency states
When the event analysis discovers a new node, the written node information WN and node dependency information DP of the new node state are set to be empty, then the state information of the updated node is transmitted through the information of the edge, and finally whether the edge is compressible is judged.
2. State update of nodes
After the state information of the node is obtained, the next step is to update the state of the node according to the information flow transmission of the edge so as to realize the compression of the edge. The operation of the state update is also mainly directed to the node dependency information DP of the node and the written node information WN of the node. The node dependency information DP related operations of the nodes mainly relate to the problem of dependency transfer and updating, i.e. how the node dependency information DP of the source node Src is transferred to the destination node Dst and how the node dependency information DP of the destination node is updated for each edge. For the write node information WN of the node, when the source node Src writes to the destination node Dst, the write node information WN of the source node Src should be added with the destination node Dst, and the write node information WN of the destination node Dst needs to be emptied when its dependency changes. Therefore, the status update of the writing node information WN mainly involves a change determination problem of the node dependency information DP. Since the node total dependency information all_dp is different from the dependency representation of the node partial dependency information part_dp, the state update for both cases will be described in detail below.
2.1 State information update in case of node full dependency information ALL_DP
In the node total dependency information all_dp, the node dependency information DP is represented using a set. Each element in the set is a unique identification of a node. For each edge analyzed by the event, the formula (1) is used for transmitting node dependency information among nodes, namely, for the edge with the source node being Src and the destination node being Dst, the initial dependencies of the nodes are respectively set asAnd (3) withAfter the side information is transferred, the destination node Dst depends onAnd performing union operation on the DP of the source node Src and the DP of the Src, performing union operation on the DP of the source node Src and the DP of the Dst, and performing difference operation on the DP of the source node Src and the DP of the source node Src. Then, the formula (2) is used as a judging condition of whether the semantics are changed or not, if the dependency set of the target node is not changed before and after the event occurs, the semantics are considered to be unchanged, otherwise, the semantics are changed, and the target writing node information of the target node is updated according to whether the semantics are changed or not.
For example, as shown in fig. 3, the state change of the node dependency information DP under the condition of the node total dependency information all_dp is shown, where P1 and P2 are process nodes, F is a file node, when t=1, P1 writes into the file F, and when t=2, P2 reads the file F. The initial dependencies of P1, F, P are empty set { R } and set { R } with dependent nodes R, empty set { }, respectively. After time T1, DP of F is calculated as { R, P1} according to equation (1), DP is changed, and the WN state of F is set to null. After time T2, the dependency calculation of P2 is { R, P1, F }, DP changes, and the WN state is set to null.
2.2 State information update in case of node PART dependency information ParT_DP
In the node PART dependency information part_dp, the dependency of the node is represented by a list, the maximum length of the list is L, and at most L dependency nodes are stored. ParT_DP may cause a dependency change but undetected condition due to dependency miss problems because it cannot record all dependencies of nodes. For example, as shown in fig. 4, the P dependency at times t=1 and t=3 changes, but the dependency list appears to be dependent due to the length constraint.
In order to solve the problems, the invention adds the version information of the node as a solution, adds the version of the node whenever the node dependency change is detected, and adds the version information N v as the node identification in the dependency record, so that the node dependency change can show the difference of the versions and can be always transferred to the Dst. By way of example, as shown in FIG. 5, N-V represents node plus version, N represents node, and V represents version. Since the F-0 version is increased to F-1 by the F-dependent change when t=2, the node-dependent change of P can be accurately detected by the version-transfer node-dependent change even though the dependent record length is only 1, so as to accurately update WN.
Thus, in the node PART dependency information in PART DP, the node with version information is identified as an element of the dependency list. For the edge with the source node being Src and the destination node being Dst, the initial dependence of the node is respectively set asAnd (3) withThe node dependency information among nodes is transferred by using a formula (3), namely, firstly, using an application function to sequentially add the DP of Dst, the DP of Src and Src to the tail part of a list (the tail part represents the latest node dependency), then using a pulling function to delete the repeated nodes close to the head part of the list, and reserving the tail part of the list with the dependency length L as the destination node dependency after the state updateThe new node dependency information dp new after the Dst state update is obtained by using the formula (4) to do difference set operation, dp new is used as a judging condition for judging whether the node dependency is changed or not, if dp new is not null, the node dependency is changed, otherwise, the node dependency is not changed.
In part_dp, when node N v changes in dependence and the version increase becomes N v+1, there is a set DP new that causes a dependency change, if node X e DP new that causes N v changes in dependence depends on node N v and also on DP new, node X should also depend on N v+1, in order to prevent this dependency from being lost, we use equation (5) to make a decision and make a dependency version update of X, and this dependency reservation can eliminate redundant edges of the cyclic scene.
For example, as shown in FIG. 6, an example scenario of dependency version update (N-V represents node plus version) is illustrated in which process P performs send and recv information interaction with socket S, and when T=2, since S-1 causing the P dependency change is also dependent on S-1 (the node is considered to be always dependent on itself) and P-0 at the same time, the dependency P-0 of updating S-1 is P-1, and the subsequent process P writes S without causing S dependency change.
For example, as shown in FIG. 7, an example scenario of dependency version update is shown, when the dependency information of node B changes after an event occurs, and when dp new = { C, D, A }, then the node in each newly added dependency information dp new is judged, the node in dp new is subjected to dependency version update according to formula (5), in this example, node C depends on dp new and node B-0, and meets the conditions in formula (5), so that update B-0 is B-1, and node A and node C do not meet formula (5), and therefore, update of the node part dependency information thereof is not required.
3. Edge compression determination
And the compression judgment of the edge can be carried out through the state information record and the state information update of the node, so that a compression algorithm is realized. The compression judgment process of the edge is as follows, for the edge with the source node being Src and the destination node being Dst, if the WN of Src contains Dst, the fact that the Src node is written with Dst under the condition of unchanged dependence is indicated, and therefore the edge can be compressed without influencing causality. Since Src has not changed and Dst has been written, the information flow on this side has already been parsed, so the state update is not needed after the side compression. If the compression determination is not passed, the information flow of the edge needs to be processed, and the node state information of Src and Dst is updated.
The overall flow of the algorithm is shown in fig. 8, where for each edge Ei, the source node is Si, the destination node is Di, and the initialization information of each node includes WN and DP. Where WN is the set of nodes written with node dependency unchanged, DP records node dependency information (reference is made to the above for dependency related implementations in steps).
To facilitate the understanding of the algorithm process, fig. 9 illustrates a node state change process in a cyclic compression scenario, where a process P (node P) and a network socket S (node S) interact with each other, and node initial states DP and WN are both null. When t=1, P sends information to S, P 'S WN adds new S node, S' S DP adds new P-0, and S version changes from 0 to 1 due to the dependent transmission change, and the set of dependent unchanged write nodes WN is set to be empty. Since P does not contain S for its WN before writing, this edge may not be compressed. When t=2, P receives S information, and the dependency of P becomes S-1, the version becomes 1 from 0, and WN is set to be null. S node WN adds node P, and since node S causing P change depends on P-0 before the dependency change and S-1 causing the dependency change, the dependency version is updated to change the dependency from P-0 to P-1. When t=3, P sends information to S, WN adds a new node S, and since S dependency has not changed, there is no other change. When t=4, since WN of S contains P, this side can be compressed when P reads information from S. If the dependency of the two subsequent nodes is unchanged, the subsequent mutually read-write edges can be compressed.
4. Compression algorithm applicable scene
The compression method finally realized is applicable to several compression scenes.
4.1 Read-write scene compression between multiple nodes
In normal system activity, a scenario may occur in which a process continues to read information from a file and write to another file. For example, as shown in fig. 10, the directional edge in the figure represents the flow direction of information, and the edge T represents an event occurring at an event, where the process a writes node information to the file B when t=1, 2,4, and 6, the process C reads information from the file B when t=3, 5, and 7, and the process a receives new information input at t=5 to cause semantic change. The three sides t=1, 2, and 4 before the semantic change of t=5 in the process a only need to keep the sides corresponding to t=1, and the two sides t=3 and 5 before the semantic change of t=6 in the process B only need to keep the sides corresponding to t=1. Note that, performing a is equivalent to a node in the above method.
4.2 Interactive scene compression between multiple nodes
Read-write interaction between processes and network nodes is also a common scenario. The compression method is suitable for mutual reading and writing among a plurality of nodes, and if the number of the node dependence reservations is L, the cyclic reading and writing redundant edges among L+1 nodes can be solved. As shown in fig. 11, the network socket S performs information interaction with the process a, and the node dependency reservation length L is set to 1. When t=3, the web socket S has the same semantic information as the process a, and writes to the other node under the condition of unchanged dependency, so after t=3, since node dependency does not change, the read-write operation of both nodes can be compressed. For interaction among multiple nodes, read-write interaction of a parent-child process through a pipeline is also a common scene. For example, as shown in fig. 12, the process a and the process B perform information interaction through the pipe P, when t=6, the node dependence remains unchanged, and the event edges of the subsequent transmission of information from P to B by a and the transmission of information from P to a by B can be compressed. It should be noted that, the process a, the process B, and the pipe P are equivalent to the nodes in the above method.
In summary, the embodiment of the invention compresses the event edges under the condition that the dependency of the source node is unchanged based on the definition of the node dependency aiming at the construction and compression scene of the traceability graph. The method is suitable for compressing various trace-source graph structures, can be used for constructing real-time graphs, and improves the effect of trace-source graph compression while maintaining the capability of trace-source graph attack investigation.
From the compression effect, it is verified that the repeated writing to the node B is not uncommon when the node a depends on unchanged conditions, such as a process continuously writing information to a file, and a process and a network node cyclically read and write. And in the traceability graph, the number of edges is often several times or even tens of times that of the nodes. In the previous research, the node has an input edge, namely the node dependence is determined to be changed, the condition that the node has information flow input and is dependent is ignored, and the scene of graph compression application is limited. According to the invention, the dependency of the node is recorded, the edge compression range keeping the causality unchanged is enlarged, and the capability of graph compression is enhanced.
From the point of view of causality, if the results of the forward analysis and the backward analysis are unchanged, the causality between the nodes is maintained, and the compressed traceability graph can still be used for an attack investigation scene. The following demonstrates from forward analysis and postamble analysis, respectively:
For each node A, if an anomaly is found at time T, forward analysis traverses an outgoing edge from node A for a time greater than T. When the node A is found abnormal at the moment T, the node A has new information flow input to cause the node abnormality, namely the dependence of the node changes. According to the method provided by the invention, all the outgoing edges after node dependence change are not compressed, so that all the nodes corresponding to the outgoing edges after abnormality occurs (at the moment T) are reserved, and the forward analysis result is ensured to be fully reserved;
similarly, all incoming edges with the traversing time of the backward analysis slave node A item being smaller than T are deleted, and the source node depends on unchanged redundant edges which are written into the slave node A, so that nodes corresponding to all the incoming edges before abnormality (time T) cannot be confirmed, and the integrity of forward influence analysis is guaranteed.
The embodiment of the invention also provides a real-time tracing graph compression device for reserving causal relationship, which comprises the following steps:
the node state recording module is used for acquiring an event stream from the log information source, analyzing the event stream into edges, source nodes and destination nodes corresponding to the edges, and acquiring initial state information of the source nodes and the destination nodes;
The node state updating module is used for updating the initial state information of the source node and the destination node based on the information flow of the side to generate target state information;
And the edge compression module is used for judging whether the edges are compressed or not based on the target state information and the initial state information so as to realize the compression of the traceability graph.
As shown in fig. 13, an embodiment of the present invention further provides an electronic device, where the electronic device may include a processor 1301, a communication interface (Communications Interface) 1302, a memory 1303 and a communication bus 1304, where the processor 1301, the communication interface 1302, and the memory 1303 complete communication with each other through the communication bus 1304. Processor 1301 may invoke logic instructions in memory 1303 to perform the methods provided by the embodiments described above, including, for example:
The method comprises the steps of obtaining event streams from a log information source, analyzing the event streams into edges, source nodes and destination nodes corresponding to the edges, obtaining initial state information of the source nodes and the destination nodes, updating the initial state information of the source nodes and the destination nodes based on the edge information streams to generate target state information, and judging whether the edges are compressed based on the target state information and the initial state information so as to realize the compression of a tracing graph.
Further, the logic instructions in the memory 1303 described above may be stored in a computer readable storage medium when implemented in the form of software functional units and sold or used as a stand alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present invention. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
It should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention, and not for limiting the same, and although the present invention has been described in detail with reference to the above-mentioned embodiments, it should be understood by those skilled in the art that the technical solution described in the above-mentioned embodiments may be modified or some technical features may be equivalently replaced, and these modifications or substitutions do not make the essence of the corresponding technical solution deviate from the spirit and scope of the technical solution of the embodiments of the present invention.

Claims (8)

1.一种保留因果关联的实时溯源图压缩方法,其特征在于,包括:1. A real-time source graph compression method that preserves causal relationships, characterized in that it includes: 从日志信息源中获取事件流,将所述事件流解析为边以及边对应的源节点和目的节点,并获取所述源节点和所述目的节点的初始状态信息;其中,所述初始状态信息包括初始节点全依赖信息和初始写入节点信息;其中所述初始节点全依赖信息利用集合记录节点依赖,所述节点写入节点信息为节点依赖不变的写入节点信息;The event stream is obtained from the log information source, and the event stream is parsed into edges and the corresponding source nodes and destination nodes. The initial state information of the source nodes and the destination nodes is obtained. The initial state information includes initial node full dependency information and initial write node information. The initial node full dependency information uses a set to record node dependencies, and the node write node information is write node information with unchanged node dependencies. 基于所述边的信息流对所述源节点和所述目的节点的初始状态信息进行更新,生成目标状态信息;所述目标状态信息包括目标节点全依赖信息和目标写入节点信息;具体包括:The initial state information of the source node and the destination node is updated based on the information flow of the edge to generate target state information; the target state information includes target node full dependency information and target write node information; specifically including: 所述源节点的目标写入节点信息等于所述源节点的初始写入节点信息加上所述目标节点;The target write node information of the source node is equal to the initial write node information of the source node plus the target node; 所述源节点的目标节点全依赖信息等于所述源节点的初始节点全依赖信息;The target node full dependency information of the source node is equal to the initial node full dependency information of the source node; 所述目的节点的目标节点全依赖信息等于所述源节点的初始节点全依赖信息、所述源节点和所述目标节点的初始节点全依赖的并集减去所述目标节点的集合,通过公式(1)计算所述目的节点的目标节点全依赖信息:The target node full dependency information of the destination node is equal to the initial node full dependency information of the source node, the union of the initial node full dependencies of the source node and the target node, minus the set of the target node. The target node full dependency information of the destination node is calculated using formula (1): 其中,表示源节点的初始节点全依赖信息,表示源节点,表示目的节点的初始节点全依赖信息,表示目的节点,表示目的节点的目标节点全依赖信息;in, This represents the initial node's full dependency information for the source node. Indicates the source node, This indicates the initial node's full dependency information for the destination node. Indicates the destination node. This represents the target node's full dependency information; 所述目的节点的目标写入节点信息在所述目的节点的目标节点全依赖信息与所述目的节点的初始节点全依赖信息不同时为空,相同则等于所述目的节点的初始写入节点信息,所述目的节点的目标写入节点信息的更新通过公式(2)进行判断:The target write node information of the destination node is empty if the target node full dependency information of the destination node is not the same as the initial node full dependency information of the destination node, and is equal to the initial write node information of the destination node if they are the same. The update of the target write node information of the destination node is determined by formula (2): ; 基于所述目标状态信息和所述初始状态信息判断所述边是否进行压缩,以实现所述溯源图的压缩。Based on the target state information and the initial state information, it is determined whether the edges should be compressed in order to achieve the compression of the source graph. 2.根据权利要求1所述的一种保留因果关联的实时溯源图压缩方法,其特征在于,或者所述初始状态信息包括初始节点部分依赖信息和初始写入节点信息;2. The real-time source graph compression method that preserves causal relationships according to claim 1, wherein the initial state information includes initial node partial dependency information and initial write node information; 其中,所述初始节点部分依赖信息为长度L的队列信息,所述初始写入节点信息为节点依赖不变的写入节点信息;Wherein, the initial node partial dependency information is queue information of length L, and the initial write node information is write node information with unchanged node dependencies; 对应的,所述目标状态信息包括目标节点部分依赖信息和目标写入节点信息。Correspondingly, the target state information includes target node partial dependency information and target write node information. 3.根据权利要求2所述的一种保留因果关联的实时溯源图压缩方法,其特征在于,所述初始节点部分依赖信息和所述目标节点部分依赖信息还包括节点版本信息;3. The real-time source graph compression method for preserving causal relationships according to claim 2, wherein the initial node partial dependency information and the target node partial dependency information further include node version information; 对应的,基于所述边的信息流对所述源节点和所述目的节点的初始状态信息进行更新,生成目标状态信息,包括:Correspondingly, the initial state information of the source node and the destination node is updated based on the information flow of the edge to generate target state information, including: 所述源节点的目标写入节点信息等于所述源节点的初始写入节点信息加上所述目标节点;The target write node information of the source node is equal to the initial write node information of the source node plus the target node; 所述源节点的目标节点部分依赖信息等于所述源节点的初始节点部分依赖信息;The target node partial dependency information of the source node is equal to the initial node partial dependency information of the source node; 所述目的节点的目标节点部分依赖信息更新如下:The target node's dependency information is updated as follows: 首先使用append函数将所述目的节点的初始节点部分依赖信息、所述源节点的初始节点部分依赖信息和所述源节点依次添加到列表的尾部;First, the append function is used to add the initial node partial dependency information of the destination node, the initial node partial dependency information of the source node, and the source node to the end of the list in sequence; 然后利用pruning函数对靠近列表首部的重复信息进行删除,保留依赖长度为L的的列表尾部作为所述目的节点的目标节点部分依赖信息;Then, the pruning function is used to delete duplicate information near the beginning of the list, and the tail of the list with a dependency length of L is retained as the target node partial dependency information of the target node. 所述目的节点的目标节点部分依赖信息通过公式(3)更新:The target node dependency information of the target node is updated using formula (3): 其中,表示目的节点的目标节点部分依赖信息,表示目的节点的初始节点部分依赖信息,表示源节点的初始节点部分依赖信息,表示源节点,表示添加函数,表示消减函数;in, This indicates the target node's partial dependency information. This indicates the initial node's dependency information for the destination node. This represents the initial node dependency information of the source node. Indicates the source node, This indicates the addition of a function. Represents the reduction function; 所述目的节点的目标写入节点信息在所述目的节点的目标节点部分依赖信息与所述目的节点的初始节点部分依赖信息不同时为空,相同则等于所述目的节点的初始写入节点信息,所述目的节点的目标写入节点信息的更新通过公式(4)进行判断:The target write node information of the destination node is empty when the target node partial dependency information of the destination node is different from the initial node partial dependency information of the destination node; if they are the same, it is equal to the initial write node information of the destination node. The update of the target write node information of the destination node is determined by formula (4): 其中,表示目的节点新增的节点部分依赖信息,若不为空则目的节点的目标写入节点信息为空,若为空则目的节点的目标写入节点信息等于目的节点的初始写入节点信息。in, This indicates the dependency information of the newly added node in the destination node. If not empty, the target node information written to the destination node is empty. If empty, the target write node information of the destination node is equal to the initial write node information of the destination node. 4.根据权利要求1所述的一种保留因果关联的实时溯源图压缩方法,其特征在于,基于所述目标状态信息和所述初始状态信息判断所述边是否进行压缩,以实现所述溯源图的压缩,包括:4. A real-time source graph compression method that preserves causal relationships according to claim 1, characterized in that, determining whether the edges should be compressed based on the target state information and the initial state information to achieve the compression of the source graph includes: 判断所述目的节点与所述源节点的所述初始状态信息和所述目标状态信息是否相同,若相同则将所述边进行边压缩。Determine whether the initial state information and target state information of the destination node are the same as those of the source node. If they are the same, compress the edges. 5.根据权利要求3所述的一种保留因果关联的实时溯源图压缩方法,其特征在于,基于所述目标状态信息和所述初始状态信息判断所述边是否进行压缩,以实现所述溯源图的压缩,包括:5. A real-time source graph compression method that preserves causal relationships according to claim 3, characterized in that, determining whether the edges should be compressed based on the target state information and the initial state information to achieve the compression of the source graph includes: 判断所述目的节点与所述源节点的所述初始状态信息和所述目标状态信息是否相同,若相同则将所述边进行边压缩;Determine whether the initial state information and target state information of the destination node are the same as those of the source node. If they are the same, compress the edges. 若不同则进行目的节点版本更新。If they are different, then update the target node version. 6.根据权利要求5所述的一种保留因果关联的实时溯源图压缩方法,其特征在于,所述方法,还包括:6. The real-time source graph compression method for preserving causal relationships according to claim 5, characterized in that the method further includes: 当所述不为空,且节点时,若节点的节点部分依赖信息包含目的节点,且依赖于,则节点依赖于更新后的版本,通过公式(5)判断节点的依赖版本更新:When the Not empty, and node When, if node The node dependency information includes the destination node. And dependent on Then the node Depends on Updated version Nodes are determined using formula (5). Dependency version updates: 其中,表示目的节点,表示节点的节点部分依赖信息。in, Indicates the destination node. Represents a node The node part depends on information. 7.一种保留因果关联的实时溯源图压缩装置,其特征在于,包括:7. A real-time source graph compression device that preserves causal relationships, characterized in that it comprises: 节点状态记录模块,用于从日志信息源中获取事件流,将所述事件流解析为边以及边对应的源节点和目的节点,并获取所述源节点和所述目的节点的初始状态信息;其中,所述初始状态信息包括初始节点全依赖信息和初始写入节点信息;其中所述初始节点全依赖信息利用集合记录节点依赖,所述节点写入节点信息为节点依赖不变的写入节点信息;The node state recording module is used to obtain an event stream from the log information source, parse the event stream into edges and the corresponding source and destination nodes, and obtain the initial state information of the source and destination nodes; wherein, the initial state information includes initial node full dependency information and initial write node information; wherein the initial node full dependency information uses a set to record node dependencies, and the node write node information is write node information with unchanged node dependencies; 节点状态更新模块,用于基于所述边的信息流对所述源节点和所述目的节点的初始状态信息进行更新,生成目标状态信息;所述目标状态信息包括目标节点全依赖信息和目标写入节点信息;所述节点状态更新模块具体用于:A node state update module is used to update the initial state information of the source node and the destination node based on the information flow of the edge, generating target state information; the target state information includes target node full dependency information and target writing node information; the node state update module is specifically used for: 所述源节点的目标写入节点信息等于所述源节点的初始写入节点信息加上所述目标节点;The target write node information of the source node is equal to the initial write node information of the source node plus the target node; 所述源节点的目标节点全依赖信息等于所述源节点的初始节点全依赖信息;The target node full dependency information of the source node is equal to the initial node full dependency information of the source node; 所述目的节点的目标节点全依赖信息等于所述源节点的初始节点全依赖信息、所述源节点和所述目标节点的初始节点全依赖的并集减去所述目标节点的集合,通过公式(1)计算所述目的节点的目标节点全依赖信息:The target node full dependency information of the destination node is equal to the initial node full dependency information of the source node, the union of the initial node full dependencies of the source node and the target node, minus the set of the target node. The target node full dependency information of the destination node is calculated using formula (1): 其中,表示源节点的初始节点全依赖信息,表示源节点,表示目的节点的初始节点全依赖信息,表示目的节点,表示目的节点的目标节点全依赖信息;in, This represents the initial node's full dependency information for the source node. Indicates the source node, This indicates the initial node's full dependency information for the destination node. Indicates the destination node. This represents the target node's full dependency information; 所述目的节点的目标写入节点信息在所述目的节点的目标节点全依赖信息与所述目的节点的初始节点全依赖信息不同时为空,相同则等于所述目的节点的初始写入节点信息,所述目的节点的目标写入节点信息的更新通过公式(2)进行判断:The target write node information of the destination node is empty if the target node full dependency information of the destination node is not the same as the initial node full dependency information of the destination node, and is equal to the initial write node information of the destination node if they are the same. The update of the target write node information of the destination node is determined by formula (2): ; 边压缩模块,用于基于所述目标状态信息和所述初始状态信息判断所述边是否进行压缩,以实现所述溯源图的压缩。An edge compression module is used to determine whether the edge should be compressed based on the target state information and the initial state information, so as to realize the compression of the source graph. 8.一种电子设备,包括存储器、处理器及存储在所述存储器上并在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现如权利要求1-6任一项所述的方法。8. An electronic device comprising a memory, a processor, and a computer program stored in the memory and running on the processor, characterized in that the processor, when executing the program, implements the method as described in any one of claims 1-6.
CN202411926940.3A 2024-12-25 2024-12-25 Real-time tracing graph compression method and device for reserving causal association Active CN119853979B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411926940.3A CN119853979B (en) 2024-12-25 2024-12-25 Real-time tracing graph compression method and device for reserving causal association

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411926940.3A CN119853979B (en) 2024-12-25 2024-12-25 Real-time tracing graph compression method and device for reserving causal association

Publications (2)

Publication Number Publication Date
CN119853979A CN119853979A (en) 2025-04-18
CN119853979B true CN119853979B (en) 2026-02-03

Family

ID=95372049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411926940.3A Active CN119853979B (en) 2024-12-25 2024-12-25 Real-time tracing graph compression method and device for reserving causal association

Country Status (1)

Country Link
CN (1) CN119853979B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115146271A (en) * 2022-09-02 2022-10-04 浙江工业大学 A method of APT traceability research and judgment based on causal analysis

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114615063A (en) * 2022-03-14 2022-06-10 清华大学 Attack tracing method and device based on log correlation analysis
CN116318990A (en) * 2023-03-13 2023-06-23 北京天融信网络安全技术有限公司 An attack chain real-time detection method, device, electronic equipment and storage medium
CN119011160A (en) * 2023-05-16 2024-11-22 华为技术有限公司 Threat event tracing method and related equipment
CN116738413B (en) * 2023-06-05 2024-02-13 广州大学 Method, system and device for back propagation attack investigation based on traceability graph
CN116743556A (en) * 2023-06-06 2023-09-12 广州大学 Traceability graph construction and pruning methods, devices and media based on system audit logs
CN116992089A (en) * 2023-06-21 2023-11-03 中国人民解放军战略支援部队信息工程大学 A method and device for constructing a traceability graph based on system audit logs
CN118590275A (en) * 2024-05-27 2024-09-03 广州大学 An abnormal node detection method based on honey-stepping log and traceability graph attention neural network
CN118368138A (en) * 2024-06-06 2024-07-19 中国科学院信息工程研究所 Attack detection method and system based on information flow tracking
CN118780375A (en) * 2024-07-24 2024-10-15 清华大学 Explainable fault diagnosis model construction, causal-driven fault diagnosis method and device, electronic equipment
CN118827222A (en) * 2024-08-07 2024-10-22 中国人民公安大学 A method for detecting APT attacks based on multi-dimensional edge-optimized traceability graph

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115146271A (en) * 2022-09-02 2022-10-04 浙江工业大学 A method of APT traceability research and judgment based on causal analysis

Also Published As

Publication number Publication date
CN119853979A (en) 2025-04-18

Similar Documents

Publication Publication Date Title
US10733149B2 (en) Template based data reduction for security related information flow data
US8856313B2 (en) Systems and methods for using provenance information for data retention in stream-processing
CN111581057B (en) General log analysis method, terminal device and storage medium
CN118394829A (en) Data blood edge analysis method, device, equipment and readable storage medium
CN112287339B (en) APT intrusion detection method and device and computer equipment
CN116318990A (en) An attack chain real-time detection method, device, electronic equipment and storage medium
CN111290881A (en) Data recovery method, device, equipment and storage medium
CN119853979B (en) Real-time tracing graph compression method and device for reserving causal association
CN114185884A (en) Streaming data processing method and system based on column storage data
CN112347102A (en) Multi-table splicing method and multi-table splicing device
CN113271234B (en) Adaptive event aggregation
WO2023078097A1 (en) Blockchain attack interception method and apparatus
CN110825784B (en) Detection method and device for low-efficiency business process
US11232121B2 (en) Method, apparatus, and computer-readable medium for data transformation pipeline optimization
GB2611965A (en) Metadata based data replication
CN119668895A (en) Event notification method, device, computer equipment and readable storage medium
CN116915455A (en) Traceability map construction methods, devices, computer equipment, storage media and products
KR20220115859A (en) Edge table representation of the process
US20070192384A1 (en) Computer implemented method for automatically managing stored checkpoint data
CN115905149B (en) Method and device for analyzing weblog, electronic equipment and readable medium
CN116846594A (en) A hypergraph-based attack detection and traceability method and system
CN108924002A (en) A kind of analytic method of performance data files, device and equipment
CN115525621A (en) Log processing method, system, electronic device and storage medium
CN114168119A (en) Code file editing method and device, electronic equipment and storage medium
CN114615036A (en) Abnormal behavior detection method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant