WO2017061893A1 - Procédé et système de découverte automatique de motifs d'utilisation de réseau - Google Patents
Procédé et système de découverte automatique de motifs d'utilisation de réseau Download PDFInfo
- Publication number
- WO2017061893A1 WO2017061893A1 PCT/RU2015/000657 RU2015000657W WO2017061893A1 WO 2017061893 A1 WO2017061893 A1 WO 2017061893A1 RU 2015000657 W RU2015000657 W RU 2015000657W WO 2017061893 A1 WO2017061893 A1 WO 2017061893A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- matrix
- network
- network address
- determining
- respect
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/142—Network analysis or design using statistical or mathematical methods
Definitions
- the present disclosure relates to a method and a system for automatic discovery of network usage patterns, in particular with respect to the field of automatic analysis of telecommunication networks.
- Modern tools used for automatic analysis of telecommunication networks should be able to perform actions in real time mode on the basis of traffic processing; they should be able to perform multi-parametric analysis of telecommunication networks.
- the use of multi- parametric analysis is necessary for the adaptation of granularity of analysis. They should be able to process data streams on different levels of hierarchy of the analyzed networks. Simultaneous processing of data streams on different levels of network hierarchy may be necessary to have the full understanding of events in the controlled network.
- Modern tools should further be able to perform the extraction of models of network resources usage in automatic mode. Extraction of patterns in automatic mode in real time may be necessary for timely prediction of technical problems or fraudulent behavior.
- Drawbacks of existing methods used for the analysis of network traffic show an absence of scalability, an absence of universality, a presence of a stage of manual data processing, an absence of adaptability and an impossibility of simultaneous detection of both individual and group types of network resource usage.
- a significant number of methods are intended for the analyses that must be made on a definite level of hierarchy of telecommunication network, for example, host- level- or network-level analysis.
- a method is developed to solve some definite problem of network traffic analysis, for example to analyze the dependence of traffic upon the time and it is impossible to reconfigure tools implementing this method for another type of analysis. This feature leads to quite narrow field of application of such methods.
- a further object of the invention is to provide low-complexity tools for the analysis of events in telecommunications networks. This object is achieved by the features of the independent claims. Further implementation forms are apparent from the dependent claims, the description and the figures.
- the disclosure has its scope in the development of scalable and universal tools which may be applied both for host-level analysis and network-level analysis.
- the ability to reconfiguration makes possible to use similar network analysis tools for the solution of quite different practical problems.
- the same tools for example, may be used both for the detection of fraudulent use of network resources and for the identification of behavior patterns that characterize similar use of network resources by large groups of users.
- Implementation of current invention will increase the degree of automation of tools used by the personnel responsible for the management of network.
- the concept as described in this disclosure is to provide an automatic identification of patterns characterizing the current state of the use of resources which are available in the analyzed segment of network and the detailing of information about these patterns.
- the concept consists in calculation of a decomposition matrix which describes main patterns of the usage of network resources. This matrix is calculated as a result of the analysis of network traffic filtered on the basis of preliminary choice of two fields of IP packet header.
- a matrix of behavior decomposition is used for the clustering of IP packets and final identification of groups. Each group has the unique pattern of usage of network resources.
- the disclosure is based on the following basic assumptions: Network usage analysis is performed on the basis of processing of data fields included in the header of network packets. Input data to be processed is represented in the form of matrix: Src versus Dst.
- Src is a set of instances of type 1 and Dst is the set of instances of type 2.
- Each type of instances is based on the use of the mask: SrcName, DstName.
- Each mask characterizes one field of the header of network packet.
- Input data is processed aiming the construction of the set of basic patterns of the use of network resources. This set of patterns completely characterizes the analyzed data set. Each pattern consists of a set of weight coefficients. Each coefficient characterizes the participation of the definite Dst instance in above mentioned usage pattern.
- Numerical analysis is carried out by the calculation of a usage decomposition matrix. This matrix describes the degree of similarity in the behavior of different instances of Src, which is represented in input data. In order to describe the invention in detail, the following terms, abbreviations and notations will be used:
- IP Internet Protocol
- Source network address e.g. source IP address
- Dst Destination network address, e.g. destination IP address
- a network usage pattern is a pattern characterizing the current state of the use of resources which are available in the analyzed segment of the network and the detailing of information about this pattern.
- the systems, devices and methods as described in this disclosure can be used in a very wide range of network analysis applications, for example: Automatic identification of patterns characterizing network behavior of users. Automatic profiling of these patterns; Automatic detection of situations which characterize high risk of network attacks; Automatic detection of unauthorized intruders in the network; Automatic detection of the cases characterizing the fraudulent use of hardware or software tools; and Automatic detection of situations which characterize high risk of failure in the monitored network segment.
- the invention relates to a method for automatic discovery of network usage patterns, the method comprising: parsing an input stream of network packets from a communication network according to a predetermined source network address and a predetermined destination network address; determining a set of source network address packet instances based on the parsed input stream of network packets with respect to the predetermined source network address and a set of destination network address packet instances based on the parsed input stream of network packets with respect to the predetermined destination network address; determining a matrix of statistics based on statistical evaluation of the set of source network address packet instances versus the set of destination network address packet instances; determining a network usage decomposition matrix based on statistical evaluation of the matrix of statistics with respect to service port information indicated in the parsed input stream of network packets; determining a set of support groups based on statistical evaluation of the network usage decomposition matrix; and assigning the set of source network address packet instances to a respective support group of the set of support groups based on a distance of the network usage decomposition matrix (
- Such a method allows creation of new means for controlling communication networks, in particular telecommunication networks. These new means allow online automatic extraction of models of network resources usage at different levels of the hierarchy of the analyzed telecommunications networks; Multi-parameter analysis of telecommunication networks, carried out in real time on the basis of traffic processing; and application of adaptive methods for automatic control and management of telecommunication networks.
- the method provides an efficient tool for network analysis of low complexity which can be used for the analysis of events in telecommunications networks.
- the method comprises: determining a port activity matrix based on statistical evaluation of the matrix of statistics with respect to the service port information indicated in the parsed input P T/RU2015/000657
- the method comprises: determining the network usage decomposition matrix based on a singular value decomposition of the port activity matrix into a first ancillary matrix, a decomposed matrix and a second ancillary matrix.
- the method comprises: determining a projection matrix based on a projection of the matrix of statistics by using the second ancillary matrix. This provides the advantage that by using the projection matrix the projection of the input matrix of statistics can be efficiently performed. Thus, the method has reduced complexity.
- the method comprises: determining a matrix of distances based on a distance of the matrix of statistics with respect to the projection matrix.
- the distance is according to the following relation: where Ty is the matrix of statistics, F y is the projection matrix, i is the index of source network address packet instance and j is the index of destination network address packet instance.
- the method comprises: determining the network usage decomposition matrix based on the matrix of distances.
- the method comprises: determining the network usage decom osition matrix according to the following relation:
- T is the matrix of statistics
- D is the distance
- i is the index of source network address packet instance
- j is the index of destination network address packet instance.
- determining the set of support groups comprises: determining a covariance matrix based on the network usage decomposition matrix.
- the method comprises: determining a number of support groups in the set of support groups based on a sum of the covariance matrix with respect to a number of destination network address packet instances.
- the method comprises: determining a number of support groups in the set of support groups based on the followin relation: wherein the number of support groups is equal to the number of different values of a support group array simCoeffi with respect to a number i of source network address packet instances. With other words, the number of support groups is equal to the number of different values of simCoeff. This provides the advantage that the number of support groups can be efficiently determined by computing a sum over the covariance matrix.
- the method comprises: assigning the set of source network address packet instances to a respective support group based on a distance of the support group array simCoeffi with respect to an array of support group markers Mark,.
- the assigning is based on the following relation: U2015/000657
- Mark is the array of support group markers and ⁇ is the value of a calculation error.
- the method comprises: determining a center of a respective support group as a vector of mean values of coordinates of the source network address packet instances included in the respective support group; and checking if source network address packet instances assigned to a respective support group have a predetermined distance to the center of the respective support group.
- This provides the advantage that by using a reference to the center, an assignment to a respective support group can be accurately performed.
- the inventions relates to a system for automatic discovery of network usage patterns, the system comprising: a preprocessing subsystem for parsing an input stream of network packets from a communication network according to a predetermined source network address and a predetermined destination network address; an input buffer processing subsystem for determining a set of source network address packet instances based on the parsed input stream of network packets with respect to the predetermined source network address and a set of destination network address packet instances based on the parsed input stream of network packets with respect to the predetermined destination network address; a statistical data processing subsystem for determining a matrix of statistics based on statistical evaluation of the set of source network address packet instances versus the set of destination network address packet instances; and for determining a network usage decomposition matrix based on statistical evaluation of the matrix of statistics with respect to service port information indicated in the parsed input stream of network packets; and a usage patterns identification subsystem for determining a set of support groups based on statistical evaluation of the network usage decomposition matrix; and for assign
- Such a system allows creation of new means for controlling communication networks, in particular telecommunication networks. These new means allow online automatic extraction of models of network resources usage at different levels of the hierarchy of the analyzed telecommunications networks; Multi-parameter analysis of telecommunication networks, carried out in real time on the basis of traffic processing; and application of adaptive methods for automatic control and management of telecommunication networks.
- the system provides an efficient tool for network analysis of low complexity which can be used for the analysis of events in telecommunications networks.
- the invention relates to a computer implemented method for automatic discovery of network usage patterns comprising the steps of processing of input stream of network packets, statistical data processing, network usage patterns
- the processing of input stream of network packets includes: parsing of network packets according the preliminary defined masks of data fields; and processing of input buffer including the process of buffering of instances and calculating the matrix of statistics on the basis of results of buffering.
- statistical data processing includes: calculating Port Activity Matrix; calculating Singular Value Decomposition for Port Activity Matrix; calculating the projection of the matrix of statistics into the space of basic usage models; calculating the matrix of distances;
- the identification of network usage patterns includes: calculating the vector of similarity coefficients; calculating of the set of markers for support groups; assigning instances to support groups and calculating the coordinates of the center for each support group; purifying support groups by moving of inappropriate instances into the secondary groups; P T/RU2015/000657
- the invention relates to a system for automatic discovery of network usage patterns comprising: an input subsystem which includes the means for capturing of network packet stream, the stream of network packets to be analyzed is the output of this subsystem; a subsystem of data stream preprocessing which includes the means for parsing of network packets, the stream of network packets to be analyzed is the input of this subsystem, the stream of pairs Src instance and Dst instance is the output of this subsystem; a subsystem of input buffer processing which includes the means for the filling of the buffer to be analyzed, the stream of pairs Src instance; Dst instance is the input of this subsystem, the stream of matrixes of statistics Src vs Dst is the output of this subsystem; a subsystem of statistical data processing which includes the means for the calculation of usage decomposition matrix, the stream of matrices of statistics Src vs Dst is the input of this subsystem, the stream of usage decomposition matrixes is the output of this subsystem;
- a first scenario is developing of a scalable network monitoring system which may be adapted for various levels of granularity of analysis: Host-level analysis tools may be used for automated identification of models characterizing software applications that are running on a definite hardware unit.
- Network-level analysis tools may be used for automated identification of models characterizing the usage of hardware units present in the definite segments of the wired or wireless networks.
- a second scenario is developing of a scalable software and/or hardware tools applicable for automated analysis of traffic streams.
- These tools may be used for the following set of purposes: Automatic identification of patterns characterizing network behavior of users, automatic profiling of these patterns; Automatic detection of situations which characterize high risk of network attacks; Automatic detection of unauthorized intruders in the network; Automatic detection of the cases characterizing the fraudulent use of hardware or software tools; Automatic detection of situations which characterize high risk of failure in the monitored network segment;
- the field of application may be analysis of resource usage patterns both in wired- and wireless networks.
- the purpose is searching for anomalies in network traffic between two specific IP addresses.
- a typical workflow is as follows: Setting the value of mask SrcName to IPSrc (IP address of the sender); Setting the value of mask DstName to IPDst (IP address of the receiver); Setting the IP addresses of two network nodes to be controlled; Training phase: automatic creation of a statistical model that characterizes the transfer of information between the selected nodes of telecommunication network; Stage of traffic control: the use of the created model for the analysis in order to detect anomalies in traffic between two network nodes.
- the purpose is monitoring of unauthorized use of network resources by specific software.
- a typical workflow is as follows: Creation of a testbed for the use in training phase; Setting the value of mask SrcName to IPSrc (IP address of the sender); Setting the value of mask DstName to SrcPort (Port used by sender software); Training phase: automatic creation of a statistical model that characterizes network interaction of controlled software; Stage of traffic control: the use of the created model to detect unauthorized use of network resources by controlled software.
- the methods, systems and devices as described in this disclosure can bring three (or more) kinds of effects as described in the following.
- the systems, devices and methods according to the disclosure facilitate to develop the set of network monitoring tools being able to automatically extract the models of network resources usage. This procedure of extraction may be realized at different levels of hierarchy of analyzed telecommunication network.
- Automatic multi-parameter analysis of data stream may be realized as a procedure carried U2015/000657
- Implementation of the methods, systems and devices according to the disclosure facilitate development of a principally new set of software and hardware tools.
- the main perspective result of the implementation of such methods, systems and devices is the creation of a principally new class of tools intended for the monitoring of the traffic of wired and wireless networks.
- the use of adaptive control for network traffic may realize Smart Networks on the basis of the methods, systems and devices described hereinafter.
- Fig. 1 shows a schematic diagram illustrating a method 100 for automatic discovery of network usage patterns according to an implementation form
- Fig. 2 shows a schematic diagram illustrating a system 200 for automatic discovery of network usage patterns according to an implementation form
- Fig. 3 shows a flowchart illustrating an exemplary embodiment of a system 300 for automatic discovery of network usage patterns according to an implementation form
- Fig. 4 shows a flow diagram illustrating an example of network traffic stream
- Fig. 5 shows a flow diagram illustrating an example of input buffer processing 500 according to an implementation form
- Fig. 6 shows a flow diagram illustrating an example of statistical data processing 600 according to an implementation form
- Fig. 7 shows a flow diagram illustrating an example of usage patterns identification 700 according to an implementation form.
- Fig. 1 shows a schematic diagram illustrating a method 100 for automatic discovery of network usage patterns according to an implementation form.
- the method 100 includes parsing 101 an input stream of network packets from a communication network according to a predetermined source network address and a predetermined destination network address.
- the method 100 includes determining 102 a set of source network address packet instances based on the parsed input stream of network packets with respect to the predetermined source network address and a set of destination network address packet instances based on the parsed input stream of network packets with respect to the predetermined destination network address.
- the method 100 includes determining 103 a matrix of statistics T(i,j) based on statistical evaluation of the set of source network address packet instances versus the set of destination network address packet instances.
- the method 100 includes determining 104 a network usage decomposition matrix Z(i,j) based on statistical evaluation of the matrix of statistics T(i j) with respect to service port information indicated in the parsed input stream of network packets.
- the method 100 includes determining 105 a set of support groups based on statistical evaluation of the network usage decomposition matrix Z(ij).
- the method 100 includes assigning 106 the set of source network address packet instances to a respective support group of the set of support groups based on a distance of the network usage decomposition matrix Z(i,j) with respect to the respective support group.
- the method 100 may further include: determining a port activity matrix PAM based on statistical evaluation of the matrix of statistics T(i,j) with respect to the service port information indicated in the parsed input stream of network packets; and determining 104 the network usage decomposition matrix Z(i,j) based on a decomposition of the port activity matrix PAM.
- the method 100 may further include: determining 104 the network usage decomposition matrix Z(i,j) based on a singular value decomposition, e.g. a singular value decomposition 603 as described with respect to Fig. 6, of the port activity matrix PAM into a first ancillary matrix U, a decomposed matrix S and a second ancillary matrix V T .
- a singular value decomposition e.g. a singular value decomposition 603 as described with respect to Fig. 6, of the port activity matrix PAM into a first ancillary matrix U, a decomposed matrix S and a second ancillary matrix V T .
- the method 100 may further include: determining 604 a projection matrix F(i,j) based on a projection of the matrix of statistics T(i,j) by using the second ancillary matrix V T , e.g. as described below with respect to Fig. 6.
- the method 100 may further include: determining 605 a matrix of distances D(i,j), e.g. as described below with respect to Fig. 6, based on a distance of the matrix of statistics T(i,j) with respect to the projection matrix F(i,j).
- the distance may be according to the following relation:
- the method 100 may further include: determining 104 the network usage decomposition matrix Z(i,j) based on the matrix of distances D(i,j), e.g. as described with respect to Fig. 6.
- the method 100 may further include: determining 104 the network usage decomposition matrix Z(i,j) according to the following relation:
- the determining 105 the set of support groups may include: determining 702 a covariance matrix covZ g based on the network usage decomposition matrix Z(i,j), e.g. as described below with respect to Fig. 7.
- the method 100 may further include: determining a number of support groups in the set of support groups based on a sum of the covariance matrix covZi j )with respect to a number of destination network address packet instances (j).
- the method 100 may further include: determining 703 a number of support groups in the set of support groups based on the following relation: e.g. as described below with respect to Fig. 7, wherein the number of support groups is equal to the number of different values of a support group array simCoeffi with respect to a number i of source network address packet instances.
- the method 100 may further include: assigning 106 the set of source network address packet instances to a respective support group based on a distance of the support group array simCoeffi with respect to an array of support group markers (Markj), e.g. as described below with respect to Fig. 7.
- Markj an array of support group markers
- the assigning 106 may be based on the following relation: 0657
- the method 100 may further include: determining 707 a center of a respective support group as a vector of mean values of coordinates of the source network address packet instances included in the respective support group, e.g. as described below with respect to Fig. 7; and checking 708 if source network address packet instances assigned to a respective support group have a predetermined distance to the center of the respective support group, e.g. as described below with respect to Fig. 7.
- Fig. 2 shows a schematic diagram illustrating a system 200 for automatic discovery of network usage patterns according to an implementation form.
- the system 200 includes a preprocessing subsystem 201 (e.g., a network packet parser) for parsing an input stream 210 of network packets from a communication network according to a predetermined source network address and a predetermined destination network address.
- the system 200 includes an input buffer processing subsystem 202 for determining a set of source network address packet instances 212 based on the parsed input stream 21 1 of network packets with respect to the predetermined source network address and a set of destination network address packet instances based on the parsed input stream of network packets with respect to the predetermined destination network address.
- the system 200 further includes a statistical data processing subsystem 203 for determining a matrix of statistics T(i,j) based on statistical evaluation of the set of source network address packet instances versus the set of destination network address packet instances; and for determining a network usage decomposition matrix Z(i,j) based on statistical evaluation of the matrix of statistics T(i,j) with respect to service port information indicated in the parsed input stream 21 1 of network packets.
- a statistical data processing subsystem 203 for determining a matrix of statistics T(i,j) based on statistical evaluation of the set of source network address packet instances versus the set of destination network address packet instances; and for determining a network usage decomposition matrix Z(i,j) based on statistical evaluation of the matrix of statistics T(i,j) with respect to service port information indicated in the parsed input stream 21 1 of network packets.
- the system 200 further includes a usage patterns identification subsystem 204 for determining a set of support groups based on statistical evaluation of the network usage decomposition matrix Z(i,j) and for assigning 214 the set of source network address packet instances to a respective support group of the set of support groups based on a distance of the network usage decomposition matrix Z(i,j) with respect to the respective support group.
- the preprocessing subsystem 201 may process step 101 of the method 100 described above with respect to Fig. 1.
- the input buffer processing subsystem 202 may process the step 102 of the method 100.
- the statistical data processing subsystem 203 may process the steps 103 and 104 of the method 100.
- the usage patterns identification subsystem 204 may process the steps 105, 106 of the method 100.
- Fig. 3 shows a flowchart illustrating an exemplary embodiment of a system 300 for automatic discovery of network usage patterns according to an implementation form.
- the system 300 includes an input subsystem 30 , a subsystem of data stream
- preprocessing 302 that may correspond to the preprocessing subsystem 201 according to Fig. 2, a subsystem of input buffer processing 303 that may correspond to the input buffer processing subsystem 202 according to Fig. 2, a subsystem of statistical data processing 304 that may correspond to the statistical data processing subsystem 203 according to Fig. 2, a subsystem of network usage pattern identification 305 that may correspond to the usage patterns identification subsystem 204 according to Fig. 2 and an output subsystem 306 for outputting data.
- FIG. 3 illustrates an example embodiment of the System of Automatic Discovery of Network Usage Patterns.
- the masks of two fields of network packets which will be used at analysis will be defined as: SrcName and DstName.
- Fig. 4 shows a flow diagram illustrating an example of network traffic stream
- preprocessing 400 according to an implementation form, i.e. an example of the subsystem 302 shown in Fig. 3.
- a third block 403 is processed to perform the parsing of the input packet and a fourth block 404 is processed to output the results of packet parsing. Then a jump to the second block 402 is performed. If the answer of second block 402 is no, a fifth block 405 is performed for checking if a variable "FlagExit" equals 1. If the answer is yes, the flow diagram goes to an exit block 406, if the answer is no, a jump to the second block 402 is performed.
- the diagram represented in Figure 4 illustrates the example of network traffic stream preprocessing.
- System input data stream comes to a System from the Input Subsystem.
- this subsystem is used as an interface between the analyzed network and System of Automatic Identification of Network Usage Patterns.
- Main purpose of data preprocessing is to prepare the data which will be used at automatic analysis of network usage.
- Block 405 illustrates the check of the state of FlagExit flag. This flag is handled by some external procedure to stop the analysis of the network usage.
- input packet is parsed (block 403).
- the header of the network packet is subdivided on the set of the separate fields. Pre-defined masks SrcName and DstName determine the logic of the parsing.
- Output of parsing procedure includes two instances: the instance of the field SrcName and the instance of the field DstName which are represented in the header of network packet.
- Fig. 5 shows a flow diagram illustrating an example of input buffer processing 500 according to an implementation form, i.e. an example of the subsystem 303 shown in Fig. 3.
- FIG. 5 illustrates the example of input buffer processing.
- the Subsystem of Input Buffer Processing is used for the following set of actions: the accumulation of the data of input buffer; the calculation of the output arrays of unique Src- and Dst instances; the calculation of matrix of statistics Src vs Dst.
- the output of the workflow of mentioned subsystem includes the arrays of unique names and the matrix of statistics T.
- Fig. 6 shows a flow diagram illustrating an example of statistical data processing 600 according to an implementation form, i.e. an example of the subsystem 304 shown in Fig. 3. In the statistical data processing 600, the following eight blocks 601 , 602, 603, 604, 605, 606, 607, 608 are sequentially processed.
- a second block 602 is processed to calculate the Port Activity Matrix (PAM) on the basis of the matrix of statistics T(i,j).
- a fifth block 605 is processed to calculate the matrix of distances from the current usage models to the set of basic usage models: D(i,j).
- a sixth block 606 is processed to calculate the matrix of usage decompositions Z(i,j) on the basis of the matrix of distances D(i,j).
- a seventh block 607 is processed to output the results and an eighth block 608 indicates the end.
- the diagram represented in Figure 6 illustrates an example of statistical data processing. This stage of data processing is performed by Subsystem of statistical data processing. In the represented embodiment this subsystem is intended for the calculation of network usage decomposition matrix. This calculation is performed on the basis of the matrix of statistics 7.
- the Port Activity Matrix (PAM) may be calculated on the basis of numerical method, e.g. as proposed in the paper ⁇ . Sharafuddin, Y. Jin, N. Jiang, Z. Zhang. Know Your Enemy, Know Yourself: Block-Level Network Behavior Profiling and Tracking // Global Telecommunications Conference (GLOBECOM 2010), IEEE, 2010".
- Fig. 7 shows a flow diagram illustrating an example of usage patterns identification 700 according to an implementation form, i.e. an example of the subsystem 305 shown in Fig.
- a second block 702 is processed to calculate the matrix of covariance covZ(i,j) on the basis of the matrix Z(ij).
- a third block 703 is processed to calculate the values of the
- a fourth block 704 is processed to define the values of the components for the vector of markers of support groups.
- a fifth block 705 is processed to make the formation of the support groups on the basis of group markers.
- a sixth block 706 is processed to make the assignment of the Src instances to support groups.
- a seventh block 707 is processed to calculate the
- An eighth block 708 is processed to check the relevance of the assignment of each instance into the definite group, and to move each excluded Src instance from the support group into the separate secondary group.
- a ninth block 709 is processed to make the output of the results of clustering and to clear all data arrays responsible for the processing of time window data.
- a tenth block 710 indicates the end.
- the diagram represented in Figure 7 illustrates an example of usage patterns identification.
- groups may be equal to the number of different values of simCoeff. Assignment of Src instances into the corresponding group may be based on the following rule. If the condition ⁇ s is true then /-th Src instance is included into the ' -th
- Coordinates of the center of each group may be calculated as a vector of mean values of coordinates of Src instances included into that group.
- the check of the relevance of the assignment of Src instances into the definite group may be based on the following rule. If the condition ⁇ (covZ v -cJ 2 false then / ' -th instance may be j
- the output block 709 or the Output subsystem 306 according to Fig. 6 may produce the output of results of the clustering of Src instances.
- the output block 709 may produce the output of the set of data structures UGD.
- the element UGD includes information about instances of Src associated with some definite group. This information includes the index of the cluster; the value of the marker of current group of Src instances; the set of indexes of Src instances associated with this cluster and usage model characterizing current group of instances.
- the output block 709 may further produce the output of the set of the unique names of Src instances presented in current time window.
- the output block 709 may further produce the output of the set of the unique names of Dst instances presented in current time window.
- the present disclosure also supports a computer program product including computer executable code or computer executable instructions that, when executed, causes at least one computer to execute the performing and computing steps described herein, in particular the methods 100, 300, 400, 500, 600 as described above with respect to Figs. 1 and 3-6 or the system 200 described above with respect to Fig. 2.
- a computer program product may include a readable storage medium storing program code thereon for use by a computer.
- the program code may perform the method 100, 300, 400, 500, 600 as described above with respect to Figs. 1 and 3-6 or the system 200 described above with respect to Fig. 2.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Mathematical Optimization (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Pure & Applied Mathematics (AREA)
- Physics & Mathematics (AREA)
- Mathematical Analysis (AREA)
- General Physics & Mathematics (AREA)
- Algebra (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
La présente invention concerne un procédé de découverte automatique de motifs d'utilisation de réseau (100) comprenant les étapes consistant à : analyser (101) un flux d'entrée de paquets de réseau à partir d'un réseau de communication d'après une adresse réseau source prédéterminée et une adresse réseau cible prédéterminée ; déterminer (102) un ensemble d'instances de paquets d'adresse réseau source d'après le flux d'entrée analysé de paquets de réseau par rapport à l'adresse réseau source prédéterminée et un ensemble d'instances de paquets d'adresse réseau cible d'après le flux d'entrée analysé de paquets de réseau par rapport à l'adresse réseau cible prédéterminée ; déterminer (103) une matrice de statistiques (T(i,j)) d'après une évaluation statistique de l'ensemble d'instances de paquets d'adresse réseau source par rapport à l'ensemble d'instances de paquets d'adresse réseau cible ; déterminer (104) une matrice de décomposition d'utilisation de réseau (Z(i,j)) d'après une évaluation statistique de la matrice de statistiques (T(i,j)) par rapport à des informations de port de service indiquées dans le flux d'entrée analysé de paquets de réseau ; déterminer (105) un ensemble de groupes de prise en charge d'après une évaluation statistique de la matrice de décomposition d'utilisation de réseau (Z(i,j)) ; et attribuer (106) l'ensemble d'instances de paquets d'adresse réseau source à un groupe de prise en charge respectif de l'ensemble de groupes de prise en charge d'après une distance de la matrice de décomposition d'utilisation de réseau (Z(i,j)) par rapport au groupe de prise en charge respectif.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/RU2015/000657 WO2017061893A1 (fr) | 2015-10-09 | 2015-10-09 | Procédé et système de découverte automatique de motifs d'utilisation de réseau |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/RU2015/000657 WO2017061893A1 (fr) | 2015-10-09 | 2015-10-09 | Procédé et système de découverte automatique de motifs d'utilisation de réseau |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2017061893A1 true WO2017061893A1 (fr) | 2017-04-13 |
Family
ID=55971172
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/RU2015/000657 Ceased WO2017061893A1 (fr) | 2015-10-09 | 2015-10-09 | Procédé et système de découverte automatique de motifs d'utilisation de réseau |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2017061893A1 (fr) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11108795B2 (en) | 2018-05-25 | 2021-08-31 | At&T Intellectual Property I, L.P. | Intrusion detection using robust singular value decomposition |
| CN120494297A (zh) * | 2025-07-15 | 2025-08-15 | 中交第一航务工程勘察设计院有限公司 | 集装箱卸货的堆位规划方法、装置、设备及存储介质 |
-
2015
- 2015-10-09 WO PCT/RU2015/000657 patent/WO2017061893A1/fr not_active Ceased
Non-Patent Citations (4)
| Title |
|---|
| E. SHARAFUDDIN; Y. JIN; N. JIANG; Z. ZHANG: "GLOBECOM 2010", 2010, IEEE, article "Know. Your Enemy, Know Yourself: Block-Level Network Behavior Profiling and Tracking II Global Telecommunications Conference" |
| ESAM SHARAFUDDIN ET AL: "Know Your Enemy, Know Yourself: Block-Level Network Behavior Profiling and Tracking", GLOBECOM 2010, 2010 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, IEEE, PISCATAWAY, NJ, USA, 6 December 2010 (2010-12-06), pages 1 - 6, XP031846647, ISBN: 978-1-4244-5636-9 * |
| MEDINA A: "TRAFFIC MATRIX ESTIMATION: EXISTING TECHNIQUES AND NEW DIRECTIONS", COMPUTER COMMUNICATION REVIEW, ACM, NEW YORK, NY, US, vol. 32, no. 4, 1 October 2002 (2002-10-01), pages 161 - 174, XP001162285, ISSN: 0146-4833, DOI: 10.1145/964725.633041 * |
| ZHE WANG ET AL: "Structural Analysis of Network Traffic Matrix via Relaxed Principal Component Pursuit", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 12 April 2011 (2011-04-12), XP080549121 * |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11108795B2 (en) | 2018-05-25 | 2021-08-31 | At&T Intellectual Property I, L.P. | Intrusion detection using robust singular value decomposition |
| US12301598B2 (en) | 2018-05-25 | 2025-05-13 | At&T Intellectual Property I, L.P. | Intrusion detection using robust singular value decomposition |
| CN120494297A (zh) * | 2025-07-15 | 2025-08-15 | 中交第一航务工程勘察设计院有限公司 | 集装箱卸货的堆位规划方法、装置、设备及存储介质 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP3407562B1 (fr) | Procédé et système de reconnaissance de flux commun, et procédé d'utilisation de serveur | |
| EP3841730B1 (fr) | Identification de types de dispositifs basée sur des attributs comportementaux | |
| US20170272344A1 (en) | Real-Time Detection of Abnormal Network Connections in Streaming Data | |
| US11196670B2 (en) | System and method for identifying devices behind network address translators | |
| US20190065738A1 (en) | Detecting anomalous entities | |
| CN112615888B (zh) | 一种网络攻击行为的威胁评估方法及装置 | |
| US9692779B2 (en) | Device for quantifying vulnerability of system and method therefor | |
| CN110855576A (zh) | 应用识别方法及装置 | |
| US10164908B2 (en) | Filtration of network traffic using virtually-extended ternary content-addressable memory (TCAM) | |
| CN110717551A (zh) | 流量识别模型的训练方法、装置及电子设备 | |
| CN109474691A (zh) | 一种物联网设备识别的方法及装置 | |
| CN104333483A (zh) | 互联网应用流量识别方法、系统及识别装置 | |
| CN111953552A (zh) | 数据流的分类方法和报文转发设备 | |
| CN111835681B (zh) | 一种大规模流量异常主机检测方法和装置 | |
| US20220413947A1 (en) | Systems and Methods for Detecting Partitioned and Aggregated Novel Network, User, Device and Application Behaviors | |
| CN112866175A (zh) | 一种异常流量类型保留方法、装置、设备及存储介质 | |
| CN104333461A (zh) | 互联网应用流量识别方法、系统及识别装置 | |
| CN109361618B (zh) | 数据流量标记方法、装置、计算机设备及存储介质 | |
| WO2017061893A1 (fr) | Procédé et système de découverte automatique de motifs d'utilisation de réseau | |
| CN114363212B (zh) | 一种设备检测方法、装置、设备和存储介质 | |
| CN116582305A (zh) | 电力业务交互行为的持续信任评估方法及相关设备 | |
| CN114205146B (zh) | 一种多源异构安全日志的处理方法及装置 | |
| CN108063814B (zh) | 一种负载均衡方法及装置 | |
| US20150058466A1 (en) | Device for server grouping | |
| CN110855602B (zh) | 物联网云平台事件识别方法及系统 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15860032 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 15860032 Country of ref document: EP Kind code of ref document: A1 |