[go: up one dir, main page]

CN115604090B - Network anomaly root cause positioning method and system based on overfitting - Google Patents

Network anomaly root cause positioning method and system based on overfitting Download PDF

Info

Publication number
CN115604090B
CN115604090B CN202211132605.7A CN202211132605A CN115604090B CN 115604090 B CN115604090 B CN 115604090B CN 202211132605 A CN202211132605 A CN 202211132605A CN 115604090 B CN115604090 B CN 115604090B
Authority
CN
China
Prior art keywords
service
data
network node
network
service identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211132605.7A
Other languages
Chinese (zh)
Other versions
CN115604090A (en
Inventor
朱文进
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Digital Intelligence Technology Co Ltd
Original Assignee
China Telecom Digital Intelligence Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Digital Intelligence Technology Co Ltd filed Critical China Telecom Digital Intelligence Technology Co Ltd
Priority to CN202211132605.7A priority Critical patent/CN115604090B/en
Publication of CN115604090A publication Critical patent/CN115604090A/en
Application granted granted Critical
Publication of CN115604090B publication Critical patent/CN115604090B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a network anomaly root cause positioning method and system based on overfitting, and belongs to the technical field of network operation and maintenance. The method comprises the steps of collecting network node data among routers, generating a route service identifier and a network node service identifier, testing network delay among the routers, determining the abnormal time of the network delay as abnormal time, importing the associated route data and the associated network node data collected in the abnormal time into a fault source positioning analysis model for analysis, determining an overfitting value, and sequencing the service data according to the overfitting value to position the fault source. The method can rapidly locate the source of network abnormality when the invisible network delay possibly caused by the network switch information passing between each hop of route cannot be obtained, and improves the efficiency of network operation and maintenance.

Description

Network anomaly root cause positioning method and system based on overfitting
Technical Field
The invention belongs to the technical field of network operation and maintenance, and particularly relates to a network anomaly root cause positioning method and system based on overfitting.
Background
With the gradual penetration of digital development, the on-line equipment of the global local area network is gradually increased, and compared with the equipment which is increased by 10-100 times before ten years, even if operation and maintenance are developed from manual operation and maintenance to tool operation and maintenance and platform operation and maintenance, the requirement of the current ultra-large local area network on operation and maintenance monitoring still cannot be met. Under the large scale, monitoring network equipment by manual experience and automatic operation and maintenance becomes a technical bottleneck for restricting operation and maintenance work. In the prior art, the problem of network delay possibly caused by the fact that the information of a network switch passing through each hop of route cannot be acquired in the route tracking process is difficult to achieve. Therefore, a more intelligent and efficient optimization method for monitoring the TR069 protocol is introduced to improve the network operation and maintenance monitoring capability.
In the prior art, the problems of large service scale, complex application relation, multiple dependency layers and difficult inquiry problem in the operation and maintenance scene of a machine room exist. Under the large scale, monitoring network equipment by manual experience and automatic operation and maintenance becomes a technical bottleneck for restricting operation and maintenance work. In the prior art, the problem that network delay between routes is ignored because network switch information passing between routes of each hop cannot be acquired in the route tracking process is difficult to achieve, and the problem of invisible network delay of the switch is possibly caused.
Disclosure of Invention
The invention aims to overcome the defects and shortcomings of the prior art, and provides a network anomaly root cause positioning method and system based on overfitting, which can perform rapid network anomaly root cause positioning when invisible network delay possibly caused by network switch information passing between each hop of route cannot be acquired, thereby improving the efficiency of network operation and maintenance.
According to one aspect of the invention, the invention provides a network anomaly root cause positioning method based on overfitting, which comprises the following steps:
S1, collecting network node data between routers, analyzing associated service identifiers through logs, generating route service identifiers and network node service identifiers, and creating an initial data pool to store data from the routers and the network nodes;
S2, testing network delay among routers, determining the abnormal time of the network delay as abnormal time, importing the associated route data and the associated network node data acquired in the abnormal time into a fault source positioning analysis model for analysis, and determining abnormal data corresponding to the fitting value;
And S3, matching and classifying the IP address of the abnormal data corresponding to the overfitting value with the route service identifier and the network node service identifier to obtain the overfitting value service data sequence of the abnormal time, and positioning the fault source according to the overfitting value service data sequence.
Preferably, the generating the route service identifier and the network node service identifier includes:
the formats of the route service identifier and the network node service identifier are as follows:
Route service identifier, namely a router A# # service id1 and a router A# # service id2
Network node service identifier, network node # service id1 and service id2
The multiple traffic is separated by commas and the multiple routes or network nodes are separated by # numbers.
Preferably, the generating the route service identifier and the network node service identifier includes:
The method comprises the steps of collecting data, carrying out service classification according to service identifications associated with network node IP corresponding to the collected data, carrying out data division according to service weights, generating route service identifications and network node service identifications, calculating a thread pool load index, analyzing thread pool occupancy rate, and dispatching threads according to the thread pool occupancy rate, wherein the thread pool load index is as follows:
Wherein, N is the number of working threads in the thread pool running, N max is the set maximum number of threads, T cur is the number of tasks in the current acquisition time window, T pre is the number of tasks in the last acquisition time window, Q is the task buffer queue size, and ζ 1、ξ2、ξ3 is the weight coefficient.
Preferably, the creating the initial data pool to store data from the router and the network node includes:
And analyzing the data source type through the routing service identifier and the network node service identifier, and creating a text data pool, an analog signal data pool and an application data pool.
Preferably, the fault source location analysis model is:
||Xθ-y||2+||Γθ||2
θ(a)=(XTX+aI)-1XTy
The method comprises the steps of adding regularization to an operation process, wherein X represents input, y represents an output prediction result, I represents a regular operation, I represents an identity matrix, theta is a fitting hyper-parameter, gamma is a weight constant, a is the weight of the identity matrix, and theta (a) represents a value of theta under the condition that a is determined.
According to another aspect of the present invention, there is also provided a network anomaly root-cause positioning system based on overfitting, the system comprising:
The generation module is used for collecting network node data between routers, analyzing the associated service identifiers through logs, generating a route service identifier and a network node service identifier, and creating an initial data pool to store data from the routers and the network nodes;
The system comprises a determining module, a fault source positioning analysis model, a network delay analysis module and a network delay analysis module, wherein the determining module is used for testing network delay between routers and determining the abnormal time of the network delay as abnormal time;
And the positioning module is used for matching and classifying the IP address of the abnormal data corresponding to the overfitting value with the route service identifier and the network node service identifier, obtaining the overfitting value service data sequence of the abnormal time, and positioning the fault source according to the overfitting value service data sequence.
Preferably, the generating module generates the route service identifier and the network node service identifier includes:
the formats of the route service identifier and the network node service identifier are as follows:
Route service identifier, namely a router A# # service id1 and a router A# # service id2
Network node service identifier, network node # service id1 and service id2
The multiple traffic is separated by commas and the multiple routes or network nodes are separated by # numbers.
Preferably, the generating module generates the route service identifier and the network node service identifier includes:
The method comprises the steps of collecting data, carrying out service classification according to service identifications associated with network node IP corresponding to the collected data, carrying out data division according to service weights, generating route service identifications and network node service identifications, calculating a thread pool load index, analyzing thread pool occupancy rate, and dispatching threads according to the thread pool occupancy rate, wherein the thread pool load index is as follows:
Wherein, N is the number of working threads in the thread pool running, N max is the set maximum number of threads, T cur is the number of tasks in the current acquisition time window, T pre is the number of tasks in the last acquisition time window, Q is the task buffer queue size, and ζ 1、ξ2、ξ3 is the weight coefficient.
Preferably, the creating the initial data pool by the generating module to store data from the router and the network node includes:
And analyzing the data source type through the routing service identifier and the network node service identifier, and creating a text data pool, an analog signal data pool and an application data pool.
Preferably, the fault source location analysis model is:
||Xθ-y||2+||Γθ||2
θ(a)=(XTX+aI)-1XTy
The method comprises the steps of adding regularization to an operation process, wherein X represents input, y represents an output prediction result, I represents a regular operation, I represents an identity matrix, theta is a fitting hyper-parameter, gamma is a weight constant, a is the weight of the identity matrix, and theta (a) represents a value of theta under the condition that a is determined.
The method has the beneficial effects that the method can effectively find out the problem of invisible network delay possibly caused by the lack of network switch information in the process of implementing commands such as Traceroute, ping and the like to realize route tracking, and simultaneously more intuitively understand the network delay and packet loss conditions of switches and network nodes between routers in a network through a network topology graph, so that the network topology graph is closer to the actual condition.
Features and advantages of the present invention will become apparent by reference to the following drawings and detailed description of embodiments of the invention.
Drawings
FIG. 1 is a flow chart of a network anomaly root cause positioning method based on overfitting;
FIG. 2 is a schematic diagram of a network anomaly root-cause positioning system based on overfitting.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
FIG. 1 is a flow chart of a network anomaly root cause positioning method based on overfitting. As shown in fig. 1, the invention provides a network anomaly root-cause positioning method based on overfitting, which comprises the following steps:
S1, collecting network node data between routers, analyzing associated service identifiers through logs, generating route service identifiers and network node service identifiers, and creating an initial data pool to store data from the routers and the network nodes.
Specifically, the network latency between routers is tested by Traceroute (tracert in Windows systems) command to locate all routers between your computer and the target computer using ICMP protocol. The TTL value may reflect the number of routers or gateways through which the packet passes, and the traceroute command may traverse to all routers on the packet transmission path by manipulating the TTL value of the independent ICMP call packet and observing the return information that the packet was discarded.
Preferably, the generating the route service identifier and the network node service identifier includes:
the formats of the route service identifier and the network node service identifier are as follows:
Route service identifier, namely a router A# # service id1 and a router A# # service id2
Network node service identifier, network node # service id1 and service id2
The multiple traffic is separated by commas and the multiple routes or network nodes are separated by # numbers.
Preferably, the generating the route service identifier and the network node service identifier includes:
The method comprises the steps of collecting data, carrying out service classification according to service identifications associated with network node IP corresponding to the collected data, carrying out data division according to service weights, generating route service identifications and network node service identifications, calculating a thread pool load index, analyzing thread pool occupancy rate, and dispatching threads according to the thread pool occupancy rate, wherein the thread pool load index is as follows:
Wherein, N is the number of working threads in the thread pool running, N max is the set maximum number of threads, T cur is the number of tasks in the current acquisition time window, T pre is the number of tasks in the last acquisition time window, Q is the task buffer queue size, and ζ 1、ξ2、ξ3 is the weight coefficient.
Specifically, service classification is performed according to service IDs associated with network nodes IP corresponding to collected data, data division is performed according to service weights, service identifiers are generated, different thread pools are created according to different data sources, occupancy rates of the thread pools are analyzed through an algorithm, and the idle thread pools are preferentially scheduled to be preferentially stored for the collected data with large service weights.
Calculating a thread pool load metricThe load degree is converted from the data of the working thread number, the maximum thread number, the task buffer queue size and the like when the thread pool runs, and a percentage value is obtained through calculation of different weight proportions.
In the formula (i),Describing the saturation of the worker thread,Describing the saturation of the current task,The task buffer queue growth rate is described. And comparing the preset thread pool load degree omega ', triggering the self-adaptive parameter adjustment calculation if the load degree omega ' is larger than the preset thread pool load degree omega ', and otherwise, skipping the current acquisition time window. Then, the current thread pool occupancy rate is acquired and is analyzed when the current thread pool occupancy rate is lower than 50 percent, namely, the current thread pool occupancy rate is idle.
Preferably, the creating the initial data pool to store data from the router and the network node includes:
And analyzing the data source type through the routing service identifier and the network node service identifier, and creating a text data pool, an analog signal data pool and an application data pool.
Specifically, an initial data pool is created and data from the route and network nodes are stored in order, and a text data pool, an analog signal data pool and an application data pool are created through analysis of data source types by route service identifiers and network node service identifiers.
The TR069 protocol collects data and creates a text data pool for storage because the data is transferred in xml file format.
Analog signal data pool the trace command return data feature is that the acquisition type is stored in the analog signal data pool.
And the application data pool is used for collecting a large amount of data with the numerical value marked with the service ID by the TR069 protocol, and storing the data in the application data pool.
Compared with a database, the data pool can integrate data sources of different data structures uniformly, meanwhile, as text types are opened up according to data characteristics of different data sources, application types are adopted, three data pools of collection types are stored in a pool, and the mass data storage efficiency is improved.
And S2, testing network delay among routers, determining the abnormal time of the network delay as abnormal time, and importing the associated route data and the associated network node data acquired in the abnormal time into a fault source positioning analysis model for analysis to determine abnormal data corresponding to the fitting value.
Specifically, network node delay data among routers are collected through a TR069 protocol, and associated service IDs are analyzed through logs to generate service identifiers. TR069, collectively, "TECHNICAL REPORT 069", is a technical specification revised by DSL Forum (a non-profit worldwide industry alliance, working on developing Broadband network paradigms, members of which include leading vendors of industries such as communications, equipment, computers, networks, and service providers, now more named "Broadband Forum"), which is an application layer management Protocol, named "CPE wide area network management Protocol (CPE WAN MANAGEMENT Protocol)". TR069 defines a set of brand-new network management system structure, including management model, interaction interface and basic management parameters, which can effectively implement management of home network equipment. In TR-069, the network management server is called ACS (Auto Configuration Server automatic configuration server) with special IP address and URL, the managed device obtains the URL of ACS by DHCP server, and after obtaining the network management IP, the managed device starts to build HTTP session according to the URL of ACS. After the session is established, initialization is required, the purpose of which is to perform authentication, and the ACS is to ensure the validity of the managed device. After the initialization is completed, the network management server can acquire various monitoring information from the CPE.
Preferably, the fault source location analysis model is:
||Xθ-y||2+||Γθ||2
θ(a)=(XTX+aI)-1XTy
The method comprises the steps of adding regularization to an operation process, wherein X represents input, y represents an output prediction result, I represents a regular operation, I represents an identity matrix, theta is a fitting hyper-parameter, gamma is a weight constant, a is the weight of the identity matrix, and theta (a) represents a value of theta under the condition that a is determined.
The least squares method commonly used in regression analysis is an unbiased estimate. For one qualified problem, X is typically xθ=y of column full rank,
Defining a loss function as the square of the residual error by adopting a least square method, and minimizing the loss function
||Xθ-y||2.
The optimization problem can be solved by adopting a gradient descent method, or can be directly solved by adopting the following formula
θ=(XTX)-1XTy,
When X is not the column full rank, or when the linear correlation between some columns is relatively large, the determinant of X T X is close to 0, i.e., X T X is close to singular, the above problem becomes an ill-posed problem, and at this time, the error in calculation (X TX)-1) is large, and the conventional least square method lacks stability and reliability.
To solve the above problem, we need to transform the uncertainty problem into a fitness problem, we add a regularization term to the above-mentioned loss function, become
||Xθ-y||2+||Γθ||2
Where Γ=ai is defined, then:
θ(a)=(XTX+aI)-1XTy
In the above formula, I is an identity matrix.
Specifically, a ridge regression algorithm is adopted to construct a fault source positioning analysis model, the abnormal time of a Traceroute command test route is taken as abnormal time, a route service identifier and a network node service identifier are analyzed to obtain a correlation route, and other network node data are put into the model to obtain a fitting value for difference comparison, and the data with larger differentiation are summarized and sequenced with a router. The more data the greater the probability of locating the root cause of the fault. Thus completing the solution to the problem of hidden network delay which may be caused by the fault source location. The method specifically comprises the following steps:
First, the time when the router executes the Traceroute command to return to the network delay exception is the exception time.
Secondly, the route service identification and the network node service identification are analyzed to obtain the router and other associated network nodes or switch data of other routers except the router associated with the network nodes between the router and the network node service identification.
Then, the Traceroute command is executed at the abnormal time to obtain other route data such as route C and route D, etc. related to the router route. At the same abnormal time, executing a Traceroute command to acquire associated route data and executing a TR069 protocol to acquire network node and switch data associated with other services.
And finally, putting the associated route data and the associated network node data acquired in the abnormal time into a fault source positioning analysis model to obtain a route fitting value and a network node fitting value. And comparing the two values, wherein the difference is more than or equal to 10%, and the fitting is performed. It is explained that at the abnormal time, the associated route data and the associated network node data are greatly different, and the difference is explained that more abnormal data are generated.
And S3, matching and classifying the IP address of the abnormal data corresponding to the overfitting value with the route service identifier and the network node service identifier to obtain the overfitting value service data sequence of the abnormal time, and positioning the fault source according to the overfitting value service data sequence.
Specifically, the abnormal data IP address corresponding to the overfitting value is extracted to match and classify the route service identifier and the network node service identifier, the overfitting value service data ordering of the abnormal time associated with the router and the network node between the routers is obtained, the probability of locating the fault source is higher as the data is more, and therefore the fault source locating to the problem of the invisible network delay possibly caused is completed.
The model overfitting is prevented by the ridge regression method, and the traditional least square method lacks stability and reliability. To solve the above problem, it is necessary to convert the ill-posed problem into a qualified problem, for which a regularization term may be added to the loss function.
The network anomaly root cause positioning method of the embodiment can rapidly perform network anomaly root cause positioning when invisible network delay possibly caused by network switch information passing between each two routes cannot be acquired, and improves network operation and maintenance efficiency.
The embodiment can effectively find out the network switch information loss in the process of implementing the command such as Traceroute, ping and the like to realize the route tracking, so that the problem of invisible network delay possibly caused can be solved, and meanwhile, the network delay and packet loss conditions of the switches and the network nodes between routers in the network can be more intuitively known through the network topology, so that the network topology is closer to the actual condition.
Example 2
FIG. 2 is a schematic diagram of a network anomaly root-cause positioning system based on overfitting. As shown in fig. 2, the present invention further provides a network anomaly root-cause positioning system based on overfitting, the system comprising:
The generating module 201 is configured to collect network node data between routers, analyze associated service identifiers through logs, generate a route service identifier and a network node service identifier, and create an initial data pool to store data from the routers and the network nodes;
the determining module 202 is used for testing network delay between routers, determining the abnormal time of the network delay as abnormal time, importing the associated route data and the associated network node data acquired in the abnormal time into the fault source positioning analysis model for analysis, and determining the abnormal data corresponding to the fitting value;
and the positioning module 203 is configured to match the IP address of the abnormal data corresponding to the overfitting value with the route service identifier and the network node service identifier, obtain an overfitting value service data sequence of the abnormal time, and perform fault source positioning according to the overfitting value service data sequence.
Preferably, the generating module 201 generates a route service identifier and a network node service identifier includes:
the formats of the route service identifier and the network node service identifier are as follows:
Route service identifier, namely a router A# # service id1 and a router A# # service id2
Network node service identifier, network node # service id1 and service id2
The multiple traffic is separated by commas and the multiple routes or network nodes are separated by # numbers.
Preferably, the generating module 201 generates a route service identifier and a network node service identifier includes:
The method comprises the steps of collecting data, carrying out service classification according to service identifications associated with network node IP corresponding to the collected data, carrying out data division according to service weights, generating route service identifications and network node service identifications, calculating a thread pool load index, analyzing thread pool occupancy rate, and dispatching threads according to the thread pool occupancy rate, wherein the thread pool load index is as follows:
Wherein, N is the number of working threads in the thread pool running, N max is the set maximum number of threads, T cur is the number of tasks in the current acquisition time window, T pre is the number of tasks in the last acquisition time window, Q is the task buffer queue size, and ζ 1、ξ2、ξ3 is the weight coefficient.
Preferably, the creating the initial data pool by the generating module 201 to store data from the router and the network node includes:
And analyzing the data source type through the routing service identifier and the network node service identifier, and creating a text data pool, an analog signal data pool and an application data pool.
Preferably, the fault source location analysis model is:
||Xθ-y||2+||Γθ||2
θ(a)=(XTX+aI)-1XTy
The method comprises the steps of adding regularization to an operation process, wherein X represents input, y represents an output prediction result, I represents a regular operation, I represents an identity matrix, theta is a fitting hyper-parameter, gamma is a weight constant, a is the weight of the identity matrix, and theta (a) represents a value of theta under the condition that a is determined.
The implementation process of the functions implemented by each module in this embodiment 2 is the same as the implementation process of each step in embodiment 1, and will not be described here again.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the invention, and all equivalent structural changes made by the specification and drawings of the present invention or direct/indirect application in other related technical fields are included in the scope of the present invention.

Claims (10)

1. The network anomaly root-cause positioning method based on the overfitting is characterized by comprising the following steps of:
S1, collecting network node data between routers, analyzing associated service identifiers through logs, generating route service identifiers and network node service identifiers, and creating an initial data pool to store data from the routers and the network nodes;
S2, testing network delay among routers, determining the abnormal time of the network delay as abnormal time, importing the associated route data and the associated network node data acquired in the abnormal time into a fault source positioning analysis model for analysis, and determining abnormal data corresponding to the fitting value;
And S3, matching and classifying the IP address of the abnormal data corresponding to the overfitting value with the route service identifier and the network node service identifier to obtain the overfitting value service data sequence of the abnormal time, and positioning the fault source according to the overfitting value service data sequence.
2. The method of claim 1, wherein generating the routing service identity and the network node service identity comprises:
the formats of the route service identifier and the network node service identifier are as follows:
Route service identifier, namely a router A# # service id1 and a router A# # service id2
Network node service identifier, network node # service id1 and service id2
The multiple traffic is separated by commas and the multiple routes or network nodes are separated by # numbers.
3. The method of claim 2, wherein generating the routing service identity and the network node service identity comprises:
The method comprises the steps of collecting data, carrying out service classification according to service identifications associated with network node IP corresponding to the collected data, carrying out data division according to service weights, generating route service identifications and network node service identifications, calculating a thread pool load index, analyzing thread pool occupancy rate, and dispatching threads according to the thread pool occupancy rate, wherein the thread pool load index is as follows:
Wherein, N is the number of working threads in the thread pool running, N max is the set maximum number of threads, T cur is the number of tasks in the current acquisition time window, T pre is the number of tasks in the last acquisition time window, Q is the task buffer queue size, and ζ 1、ξ2、ξ3 is the weight coefficient.
4. The method of claim 1, wherein creating an initial data pool to store data from routers and network nodes comprises:
And analyzing the data source type through the routing service identifier and the network node service identifier, and creating a text data pool, an analog signal data pool and an application data pool.
5. The method of claim 1, wherein the fault source localization analysis model is:
||Xθ-y||2+||Γθ||2
θ(a)=(XTX+aI)-1XTy
The method comprises the steps of adding regularization to an operation process, wherein X represents input, y represents an output prediction result, I represents a regular operation, I represents an identity matrix, theta is a fitting hyper-parameter, gamma is a weight constant, a is the weight of the identity matrix, and theta (a) represents a value of theta under the condition that a is determined.
6. A network anomaly root-cause positioning system based on overfitting, the system comprising:
The generation module is used for collecting network node data between routers, analyzing the associated service identifiers through logs, generating a route service identifier and a network node service identifier, and creating an initial data pool to store data from the routers and the network nodes;
The system comprises a determining module, a fault source positioning analysis model, a network delay analysis module and a network delay analysis module, wherein the determining module is used for testing network delay between routers and determining the abnormal time of the network delay as abnormal time;
And the positioning module is used for carrying out matching classification on the IP address of the abnormal data corresponding to the overfitting value, the route service identifier and the network node service identifier, obtaining the overfitting value service data sequence of the abnormal time, and carrying out fault source positioning according to the overfitting value service data sequence.
7. The system of claim 6, wherein the generating module generating the routing traffic identity and the network node traffic identity comprises:
the formats of the route service identifier and the network node service identifier are as follows:
Route service identifier, namely a router A# # service id1 and a router A# # service id2
Network node service identifier, network node # service id1 and service id2
The multiple traffic is separated by commas and the multiple routes or network nodes are separated by # numbers.
8. The system of claim 7, wherein the generating module generating the routing traffic identity and the network node traffic identity comprises:
The method comprises the steps of collecting data, carrying out service classification according to service identifications associated with network node IP corresponding to the collected data, carrying out data division according to service weights, generating route service identifications and network node service identifications, calculating a thread pool load index, analyzing thread pool occupancy rate, and dispatching threads according to the thread pool occupancy rate, wherein the thread pool load index is as follows:
Wherein, N is the number of working threads in the thread pool running, N max is the set maximum number of threads, T cur is the number of tasks in the current acquisition time window, T pre is the number of tasks in the last acquisition time window, Q is the task buffer queue size, and ζ 1、ξ2、ξ3 is the weight coefficient.
9. The system of claim 6, wherein the generating module creating an initial data pool to store data from the router and the network node comprises:
And analyzing the data source type through the routing service identifier and the network node service identifier, and creating a text data pool, an analog signal data pool and an application data pool.
10. The system of claim 6, wherein the fault source location analysis model is:
||Xθ-y||2+||Γθ||2
θ(a)=(XTX+aI)-1XTy
The method comprises the steps of adding regularization to an operation process, wherein X represents input, y represents an output prediction result, I represents a regular operation, I represents an identity matrix, theta is a fitting hyper-parameter, gamma is a weight constant, a is the weight of the identity matrix, and theta (a) represents a value of theta under the condition that a is determined.
CN202211132605.7A 2022-09-17 2022-09-17 Network anomaly root cause positioning method and system based on overfitting Active CN115604090B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211132605.7A CN115604090B (en) 2022-09-17 2022-09-17 Network anomaly root cause positioning method and system based on overfitting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211132605.7A CN115604090B (en) 2022-09-17 2022-09-17 Network anomaly root cause positioning method and system based on overfitting

Publications (2)

Publication Number Publication Date
CN115604090A CN115604090A (en) 2023-01-13
CN115604090B true CN115604090B (en) 2025-05-23

Family

ID=84843979

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211132605.7A Active CN115604090B (en) 2022-09-17 2022-09-17 Network anomaly root cause positioning method and system based on overfitting

Country Status (1)

Country Link
CN (1) CN115604090B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668693A (en) * 2020-12-03 2021-04-16 支付宝(杭州)信息技术有限公司 Method and device for detecting abnormal business index
CN113098723A (en) * 2021-06-07 2021-07-09 新华三人工智能科技有限公司 Fault root cause positioning method and device, storage medium and equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7583667B2 (en) * 2004-03-19 2009-09-01 Avaya Inc. Automatic determination of connectivity problem locations or other network-characterizing information in a network utilizing an encapsulation protocol
CN113556258B (en) * 2020-04-24 2022-12-27 西安华为技术有限公司 Anomaly detection method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668693A (en) * 2020-12-03 2021-04-16 支付宝(杭州)信息技术有限公司 Method and device for detecting abnormal business index
CN113098723A (en) * 2021-06-07 2021-07-09 新华三人工智能科技有限公司 Fault root cause positioning method and device, storage medium and equipment

Also Published As

Publication number Publication date
CN115604090A (en) 2023-01-13

Similar Documents

Publication Publication Date Title
CN118400314B (en) Informationized machine room monitoring and management system
CN118509336B (en) A communication network optimization method, device and equipment considering power consumption
CN109787817B (en) Network fault diagnosis method, device and computer-readable storage medium
US7636318B2 (en) Real-time network analyzer
CN118054845B (en) Distributed optical network terminal fault monitoring method and system
Wang et al. QoE oriented cognitive network based on machine learning and SDN
CN119276775A (en) A multi-link fusion method and system based on SD-WAN technology
CN116980284A (en) Optical cable fiber distribution box operation and maintenance information transmission method and system based on Internet of things
CN119556679A (en) A new energy station fault diagnosis system and method based on intelligent sensor
CN115604090B (en) Network anomaly root cause positioning method and system based on overfitting
WO2022182272A1 (en) Method for communications network analysis using trained machine learning models and network topography information
CN108494625A (en) A kind of analysis system on network performance evaluation
CN114338825B (en) TR069 protocol-based SRv network distributed log compression method
Xiong et al. Deep learning traffic prediction to optimize routing paths and reduce latency in SDN
CN115718899A (en) Method and device for IT operation and maintenance investigation by using heterogeneous data source
CN119579147B (en) Substation network equipment fault elimination and verification method and system based on deep learning
CN111565124B (en) Topology analysis method and device
Ji et al. Measurement-based network monitoring and inference: scalability and missing information
JP6061838B2 (en) Communication quality degradation factor analyzer
Yang et al. Traffic anomaly detection and prediction based on SDN-enabled ICN
CN118827480A (en) Network system, service processing method, device, equipment and storage medium
Shmelkin et al. On adapting SNMP as communication protocol in distributed control loops for self-adaptive systems
CN120562461B (en) Edge collaborative agent management method based on mechanism framework and AI
Nadig et al. Large data transfer predictability and forecasting using application-aware SDN
CN120639583A (en) Method, device and electronic equipment for controlling network quality of fixed-line users

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant