CN119728386A

CN119728386A - Alarm scene processing method, device, electronic device and storage medium

Info

Publication number: CN119728386A
Application number: CN202411752578.2A
Authority: CN
Inventors: 谢绍航; 邓锦烨; 莫华森; 段云涌
Original assignee: China Telecom Cloud Technology Co Ltd
Current assignee: China Telecom Cloud Technology Co Ltd
Priority date: 2024-12-02
Filing date: 2024-12-02
Publication date: 2025-03-28

Abstract

The embodiment of the present invention provides a processing method, device, electronic device and storage medium for an alarm scenario, and relates to the technical field of alarm data processing. The method includes: obtaining an alarm strategy through a policy configuration platform, and sending the alarm strategy to an alarm execution engine; receiving the alarm strategy sent by the alarm execution engine through a multiplexing layer, performing semantic level analysis on the alarm strategy, obtaining basic conditions corresponding to the alarm strategy, and abstracting the basic conditions to obtain target basic conditions, and then classifying the basic conditions based on the occurrence frequency corresponding to each target basic condition to obtain hotspot conditions and non-hotspot conditions, and then querying the hotspot conditions and non-hotspot conditions from a database to obtain a first observation value of the hotspot condition and a second observation value of the non-hotspot condition; executing the alarm processing process for the alarm strategy according to the first observation value and the second observation value through the alarm execution engine, thereby ensuring the stability of system operation and service availability.

Description

Alarm scene processing method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the technical field of alarm data processing, and in particular, to a method for processing an alarm scene, a device for processing an alarm scene, an electronic device, and a computer readable storage medium.

Background

In a CDN service system, operators need to monitor performance indexes and monitoring states of nodes corresponding to each region in real time, and conduct investigation and maintenance operations on fault nodes in time, or optimize a flow distribution scheduling strategy for guaranteeing service quality. The CDN service is located on the edge side, and typically includes a large number of node servers located on the edge side, and the CDN generally also carries a large amount of traffic data, so that the object to be monitored is more complex. In the actual monitoring and alarming process, corresponding alarming strategies are required to be set, the quantity of the alarming strategies is huge and continuously increased, and each alarming strategy is judged to be required to be subjected to corresponding inquiry once, so that extremely large performance burden is definitely brought, and the pressure is further transmitted to a bottom database, so that the running stability of the system and the service availability are affected.

Disclosure of Invention

The embodiment of the invention provides a processing method, a device, electronic equipment and a computer readable storage medium of an alarm scene, which are used for solving or partially solving the problems that an alarm strategy brings great performance burden to a CDN service system and affects the stability of system operation and service availability.

The embodiment of the invention discloses a processing method of an alarm scene, which relates to a content distribution network, wherein the content distribution network comprises a plurality of nodes, the content distribution network caches service contents corresponding to a source station to each node, wherein the nodes, the source station and a first dimension corresponding to the nodes and a second dimension corresponding to the source station are respectively provided with an independent alarm strategy, the first dimension is an analysis dimension for analyzing the node performance of the nodes, and the second dimension is an analysis dimension for analyzing the source station performance of the source station, and the method is applied to a monitoring analysis system, and at least comprises a strategy configuration platform, an alarm execution engine in communication connection with the strategy configuration platform, a multiplexing layer in communication connection with the alarm execution engine and a database in communication connection with the multiplexing layer.

Acquiring the alarm strategy through the strategy configuration platform and sending the alarm strategy to the alarm execution engine;

The multiplexing layer receives the alarm strategy sent by the alarm execution engine, performs semantic analysis on the alarm strategy to obtain basic conditions corresponding to the alarm strategy, and performs abstract processing on the basic conditions to obtain target basic conditions;

Classifying the basic conditions based on the occurrence frequency corresponding to each target basic condition through the multiplexing layer to obtain hot spot conditions and non-hot spot conditions;

Inquiring the hot spot condition and the non-hot spot condition from the database through the multiplexing layer to obtain a first observation value corresponding to the hot spot condition and a second observation value corresponding to the non-hot spot condition, wherein the first observation value and the second observation value are the basis for triggering an alarm strategy;

And executing an alarm processing process aiming at the alarm strategy according to the first observed value and the second observed value by the alarm execution engine.

In some possible implementations, the performing semantic analysis on the alarm policy to obtain a basic condition corresponding to the alarm policy includes:

Acquiring a logic operator aiming at the alarm strategy;

and cutting the alarm strategy according to the logic operator to obtain corresponding basic conditions.

In some possible implementations, the abstracting the basic condition to obtain the target basic condition includes:

acquiring a source station description corresponding to the source station and a node description corresponding to the node;

And removing the source station description and/or the node description in each basic condition to obtain a target basic condition.

In some possible implementations, the classifying the basic conditions based on occurrence frequencies corresponding to the target basic conditions to obtain hot spot conditions and non-hot spot conditions includes:

acquiring the occurrence frequency corresponding to each target basic condition;

taking a first basic condition of which the occurrence frequency is greater than or equal to a target basic condition of a preset threshold value as a hot spot condition, and taking a second basic condition of which the occurrence frequency is less than the target basic condition of the preset threshold value as a non-hot spot condition.

In some possible implementations, the querying the database for the hotspot condition and the non-hotspot condition, to obtain a first observed value corresponding to the hotspot condition and a second observed value corresponding to the non-hotspot condition, includes:

Determining input parameters corresponding to the database, wherein the input parameters at least comprise a first input parameter used for specifying a query statement to be executed and a second input parameter used for specifying a timestamp of the query;

And calling an interface corresponding to the database, transmitting the first input parameter and the second input parameter, and executing query operation for the hot spot condition to obtain a source station observed value of the hot spot condition on each source station and a node observed value of the hot spot condition on each node.

In some possible implementations, the querying the database for the hotspot condition and the non-hotspot condition, obtaining a first observed value corresponding to the hotspot condition and a second observed value corresponding to the non-hotspot condition, further includes:

And directly transmitting the non-hot spot condition to the database through the multiplexing layer to query, and obtaining a second observation value corresponding to the non-hot spot condition.

In some possible implementations, the performing an alarm processing procedure for the alarm policy according to the first observation value and the second observation value includes:

And if the first observed value meets a first preset condition and/or the second observed value meets a second preset condition, generating alarm prompt information corresponding to the alarm strategy.

The embodiment of the invention also discloses a processing device of the alarm scene, which relates to a content distribution network, wherein the content distribution network comprises a plurality of nodes, the content distribution network caches service contents corresponding to a source station to each node, wherein the nodes, the source station and a first dimension corresponding to the nodes and a second dimension corresponding to the source station are respectively provided with independent alarm strategies, the first dimension is an analysis dimension for analyzing the node performance of the nodes, and the second dimension is an analysis dimension for analyzing the source station performance of the source station, and the device is applied to a monitoring analysis system, and at least comprises a strategy configuration platform, an alarm execution engine in communication connection with the strategy configuration platform, a multiplexing layer in communication connection with the alarm execution engine and a database in communication connection with the multiplexing layer, and comprises:

the strategy acquisition module is positioned on the strategy configuration platform and is used for acquiring the alarm strategy and sending the alarm strategy to the alarm execution engine;

The condition decomposition module is positioned at the multiplexing layer and is used for receiving the alarm strategy sent by the alarm execution engine, analyzing the alarm strategy in a semantic level to obtain basic conditions corresponding to the alarm strategy, and abstracting the basic conditions to obtain target basic conditions;

the classifying module is positioned at the multiplexing layer and is used for classifying the basic conditions based on the occurrence frequency corresponding to each target basic condition to obtain hot spot conditions and non-hot spot conditions;

The query module is located at the multiplexing layer and is used for querying the database for the hot spot condition and the non-hot spot condition to obtain a first observation value corresponding to the hot spot condition and a second observation value corresponding to the non-hot spot condition, wherein the first observation value and the second observation value are the basis of triggering an alarm strategy;

and the alarm processing module is positioned in the alarm execution engine and is used for executing an alarm processing process aiming at the alarm strategy according to the first observation value and the second observation value.

In some possible implementations, the conditional decomposition module is specifically configured to:

Acquiring a logic operator aiming at the alarm strategy;

In some possible implementations, the classification module is specifically configured to:

In some possible implementations, the query module is specifically configured to:

In some possible implementations, the query module is specifically further configured to:

In some possible implementations, the alarm processing module is specifically configured to:

The embodiment of the invention also discloses electronic equipment, which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;

the memory is used for storing a computer program;

The processor is configured to implement the method according to the embodiment of the present invention when executing the program stored in the memory.

Embodiments of the present invention also disclose a computer-readable storage medium having instructions stored thereon, which when executed by one or more processors, cause the processors to perform the method according to the embodiments of the present invention.

The embodiment of the invention has the following advantages:

In the embodiment of the invention, when a monitoring alarm scene of a content distribution network is processed, the monitoring analysis system can be optimized, the optimized monitoring analysis system can comprise a strategy configuration platform, an alarm execution engine in communication connection with the strategy configuration platform, a multiplexing layer in communication connection with the alarm execution engine and a database in communication connection with the multiplexing layer, the system can acquire an alarm strategy through the strategy configuration platform, and send the alarm strategy to the alarm execution engine, and receive the alarm strategy sent by the alarm execution engine through the multiplexing layer, analyze the alarm strategy at a semantic level to obtain a basic condition corresponding to the alarm strategy, abstract the basic condition to obtain a target basic condition, classify the basic condition based on occurrence frequency corresponding to each target basic condition to obtain a hot condition and a non-hot condition, then inquire the database about the hot condition and the non-hot condition to obtain a first observation value corresponding to the hot condition and a second observation value corresponding to the non-hot condition, and finally execute an alarm processing process aiming at the alarm strategy according to the first observation value and the second observation value through the alarm execution engine, and the alarm execution engine is used for the alarm strategy to trigger the alarm strategy, so that the basic condition is simplified in terms of the alarm strategy, the redundancy analysis system can be provided in a service-level, and the performance of the system can be effectively improved by combining the performance of the system in a network-based on the aspect of the strategy, and the system.

Drawings

FIG. 1 is a flow chart of steps of a method for processing an alert scene provided in an embodiment of the present invention;

FIG. 2 is a schematic diagram of a CDN monitoring architecture provided in an embodiment of the present invention;

FIG. 3 is a software flow diagram of a multiplexing layer provided in an embodiment of the invention;

fig. 4 is a block diagram of a processing device for an alarm scenario according to an embodiment of the present invention.

Detailed Description

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

As an example, in the actual monitoring and alarming process, corresponding alarming strategies need to be set, the number of alarming strategies is huge and continuously increased, and each alarming strategy needs to be judged to be executed once, which definitely brings great performance burden, and pressure is further transmitted to the bottom database, so that the stability of system operation and service availability are affected.

In this way, in the invention, the monitoring analysis system is optimized, the optimized monitoring analysis system can comprise a strategy configuration platform, an alarm execution engine in communication connection with the strategy configuration platform, a multiplexing layer in communication connection with the alarm execution engine and a database in communication connection with the multiplexing layer, the system can acquire the alarm strategy through the strategy configuration platform and send the alarm strategy to the alarm execution engine, and receive the alarm strategy sent by the alarm execution engine through the multiplexing layer, analyze the alarm strategy in a semantic layer to acquire basic conditions corresponding to the alarm strategy, abstract the basic conditions to acquire target basic conditions, classify the basic conditions based on occurrence frequency corresponding to each target basic condition to acquire hot condition and non-hot condition, query the database for the hot condition and the non-hot condition to acquire a first observation value corresponding to the hot condition and a second observation value corresponding to the non-hot condition, and finally execute an alarm processing process aiming at the alarm strategy through the alarm execution engine according to the first observation value and the second observation value, the first observation value and the second observation value are bases for triggering the alarm strategy, so that the alarm is cut in the semantic layer, the corresponding basic conditions are acquired, the basic conditions are identified, the redundancy conditions are effectively reduced, and the service performance of the system can be effectively improved by combining with the redundancy analysis system can be provided.

In order to enable those skilled in the art to better understand the technical solutions in the embodiments of the present invention, the following explains and describes some technical features related to the embodiments of the present invention:

CDN (Content Distribution Network) the content distribution network, through distributing the server storage content copy in different areas, and distribute to the request customer end nearby, make the content transmission faster more stable.

Engine-refers to a combination of a series of closely related software operations that provide a user with an easy-to-use way of invoking and hiding the inherent details for accomplishing a particular goal.

Lexical analyzers (Lexical Analyzer, lexer) are program tools in computer processing technology for extracting valid words from character sequences, and are commonly used in the field of compiler front-end processing.

Referring to fig. 1, a step flow chart of a processing method of an alarm scenario provided in an embodiment of the present invention relates to a content distribution network, where the content distribution network includes a plurality of nodes, the content distribution network caches service content corresponding to a source station onto each node, where the nodes, the source station, and a first dimension corresponding to the nodes, and a second dimension corresponding to the source station are all provided with independent alarm policies, the first dimension is an analysis dimension for analyzing node performance of the nodes, and the second dimension is an analysis dimension for analyzing source station performance of the source station, where the monitoring analysis system is applied to a monitoring analysis system, where the monitoring analysis system at least includes a policy configuration platform, an alarm execution engine communicatively connected to the policy configuration platform, a multiplexing layer communicatively connected to the alarm execution engine, and a database communicatively connected to the multiplexing layer, and specifically may include the following steps:

Step 101, acquiring the alarm strategy through the strategy configuration platform and sending the alarm strategy to the alarm execution engine;

Alternatively, for a content delivery network CDN, which may be a set of server networks distributed in different geographic locations, the aim is to increase the loading speed and performance of the website by providing content nearby, e.g., the CDN may cache the content of the source site and deliver it to nodes around the world, enabling users to obtain content from the nearest node, thereby reducing latency and bandwidth consumption.

The CDN network may include a plurality of nodes, which may be servers in the CDN network, and are usually located in a data center or a network access point close to the user, and are responsible for caching the contents of the source station and directly providing the contents when the user requests, so that direct access to the source station is reduced. For the source station, it may be a server for storing original content, typically may be a back-end server of a website or an application program, etc., and is responsible for generating or storing original content of the website, such as HTML, CSS, javaScript, pictures, videos, etc.

For the CDN network, the dimensions may be different views or classification standards when analyzing and monitoring the performance of the CDN network, for example, the dimensions may include multiple dimensions such as request response status codes, abnormal bandwidth variation, downloading speed, and the like, the dimensions may include setting for nodes, source stations, and single indexes, in actual production activities, by setting independent alarm policies for each node, each source station, each unique, the product of the number of the nodes, source stations, and dimensions is the final number of alarm policies, so that by setting corresponding alarm policies, whether the nodes, source stations, dimensions, and the like meet performance requirements or not can be effectively detected. The alarm strategy is sent to an alarm execution engine according to a specified time interval, and the execution engine inquires and calculates performance indexes in massive monitoring data according to the description of the alarm strategy to judge whether to send alarm information to related personnel.

In the embodiment of the invention, for the monitoring analysis system, the monitoring analysis system at least comprises a strategy configuration platform, an alarm execution engine in communication connection with the strategy configuration platform, a multiplexing layer in communication connection with the alarm execution engine and a database in communication connection with the multiplexing layer. The system comprises a policy configuration platform, an alarm execution engine, a multiplexing layer, a database and a database, wherein the policy configuration platform can be used for configuring alarm policies and the like, the alarm execution engine can be used for receiving the alarm policies periodically sent by the policy configuration platform and executing corresponding alarm policies, such as inquiring corresponding data from the database, the multiplexing layer can be used for processing the alarm policies received by the alarm execution engine, optimizing the content related to the alarm policies, reducing redundancy in the alarm policies and improving the efficiency and performance of system operation, and the database can be used for storing corresponding monitoring data, such as data obtained by monitoring a source station, a node, corresponding dimension and the like.

In a specific implementation, a user may configure an alarm policy for a source station, for a node, and for a corresponding dimension on an alarm policy configuration platform, and send the alarm policy to an alarm execution engine, so that the alarm execution engine queries and calculates performance indexes in a huge amount of monitoring data according to a related description of the alarm policy, so as to determine whether to send alarm information to related personnel based on a processing result.

102, Receiving the alarm strategy sent by the alarm execution engine through the multiplexing layer, analyzing the alarm strategy in a semantic level to obtain a basic condition corresponding to the alarm strategy, and abstracting the basic condition to obtain a target basic condition;

For the alarm execution engine, if the alarm execution engine immediately executes the operation corresponding to the alarm strategy after receiving the alarm strategy, the performance index is queried and calculated in massive monitoring data according to the description of the alarm strategy, and whether to send the alarm message to related personnel is judged, the problem that the number of the alarm strategies is huge and continuously grows is easy, so that the judging requirement of each alarm strategy is queried once in a database, the alarm execution engine is required to bear larger performance pressure, and the pressure is further conducted to a bottom database, so that the running stability of the system and the service availability are affected.

In this regard, in the embodiment of the present invention, for the alert execution engine, in order to reduce the query computation pressure of the alert execution engine, a multiplexing layer is introduced between the alert execution engine and the database, and the alert policy received by the alert execution engine is processed through the multiplexing layer, so as to optimize the content related to the alert policy, reduce redundancy in the alert policy, and improve the efficiency and performance of the system operation.

In a specific implementation, the alarm execution engine can send the received alarm strategy to the multiplexing layer, after the multiplexing layer receives the alarm strategy sent by the alarm execution engine, the analysis of the semantic layer can be performed on the alarm strategy to obtain the basic condition corresponding to the alarm strategy, and then the basic condition is subjected to abstract processing to obtain the target basic condition, so that the alarm strategy is cut at the semantic layer to obtain the corresponding basic condition, redundancy in the alarm strategy is effectively reduced, and the processing efficiency of the system is improved.

In some possible implementations, the multiplexing layer may acquire a logic operator for the alarm policy, then cut the alarm policy according to the logic operator to obtain corresponding basic conditions, then acquire a source station description corresponding to the source station and a node description corresponding to the node, and remove the source station description and/or the node description in each basic condition to obtain the target basic condition. Optionally, the alarm policy may be composed of at least two basic conditions, and the different basic conditions are combined based on the logic operators, so that the alarm policy may be decomposed into corresponding basic conditions based on the logic operators, such as logic operators of and, or, & @, @ and the like, for example, the alarm policy for sudden drop of the request number may be decomposed into two basic conditions of "the speed of drop of the request number is greater than 80%," the number of requests before drop is greater than 2 ten thousand ", and the like, which is not limited in this invention.

In a specific implementation, the multiplexing layer can analyze the received alarm strategy at the semantic level first, decompose the alarm strategy into combinations of basic conditions, abstract all the basic conditions, remove filter descriptions related to specific source stations and nodes so as to combine the basic conditions, and obtain corresponding target basic conditions so as to further process the alarm strategy based on the target basic conditions, thereby cutting the alarm strategy at the semantic level to obtain corresponding basic conditions, effectively reducing redundancy in the alarm strategy and improving the processing efficiency of the system.

In one example, after receiving the alarm policy sent by the alarm execution engine, the multiplexing layer may acquire a logical operator, such as and, or, & @, etc., and then cut the alarm policy according to the logical operator to obtain a corresponding basic condition, such as cpu_usage{instance＝"node1"}>80、memory_usage{instance＝"node1"}<20、network_latency{instance＝"node2"}>100, etc. Then, the multiplexing layer may further acquire source station descriptions and node descriptions, so as to remove specific service information based on the source station descriptions and the node descriptions, for example, instance= "node1", instance= "node2", and obtain corresponding target basic conditions, for example, cpu_usage >80, memory_usage <20, network_latency >100, etc., so as to perform further processing based on the target basic conditions, thereby cutting the alarm policy at the semantic level, obtaining corresponding basic conditions, effectively reducing redundancy in the alarm policy, and improving the processing efficiency of the system.

Step 104, classifying the basic conditions based on the occurrence frequency corresponding to each target basic condition through the multiplexing layer to obtain hot spot conditions and non-hot spot conditions;

After the target basic conditions are obtained through segmentation of the semantic level and removal of the corresponding service information, the occurrence frequency corresponding to each target basic condition can be further counted, the basic conditions are classified based on the occurrence frequency, the first basic condition to which the target basic condition with the occurrence frequency being greater than or equal to a preset threshold belongs is used as a hot spot condition, and the second basic condition to which the target basic condition with the occurrence frequency being smaller than the preset threshold belongs is used as a non-hot spot condition, so that the alarm strategy is cut at the semantic level, the corresponding basic condition is obtained, and the redundancy in the alarm strategy is effectively reduced and the processing efficiency of the system is improved by combining the mode of identifying the hot spot condition.

In a specific implementation, the multiplexing layer can cut the expression of the alarm strategy into basic conditions according to 'and, or, & gt, |' characters to form a condition set 1, then remove specific business information such as accid from the metric label set according to the grammar of PromQL sentences to form a condition set 2, then merge and count the condition set 2, wherein the basic conditions with the proportion exceeding 10% of the total number are marked as hot spot conditions, and the rest conditions which are not marked as hot spot conditions are selected from the condition set 1 according to the mapping relation between the condition set 2 and the condition set 1 to form non-hot spot conditions. For example, after the service information is removed, the same target basic conditions can be combined and counted, the basic conditions of which the target basic conditions belong with the ratio exceeding 10% of the total number are marked as hot spot conditions, the basic conditions of which the target basic conditions belong with the ratio not exceeding 10% are marked as non-hot spot conditions, so that the alarm strategy is cut at the semantic level to obtain corresponding basic conditions, and the redundancy in the alarm strategy is effectively reduced and the processing efficiency of the system is improved by combining the mode of identifying the hot spot conditions.

Step 105, inquiring the hot spot condition and the non-hot spot condition from the database through the multiplexing layer to obtain a first observation value corresponding to the hot spot condition and a second observation value corresponding to the non-hot spot condition;

After the basic conditions are classified, the multiplexing layer can query the database for the hot spot conditions and the non-hot spot conditions obtained after the classification, so as to extract corresponding target data from mass monitoring data stored in the database according to descriptions corresponding to the corresponding basic conditions, and calculate corresponding performance indexes according to the target data to obtain a first observation value corresponding to the hot spot conditions and a second observation value corresponding to the non-hot spot conditions. The first observation value and the second observation value are the basis for triggering the alarm strategy, so that whether the alarm strategy is triggered or not is judged, and whether corresponding personnel are notified or not is determined.

In some possible implementations, for a hot spot condition, the multiplexing layer may determine input parameters corresponding to the database, where the input parameters include at least a first input parameter for specifying a query statement to be executed and a second input parameter for specifying a timestamp of the query, then call an interface corresponding to the database, enter the first input parameter and the second input parameter, perform a query operation for the hot spot condition, obtain source station observations of the hot spot condition on each source station, and obtain node observations of the hot spot condition on each node. And for the non-hot spot condition, the multiplexing layer can directly transmit the non-hot spot condition to the database for inquiring, so as to obtain a second observation value corresponding to the non-hot spot condition.

In a specific implementation, the database may be a Prometaus database, and the multiplexing layer may call interfaces such as Prometaus/api/v 1/query through an HTTP POST request, and transmit query parameters (i.e. a first input parameter) and time parameters (i.e. a second input parameter) through the interfaces, so as to query the Prometaus database for hot spot conditions, obtain observations of the hot spot conditions in all source stations and nodes, store the obtained observations in a hot spot file, and increase access speed in the storage process by using a high-speed hard disk, and for non-hot spot conditions, directly transmit the decomposed basic conditions to the back-end Prometaus database for query, and directly store the obtained observations in a memory, thereby greatly reducing the total number of basic conditions by extracting the basic conditions unchanged in an alarm policy and then reducing the number of times of querying the back-end database by a hot spot condition merging algorithm of the basic condition aggregation module.

And 106, executing an alarm processing process aiming at the alarm strategy according to the first observed value and the second observed value by the alarm execution engine.

In the embodiment of the invention, the multiplexing layer optimizes the alarm strategy, after redundancy in the alarm strategy is reduced, and after corresponding observation values are queried from the database based on the decomposed basic conditions, the multiplexing layer can return the first observation value corresponding to the hot event and the second observation value corresponding to the non-hot event to the alarm execution engine, and the alarm execution engine executes an alarm processing process aiming at the alarm strategy according to the first observation value and the second observation value, so that the alarm strategy is cut at a semantic level, the corresponding basic conditions are obtained, the redundancy in the alarm strategy is effectively reduced, the processing efficiency of the system is improved, the structure of a lexical analyzer under the content distribution network is simplified, the system performance is provided, and the stability of the system operation and the service availability are ensured.

In a specific implementation, the alarm execution engine may compare the corresponding observed value with a preset alarm condition, and determine whether an alarm is triggered, and if the first observed value meets the first preset condition, and/or if the second observed value meets the second preset condition, generate alarm prompt information corresponding to an alarm policy.

In one example, assume the observations are as follows:

cpu_usage:75

memory_usage:15

network_latency:90

comparison results:

cpu_use >80:75< =80, does not satisfy the first preset condition.

Memory_usage <20:15<20, satisfying a second preset condition.

Network_latency >100:90< = 100, the third preset condition is not satisfied.

Triggering conditions:

if(cpu_usage>80andmemory_usage<20)or(network_latency>100)thenalert

Since cpu_usage >80 is not satisfied and network_latency >100 is not satisfied, an alarm is not triggered.

Alarm prompt information:

Alarm strategy:

if(cpu_usage>80andmemory_usage<20)or(network_latency>100)thenalert

Observations: cpu_use=75, memory_use=15, network_latency=90

Alarm prompt, namely, not triggering alarm.

In another example, assume that the observations are as follows:

cpu_usage:85

memory_usage:25

network_latency:120

comparison results:

cpu_use >80:85>80, satisfying a first preset condition.

Memory_use <20:25> =20, and the second preset condition is not satisfied.

Network_latency >100:120>100, satisfying a third preset condition.

Triggering conditions:

if(cpu_usage>80and memory_usage<20)or(network_latency>100)then alert

since network_latency >100 is satisfied, an alarm is triggered.

Alarm prompt information:

Alarm policy if (cpu_use >80and memory_usage<20) or (network_latency > 100) THEN ALERT

Observations: cpu_use=85, memory_use=25, network_latency=120

Alarm prompting, that is, network delay exceeds 100ms, and an alarm is triggered.

It should be noted that the embodiments of the present invention include, but are not limited to, the foregoing examples, and it will be understood that those skilled in the art may also set the embodiments according to actual requirements under the guidance of the concepts of the embodiments of the present invention, which are not limited thereto.

In order to enable those skilled in the art to better understand the technical solutions according to the embodiments of the present invention, the following exemplary descriptions are provided by way of corresponding examples:

Referring to fig. 2, a schematic diagram of a CDN monitoring architecture provided in an embodiment of the present invention is shown, for which an alarm policy configuration platform, an alarm execution engine, a multiplexing layer, and a promethaus database may be included. The manager can configure a corresponding alarm strategy on the alarm strategy configuration platform, and periodically send the alarm strategy to the alarm execution engine, and after the alarm execution engine receives the alarm strategy, the alarm strategy can be sent to the multiplexing layer, and the multiplexing layer can send the alarm strategy to the multiplexing layer in each alarm period:

1. The received alarm strategy is analyzed in a semantic level, and is decomposed into basic condition combinations;

2. Secondly, abstract all basic conditions, and remove filter descriptions related to specific source stations and nodes so as to be combined;

3. then, identifying a basic condition with higher occurrence frequency, and becoming a hot spot condition;

4. Calling a Prometaus/api/v 1/query interface in an HTTP POST request mode, transmitting query and time parameters, inquiring a hot spot condition from a Prometaus database, obtaining observation values of the condition in all source stations and nodes, storing the observation values in a hot spot file, and improving access speed by using a high-speed hard disk;

5. for non-hot point conditions, directly and transparently transmitting the basic conditions after the decomposition in the step 1 to a rear-end Prometaus database for inquiring, and storing the observed values in a memory;

6. Traversing each alarm strategy, combining the observation values of the non-hot spot condition and the hot spot condition based on the decomposition relation of the step 1, and returning to the alarm execution engine.

In a specific implementation, referring to fig. 3, a software flow chart of a multiplexing layer provided in an embodiment of the present invention is shown, specifically, in each alarm determination period, the multiplexing layer has the following operations:

The semantic segmentation module is used for cutting an expression of the alarm strategy into basic conditions according to 'and, or, & & |' characters to form a condition set 1;

the semantic segmentation module removes specific business information such as accid from the metric label set according to grammar of PromQL sentences to form a condition set 2;

The basic condition set module is used for carrying out merging counting on the condition set 2, and marking basic conditions with the proportion exceeding 10% of the total number as hot spot conditions, and selecting other conditions which are not marked as the hot spot conditions from the condition set 1 according to the mapping relation between the condition set 2 and the condition set 1 to form non-hot spot conditions;

The query module invokes a Prometaus/api/v 1/query interface in an HTTP POST mode, transmits query and time parameters, queries hot spot conditions and non-hot spot conditions respectively, and forms a result set, wherein the query result of the hot spot conditions is a compact dictionary type data structure and comprises the mapping relation between original service information and the query result;

And the reorganization output module traverses all the alarm strategies, selects correct results from the non-hotspot condition query result set and the hotspot condition query result set to be combined and returns to the alarm execution engine according to the mapping relation of the steps.

Through the process, the basic conditions can be cut according to the semantics, the technology for identifying the hot spot conditions can be combined, compared with the prior art, the redundancy in the alarm strategy can be found and reduced, the system efficiency is improved, and the general rule expressed by the alarm strategy in the CDN monitoring scene is combined through the alarm strategy semantic segmentation method, so that the structure of the lexical analyzer in the scene is simplified, and the performance is improved.

It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.

Referring to fig. 4, a structural block diagram of a processing device of an alarm scenario provided in an embodiment of the present invention is shown, and the processing device relates to a content distribution network, where the content distribution network includes a plurality of nodes, the content distribution network caches service content corresponding to a source station onto each node, where the nodes, the source station, and a first dimension corresponding to the node, and a second dimension corresponding to the source station are all provided with independent alarm policies, the first dimension is an analysis dimension for analyzing node performance of the node, and the second dimension is an analysis dimension for analyzing source station performance of the source station, and the monitoring analysis system is applied to a monitoring analysis system, where the monitoring analysis system includes at least a policy configuration platform, an alarm execution engine communicatively connected to the policy configuration platform, a multiplexing layer communicatively connected to the alarm execution engine, and a database communicatively connected to the multiplexing layer, and may specifically include the following modules:

the policy acquisition module 401 is located on the policy configuration platform, and is configured to acquire the alarm policy and send the alarm policy to the alarm execution engine;

The condition decomposition module 402, located at the multiplexing layer, is configured to receive the alarm policy sent by the alarm execution engine, perform semantic analysis on the alarm policy, obtain a basic condition corresponding to the alarm policy, and abstract the basic condition to obtain a target basic condition;

the classification module 403, located at the multiplexing layer, is configured to classify the basic conditions based on occurrence frequencies corresponding to the target basic conditions, so as to obtain hot spot conditions and non-hot spot conditions;

The query module 404, located at the multiplexing layer, is configured to query the database for the hot spot condition and the non-hot spot condition, and obtain a first observed value corresponding to the hot spot condition and a second observed value corresponding to the non-hot spot condition, where the first observed value and the second observed value are the basis for triggering an alarm policy;

And the alarm processing module 405 is located in the alarm execution engine, and is configured to execute an alarm processing procedure for the alarm policy according to the first observation value and the second observation value.

In some possible implementations, the conditional decomposition module 402 is specifically configured to:

Acquiring a logic operator aiming at the alarm strategy;

In some possible implementations, the classification module 403 is specifically configured to:

In some possible implementations, the query module 404 is specifically configured to:

In some possible implementations, the query module 404 is specifically further configured to:

In some possible implementations, the alarm processing module 405 is specifically configured to:

For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.

In addition, the embodiment of the invention also provides electronic equipment, which comprises a processor, a memory and a computer program stored in the memory and capable of running on the processor, wherein the computer program realizes the processes of the processing method embodiment of the alarm scene when being executed by the processor, can achieve the same technical effect, and is not repeated here for avoiding repetition.

The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, realizes the processes of the processing method embodiment of the alarm scene, and can achieve the same technical effects, and in order to avoid repetition, the description is omitted. The computer readable storage medium is, for example, a Read-Only Memory (ROM), a random access Memory (Random Access Memory RAM), a magnetic disk or an optical disk.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.

It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Moreover, embodiments of the invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, EEPROM, flash, eMMC, etc.) having computer-usable program code embodied therein.

Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.

The foregoing describes in detail a method for processing an alarm scenario and a device for processing an alarm scenario, wherein specific examples are applied to illustrate the principles and embodiments of the present invention, and the description of the foregoing examples is only for aiding in understanding the method and core concept of the present invention, and meanwhile, to those skilled in the art, according to the concept of the present invention, there are variations in the specific embodiments and application ranges, so that the disclosure should not be construed as limiting the present invention.

Claims

1. A method for processing an alarm scenario, characterized in that it involves a content distribution network, the content distribution network includes a plurality of nodes, and the content distribution network caches the business content corresponding to the source station on each of the nodes; wherein the node, the source station, and the first dimension corresponding to the node and the second dimension corresponding to the source station are all provided with independent alarm strategies, the first dimension is an analysis dimension for analyzing the node performance of the node, and the second dimension is an analysis dimension for analyzing the source station performance of the source station; wherein, applied to a monitoring and analysis system, the monitoring and analysis system at least includes a policy configuration platform, an alarm execution engine communicatively connected to the policy configuration platform, a multiplexing layer communicatively connected to the alarm execution engine, and a database communicatively connected to the multiplexing layer, and the method includes:

Acquiring the alarm strategy through the strategy configuration platform, and sending the alarm strategy to the alarm execution engine;

Receiving the alarm strategy sent by the alarm execution engine through the multiplexing layer, performing semantic level analysis on the alarm strategy to obtain basic conditions corresponding to the alarm strategy, and abstracting the basic conditions to obtain target basic conditions;

Classifying the basic conditions based on the occurrence frequencies corresponding to the target basic conditions through the multiplexing layer to obtain hotspot conditions and non-hotspot conditions;

Query the hotspot condition and the non-hotspot condition from the database through the multiplexing layer to obtain a first observation value corresponding to the hotspot condition and a second observation value corresponding to the non-hotspot condition, wherein the first observation value and the second observation value are the basis for triggering an alarm strategy;

The alarm execution engine executes an alarm processing process for the alarm strategy according to the first observation value and the second observation value.

2. The method according to claim 1, wherein the semantic level analysis of the alarm strategy to obtain the basic conditions corresponding to the alarm strategy comprises:

Obtaining a logical operator for the alarm strategy;

The alarm strategy is segmented according to the logical operator to obtain corresponding basic conditions.

3. The method according to claim 2, characterized in that the abstracting of the basic conditions to obtain the target basic conditions comprises:

Obtain a source station description corresponding to the source station and a node description corresponding to the node;

The source station description and/or the node description in each of the basic conditions are removed to obtain the target basic conditions.

4. The method according to claim 1, characterized in that the basic conditions are classified based on the occurrence frequency corresponding to each of the target basic conditions to obtain hotspot conditions and non-hotspot conditions, comprising:

Obtaining the occurrence frequency corresponding to each of the target basic conditions;

The first basic condition to which the target basic condition belongs whose occurrence frequency is greater than or equal to the preset threshold is taken as the hotspot condition, and the second basic condition to which the target basic condition belongs whose occurrence frequency is less than the preset threshold is taken as the non-hotspot condition.

5. The method according to claim 1, characterized in that the querying the database about the hotspot condition and the non-hotspot condition to obtain a first observation value corresponding to the hotspot condition and a second observation value corresponding to the non-hotspot condition comprises:

Determine input parameters corresponding to the database, the input parameters including at least a first input parameter for specifying a query statement to be executed, and a second input parameter for specifying a timestamp of the query;

Call the interface corresponding to the database, pass in the first input parameter and the second input parameter, execute a query operation for the hotspot condition, obtain the source station observation value of the hotspot condition on each of the source stations, and the node observation value of the hotspot condition on each of the nodes.

6. The method according to claim 5, characterized in that the querying the database for the hotspot condition and the non-hotspot condition to obtain a first observation value corresponding to the hotspot condition and a second observation value corresponding to the non-hotspot condition further comprises:

The non-hotspot condition is directly transmitted to the database through the multiplexing layer for query, so as to obtain a second observation value corresponding to the non-hotspot condition.

7. The method according to claim 1, wherein the step of executing an alarm processing process for the alarm strategy according to the first observation value and the second observation value comprises:

If the first observation value satisfies a first preset condition, and/or the second observation value satisfies a second preset condition, then alarm prompt information corresponding to the alarm strategy is generated.

8. A processing device for an alarm scenario, characterized in that it involves a content distribution network, the content distribution network includes a plurality of nodes, and the content distribution network caches the business content corresponding to the source station on each of the nodes; wherein the node, the source station, and the first dimension corresponding to the node and the second dimension corresponding to the source station are all provided with independent alarm strategies, the first dimension is an analysis dimension for analyzing the node performance of the node, and the second dimension is an analysis dimension for analyzing the source station performance of the source station; wherein, it is applied to a monitoring and analysis system, the monitoring and analysis system at least includes a policy configuration platform, an alarm execution engine communicatively connected to the policy configuration platform, a multiplexing layer communicatively connected to the alarm execution engine, and a database communicatively connected to the multiplexing layer, and the device includes:

A policy acquisition module located in the policy configuration platform, used to acquire the alarm policy and send the alarm policy to the alarm execution engine;

The condition decomposition module located at the multiplexing layer is used to receive the alarm strategy sent by the alarm execution engine, perform semantic analysis on the alarm strategy, obtain basic conditions corresponding to the alarm strategy, and perform abstract processing on the basic conditions to obtain target basic conditions;

A classification module located at the multiplexing layer, used to classify the basic conditions based on the occurrence frequencies corresponding to the target basic conditions, and obtain hotspot conditions and non-hotspot conditions;

A query module located at the multiplexing layer, used to query the database for the hotspot condition and the non-hotspot condition, obtain a first observation value corresponding to the hotspot condition and a second observation value corresponding to the non-hotspot condition, wherein the first observation value and the second observation value are the basis for triggering an alarm strategy;

The alarm processing module located in the alarm execution engine is used to execute the alarm processing process for the alarm strategy according to the first observation value and the second observation value.

9. An electronic device, comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory communicate with each other through the communication bus;

The memory is used to store computer programs;

The processor is used to implement the method according to any one of claims 1 to 7 when executing the program stored in the memory.

10. A computer-readable storage medium having instructions stored thereon, which, when executed by one or more processors, cause the processors to perform the method according to any one of claims 1 to 7.