[go: up one dir, main page]

WO2019080249A1 - Alarm processing method and apparatus, computer device, and storage medium - Google Patents

Alarm processing method and apparatus, computer device, and storage medium

Info

Publication number
WO2019080249A1
WO2019080249A1 PCT/CN2017/113234 CN2017113234W WO2019080249A1 WO 2019080249 A1 WO2019080249 A1 WO 2019080249A1 CN 2017113234 W CN2017113234 W CN 2017113234W WO 2019080249 A1 WO2019080249 A1 WO 2019080249A1
Authority
WO
WIPO (PCT)
Prior art keywords
alarm
preset
event
processing
alarm event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2017/113234
Other languages
French (fr)
Chinese (zh)
Inventor
高泗俊
李渊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Publication of WO2019080249A1 publication Critical patent/WO2019080249A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance

Definitions

  • the present application relates to the field of communications technologies, and in particular, to an alarm processing method, apparatus, computer device, and storage medium.
  • the application provides an alarm processing method, device, computer device and storage medium, which can reduce invalid alarms and reduce false alarm rate.
  • the application provides an alarm processing method, which is applied to a monitoring system, and includes:
  • the alarm event that has been processed through the preset is sent to the user terminal.
  • an alarm processing apparatus including:
  • a data acquisition unit configured to acquire monitoring data of the monitored object
  • a data determining unit configured to determine alarm data in the monitoring data according to a preset alarm rule, where the alarm data is monitoring data that triggers the preset alarm rule;
  • An event generating and recording unit configured to generate an alarm event according to the alarm data, and record the alarm event in a preset message queue
  • An event obtaining unit configured to acquire an alarm event in the preset message queue according to a preset acquisition rule
  • a preset processing unit configured to perform preset processing on the alarm event based on a machine learning rule and a preset knowledge base
  • the event sending unit is configured to send the alarm event that has been processed by the preset to the user terminal.
  • the present application further provides a computer device including a memory, a processor, and a computer program stored on the memory and operable on the processor, the processor implementing the program The alarm processing method described in any one of the applications.
  • the present application also provides a storage medium, wherein the storage medium stores a computer program, the computer program comprising program instructions, the program instructions, when executed by a processor, causing the processor to execute the application The alarm processing method of any of the provided.
  • the application provides an alarm processing method, device, computer device and storage medium.
  • the method determines the alarm data in the monitoring data of the monitoring object according to the preset alarm rule. After the alarm data is generated and saved in the preset message queue, the preset information is obtained according to the preset acquisition rule.
  • the alarm event is generated based on the machine learning rule and the preset knowledge base, and the processed alarm event is sent to the user terminal for display to the operation and maintenance manager.
  • the alarm processing method can not only eliminate the alarm storm, but also improve the alarm accuracy rate, thereby reducing the invalid alarm.
  • FIG. 1 is a schematic flowchart of an alarm processing method according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of an alarm processing method according to another embodiment of the present application.
  • FIG. 3 is a schematic flow chart of the sub-steps of step S205 in Figure 2;
  • FIG. 4 is a schematic block diagram of an alarm processing apparatus according to an embodiment of the present application.
  • FIG. 5 is a schematic block diagram of an alarm processing apparatus according to another embodiment of the present application.
  • FIG. 6 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of an alarm processing method according to an embodiment of the present application.
  • the alarm processing method is applied to a monitoring system, and the monitoring system can be run in a server, where the server can be a stand-alone server or a server cluster composed of multiple servers.
  • the alarm processing method includes steps S101 to S106.
  • the monitoring object includes a host, a container, a network device, a middleware, and the like, and the middleware is, for example, weblogic. Tomcat, kafka or Zookeeper component, etc.
  • the monitoring data of the monitoring object includes information such as the usage status of the monitoring object.
  • the monitoring object is a host, and the corresponding monitoring data includes information such as the CPU usage, memory usage, disk usage, and network traffic of the host.
  • S102 Determine alarm data in the monitoring data according to a preset alarm rule, where the alarm data is monitoring data that triggers the preset alarm rule.
  • the preset alarm rule is a preset alarm rule, and a set of common alarm rules are automatically configured as a preset alarm rule when the alarm center of the monitoring system is initialized. If the user has personalized requirements, the user can customize the alarm rule. Or modify the preset alarm rule.
  • the preset alarm rule adopts a preset threshold range determining method, by determining whether the value in the monitoring data is within a preset threshold range; if the value in the monitoring data is within the preset threshold range, Then, it is determined that the preset alarm rule is not triggered; if the value in the monitoring data is not within the preset threshold range, it is determined that the preset alarm rule is triggered.
  • the monitoring data corresponding to the triggering the preset alarm rule is the alarm data.
  • the CPU usage in the monitoring data is determined to be the alarm data if the value corresponding to the usage exceeds the preset threshold range of the CPU usage rate corresponding to the preset alarm rule.
  • generating an alarm event according to the alarm data includes: generating an alarm event by using the alarm data according to a preset event format.
  • the preset event format can standardize alarm events to facilitate subsequent parsing processing.
  • the preset event format may be defined in advance, so as to subsequently parse the value of the corresponding field in the alarm event, and obtain the desired information, and the alarm event generated according to the preset event format is pushed to each subscribed alarm event.
  • the user system needs to parse the alarm event format defined by the alarm center to obtain related information in the alarm event. This design can reduce the complexity of the alarm center interface design.
  • the preset event format for example, adopts a preset JSON format, as follows:
  • the alarm event is sent and saved to the preset message queue, where the preset message queue is saved in a preset database.
  • the use of queues can eliminate alarm storms.
  • the preset message queue includes multiple alarm events, so the alarm event in the preset message queue is obtained according to the preset acquisition rule.
  • the alarm event may further include priority information or alarm severity information.
  • the preset acquisition rule may be configured to acquire an alarm event in the preset message queue according to the priority level or the severity of the alarm, so that the alarm event in the preset message queue can be performed in a reasonable and orderly manner. Processing, first solve important alarm problems, thereby indirectly improving the efficiency of alarm processing.
  • the acquiring an alarm event in the preset message queue according to the preset acquisition rule includes: acquiring a preset number of alarm events in the preset message queue according to a preset sequence.
  • the preset sequence may be a priority order of the alarm events, or may be a time sequence corresponding to the generation time of the alarm events, and a certain number of alarm events are acquired each time according to the preset sequence. Therefore, the preset quantity can ensure that the alarm events in the preset message queue are processed in batches, thereby avoiding the occurrence of an alarm storm.
  • the preset process is performed on the alarm event obtained from the preset message queue based on the machine learning rule and the preset knowledge base, and specifically includes: based on a machine learning rule and a preset knowledge base
  • the alarm event obtained in the preset message queue is subjected to preset alarm analysis processing, preset alarm convergence processing, or preset alarm aggregation processing.
  • the machine learning rule is mainly a rule model obtained by learning and training a large number of alarm events by using an algorithm, and then applying the rule model to analyzing and processing an existing alarm event, the algorithm is specifically a machine learning algorithm, for example, based on an index.
  • Smooth quadratic smoothing algorithm, cubic smoothing algorithm, decomposition-based Fourier decomposition algorithm, wavelet decomposition algorithm, etc., or deep learning based feedforward neural network, cyclic neural network RNN Algorithms, etc. these machine learning algorithms need to be trained through a large amount of online historical data (historical alarm events) in order to obtain a relatively accurate alarm strategy, which is a rule model for analyzing and processing current alarm events.
  • multiple algorithms can be trained at the same time. The effect of each algorithm will be different for different scenarios. At this time, according to the historical results, the weight of each algorithm is adjusted, and finally a common alarm strategy is obtained.
  • the preset knowledge base is used for storing alarm events, alarm event processing result information, and analysis result information, etc., to help analyze and process the alarm events.
  • the analysis method corresponding to the preset alarm analysis processing includes: a method based on historical data statistics, a method of assuming normal distribution, a 3-sigma strategy, etc., and a statistical alarm method can automatically calculate a reasonable alarm valve. Value, when the performance indicator exceeds or falls below this alarm threshold, an alarm is triggered. At the same time, these reasonable alarm thresholds are set to the preset threshold ranges in the preset alarm rules, thereby forming a closed loop control, thereby improving the accuracy of the alarm.
  • the preset alarm convergence process includes: combining the alarm events according to the preset time window, for example, combining the alarm events generated in a certain period of time into one issue; or merging the corresponding alarm events according to the same monitoring policy. For example, multiple alarm events with a CPU usage of more than 90% are combined into one alarm event; or the corresponding alarm events are combined according to the same alarm object, such as the alarm issued by host A in a certain time window, cpu, memory or disk. The alarms can be combined into one alarm event corresponding to host A.
  • the preset alarm aggregation process includes: an association mining process and an exception dependency process.
  • the merging strategy of the association mining process refers to merging multiple alarm events into one alarm event by mining the association between the alarm event and the alarm event and the association between multiple timings.
  • Exception dependency processing refers to the occurrence of an exception and another dependency. For example, a disk failure may cause the host to crash. If an alarm event of disk failure and host loss is received at the same time, the exception dependency can be passed. , Convergence of these two alarm events into one alarm event.
  • the preset knowledge base and the preset message queue are saved in the same database, so that the alarm event is loaded into the database for preset processing of these alarm events; wherein the preset knowledge base stores historical alarms. Events and related information.
  • the user terminal includes a terminal corresponding to the system administrator or a terminal corresponding to the user system, and the system administrator is a management personnel responsible for monitoring the system.
  • the system administrator can notify the administrator of the system by email, short message, telephone, etc., and for the user system, the alarm event can be pushed to the user's own system for further processing, for example, some users will develop themselves.
  • the tool platform further processes the alarm events and combines the business logic to make judgments.
  • the user system can subscribe to the alarms of the monitoring system. When a new alarm is generated in the alarm center of the monitoring system, the directional push is implemented according to the corresponding topic subscribed by the user system, thereby improving the user experience.
  • the alarm processing method of the foregoing embodiment determines the alarm data in the monitoring data of the monitoring object according to the preset alarm rule. After the alarm data is generated and saved in the preset message queue, the preset information is obtained according to the preset acquisition rule. An alarm event in the queue, and the obtained alarm event is pre-processed based on the machine learning rule and the preset knowledge base, and the processed alarm event is sent to the user terminal for display to the operation and maintenance manager.
  • the alarm processing method can not only eliminate the alarm storm, but also improve the alarm accuracy rate, thereby reducing the invalid alarm.
  • FIG. 2 is a schematic flowchart of an alarm processing method according to another embodiment of the present application.
  • the alarm processing method is applied to a monitoring system, and the monitoring system can be run in a server, where the server can be a stand-alone server or a server cluster composed of multiple servers.
  • the alarm processing method includes steps S201 to S208.
  • the monitoring object includes the monitoring object itself in addition to the host, the container, the network device, and the middleware.
  • acquiring the monitoring data of the monitoring object further includes: acquiring monitoring data of the monitoring system.
  • S202 Determine, according to a preset alarm rule, alarm data in the monitoring data, where the alarm data is monitoring data that triggers the preset alarm rule.
  • the preset alarm rule is a preset alarm rule, and the preset threshold range determination method is used, and other similar determination methods may also be used, which are not limited herein.
  • the alarm data is generated in an alarm event according to a preset event format, and the generated alarm event is saved in a preset message queue.
  • the preset message queue is used to store multiple alarm events, which can effectively prevent the server from generating multiple alarm events at the same time, thereby generating an alarm storm.
  • the preset number of alarm events in the preset message queue are obtained in a preset order.
  • the preset sequence may be a priority order of the alarm events, or may be a time sequence corresponding to the generation time of the alarm events, and a certain number of alarm events are acquired each time according to the preset sequence. Therefore, the preset quantity can ensure that the alarm events in the preset message queue are processed in batches, thereby avoiding the occurrence of an alarm storm.
  • the monitoring object is not the alarm event corresponding to the monitoring system or the preset processing method of the foregoing embodiment, and is not described in detail herein.
  • the monitoring alarm event is filtered out from the alarm event, where the monitoring alarm event is an alarm event of the monitoring system.
  • the alarm event includes not only the alarm event of the monitoring system but also the alarm event of the host, the container, or the middleware. Therefore, the monitoring alarm event needs to be filtered out from the alarm event, and the monitoring alarm event is an alarm of the monitoring system. event.
  • the identification information of the monitoring object may be obtained by associating the monitoring data of the monitoring object with the identification information. Therefore, the alarm data is also associated with the identification information, and the identification information can be used to filter out the monitoring alarm event from the alarm event.
  • S205b Send the monitoring alarm event to the self-healing system, so that the self-healing system processes the fault corresponding to the monitoring alarm event.
  • the self-healing system and the monitoring system can be installed in different servers, and communication connections are established between the servers to complete data interaction.
  • the self-healing system is configured to automatically process faults corresponding to the monitoring alarm events, such as expanding capacity, restarting services, or limiting traffic.
  • the alarm event that has been processed by the preset is sent to the user terminal, so that the user can perform corresponding processing on the alarm event, and the alarm event is cancelled in time.
  • S207 Receive feedback information sent by the user terminal, where the feedback information is processing result information generated by the user terminal by performing preset label processing on the preset alarm event.
  • the preset tag processing includes: marking an invalid alarm event, not processing, and feeding back to the alarm center of the server. Or mark the event alarm and handle the event.
  • User terminals may have automated event processing systems. After receiving the event, the user terminal starts processing, and notifies the alarm center of the server to the final processing result, which can be specifically sent through an API callback.
  • the server saves the processing result information to the preset knowledge base, and uses the alarm event and the corresponding processing result information as historical alarm events to analyze and process the current alarm event, thereby forming a
  • the closed-loop mechanism allows the entire process to be continuously improved, thereby improving the accuracy of the alarm.
  • the alarm processing method provided by the foregoing embodiment can not only eliminate the alarm storm, but also perform self-healing processing on the alarm corresponding fault of the monitoring system to ensure the normal operation of the monitoring system, and can also receive the feedback information sent by the user terminal, and the The processing result information about the alarm event in the feedback information is saved in the preset database, so as to analyze and process the next alarm event, thereby forming an analysis closed-loop control mechanism, and the monitoring system is continuously improved through the closed-loop control mechanism.
  • the alarm analysis processing capability improves the alarm accuracy.
  • the embodiment of the present application further provides an alarm processing apparatus, which is used to execute the foregoing alarm processing method.
  • FIG. 4 is a schematic block diagram of an alarm processing apparatus according to an embodiment of the present application.
  • the alarm processing device 300 can be installed in a server.
  • the alarm processing apparatus 300 includes a data acquisition unit 301, a data determination unit 302, a generation recording unit 303, an event acquisition unit 304, a preset processing unit 305, and an event transmission unit 306.
  • the data obtaining unit 301 is configured to acquire monitoring data of the monitoring object.
  • the monitoring object includes a host, a container, a network device, and a middleware, and the middleware is, for example, weblogic, tomcat, Kafka or Zookeeper component, etc.
  • the monitoring data of the monitoring object includes information such as the usage status of the monitoring object.
  • the monitoring object is a host, and the corresponding monitoring data includes information such as the CPU usage, memory usage, disk usage, and network traffic of the host.
  • the data determining unit 302 is configured to determine alarm data in the monitoring data according to a preset alarm rule, where the alarm data is monitoring data that triggers the preset alarm rule.
  • the preset alarm rule is a preset alarm rule.
  • a set of common alarm rules is automatically configured as a preset alarm rule. If the user has personalized requirements, the preset can be customized or modified. Set alarm rules.
  • the preset alarm rule adopts a preset threshold range determining method, by determining whether the value in the monitoring data is within a preset threshold range; if the value in the monitoring data is within the preset threshold range, Then, it is determined that the preset alarm rule is not triggered; if the value in the monitoring data is not within the preset threshold range, it is determined that the preset alarm rule is triggered.
  • the monitoring data corresponding to the triggering the preset alarm rule is the alarm data.
  • the generating record unit 303 is configured to generate an alarm event according to the alarm data, and record the alarm event in a preset message queue.
  • the generating an alarm event according to the alarm data includes: generating an alarm event by using the alarm data according to a preset event format.
  • the preset event format can standardize alarm events to facilitate subsequent parsing processing.
  • the alarm event is sent and saved to the preset message queue, where the preset message queue is saved in a preset database.
  • the use of queues can eliminate alarm storms.
  • the event obtaining unit 304 is configured to acquire an alarm event in the preset message queue according to a preset acquisition rule.
  • the preset message queue includes multiple alarm events, so the alarm event in the preset message queue is obtained according to the preset acquisition rule.
  • the alarm event may further include priority information or alarm severity information, etc., corresponding to
  • the preset acquisition rule may be configured to acquire an alarm event in the preset message queue according to a priority level or an alarm severity. Therefore, the alarm event in the preset message queue may be processed in a reasonable and orderly manner. Important alarm issues, thereby indirectly improving the efficiency of alarm processing.
  • the event obtaining unit 304 is configured to: acquire a preset number of alarm events in the preset message queue according to a preset sequence. Specifically, a certain number of alarm events are acquired each time. Therefore, the preset quantity can ensure that the alarm events in the preset message queue are processed in batches, thereby avoiding the occurrence of an alarm storm.
  • the preset processing unit 305 is configured to perform preset processing on the alarm event acquired from the preset message queue based on the machine learning rule and the preset knowledge base.
  • the method is: performing preset alarm analysis processing, preset alarm convergence processing, or preset alarm aggregation processing on an alarm event obtained from the preset message queue based on a machine learning rule and a preset knowledge base.
  • the preset processing unit 305 can include an alarm analysis sub-unit 3051, an alarm convergence sub-unit 3052, and an alarm aggregation sub-unit 3053.
  • the alarm analysis sub-unit 3051 is configured to automatically calculate a reasonable alarm threshold by using a statistical method based on historical data statistics, a method of normal distribution, a 3-sigma strategy, and the like. When the performance indicator exceeds or falls below this alarm threshold, an alarm is triggered. At the same time, these reasonable alarm thresholds are set to the preset threshold ranges in the preset alarm rules, thereby forming a closed loop control, thereby improving the accuracy of the alarm.
  • the alarm convergence sub-unit 3052 is specifically configured to: combine the alarm events according to the preset time window; or merge the corresponding alarm events according to the same monitoring policy; or combine the corresponding alarm events according to the same alarm object.
  • the alarm events generated during a certain period of time are combined into one.
  • multiple alarm events with a CPU usage exceeding 90% are combined into one alarm event; for example, the alarm sent by host A at a certain time window.
  • Alarms such as cpu, memory, or disk can be combined into one alarm event corresponding to host A.
  • the alarm aggregation sub-unit 3053 is specifically configured to: association mining processing and exception dependency processing.
  • the merging strategy of the association mining process refers to merging multiple alarm events into one alarm event by mining the association between the alarm event and the alarm event and the association between multiple timings.
  • Exception dependency processing refers to the occurrence of an exception and another dependency. For example, a disk failure may cause the host to crash. If an alarm event of disk failure and host loss is received at the same time, the exception dependency can be passed. , Convergence of these two alarm events into one alarm event.
  • the event sending unit 306 is configured to send the alarm event that has undergone the preset processing to the user terminal.
  • the user terminal includes a system administrator or a terminal corresponding to the user system, and different transmission modes may be adopted for different user terminals.
  • the system administrator can notify the administrator of the system by email, short message, telephone, etc., and for the user system, the alarm event can be pushed to the user system for further processing, for example, some users will develop their own tools.
  • the platform further processes the alarm event and combines the business logic to make judgments.
  • the user system can subscribe to the alarm of the monitoring system. When a new alarm is generated in the alarm center of the monitoring system, the directional push is implemented according to the corresponding topic subscribed by the user system, thereby improving the user experience.
  • FIG. 5 is a schematic block diagram of an alarm processing apparatus according to an embodiment of the present application.
  • the alarm processing device 400 can be installed in a server.
  • the alarm processing apparatus 400 includes a data acquisition unit 401, a data determination unit 402, a generation recording unit 403, an event acquisition unit 404, a preset processing unit 405, an event transmission unit 406, an information reception unit 407, and information storage. Unit 408.
  • the data obtaining unit 401 is configured to acquire monitoring data of the monitoring object.
  • the monitoring object includes the monitoring object itself in addition to the host, the container, the network device, and the middleware.
  • acquiring the monitoring data of the monitoring object further includes: acquiring monitoring data of the monitoring system.
  • the data determining unit 402 is configured to determine alarm data in the monitoring data according to a preset alarm rule, where the alarm data is monitoring data that triggers the preset alarm rule.
  • the preset alarm rule is a preset alarm rule, and the preset threshold range determination method is used, and other similar determination methods may also be used, which are not limited herein.
  • the generating record unit 403 is configured to generate an alarm event according to the alarm data, and record the alarm event in a preset message queue.
  • the alarm data is generated in an alarm event according to a preset event format, and the generated alarm event is saved in a preset message queue.
  • the preset message queue is used to store multiple alarm events, which can effectively prevent the server from generating multiple alarm events at the same time, thereby generating an alarm storm.
  • the event obtaining unit 404 is configured to acquire an alarm event in the preset message queue according to a preset acquisition rule.
  • the method is: acquiring a preset number of alarm events in the preset message queue according to a preset sequence. Specifically, a certain number of alarm events are acquired each time in a preset order. Therefore, the preset quantity can ensure that the alarm events in the preset message queue are processed in batches, thereby avoiding the occurrence of an alarm storm.
  • the preset processing unit 405 is configured to perform preset processing on the alarm event based on the machine learning rule and the preset knowledge base.
  • the alarm event and the monitoring object corresponding to the monitoring object are the alarm events corresponding to the monitoring system itself, and therefore different preset processing modes are required.
  • the preset processing unit 405 includes the event screening subunit 4051 and The self-healing subunit 4052 is sent.
  • the event screening sub-unit 4051 is configured to filter out a monitoring alarm event from the alarm event, where the monitoring alarm event is an alarm event of the monitoring system.
  • the alarm event includes not only the alarm event of the monitoring system but also the alarm event of the host, the container, or the middleware. Therefore, the monitoring alarm event needs to be filtered out from the alarm event, and the monitoring alarm event is an alarm of the monitoring system. event.
  • the identification information of the monitoring object may be obtained by associating the monitoring data of the monitoring object with the identification information. Therefore, the alarm data is also associated with the identification information, and the identification information can be used to filter out the monitoring alarm event from the alarm event.
  • the self-healing sub-unit 4052 is configured to send the monitoring alarm event to the self-healing system to cause the self-healing system to process the fault corresponding to the monitoring alarm event.
  • the self-healing system and the monitoring system can be installed in different servers, and communication connections are established between the servers to complete data interaction.
  • the self-healing system is configured to automatically process faults corresponding to the monitoring alarm events, such as expanding capacity, restarting services, or limiting traffic.
  • the event sending unit 406 is configured to send the alarm event that has undergone the preset processing to the user terminal.
  • the alarm event that has been processed by the preset is sent to the user terminal, so that the user can perform corresponding processing on the alarm event, and the alarm event is cancelled in time.
  • the information receiving unit 407 is configured to receive the feedback information sent by the user terminal, where the feedback information is processing result information generated by the user terminal by performing preset label processing on the preset alarm event. .
  • the preset tag processing includes: marking an invalid alarm event, not processing, and feeding back to the alarm center of the server. Or mark the event alarm and handle the event.
  • User terminals may have automated event processing systems. After receiving the event, the user terminal starts processing, and notifies the alarm center of the server to the final processing result, which can be specifically sent through an API callback.
  • the information saving unit 408 saves the processing result information to the preset knowledge base.
  • the server saves the processing result information to the preset knowledge base, and uses the alarm event and the corresponding processing result information as historical alarm events to analyze and process the current alarm event, thereby forming a closed loop mechanism.
  • the entire process is continuously improved, thereby improving the accuracy of the alarm.
  • the above apparatus may be embodied in the form of a computer program that can be run on a computer device as shown in FIG.
  • FIG. 6 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • the computer device 700 device can be a terminal.
  • the terminal can be a communication-enabled electronic device such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device.
  • the computer device 700 includes a processor 720, a network interface 750, and a memory connected by a system bus 710, wherein the memory can include a non-volatile storage medium 730 and an internal memory 740.
  • the non-volatile storage medium 730 can store an operating system 731 and a computer program 732.
  • the processor 720 can be caused to perform an alert processing method.
  • the processor 720 is used to provide computing and control capabilities to support the operation of the entire computer device 700.
  • the internal memory 740 provides an environment for the operation of the computer program 732 in the non-volatile storage medium 730.
  • the network interface 750 is used for network communication, such as sending assigned tasks and the like. It will be understood by those skilled in the art that the structure shown in FIG. 6 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device 700 to which the solution of the present application is applied, and a specific computer device. 700 may include more or fewer components than shown, or some components may be combined, or have different component arrangements.
  • the processor 720 is configured to run program code stored in the memory to implement the following functions:
  • the monitoring data of the monitoring object Acquiring the monitoring data of the monitoring object; determining the alarm data in the monitoring data according to the preset alarm rule, wherein the alarm data is monitoring data that triggers the preset alarm rule; generating an alarm event according to the alarm data, and The alarm event is recorded in a preset message queue; the alarm event in the preset message queue is obtained according to a preset acquisition rule; and the alarm event obtained from the preset message queue is obtained based on the machine learning rule and the preset knowledge base. Performing a preset process; and transmitting an alert event processed by the preset to the user terminal.
  • the following is also performed: receiving feedback information sent by the user terminal, where the feedback information is Processing result information generated by the user terminal for performing preset label processing on the preset processed alarm event; and saving the processing result information to the preset knowledge base.
  • the program when the processor 720 is executed, the program is specifically executed to: select a monitoring alarm event from the alarm event, where the monitoring alarm event is an alarm event of the monitoring system; The event is sent to the self-healing system such that the self-healing system processes the fault corresponding to the monitored alarm event.
  • the processor 720 when executing, specifically executes a process of generating an alarm event by using the alarm data according to a preset event format.
  • the program when the processor 720 is executed, the program is specifically configured to: perform preset alarm analysis processing and preset alarm on the alarm event acquired from the preset message queue based on the machine learning rule and the preset knowledge base. Convergence processing or preset alarm aggregation processing.
  • the program when the processor 720 is executed, the program is specifically executed to acquire a preset number of alarm events in the preset message queue according to a preset sequence.
  • the processor 720 may be a central processing unit (Central Processing Unit, CPU), the processor 720 can also be other general-purpose processors, digital signal processors (DSPs), and application specific integrated circuits. (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field-Programmable) Gate Array, FPGA) Or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, and the like.
  • the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • computer device 700 architecture illustrated in FIG. 6 does not constitute a limitation to computer device 700, may include more or fewer components than illustrated, or may combine certain components, or different components. Arrangement.
  • a storage medium comprising a computer readable storage medium storing a computer program, wherein the computer program comprises program instructions.
  • This program instruction is implemented when executed by the processor:
  • the monitoring data of the monitoring object Acquiring the monitoring data of the monitoring object; determining the alarm data in the monitoring data according to the preset alarm rule, wherein the alarm data is monitoring data that triggers the preset alarm rule; generating an alarm event according to the alarm data, and The alarm event is recorded in a preset message queue; the alarm event in the preset message queue is obtained according to a preset acquisition rule; and the alarm event obtained from the preset message queue is obtained based on the machine learning rule and the preset knowledge base. Performing a preset process; and transmitting an alert event processed by the preset to the user terminal.
  • the method further includes: receiving feedback information sent by the user terminal, where the feedback information is Processing result information generated by the user terminal for performing preset label processing on the preset processed alarm event; and saving the processing result information to the preset knowledge base.
  • the machine learning rule and the preset knowledge base when the program instruction is executed by the processor, perform preset processing on the alarm event acquired from the preset message queue, and the specific implementation is:
  • the monitoring alarm event is filtered out, wherein the monitoring alarm event is an alarm event of the monitoring system; and the monitoring alarm event is sent to the self-healing system to cause the self-healing system to process the fault corresponding to the monitoring alarm event.
  • the alarm event is generated according to the alarm data
  • the alarm data is generated according to a preset event format.
  • the machine learning rule and the preset knowledge base when the program instruction is executed by the processor, perform preset processing on the alarm event acquired from the preset message queue, and the specific implementation is: based on a machine learning rule.
  • the preset knowledge base performs preset alarm analysis processing, preset alarm convergence processing, or preset alarm aggregation processing on the alarm events obtained from the preset message queue.
  • the specific implementation is: acquiring the preset in the preset message queue according to a preset order. The number of alarm events.
  • the computer readable storage medium may be a USB flash drive, a removable hard drive, or a read only memory (ROM, Read-Only)
  • ROM Read-Only
  • the disclosed alarm processing apparatus and method may be implemented in other manners.
  • the alarm processing device embodiments described above are merely illustrative.
  • the division of each unit is only a logical function division, and there may be another division manner in actual implementation.
  • multiple units or components may be combined or integrated into another system, or some features may be omitted or not implemented.
  • the steps in the method of the embodiment of the present application may be sequentially adjusted, merged, and deleted according to actual needs.
  • the units in the apparatus of the embodiment of the present application may be combined, divided, and deleted according to actual needs.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, can be stored in a computer readable storage medium.
  • the technical solution of the present application may be in essence or part of the contribution to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium.
  • Including several instructions to make a computer device may be a personal computer, terminal, or network device, etc.) Performing all or part of the steps of the method described in various embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Computing Systems (AREA)
  • Economics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Alarm Systems (AREA)

Abstract

本申请公开了一种告警处理方法、装置、计算机设备及存储介质。其中方法包括:获取监控对象的监控数据;确定监控数据中的告警数据并成告警事件,将告警事件记录在预设消息队列;获取预设消息队列中的告警事件;基于机器学习规则和预设知识库对所述告警事件进行预设处理;将经过预设处理过的告警事件发送至用户终端。The present application discloses an alarm processing method, device, computer device and storage medium. The method includes: obtaining monitoring data of the monitoring object; determining alarm data in the monitoring data and generating an alarm event, recording the alarm event in a preset message queue; acquiring an alarm event in the preset message queue; based on machine learning rules and presets The knowledge base performs preset processing on the alarm event, and sends the preset alarm event to the user terminal.

Description

告警处理方法、装置、计算机设备及存储介质  Alarm processing method, device, computer device and storage medium

本申请要求于2017年10月24日提交中国专利局、申请号为2017110017553、发明名称为“告警处理方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese Patent Application filed on October 24, 2017, the Chinese Patent Application No. 2017110017553, entitled "Alarm Processing Method, Apparatus, Computer Equipment, and Storage Medium", the entire contents of which are incorporated by reference. In this application.

技术领域Technical field

本申请涉及通信技术领域,尤其涉及一种告警处理方法、装置、计算机设备及存储介质。The present application relates to the field of communications technologies, and in particular, to an alarm processing method, apparatus, computer device, and storage medium.

背景技术Background technique

目前,市场上存在很多开源的监控系统,比如Zabbix监控系统和Open-falcon监控系统。这些监控系统均包括“告警中心”,该告警中心用于将告警事件展示给运维人员以提示运维人员进行相应地处理。但是,现有的告警中心大多只是简单地实现告警事件的展示,这种处理方式会导致产生大量的无效告警,从而延长了故障恢复时间,不利于运维人员的及时处理,造成了更多的损失。Currently, there are many open source monitoring systems on the market, such as the Zabbix monitoring system and the Open-falcon monitoring system. These monitoring systems include an "alarm center", which is used to display alarm events to the operation and maintenance personnel to prompt the operation and maintenance personnel to handle them accordingly. However, most of the existing alarm centers simply display the alarm events. This type of processing will result in a large number of invalid alarms, which will prolong the recovery time, which is not conducive to the timely processing of operation and maintenance personnel, resulting in more loss.

发明内容Summary of the invention

本申请提供了一种告警处理方法、装置、计算机设备及存储介质,可以减少无效报警以及降低误报率。The application provides an alarm processing method, device, computer device and storage medium, which can reduce invalid alarms and reduce false alarm rate.

第一方面,本申请提供了一种告警处理方法,应用于监控系统,其包括:In a first aspect, the application provides an alarm processing method, which is applied to a monitoring system, and includes:

获取监控对象的监控数据;Obtain monitoring data of the monitored object;

根据预设告警规则确定所述监控数据中的告警数据,其中所述告警数据为触发所述预设告警规则的监控数据;Determining, by the preset alarm rule, the alarm data in the monitoring data, where the alarm data is monitoring data that triggers the preset alarm rule;

根据所述告警数据生成告警事件,并将所述告警事件记录在预设消息队列;Generating an alarm event according to the alarm data, and recording the alarm event in a preset message queue;

按照预设获取规则获取所述预设消息队列中的告警事件;Obtaining an alarm event in the preset message queue according to a preset acquisition rule;

基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理;以及Presetting the alarm event acquired from the preset message queue based on the machine learning rule and the preset knowledge base;

将经过所述预设处理过的告警事件发送至用户终端。The alarm event that has been processed through the preset is sent to the user terminal.

第二方面,本申请提供了一种告警处理装置,其包括:In a second aspect, the present application provides an alarm processing apparatus, including:

数据获取单元,用于获取监控对象的监控数据;a data acquisition unit, configured to acquire monitoring data of the monitored object;

数据确定单元,用于根据预设告警规则确定所述监控数据中的告警数据,其中所述告警数据为触发所述预设告警规则的监控数据;a data determining unit, configured to determine alarm data in the monitoring data according to a preset alarm rule, where the alarm data is monitoring data that triggers the preset alarm rule;

事件生成记录单元,用于根据所述告警数据生成告警事件,并将所述告警事件记录在预设消息队列;An event generating and recording unit, configured to generate an alarm event according to the alarm data, and record the alarm event in a preset message queue;

事件获取单元,用于按照预设获取规则获取所述预设消息队列中的告警事件;An event obtaining unit, configured to acquire an alarm event in the preset message queue according to a preset acquisition rule;

预设处理单元,用于基于机器学习规则和预设知识库对所述告警事件进行预设处理;以及a preset processing unit, configured to perform preset processing on the alarm event based on a machine learning rule and a preset knowledge base;

事件发送单元,用于将经过所述预设处理过的告警事件发送至用户终端。The event sending unit is configured to send the alarm event that has been processed by the preset to the user terminal.

第三方面,本申请又提供了一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述程序时实现本申请提供的任一项所述的告警处理方法。In a third aspect, the present application further provides a computer device including a memory, a processor, and a computer program stored on the memory and operable on the processor, the processor implementing the program The alarm processing method described in any one of the applications.

第四方面,本申请还提供了一种存储介质,其中所述存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行本申请提供的任一项所述的告警处理方法。In a fourth aspect, the present application also provides a storage medium, wherein the storage medium stores a computer program, the computer program comprising program instructions, the program instructions, when executed by a processor, causing the processor to execute the application The alarm processing method of any of the provided.

本申请提供一种告警处理方法、装置、计算机设备及存储介质。其中该方法通过根据预设告警规则确定监控对象的监控数据中的告警数据;在将告警数据生成告警事件并保存在预设消息队列后,按照预设获取规则获取所述预设消息队列中的告警事件,并将获取到的告警事件基于机器学习规则和预设知识库进行预设处理,以及将处理过的告警事件发送至用户终端以展示给运维管理人员。该告警处理方法不仅可以消除报警风暴,同时还可以提高报警准确率,进而减少无效报警。The application provides an alarm processing method, device, computer device and storage medium. The method determines the alarm data in the monitoring data of the monitoring object according to the preset alarm rule. After the alarm data is generated and saved in the preset message queue, the preset information is obtained according to the preset acquisition rule. The alarm event is generated based on the machine learning rule and the preset knowledge base, and the processed alarm event is sent to the user terminal for display to the operation and maintenance manager. The alarm processing method can not only eliminate the alarm storm, but also improve the alarm accuracy rate, thereby reducing the invalid alarm.

附图说明DRAWINGS

为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly described below. Obviously, the drawings in the following description are some embodiments of the present application, For the ordinary technicians, other drawings can be obtained based on these drawings without any creative work.

图 1是本申请一实施例提供的一种告警处理方法的示意流程图;FIG. 1 is a schematic flowchart of an alarm processing method according to an embodiment of the present application;

图 2是本申请另一实施例提供的一种告警处理方法的示意流程图;2 is a schematic flowchart of an alarm processing method according to another embodiment of the present application;

图 3是图2中步骤S205的子步骤示意流程图;Figure 3 is a schematic flow chart of the sub-steps of step S205 in Figure 2;

图 4是本申请一实施例提供的一种告警处理装置的示意性框图;4 is a schematic block diagram of an alarm processing apparatus according to an embodiment of the present application;

图 5是本申请另一实施例提供的一种告警处理装置的示意性框图;FIG. 5 is a schematic block diagram of an alarm processing apparatus according to another embodiment of the present application; FIG.

图 6是本申请一实施例提供的一种计算机设备的一示意性框图。FIG. 6 is a schematic block diagram of a computer device according to an embodiment of the present application.

具体实施方式Detailed ways

下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application are clearly and completely described in the following with reference to the drawings in the embodiments of the present application. It is obvious that the described embodiments are a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope are the scope of the present application.

应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”和 “包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It will be understood that the terms "comprise" and "the" when used in the specification and the appended claims "Comprising" indicates the existence of the described features, integers, steps, operations, elements and/or components, but does not exclude the presence of one or more other features, integers, steps, operations, elements, components and/or combinations thereof Add to.

还应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。The terminology used herein is for the purpose of describing particular embodiments and is not intended to be limiting. As used in the specification and the appended claims, the claims

还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/ 或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。It should also be further understood that the term "and / is used in the specification and appended claims of this application. Or "" refers to any combination of one or more of the associated listed items and all possible combinations, and includes such combinations.

请参阅图1,图1是本申请一实施例提供的一种告警处理方法的示意流程图。该告警处理方法应用于监控系统中,监控系统可运行在服务器中,其中该服务器可以是独立的服务器,也可以是多个服务器组成的服务器集群。如图1所示,该告警处理方法包括步骤S101~S106。Referring to FIG. 1 , FIG. 1 is a schematic flowchart of an alarm processing method according to an embodiment of the present application. The alarm processing method is applied to a monitoring system, and the monitoring system can be run in a server, where the server can be a stand-alone server or a server cluster composed of multiple servers. As shown in FIG. 1, the alarm processing method includes steps S101 to S106.

S101、获取监控对象的监控数据。S101. Obtain monitoring data of the monitored object.

在本实施例中,监控对象包括主机、容器、网络设备以及中间件等,该中间件比如为weblogic, tomcat, kafka或 zookeeper组件等。监控对象的监控数据包括监控对象的使用状态等信息,比如监控对象为主机,相对应地监控数据包括主机的CPU使用率、内存使用率,磁盘使用率和网络流量等信息。In this embodiment, the monitoring object includes a host, a container, a network device, a middleware, and the like, and the middleware is, for example, weblogic. Tomcat, kafka or Zookeeper component, etc. The monitoring data of the monitoring object includes information such as the usage status of the monitoring object. For example, the monitoring object is a host, and the corresponding monitoring data includes information such as the CPU usage, memory usage, disk usage, and network traffic of the host.

S102、根据预设告警规则确定所述监控数据中的告警数据,其中所述告警数据为触发所述预设告警规则的监控数据。S102: Determine alarm data in the monitoring data according to a preset alarm rule, where the alarm data is monitoring data that triggers the preset alarm rule.

在本实施例中,预设告警规则为预先设置的告警规则,在监控系统的告警中心初始化时自动配置一套通用的告警规则作为预设告警规则,如果用户有个性化的需求,可以自定义或修改该预设告警规则。In this embodiment, the preset alarm rule is a preset alarm rule, and a set of common alarm rules are automatically configured as a preset alarm rule when the alarm center of the monitoring system is initialized. If the user has personalized requirements, the user can customize the alarm rule. Or modify the preset alarm rule.

具体地,该预设告警规则采用预设阈值范围判定方法,通过判断所述监控数据中的数值是否在预设阈值范围内;如果所述监控数据中的数值在所述预设阈值范围内,则判定为未触发所述预设告警规则;如果所述监控数据中的数值未在所述预设阈值范围内,则判定为触发所述预设告警规则。其中触发所述预设告警规则对应的监控数据即为告警数据。Specifically, the preset alarm rule adopts a preset threshold range determining method, by determining whether the value in the monitoring data is within a preset threshold range; if the value in the monitoring data is within the preset threshold range, Then, it is determined that the preset alarm rule is not triggered; if the value in the monitoring data is not within the preset threshold range, it is determined that the preset alarm rule is triggered. The monitoring data corresponding to the triggering the preset alarm rule is the alarm data.

比如,监控数据中的CPU使用率,如果该使用率对应的数值超过了所述预设告警规则对应的CPU使用率的预设阈值范围,则判定该CPU使用率即为告警数据。For example, the CPU usage in the monitoring data is determined to be the alarm data if the value corresponding to the usage exceeds the preset threshold range of the CPU usage rate corresponding to the preset alarm rule.

S103、根据所述告警数据生成告警事件,并将所述告警事件记录在预设消息队列。S103. Generate an alarm event according to the alarm data, and record the alarm event in a preset message queue.

在本实施例中,根据所述告警数据生成告警事件,包括:按照预设事件格式将所述告警数据生成告警事件。该预设事件格式可以对告警事件进行规范,方便后续解析处理。In this embodiment, generating an alarm event according to the alarm data includes: generating an alarm event by using the alarm data according to a preset event format. The preset event format can standardize alarm events to facilitate subsequent parsing processing.

具体地,可以事先定义该预设事件格式,以便后续解析告警事件中相应的字段的值,从中获取想要的信息,按照预设事件格式生成的告警事件在被推送至每个订阅了告警事件的用户系统后,用户系统需要针对告警中心定义的告警事件格式进行解析,以获取告警事件中的相关信息。这样设计可以减少告警中心接口设计的复杂度。Specifically, the preset event format may be defined in advance, so as to subsequently parse the value of the corresponding field in the alarm event, and obtain the desired information, and the alarm event generated according to the preset event format is pushed to each subscribed alarm event. After the user system, the user system needs to parse the alarm event format defined by the alarm center to obtain related information in the alarm event. This design can reduce the complexity of the alarm center interface design.

在一实施例中,该预设事件格式,比如采用预设JSON格式,具体如下:In an embodiment, the preset event format, for example, adopts a preset JSON format, as follows:

{Detail:[P3#1/1]net_monitor_1cpu.used.percentall(#3)100>=30,Entity:net_monitor_1,Status:1,Url:abc.com,Group:app_type=APP,app_name=NET-MONITOR >} 。{Detail:[P3#1/1]net_monitor_1cpu.used.percentall(#3)100>=30,Entity:net_monitor_1,Status:1,Url:abc.com,Group:app_type=APP,app_name=NET-MONITOR >}.

在上述JSON式中,[P3#1/1]中P3表示告警优先级,#1/1表示当前告警次数/最大告警次数;net_monitor_1cpu.used.percentall(#3)100>=30中net_monitor_1表示监控对象对应的系统名称,具体为CPU的使用率;Entity:net_monitor_1表示是实体网络;Status:1表示告警事件的状态;Url:abc.com表示监控对象的URL地址;Group:app_type=APP表示监控对象类别;app_name=NET-MONITOR 表示监控对象的小类别。In the above JSON formula, P3 in [P3#1/1] indicates the alarm priority level, #1/1 indicates the current alarm number/maximum alarm number; net_monitor_1cpu.used.percentall(#3)100>=30 indicates that the net_monitor_1 indicates the monitoring The system name corresponding to the object, specifically the CPU usage rate; Entity: net_monitor_1 indicates the entity network; Status: 1 indicates the status of the alarm event; Url: abc.com indicates the URL address of the monitored object; Group: app_type = APP indicates the monitored object Category; app_name=NET-MONITOR Represents a small category of monitored objects.

其中,在按照预设事件格式将所述告警数据生成告警事件后,还需将该告警事件发送并保存至预设消息队列中,该预设消息队列保存在预设数据库中,该预设消息队列的使用可以消除告警风暴。After the alarm data is generated according to the preset event format, the alarm event is sent and saved to the preset message queue, where the preset message queue is saved in a preset database. The use of queues can eliminate alarm storms.

S104、按照预设获取规则获取所述预设消息队列中的告警事件。S104. Acquire an alarm event in the preset message queue according to a preset acquisition rule.

在本实施例中,由于预设消息队列中包括多个告警事件,因此按照预设获取规则获取所述预设消息队列中的告警事件,比如告警事件还可包括优先等级信息或告警严重程度信息等,相对应地,该预设获取规则可以为按照优先等级或告警严重程度的顺序获取所述预设消息队列中的告警事件,因此可以合理有序地对预设消息队列中的告警事件进行处理,先解决重要的告警问题,由此间接地提高告警处理效率。In this embodiment, the preset message queue includes multiple alarm events, so the alarm event in the preset message queue is obtained according to the preset acquisition rule. For example, the alarm event may further include priority information or alarm severity information. Correspondingly, the preset acquisition rule may be configured to acquire an alarm event in the preset message queue according to the priority level or the severity of the alarm, so that the alarm event in the preset message queue can be performed in a reasonable and orderly manner. Processing, first solve important alarm problems, thereby indirectly improving the efficiency of alarm processing.

在一实施例中,所述按照预设获取规则获取所述预设消息队列中的告警事件,包括:按照预设顺序获取所述预设消息队列中预设数量的告警事件。具体地,该预设顺序可以为告警事件的优先等级顺序,也可以为告警事件的生成时间对应时间顺序,按照该预设顺序每次获取一定数量的告警事件。因此,该预设数量可以保证按批次地对预设消息队列中的告警事件进行处理,进而避免报警风暴的产生。In an embodiment, the acquiring an alarm event in the preset message queue according to the preset acquisition rule includes: acquiring a preset number of alarm events in the preset message queue according to a preset sequence. Specifically, the preset sequence may be a priority order of the alarm events, or may be a time sequence corresponding to the generation time of the alarm events, and a certain number of alarm events are acquired each time according to the preset sequence. Therefore, the preset quantity can ensure that the alarm events in the preset message queue are processed in batches, thereby avoiding the occurrence of an alarm storm.

S105、基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理。S105. Perform preset processing on the alarm event obtained from the preset message queue based on the machine learning rule and the preset knowledge base.

在本实施例中,所述基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理,具体包括:基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设告警分析处理、预设告警收敛处理或预设告警聚合处理。In this embodiment, the preset process is performed on the alarm event obtained from the preset message queue based on the machine learning rule and the preset knowledge base, and specifically includes: based on a machine learning rule and a preset knowledge base The alarm event obtained in the preset message queue is subjected to preset alarm analysis processing, preset alarm convergence processing, or preset alarm aggregation processing.

其中,机器学习规则主要是利用算法对大量的告警事件进行学习训练而得到的规则模型,再将该规则模型应用于分析处理现有的告警事件,该算法具体为机器学习算法,比如为基于指数平滑的二次平滑算法、三次平滑算法,基于分解的傅里叶分解算法、小波分解算法等,或者基于深度学习的前馈神经网络、循环神经网络 RNN 算法等,这些机器学习算法就需要通过大量的线上历史数据(历史告警事件)进行训练,以便得出相对准确的告警策略,该告警策略即是规则模型用于分析处理当前的告警事件。此外,还可以多个算法同时进行训练,针对不同的场景,每一种算法的效果会不同,这时根据与历史结果进行对比,调整每个算法的权重,最终得出一个共同告警策略。The machine learning rule is mainly a rule model obtained by learning and training a large number of alarm events by using an algorithm, and then applying the rule model to analyzing and processing an existing alarm event, the algorithm is specifically a machine learning algorithm, for example, based on an index. Smooth quadratic smoothing algorithm, cubic smoothing algorithm, decomposition-based Fourier decomposition algorithm, wavelet decomposition algorithm, etc., or deep learning based feedforward neural network, cyclic neural network RNN Algorithms, etc., these machine learning algorithms need to be trained through a large amount of online historical data (historical alarm events) in order to obtain a relatively accurate alarm strategy, which is a rule model for analyzing and processing current alarm events. In addition, multiple algorithms can be trained at the same time. The effect of each algorithm will be different for different scenarios. At this time, according to the historical results, the weight of each algorithm is adjusted, and finally a common alarm strategy is obtained.

预设知识库用于储存告警事件、告警事件处理结果信息及分析结果信息等,以帮助分析处理所述告警事件。The preset knowledge base is used for storing alarm events, alarm event processing result information, and analysis result information, etc., to help analyze and process the alarm events.

具体地,预设告警分析处理对应的分析方法包括:基于历史数据统计的方法,假设正态分布的方法,3-sigma策略等,通过这些标准的统计学方法可以自动计算出一个合理的告警阀值,当性能指标超过或者低于这个告警阀值时,就会触发告警。同时将这些合理的告警阀值设置为所述预设告警规则中的预设阈值范围,以此形成一个闭环控制,从而提高告警的准确率。Specifically, the analysis method corresponding to the preset alarm analysis processing includes: a method based on historical data statistics, a method of assuming normal distribution, a 3-sigma strategy, etc., and a statistical alarm method can automatically calculate a reasonable alarm valve. Value, when the performance indicator exceeds or falls below this alarm threshold, an alarm is triggered. At the same time, these reasonable alarm thresholds are set to the preset threshold ranges in the preset alarm rules, thereby forming a closed loop control, thereby improving the accuracy of the alarm.

具体地,预设告警收敛处理,具体包括:根据预设时间窗口合并告警事件,比如,将在某段时间内产生的告警事件合并成一条发出;或者根据相同的监控策略合并相应的告警事件,比如,有多条cpu使用率超过90%的告警事件合并成一条告警事件;或者根据相同的告警对象进行合并相应的告警事件,比如主机A在某个时间窗口发出的告警,cpu,内存或者磁盘等告警,可以合并成一条主机A对应的告警事件发出。Specifically, the preset alarm convergence process includes: combining the alarm events according to the preset time window, for example, combining the alarm events generated in a certain period of time into one issue; or merging the corresponding alarm events according to the same monitoring policy. For example, multiple alarm events with a CPU usage of more than 90% are combined into one alarm event; or the corresponding alarm events are combined according to the same alarm object, such as the alarm issued by host A in a certain time window, cpu, memory or disk. The alarms can be combined into one alarm event corresponding to host A.

具体地,预设告警聚合处理包括:关联挖掘处理和异常依赖处理。关联挖掘处理的合并策略指的是通过挖掘告警事件与告警事件之间的关联,多时序间的关联,将多个告警事件合并成一个告警事件发出。异常依赖处理指的是某个异常的产生和另外一个异常有依赖关系,比如磁盘故障可能会导致主机宕机,如果同时收到磁盘故障和主机失联的告警事件,那么可以通过该异常依赖关系, 将这两条告警事件收敛为一条告警事件。Specifically, the preset alarm aggregation process includes: an association mining process and an exception dependency process. The merging strategy of the association mining process refers to merging multiple alarm events into one alarm event by mining the association between the alarm event and the alarm event and the association between multiple timings. Exception dependency processing refers to the occurrence of an exception and another dependency. For example, a disk failure may cause the host to crash. If an alarm event of disk failure and host loss is received at the same time, the exception dependency can be passed. , Convergence of these two alarm events into one alarm event.

需要说明的是,预设知识库和预设消息队列均保存在相同的数据库中,方便告警事件加载数据库中以备对这些告警事件进行预设处理;其中该预设知识库里保存有历史告警事件及其相关信息。It should be noted that the preset knowledge base and the preset message queue are saved in the same database, so that the alarm event is loaded into the database for preset processing of these alarm events; wherein the preset knowledge base stores historical alarms. Events and related information.

S106、将经过所述预设处理过的告警事件发送至用户终端。S106. Send the alarm event that has been processed by the preset to the user terminal.

在本实施例中,用户终端包括系统管理员对应的终端或用户系统对应的终端等,该系统管理员为负责监控系统的管理人员。其中对于不同的用户终端也可以采用不同发送方式。具体地,针对系统管理员可以以邮件,短信,电话等方式通知系统的管理员;而针对用户系统,可以将告警事件推送给用户自己的系统进行下一步的处理,比如有的用户会开发自己的工具平台,对告警事件做进一步的处理,结合业务逻辑做判断,为满足这样的用户需求,用户系统可以订阅监控系统的告警。当监控系统的告警中心有新的告警产生时,会根据用户系统订阅的对应的主题,实现定向推送,因此提高了用户的体验。In this embodiment, the user terminal includes a terminal corresponding to the system administrator or a terminal corresponding to the user system, and the system administrator is a management personnel responsible for monitoring the system. Different transmission modes can also be adopted for different user terminals. Specifically, the system administrator can notify the administrator of the system by email, short message, telephone, etc., and for the user system, the alarm event can be pushed to the user's own system for further processing, for example, some users will develop themselves. The tool platform further processes the alarm events and combines the business logic to make judgments. To meet such user requirements, the user system can subscribe to the alarms of the monitoring system. When a new alarm is generated in the alarm center of the monitoring system, the directional push is implemented according to the corresponding topic subscribed by the user system, thereby improving the user experience.

上述实施例告警处理方法通过根据预设告警规则确定监控对象的监控数据中的告警数据;在将告警数据生成告警事件并保存在预设消息队列后,按照预设获取规则获取所述预设消息队列中的告警事件,并将获取到的告警事件基于机器学习规则和预设知识库进行预设处理,以及将处理过的告警事件发送至用户终端以展示给运维管理人员。该告警处理方法不仅可以消除报警风暴,同时还可提高报警准确率,进而减少无效报警。 The alarm processing method of the foregoing embodiment determines the alarm data in the monitoring data of the monitoring object according to the preset alarm rule. After the alarm data is generated and saved in the preset message queue, the preset information is obtained according to the preset acquisition rule. An alarm event in the queue, and the obtained alarm event is pre-processed based on the machine learning rule and the preset knowledge base, and the processed alarm event is sent to the user terminal for display to the operation and maintenance manager. The alarm processing method can not only eliminate the alarm storm, but also improve the alarm accuracy rate, thereby reducing the invalid alarm.

请参阅图2,图2是本申请另一实施例提供的一种告警处理方法的示意流程图。该告警处理方法应用于监控系统中,监控系统可运行在服务器中,其中该服务器可以是独立的服务器,也可以是多个服务器组成的服务器集群。如图2所示,该告警处理方法包括步骤S201~S208。Referring to FIG. 2, FIG. 2 is a schematic flowchart of an alarm processing method according to another embodiment of the present application. The alarm processing method is applied to a monitoring system, and the monitoring system can be run in a server, where the server can be a stand-alone server or a server cluster composed of multiple servers. As shown in FIG. 2, the alarm processing method includes steps S201 to S208.

S201、获取监控对象的监控数据。S201. Obtain monitoring data of the monitored object.

其中,监控对象除了包括主机、容器、网络设备以及中间件等,还包括监控对象本身。Among them, the monitoring object includes the monitoring object itself in addition to the host, the container, the network device, and the middleware.

在本实施例中,获取监控对象的监控数据,还包括:获取监控系统的监控数据。In this embodiment, acquiring the monitoring data of the monitoring object further includes: acquiring monitoring data of the monitoring system.

S202、根据预设告警规则确定所述监控数据中的告警数据,其中所述告警数据为触发所述预设告警规则的监控数据。S202. Determine, according to a preset alarm rule, alarm data in the monitoring data, where the alarm data is monitoring data that triggers the preset alarm rule.

其中,预设告警规则为预先设置的告警规则,具体采用预设阈值范围判定方法,也可以采用其他类似的判定方法,在此不做限定。The preset alarm rule is a preset alarm rule, and the preset threshold range determination method is used, and other similar determination methods may also be used, which are not limited herein.

S203、根据所述告警数据生成告警事件,并将所述告警事件记录在预设消息队列。S203. Generate an alarm event according to the alarm data, and record the alarm event in a preset message queue.

在本实施例中,也采用按照预设事件格式的方式将所述告警数据生成告警事件,以及将生成告警事件保存在预设消息队列。In this embodiment, the alarm data is generated in an alarm event according to a preset event format, and the generated alarm event is saved in a preset message queue.

其中,该预设消息队列用于存储多个告警事件,可有效地防止服务器需要同时产生多个告警事件,进而产生告警风暴。The preset message queue is used to store multiple alarm events, which can effectively prevent the server from generating multiple alarm events at the same time, thereby generating an alarm storm.

S204、按照预设获取规则获取所述预设消息队列中的告警事件。S204. Acquire an alarm event in the preset message queue according to a preset acquisition rule.

具体地,按照预设顺序获取所述预设消息队列中预设数量的告警事件。具体地,该预设顺序可以为告警事件的优先等级顺序,也可以为告警事件的生成时间对应时间顺序,按照该预设顺序每次获取一定数量的告警事件。因此,该预设数量可以保证按批次地对预设消息队列中的告警事件进行处理,进而避免报警风暴的产生。Specifically, the preset number of alarm events in the preset message queue are obtained in a preset order. Specifically, the preset sequence may be a priority order of the alarm events, or may be a time sequence corresponding to the generation time of the alarm events, and a certain number of alarm events are acquired each time according to the preset sequence. Therefore, the preset quantity can ensure that the alarm events in the preset message queue are processed in batches, thereby avoiding the occurrence of an alarm storm.

S205、基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理。S205. Perform preset processing on the alarm event obtained from the preset message queue based on the machine learning rule and the preset knowledge base.

其中,对于监控对象不是监控系统所对应的告警事件还是采用上述实施例的预设处理方法,在此不做详细介绍。The monitoring object is not the alarm event corresponding to the monitoring system or the preset processing method of the foregoing embodiment, and is not described in detail herein.

在本实施例中,对于监控对象是监控系统本身所对应的告警事件,将采用如图3中的方法步骤,具体如步骤S205a和S205b。In this embodiment, for the alarm event corresponding to the monitoring system itself, the method steps in FIG. 3 are adopted, specifically, steps S205a and S205b.

S205a、从所述告警事件中筛选出监控告警事件,其中所述监控告警事件为所述监控系统的告警事件。S205a: The monitoring alarm event is filtered out from the alarm event, where the monitoring alarm event is an alarm event of the monitoring system.

因为所述告警事件不仅包括监控系统的告警事件,还包括主机、容器或中间件的告警事件,因此需要从所述告警事件中筛选出监控告警事件,该监控告警事件为所述监控系统的告警事件。The alarm event includes not only the alarm event of the monitoring system but also the alarm event of the host, the container, or the middleware. Therefore, the monitoring alarm event needs to be filtered out from the alarm event, and the monitoring alarm event is an alarm of the monitoring system. event.

具体地,可通过获取监控对象的标识信息;将所述监控对象的监控数据和所述标识信息相关联。因此告警数据也对应有标识信息,由该标识信息可以从所述告警事件中筛选出监控告警事件。Specifically, the identification information of the monitoring object may be obtained by associating the monitoring data of the monitoring object with the identification information. Therefore, the alarm data is also associated with the identification information, and the identification information can be used to filter out the monitoring alarm event from the alarm event.

S205b、将所述监控告警事件发送至自愈系统以使得所述自愈系统处理所述监控告警事件对应的故障。S205b: Send the monitoring alarm event to the self-healing system, so that the self-healing system processes the fault corresponding to the monitoring alarm event.

其中,该自愈系统和监控系统可以安装在不同的服务器中,服务器之间建立通信连接以完成数据的交互。该自愈系统用于自动化处理所述监控告警事件对应的故障,比如扩展容量、重启服务或限制流量等。The self-healing system and the monitoring system can be installed in different servers, and communication connections are established between the servers to complete data interaction. The self-healing system is configured to automatically process faults corresponding to the monitoring alarm events, such as expanding capacity, restarting services, or limiting traffic.

S206、将经过所述预设处理过的告警事件发送至用户终端。S206. Send an alarm event that has been processed by the preset to the user terminal.

其中,将经过所述预设处理过的告警事件发送至用户终端,以便用户对该告警事件做相应的处理,及时消除该告警事件。The alarm event that has been processed by the preset is sent to the user terminal, so that the user can perform corresponding processing on the alarm event, and the alarm event is cancelled in time.

S207、接收所述用户终端发送的反馈信息,其中所述反馈信息为所述用户终端对所述经过所述预设处理过的告警事件做预设标记处理生成的处理结果信息。S207: Receive feedback information sent by the user terminal, where the feedback information is processing result information generated by the user terminal by performing preset label processing on the preset alarm event.

在本实施例中,该预设标记处理包括:标记无效告警事件,不做处理,并反馈给服务器的告警中心 ;或者标记有效告事件警,进行事件处理。用户终端可能有自动化事件处理系统。用户终端接收到事件后开始处理,并将最后的处理结果好通知服务器的告警中心,具体可以通过API回调的方式发送。In this embodiment, the preset tag processing includes: marking an invalid alarm event, not processing, and feeding back to the alarm center of the server. Or mark the event alarm and handle the event. User terminals may have automated event processing systems. After receiving the event, the user terminal starts processing, and notifies the alarm center of the server to the final processing result, which can be specifically sent through an API callback.

S208、将所述处理结果信息保存至所述预设知识库。S208. Save the processing result information to the preset knowledge base.

在本实施例中,服务器将将所述处理结果信息保存至所述预设知识库,将告警事件及对应的处理结果信息作为历史告警事件,对当前的告警事件进行分析处理,由此形成一个闭环机制,可以让整个处理过程不断完善,由此提高了告警的准确率。In this embodiment, the server saves the processing result information to the preset knowledge base, and uses the alarm event and the corresponding processing result information as historical alarm events to analyze and process the current alarm event, thereby forming a The closed-loop mechanism allows the entire process to be continuously improved, thereby improving the accuracy of the alarm.

上述实施例提供的告警处理方法不仅可以消除报警风暴,同时还可以对监控系统自身的告警对应故障进行自愈处理,保证监控系统的正常运行,还可接收用户终端发送的反馈信息,并将该反馈信息中的关于告警事件的处理结果信息保存在所述预设数据库,以便对下次的告警事件做分析处理,由此形成了一个分析闭环控制机制,通过该闭环控制机制不断完善该监控系统的告警分析处理能力,进而提高了告警准确率。The alarm processing method provided by the foregoing embodiment can not only eliminate the alarm storm, but also perform self-healing processing on the alarm corresponding fault of the monitoring system to ensure the normal operation of the monitoring system, and can also receive the feedback information sent by the user terminal, and the The processing result information about the alarm event in the feedback information is saved in the preset database, so as to analyze and process the next alarm event, thereby forming an analysis closed-loop control mechanism, and the monitoring system is continuously improved through the closed-loop control mechanism. The alarm analysis processing capability improves the alarm accuracy.

本申请实施例还提供一种告警处理装置,该装置用于执行前述的告警处理方法。具体地,请参阅图4,图4是本申请实施例提供的一种告警处理装置的示意性框图。该告警处理装置300可以安装于服务器中。The embodiment of the present application further provides an alarm processing apparatus, which is used to execute the foregoing alarm processing method. Specifically, please refer to FIG. 4, which is a schematic block diagram of an alarm processing apparatus according to an embodiment of the present application. The alarm processing device 300 can be installed in a server.

如图4所示,该告警处理装置300包括数据获取单元301、数据确定单元302、生成记录单元303、事件获取单元304、预设处理单元305和事件发送单元306。As shown in FIG. 4, the alarm processing apparatus 300 includes a data acquisition unit 301, a data determination unit 302, a generation recording unit 303, an event acquisition unit 304, a preset processing unit 305, and an event transmission unit 306.

数据获取单元301,用于获取监控对象的监控数据。The data obtaining unit 301 is configured to acquire monitoring data of the monitoring object.

其中,监控对象包括主机、容器、网络设备以及中间件等,该中间件比如为weblogic, tomcat, kafka或 zookeeper组件等。监控对象的监控数据包括监控对象的使用状态等信息,比如监控对象为主机,相对应地监控数据包括主机的CPU使用率、内存使用率,磁盘使用率和网络流量等信息。The monitoring object includes a host, a container, a network device, and a middleware, and the middleware is, for example, weblogic, tomcat, Kafka or Zookeeper component, etc. The monitoring data of the monitoring object includes information such as the usage status of the monitoring object. For example, the monitoring object is a host, and the corresponding monitoring data includes information such as the CPU usage, memory usage, disk usage, and network traffic of the host.

数据确定单元302,用于根据预设告警规则确定所述监控数据中的告警数据,其中所述告警数据为触发所述预设告警规则的监控数据。The data determining unit 302 is configured to determine alarm data in the monitoring data according to a preset alarm rule, where the alarm data is monitoring data that triggers the preset alarm rule.

其中,预设告警规则为预先设置的告警规则,在监控系统的告警中心初始化时自动配置一套通用的告警规则作为预设告警规则,如果用户有个性化的需求,可以自定义或修改该预设告警规则。The preset alarm rule is a preset alarm rule. When the alarm center of the monitoring system is initialized, a set of common alarm rules is automatically configured as a preset alarm rule. If the user has personalized requirements, the preset can be customized or modified. Set alarm rules.

具体地,该预设告警规则采用预设阈值范围判定方法,通过判断所述监控数据中的数值是否在预设阈值范围内;如果所述监控数据中的数值在所述预设阈值范围内,则判定为未触发所述预设告警规则;如果所述监控数据中的数值未在所述预设阈值范围内,则判定为触发所述预设告警规则。其中触发所述预设告警规则对应的监控数据即为告警数据。Specifically, the preset alarm rule adopts a preset threshold range determining method, by determining whether the value in the monitoring data is within a preset threshold range; if the value in the monitoring data is within the preset threshold range, Then, it is determined that the preset alarm rule is not triggered; if the value in the monitoring data is not within the preset threshold range, it is determined that the preset alarm rule is triggered. The monitoring data corresponding to the triggering the preset alarm rule is the alarm data.

生成记录单元303,用于根据所述告警数据生成告警事件,并将所述告警事件记录在预设消息队列。The generating record unit 303 is configured to generate an alarm event according to the alarm data, and record the alarm event in a preset message queue.

其中,根据所述告警数据生成告警事件,包括:按照预设事件格式将所述告警数据生成告警事件。该预设事件格式可以对告警事件进行规范,方便后续解析处理。The generating an alarm event according to the alarm data includes: generating an alarm event by using the alarm data according to a preset event format. The preset event format can standardize alarm events to facilitate subsequent parsing processing.

其中,在按照预设事件格式将所述告警数据生成告警事件后,还需将该告警事件发送并保存至预设消息队列中,该预设消息队列保存在预设数据库中,该预设消息队列的使用可以消除告警风暴。After the alarm data is generated according to the preset event format, the alarm event is sent and saved to the preset message queue, where the preset message queue is saved in a preset database. The use of queues can eliminate alarm storms.

事件获取单元304,用于按照预设获取规则获取所述预设消息队列中的告警事件。The event obtaining unit 304 is configured to acquire an alarm event in the preset message queue according to a preset acquisition rule.

其中,由于预设消息队列中包括多个告警事件,因此按照预设获取规则获取所述预设消息队列中的告警事件,比如告警事件还可包括优先等级信息或告警严重程度信息等,相对应地,该预设获取规则可以为按照优先等级或告警严重程度的顺序获取所述预设消息队列中的告警事件,因此可以合理有序地对预设消息队列中的告警事件进行处理,先解决重要的告警问题,由此间接地提高告警处理效率。The preset message queue includes multiple alarm events, so the alarm event in the preset message queue is obtained according to the preset acquisition rule. For example, the alarm event may further include priority information or alarm severity information, etc., corresponding to The preset acquisition rule may be configured to acquire an alarm event in the preset message queue according to a priority level or an alarm severity. Therefore, the alarm event in the preset message queue may be processed in a reasonable and orderly manner. Important alarm issues, thereby indirectly improving the efficiency of alarm processing.

在一实施例中,事件获取单元304,具体用于:按照预设顺序获取所述预设消息队列中预设数量的告警事件。具体地,即每次获取一定数量的告警事件。因此,该预设数量可以保证按批次地对预设消息队列中的告警事件进行处理,进而避免报警风暴的产生。In an embodiment, the event obtaining unit 304 is configured to: acquire a preset number of alarm events in the preset message queue according to a preset sequence. Specifically, a certain number of alarm events are acquired each time. Therefore, the preset quantity can ensure that the alarm events in the preset message queue are processed in batches, thereby avoiding the occurrence of an alarm storm.

预设处理单元305,用于基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理。The preset processing unit 305 is configured to perform preset processing on the alarm event acquired from the preset message queue based on the machine learning rule and the preset knowledge base.

其中,具体用于:基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设告警分析处理、预设告警收敛处理或预设告警聚合处理。基于此,该预设处理单元305可包括告警分析子单元3051、告警收敛子单元3052和告警聚合子单元3053。Specifically, the method is: performing preset alarm analysis processing, preset alarm convergence processing, or preset alarm aggregation processing on an alarm event obtained from the preset message queue based on a machine learning rule and a preset knowledge base. Based on this, the preset processing unit 305 can include an alarm analysis sub-unit 3051, an alarm convergence sub-unit 3052, and an alarm aggregation sub-unit 3053.

具体地,告警分析子单元3051,用于:基于历史数据统计的方法,假设正态分布的方法,3-sigma策略等,通过这些标准的统计学方法可以自动计算出一个合理的告警阀值,当性能指标超过或者低于这个告警阀值时,触发告警。同时将这些合理的告警阀值设置为所述预设告警规则中的预设阈值范围,以此形成一个闭环控制,从而提高告警的准确率。Specifically, the alarm analysis sub-unit 3051 is configured to automatically calculate a reasonable alarm threshold by using a statistical method based on historical data statistics, a method of normal distribution, a 3-sigma strategy, and the like. When the performance indicator exceeds or falls below this alarm threshold, an alarm is triggered. At the same time, these reasonable alarm thresholds are set to the preset threshold ranges in the preset alarm rules, thereby forming a closed loop control, thereby improving the accuracy of the alarm.

具体地,告警收敛子单元3052,具体用于:根据预设时间窗口合并告警事件;或者根据相同的监控策略合并相应的告警事件;或者根据相同的告警对象进行合并相应的告警事件。比如,将在某段时间内产生的告警事件合并成一条发出;比如,有多条cpu使用率超过90%的告警事件合并成一条告警事件;再比如主机A在某个时间窗口发出的告警,cpu,内存或者磁盘等告警,可以合并成一条主机A对应的告警事件发出。Specifically, the alarm convergence sub-unit 3052 is specifically configured to: combine the alarm events according to the preset time window; or merge the corresponding alarm events according to the same monitoring policy; or combine the corresponding alarm events according to the same alarm object. For example, the alarm events generated during a certain period of time are combined into one. For example, multiple alarm events with a CPU usage exceeding 90% are combined into one alarm event; for example, the alarm sent by host A at a certain time window. Alarms such as cpu, memory, or disk can be combined into one alarm event corresponding to host A.

具体地,告警聚合子单元3053,具体用于:关联挖掘处理和异常依赖处理。关联挖掘处理的合并策略指的是通过挖掘告警事件与告警事件之间的关联,多时序间的关联,将多个告警事件合并成一个告警事件发出。异常依赖处理指的是某个异常的产生和另外一个异常有依赖关系,比如磁盘故障可能会导致主机宕机,如果同时收到磁盘故障和主机失联的告警事件,那么可以通过该异常依赖关系, 将这两条告警事件收敛为一条告警事件。Specifically, the alarm aggregation sub-unit 3053 is specifically configured to: association mining processing and exception dependency processing. The merging strategy of the association mining process refers to merging multiple alarm events into one alarm event by mining the association between the alarm event and the alarm event and the association between multiple timings. Exception dependency processing refers to the occurrence of an exception and another dependency. For example, a disk failure may cause the host to crash. If an alarm event of disk failure and host loss is received at the same time, the exception dependency can be passed. , Convergence of these two alarm events into one alarm event.

事件发送单元306,用于将经过所述预设处理过的告警事件发送至用户终端。The event sending unit 306 is configured to send the alarm event that has undergone the preset processing to the user terminal.

其中,用户终端包括系统管理员或用户系统对应的终端等,对于不同的用户终端也可以采用不同发送方式。具体地,针对系统管理员可以以邮件,短信,电话等方式通知系统的管理员;而针对用户系统,可以将告警事件推送给用户系统进行下一步的处理,比如有的用户会开发自己的工具平台,对告警事件做进一步的处理,结合业务逻辑做判断,为满足这样的用户需求,用户系统可以订阅监控系统的告警。当监控系统的告警中心有新的告警产生时,会根据用户系统订阅的对应的主题,实现定向推送,因此还提高了用户的体验。The user terminal includes a system administrator or a terminal corresponding to the user system, and different transmission modes may be adopted for different user terminals. Specifically, the system administrator can notify the administrator of the system by email, short message, telephone, etc., and for the user system, the alarm event can be pushed to the user system for further processing, for example, some users will develop their own tools. The platform further processes the alarm event and combines the business logic to make judgments. To meet such user requirements, the user system can subscribe to the alarm of the monitoring system. When a new alarm is generated in the alarm center of the monitoring system, the directional push is implemented according to the corresponding topic subscribed by the user system, thereby improving the user experience.

本申请实施例还提供另一种告警处理装置,该装置用于执行前述的告警处理方法。具体地,请参阅图5,图5是本申请实施例提供的一种告警处理装置的示意性框图。该告警处理装置400可以安装于服务器中。The embodiment of the present application further provides another alarm processing apparatus, which is used to execute the foregoing alarm processing method. Specifically, please refer to FIG. 5. FIG. 5 is a schematic block diagram of an alarm processing apparatus according to an embodiment of the present application. The alarm processing device 400 can be installed in a server.

如图5所示,该告警处理装置400包括数据获取单元401、数据确定单元402、生成记录单元403、事件获取单元404、预设处理单元405、事件发送单元406、信息接收单元407和信息保存单元408。As shown in FIG. 5, the alarm processing apparatus 400 includes a data acquisition unit 401, a data determination unit 402, a generation recording unit 403, an event acquisition unit 404, a preset processing unit 405, an event transmission unit 406, an information reception unit 407, and information storage. Unit 408.

数据获取单元401,用于获取监控对象的监控数据。The data obtaining unit 401 is configured to acquire monitoring data of the monitoring object.

其中,监控对象除了包括主机、容器、网络设备以及中间件等,还包括监控对象本身。在本实施例中,获取监控对象的监控数据,还包括:获取监控系统的监控数据。Among them, the monitoring object includes the monitoring object itself in addition to the host, the container, the network device, and the middleware. In this embodiment, acquiring the monitoring data of the monitoring object further includes: acquiring monitoring data of the monitoring system.

数据确定单元402,用于根据预设告警规则确定所述监控数据中的告警数据,其中所述告警数据为触发所述预设告警规则的监控数据。The data determining unit 402 is configured to determine alarm data in the monitoring data according to a preset alarm rule, where the alarm data is monitoring data that triggers the preset alarm rule.

其中,预设告警规则为预先设置的告警规则,具体采用预设阈值范围判定方法,也可以采用其他类似的判定方法,在此不做限定。The preset alarm rule is a preset alarm rule, and the preset threshold range determination method is used, and other similar determination methods may also be used, which are not limited herein.

生成记录单元403,用于根据所述告警数据生成告警事件,并将所述告警事件记录在预设消息队列。The generating record unit 403 is configured to generate an alarm event according to the alarm data, and record the alarm event in a preset message queue.

其中,也采用按照预设事件格式的方式将所述告警数据生成告警事件,以及将生成告警事件保存在预设消息队列。该预设消息队列用于存储多个告警事件,可有效地防止服务器需要同时产生多个告警事件,进而产生告警风暴。The alarm data is generated in an alarm event according to a preset event format, and the generated alarm event is saved in a preset message queue. The preset message queue is used to store multiple alarm events, which can effectively prevent the server from generating multiple alarm events at the same time, thereby generating an alarm storm.

事件获取单元404,用于按照预设获取规则获取所述预设消息队列中的告警事件。The event obtaining unit 404 is configured to acquire an alarm event in the preset message queue according to a preset acquisition rule.

具体用于:按照预设顺序获取所述预设消息队列中预设数量的告警事件。具体地,每次按照预设顺序获取一定数量的告警事件。因此,该预设数量可以保证按批次地对预设消息队列中的告警事件进行处理,进而避免报警风暴的产生。Specifically, the method is: acquiring a preset number of alarm events in the preset message queue according to a preset sequence. Specifically, a certain number of alarm events are acquired each time in a preset order. Therefore, the preset quantity can ensure that the alarm events in the preset message queue are processed in batches, thereby avoiding the occurrence of an alarm storm.

预设处理单元405,用于基于机器学习规则和预设知识库对所述告警事件进行预设处理。The preset processing unit 405 is configured to perform preset processing on the alarm event based on the machine learning rule and the preset knowledge base.

其中,对于监控对象不是监控系统所对应的告警事件和监控对象是监控系统本身所对应的告警事件,因此需要不同的预设处理方式,基于此,预设处理单元405包括事件筛选子单元4051和发送自愈子单元4052。The alarm event and the monitoring object corresponding to the monitoring object are the alarm events corresponding to the monitoring system itself, and therefore different preset processing modes are required. Based on this, the preset processing unit 405 includes the event screening subunit 4051 and The self-healing subunit 4052 is sent.

事件筛选子单元4051,用于从所述告警事件中筛选出监控告警事件,其中所述监控告警事件为所述监控系统的告警事件。The event screening sub-unit 4051 is configured to filter out a monitoring alarm event from the alarm event, where the monitoring alarm event is an alarm event of the monitoring system.

因为所述告警事件不仅包括监控系统的告警事件,还包括主机、容器或中间件的告警事件,因此需要从所述告警事件中筛选出监控告警事件,该监控告警事件为所述监控系统的告警事件。The alarm event includes not only the alarm event of the monitoring system but also the alarm event of the host, the container, or the middleware. Therefore, the monitoring alarm event needs to be filtered out from the alarm event, and the monitoring alarm event is an alarm of the monitoring system. event.

具体地,可通过获取监控对象的标识信息;将所述监控对象的监控数据和所述标识信息相关联。因此告警数据也对应有标识信息,由该标识信息可以从所述告警事件中筛选出监控告警事件。Specifically, the identification information of the monitoring object may be obtained by associating the monitoring data of the monitoring object with the identification information. Therefore, the alarm data is also associated with the identification information, and the identification information can be used to filter out the monitoring alarm event from the alarm event.

发送自愈子单元4052,用于将所述监控告警事件发送至自愈系统以使得所述自愈系统处理所述监控告警事件对应的故障。The self-healing sub-unit 4052 is configured to send the monitoring alarm event to the self-healing system to cause the self-healing system to process the fault corresponding to the monitoring alarm event.

其中,该自愈系统和监控系统可以安装在不同的服务器中,服务器之间建立通信连接以完成数据的交互。该自愈系统用于自动化处理所述监控告警事件对应的故障,比如扩展容量、重启服务或限制流量等。The self-healing system and the monitoring system can be installed in different servers, and communication connections are established between the servers to complete data interaction. The self-healing system is configured to automatically process faults corresponding to the monitoring alarm events, such as expanding capacity, restarting services, or limiting traffic.

事件发送单元406,用于将经过所述预设处理过的告警事件发送至用户终端。The event sending unit 406 is configured to send the alarm event that has undergone the preset processing to the user terminal.

其中,将经过所述预设处理过的告警事件发送至用户终端,以便用户对该告警事件做相应的处理,及时消除该告警事件。The alarm event that has been processed by the preset is sent to the user terminal, so that the user can perform corresponding processing on the alarm event, and the alarm event is cancelled in time.

信息接收单元407,用于接收所述用户终端发送的反馈信息,其中所述反馈信息为所述用户终端对所述经过所述预设处理过的告警事件做预设标记处理生成的处理结果信息。The information receiving unit 407 is configured to receive the feedback information sent by the user terminal, where the feedback information is processing result information generated by the user terminal by performing preset label processing on the preset alarm event. .

其中,该预设标记处理包括:标记无效告警事件,不做处理,并反馈给服务器的告警中心 ;或者标记有效告事件警,进行事件处理。用户终端可能有自动化事件处理系统。用户终端接收到事件后开始处理,并将最后的处理结果好通知服务器的告警中心,具体可以通过API回调的方式发送。The preset tag processing includes: marking an invalid alarm event, not processing, and feeding back to the alarm center of the server. Or mark the event alarm and handle the event. User terminals may have automated event processing systems. After receiving the event, the user terminal starts processing, and notifies the alarm center of the server to the final processing result, which can be specifically sent through an API callback.

信息保存单元408、将所述处理结果信息保存至所述预设知识库。The information saving unit 408 saves the processing result information to the preset knowledge base.

其中,服务器将将所述处理结果信息保存至所述预设知识库,将告警事件及对应的处理结果信息作为历史告警事件,对当前的告警事件进行分析处理,由此形成一个闭环机制,可以让整个处理过程不断完善,由此提高了告警的准确率。The server saves the processing result information to the preset knowledge base, and uses the alarm event and the corresponding processing result information as historical alarm events to analyze and process the current alarm event, thereby forming a closed loop mechanism. The entire process is continuously improved, thereby improving the accuracy of the alarm.

上述装置可以实现为一种计算机程序的形式,该计算机程序可以在如图6所示的计算机设备上运行。The above apparatus may be embodied in the form of a computer program that can be run on a computer device as shown in FIG.

请参阅图6,图6是本申请实施例提供的一种计算机设备的示意性框图。该计算机设备700设备可以是终端。该终端可以是智能手机、平板电脑、笔记本电脑、台式电脑、个人数字助理和穿戴式设备等具有通信功能的电子设备。Please refer to FIG. 6. FIG. 6 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 700 device can be a terminal. The terminal can be a communication-enabled electronic device such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device.

参照图6,该计算机设备700包括通过系统总线710连接的处理器720、网络接口750和存储器,其中,存储器可以包括非易失性存储介质730和内存储器740。Referring to FIG. 6, the computer device 700 includes a processor 720, a network interface 750, and a memory connected by a system bus 710, wherein the memory can include a non-volatile storage medium 730 and an internal memory 740.

该非易失性存储介质730可存储操作系统731和计算机程序732。该计算机程序732被执行时,可使得处理器720执行一种告警处理方法。The non-volatile storage medium 730 can store an operating system 731 and a computer program 732. When the computer program 732 is executed, the processor 720 can be caused to perform an alert processing method.

该处理器720用于提供计算和控制能力,支撑整个计算机设备700的运行。The processor 720 is used to provide computing and control capabilities to support the operation of the entire computer device 700.

该内存储器740为非易失性存储介质730中的计算机程序732的运行提供环境。The internal memory 740 provides an environment for the operation of the computer program 732 in the non-volatile storage medium 730.

该网络接口750用于进行网络通信,如发送分配的任务等。本领域技术人员可以理解,图6中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备700的限定,具体的计算机设备700可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。The network interface 750 is used for network communication, such as sending assigned tasks and the like. It will be understood by those skilled in the art that the structure shown in FIG. 6 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device 700 to which the solution of the present application is applied, and a specific computer device. 700 may include more or fewer components than shown, or some components may be combined, or have different component arrangements.

其中,所述处理器720用于运行存储在存储器中的程序代码,以实现如下功能:The processor 720 is configured to run program code stored in the memory to implement the following functions:

获取监控对象的监控数据;根据预设告警规则确定所述监控数据中的告警数据,其中所述告警数据为触发所述预设告警规则的监控数据;根据所述告警数据生成告警事件,并将所述告警事件记录在预设消息队列;按照预设获取规则获取所述预设消息队列中的告警事件;基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理;以及将经过所述预设处理过的告警事件发送至用户终端。Acquiring the monitoring data of the monitoring object; determining the alarm data in the monitoring data according to the preset alarm rule, wherein the alarm data is monitoring data that triggers the preset alarm rule; generating an alarm event according to the alarm data, and The alarm event is recorded in a preset message queue; the alarm event in the preset message queue is obtained according to a preset acquisition rule; and the alarm event obtained from the preset message queue is obtained based on the machine learning rule and the preset knowledge base. Performing a preset process; and transmitting an alert event processed by the preset to the user terminal.

在一实施例中,处理器720在执行所述将经过所述预设处理过的告警事件发送至用户终端之后,还执行如下程序:接收所述用户终端发送的反馈信息,其中所述反馈信息为所述用户终端对所述经过所述预设处理过的告警事件做预设标记处理生成的处理结果信息;以及将所述处理结果信息保存至所述预设知识库。In an embodiment, after the performing, by the processor 720, the alarm event that has been processed by the preset to the user terminal, the following is also performed: receiving feedback information sent by the user terminal, where the feedback information is Processing result information generated by the user terminal for performing preset label processing on the preset processed alarm event; and saving the processing result information to the preset knowledge base.

在一实施例中,处理器720在执行时,具体执行如下程序:从所述告警事件中筛选出监控告警事件,其中所述监控告警事件为所述监控系统的告警事件;将所述监控告警事件发送至自愈系统以使得所述自愈系统处理所述监控告警事件对应的故障。In an embodiment, when the processor 720 is executed, the program is specifically executed to: select a monitoring alarm event from the alarm event, where the monitoring alarm event is an alarm event of the monitoring system; The event is sent to the self-healing system such that the self-healing system processes the fault corresponding to the monitored alarm event.

在一实施例中,处理器720在执行时,具体执行如下程序:按照预设事件格式将所述告警数据生成告警事件。In an embodiment, when executing, the processor 720 specifically executes a process of generating an alarm event by using the alarm data according to a preset event format.

在一实施例中,处理器720在执行时,具体执行如下程序:基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设告警分析处理、预设告警收敛处理或预设告警聚合处理。In an embodiment, when the processor 720 is executed, the program is specifically configured to: perform preset alarm analysis processing and preset alarm on the alarm event acquired from the preset message queue based on the machine learning rule and the preset knowledge base. Convergence processing or preset alarm aggregation processing.

在一实施例中,处理器720在执行时,具体执行如下程序:按照预设顺序获取所述预设消息队列中预设数量的告警事件。In an embodiment, when the processor 720 is executed, the program is specifically executed to acquire a preset number of alarm events in the preset message queue according to a preset sequence.

应当理解,在本申请实施例中,处理器720可以是中央处理单元 (Central Processing Unit,CPU),该处理器720还可以是其他通用处理器、数字信号处理器 (Digital Signal Processor,DSP)、专用集成电路 (Application Specific Integrated Circuit,ASIC)、现成可编程门阵列 (Field-Programmable Gate Array,FPGA) 或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that in the embodiment of the present application, the processor 720 may be a central processing unit (Central Processing Unit, CPU), the processor 720 can also be other general-purpose processors, digital signal processors (DSPs), and application specific integrated circuits. (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field-Programmable) Gate Array, FPGA) Or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, and the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

本领域技术人员可以理解,图6中示出的计算机设备700结构并不构成对计算机设备700的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art will appreciate that the computer device 700 architecture illustrated in FIG. 6 does not constitute a limitation to computer device 700, may include more or fewer components than illustrated, or may combine certain components, or different components. Arrangement.

在本申请的另一实施例中提供一种存储介质,该存储介质包括计算机可读存储介质,该计算机可读存储介质存储有计算机程序,其中计算机程序包括程序指令。该程序指令被处理器执行时实现:In another embodiment of the present application, a storage medium is provided, the storage medium comprising a computer readable storage medium storing a computer program, wherein the computer program comprises program instructions. This program instruction is implemented when executed by the processor:

获取监控对象的监控数据;根据预设告警规则确定所述监控数据中的告警数据,其中所述告警数据为触发所述预设告警规则的监控数据;根据所述告警数据生成告警事件,并将所述告警事件记录在预设消息队列;按照预设获取规则获取所述预设消息队列中的告警事件;基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理;以及将经过所述预设处理过的告警事件发送至用户终端。Acquiring the monitoring data of the monitoring object; determining the alarm data in the monitoring data according to the preset alarm rule, wherein the alarm data is monitoring data that triggers the preset alarm rule; generating an alarm event according to the alarm data, and The alarm event is recorded in a preset message queue; the alarm event in the preset message queue is obtained according to a preset acquisition rule; and the alarm event obtained from the preset message queue is obtained based on the machine learning rule and the preset knowledge base. Performing a preset process; and transmitting an alert event processed by the preset to the user terminal.

在一实施例中,该程序指令被处理器执行所述将经过所述预设处理过的告警事件发送至用户终端之后,还实现:接收所述用户终端发送的反馈信息,其中所述反馈信息为所述用户终端对所述经过所述预设处理过的告警事件做预设标记处理生成的处理结果信息;以及将所述处理结果信息保存至所述预设知识库。In an embodiment, after the program instruction is executed by the processor to send the alarm event that has been processed by the preset to the user terminal, the method further includes: receiving feedback information sent by the user terminal, where the feedback information is Processing result information generated by the user terminal for performing preset label processing on the preset processed alarm event; and saving the processing result information to the preset knowledge base.

在一实施例中,该程序指令被处理器执行所述基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理时,具体实现:从所述告警事件中筛选出监控告警事件,其中所述监控告警事件为所述监控系统的告警事件;将所述监控告警事件发送至自愈系统以使得所述自愈系统处理所述监控告警事件对应的故障。In an embodiment, when the program instruction is executed by the processor, the machine learning rule and the preset knowledge base perform preset processing on the alarm event acquired from the preset message queue, and the specific implementation is: The monitoring alarm event is filtered out, wherein the monitoring alarm event is an alarm event of the monitoring system; and the monitoring alarm event is sent to the self-healing system to cause the self-healing system to process the fault corresponding to the monitoring alarm event. .

在一实施例中,该程序指令被处理器执行所述根据所述告警数据生成告警事件时,具体实现:按照预设事件格式将所述告警数据生成告警事件。In an embodiment, when the program instruction is executed by the processor, the alarm event is generated according to the alarm data, and the alarm data is generated according to a preset event format.

在一实施例中,该程序指令被处理器执行所述基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理时,具体实现:基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设告警分析处理、预设告警收敛处理或预设告警聚合处理。In an embodiment, when the program instruction is executed by the processor, the machine learning rule and the preset knowledge base perform preset processing on the alarm event acquired from the preset message queue, and the specific implementation is: based on a machine learning rule. The preset knowledge base performs preset alarm analysis processing, preset alarm convergence processing, or preset alarm aggregation processing on the alarm events obtained from the preset message queue.

在一实施例中,该程序指令被处理器执行所述按照预设获取规则获取所述预设消息队列中的告警事件时,具体实现:按照预设顺序获取所述预设消息队列中预设数量的告警事件。In an embodiment, when the program instruction is executed by the processor to obtain the alarm event in the preset message queue according to the preset acquisition rule, the specific implementation is: acquiring the preset in the preset message queue according to a preset order. The number of alarm events.

该计算机可读存储介质可以是U盘、移动硬盘、只读存储器 (ROM,Read-Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。The computer readable storage medium may be a USB flash drive, a removable hard drive, or a read only memory (ROM, Read-Only) A variety of media that can store program code, such as a memory, a disk, or an optical disk.

本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both, for clarity of hardware and software. Interchangeability, the composition and steps of the various examples have been generally described in terms of function in the above description. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present application.

所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。A person skilled in the art can clearly understand that, for the convenience and brevity of the description, the specific working process of the device and the unit described above can refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.

在本申请所提供的几个实施例中,应该理解到,所揭露的告警处理装置和方法,可以通过其它的方式实现。例如,以上所描述的告警处理装置实施例仅仅是示意性的。例如,各个单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。In the several embodiments provided by the present application, it should be understood that the disclosed alarm processing apparatus and method may be implemented in other manners. For example, the alarm processing device embodiments described above are merely illustrative. For example, the division of each unit is only a logical function division, and there may be another division manner in actual implementation. For example, multiple units or components may be combined or integrated into another system, or some features may be omitted or not implemented.

本申请实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减。本申请实施例装置中的单元可以根据实际需要进行合并、划分和删减。The steps in the method of the embodiment of the present application may be sequentially adjusted, merged, and deleted according to actual needs. The units in the apparatus of the embodiment of the present application may be combined, divided, and deleted according to actual needs.

另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.

该集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备 ( 可以是个人计算机,终端,或者网络设备等 ) 执行本申请各个实施例所述方法的全部或部分步骤。The integrated unit, if implemented in the form of a software functional unit and sold or used as a standalone product, can be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be in essence or part of the contribution to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium. Including several instructions to make a computer device (may be a personal computer, terminal, or network device, etc.) Performing all or part of the steps of the method described in various embodiments of the present application.

以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。 The foregoing is only a specific embodiment of the present application, but the scope of protection of the present application is not limited thereto, and any equivalents can be easily conceived by those skilled in the art within the technical scope disclosed in the present application. Modifications or substitutions are intended to be included within the scope of the present application. Therefore, the scope of protection of this application should be determined by the scope of protection of the claims.

Claims (20)

一种告警处理方法,应用于监控系统,其包括: An alarm processing method is applied to a monitoring system, which includes: 获取监控对象的监控数据;Obtain monitoring data of the monitored object; 根据预设告警规则确定所述监控数据中的告警数据,其中所述告警数据为触发所述预设告警规则的监控数据;Determining, by the preset alarm rule, the alarm data in the monitoring data, where the alarm data is monitoring data that triggers the preset alarm rule; 根据所述告警数据生成告警事件,并将所述告警事件记录在预设消息队列;Generating an alarm event according to the alarm data, and recording the alarm event in a preset message queue; 按照预设获取规则获取所述预设消息队列中的告警事件;Obtaining an alarm event in the preset message queue according to a preset acquisition rule; 基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理;以及Presetting the alarm event acquired from the preset message queue based on the machine learning rule and the preset knowledge base; 将经过所述预设处理过的告警事件发送至用户终端。 The alarm event that has been processed through the preset is sent to the user terminal. 根据权利要求1所述的告警处理方法,其中,所述将经过所述预设处理过的告警事件发送至用户终端之后,还包括:The alarm processing method according to claim 1, wherein after the sending the preset processed alarm event to the user terminal, the method further includes: 接收所述用户终端发送的反馈信息,其中所述反馈信息为所述用户终端对所述经过所述预设处理过的告警事件做预设标记处理生成的处理结果信息;以及Receiving the feedback information sent by the user terminal, where the feedback information is processing result information generated by the user terminal by performing preset label processing on the preset processed alarm event; 将所述处理结果信息保存至所述预设知识库。Saving the processing result information to the preset knowledge base. 根据权利要求1所述的告警处理方法,其中,所述基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理,包括:The alarm processing method according to claim 1, wherein the preset processing of the alarm event acquired from the preset message queue based on the machine learning rule and the preset knowledge base comprises: 从所述告警事件中筛选出监控告警事件,其中所述监控告警事件为所述监控系统的告警事件;And filtering a monitoring alarm event from the alarm event, where the monitoring alarm event is an alarm event of the monitoring system; 将所述监控告警事件发送至自愈系统以使得所述自愈系统处理所述监控告警事件对应的故障。And transmitting the monitoring alarm event to the self-healing system to cause the self-healing system to process the fault corresponding to the monitoring alarm event. 根据权利要求1所述的告警处理方法,其中,所述根据所述告警数据生成告警事件,包括:按照预设事件格式将所述告警数据生成告警事件。The alarm processing method according to claim 1, wherein the generating an alarm event according to the alarm data comprises: generating an alarm event according to a preset event format. 根据权利要求1所述的告警处理方法,其中,所述按照预设获取规则获取所述预设消息队列中的告警事件,包括:The alarm processing method according to claim 1, wherein the acquiring an alarm event in the preset message queue according to a preset acquisition rule comprises: 按照预设顺序获取所述预设消息队列中预设数量的告警事件。Obtaining a preset number of alarm events in the preset message queue according to a preset order. 根据权利要求1所述的告警处理方法,其中,所述基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理,包括:The alarm processing method according to claim 1, wherein the preset processing of the alarm event acquired from the preset message queue based on the machine learning rule and the preset knowledge base comprises: 基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设告警分析处理、预设告警收敛处理或预设告警聚合处理。Performing preset alarm analysis processing, preset alarm convergence processing, or preset alarm aggregation processing on the alarm events acquired from the preset message queue based on the machine learning rules and the preset knowledge base. 根据权利要求6所述的告警处理方法,其中,所述预设告警分析处理对应的分析方法包括基于历史数据统计的方法、假设正态分布的方法或3-sigma策略方法。The alarm processing method according to claim 6, wherein the analysis method corresponding to the preset alarm analysis processing comprises a method based on historical data statistics, a method based on normal distribution, or a 3-sigma strategy method. 根据权利要求6所述的告警处理方法,其中,所述预设告警收敛处理,包括:根据预设时间窗口合并所述告警事件。The alarm processing method according to claim 6, wherein the preset alarm convergence processing comprises: combining the alarm events according to a preset time window. 根据权利要求6所述的告警处理方法,其中,所述预设告警聚合处理包括:关联挖掘处理和异常依赖处理。The alarm processing method according to claim 6, wherein the preset alarm aggregation processing comprises: association mining processing and abnormal dependency processing. 一种告警处理装置,其包括: An alarm processing device includes: 数据获取单元,用于获取监控对象的监控数据;a data acquisition unit, configured to acquire monitoring data of the monitored object; 数据确定单元,用于根据预设告警规则确定所述监控数据中的告警数据,其中所述告警数据为触发所述预设告警规则的监控数据;a data determining unit, configured to determine alarm data in the monitoring data according to a preset alarm rule, where the alarm data is monitoring data that triggers the preset alarm rule; 生成记录单元,用于根据所述告警数据生成告警事件,并将所述告警事件记录在预设消息队列;Generating a recording unit, configured to generate an alarm event according to the alarm data, and record the alarm event in a preset message queue; 事件获取单元,用于按照预设获取规则获取所述预设消息队列中的告警事件;An event obtaining unit, configured to acquire an alarm event in the preset message queue according to a preset acquisition rule; 预设处理单元,用于基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理;以及a preset processing unit, configured to perform preset processing on an alarm event acquired from the preset message queue based on a machine learning rule and a preset knowledge base; 事件发送单元,用于将经过所述预设处理过的告警事件发送至用户终端。The event sending unit is configured to send the alarm event that has been processed by the preset to the user terminal. 根据权利要求10所述的告警处理装置,其中,还包括:The alarm processing device of claim 10, further comprising: 信息接收单元,用于接收所述用户终端发送的反馈信息,其中所述反馈信息为所述用户终端对所述经过所述预设处理过的告警事件做预设标记处理生成的处理结果信息;以及The information receiving unit is configured to receive the feedback information sent by the user terminal, where the feedback information is processing result information generated by the user terminal by performing preset label processing on the preset alarm event; as well as 信息保存单元,用于将所述处理结果信息保存至所述预设知识库。And an information saving unit, configured to save the processing result information to the preset knowledge base. 根据权利要求10所述的告警处理装置,其中,所述预设处理单元,包括:The alarm processing device according to claim 10, wherein the preset processing unit comprises: 事件筛选子单元,用于从所述告警事件中筛选出监控告警事件,其中所述监控告警事件为所述监控系统的告警事件;An event screening subunit, configured to filter out a monitoring alarm event from the alarm event, where the monitoring alarm event is an alarm event of the monitoring system; 发送自愈子单元,用于将所述监控告警事件发送至自愈系统以使得所述自愈系统处理所述监控告警事件对应的故障。And sending a self-healing subunit, configured to send the monitoring alarm event to the self-healing system, so that the self-healing system processes the fault corresponding to the monitoring alarm event. 根据权利要求10所述的告警处理装置,其中,所述生成记录单元,具体用于按照预设事件格式将所述告警数据生成告警事件。The alarm processing device according to claim 10, wherein the generating and recording unit is configured to generate an alarm event according to the preset event format. 根据权利要求10所述的告警处理装置,其中,所述预设处理单元,具体用于基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设告警分析处理、预设告警收敛处理或预设告警聚合处理。The alarm processing device according to claim 10, wherein the preset processing unit is configured to perform preset alarm analysis on an alarm event acquired from the preset message queue based on a machine learning rule and a preset knowledge base. Process, preset alarm convergence processing, or preset alarm aggregation processing. 一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现:A computer device comprising a memory, a processor, and a computer program stored on the memory and operative on the processor, wherein the processor executes the computer program: 获取监控对象的监控数据;Obtain monitoring data of the monitored object; 根据预设告警规则确定所述监控数据中的告警数据,其中所述告警数据为触发所述预设告警规则的监控数据;Determining, by the preset alarm rule, the alarm data in the monitoring data, where the alarm data is monitoring data that triggers the preset alarm rule; 根据所述告警数据生成告警事件,并将所述告警事件记录在预设消息队列;Generating an alarm event according to the alarm data, and recording the alarm event in a preset message queue; 按照预设获取规则获取所述预设消息队列中的告警事件;Obtaining an alarm event in the preset message queue according to a preset acquisition rule; 基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理;以及Presetting the alarm event acquired from the preset message queue based on the machine learning rule and the preset knowledge base; 将经过所述预设处理过的告警事件发送至用户终端。The alarm event that has been processed through the preset is sent to the user terminal. 根据权利要求15所述的计算机设备,其中,所述处理器执行所述计算机程序时实现所述将经过所述预设处理过的告警事件发送至用户终端之后,还实现:The computer device according to claim 15, wherein after the processor executes the computer program, after the sending of the preset processed alarm event to the user terminal, the processor further implements: 接收所述用户终端发送的反馈信息,其中所述反馈信息为所述用户终端对所述经过所述预设处理过的告警事件做预设标记处理生成的处理结果信息;以及Receiving the feedback information sent by the user terminal, where the feedback information is processing result information generated by the user terminal by performing preset label processing on the preset processed alarm event; 将所述处理结果信息保存至所述预设知识库。Saving the processing result information to the preset knowledge base. 根据权利要求15所述的计算机设备,其中,所述处理器执行所述计算机程序实现所述基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理时,具体实现:The computer device according to claim 15, wherein said processor executes said computer program to implement said machine learning rule and a preset knowledge base to perform preset processing on an alarm event acquired from said preset message queue When the specific implementation: 从所述告警事件中筛选出监控告警事件,其中所述监控告警事件为所述监控系统的告警事件;And filtering a monitoring alarm event from the alarm event, where the monitoring alarm event is an alarm event of the monitoring system; 将所述监控告警事件发送至自愈系统以使得所述自愈系统处理所述监控告警事件对应的故障。And transmitting the monitoring alarm event to the self-healing system to cause the self-healing system to process the fault corresponding to the monitoring alarm event. 根据权利要求15所述的计算机设备,其中,所述处理器执行所述计算机程序实现所述根据所述告警数据生成告警事件时,具体实现:按照预设事件格式将所述告警数据生成告警事件。The computer device according to claim 15, wherein the processor executes the computer program to implement the generating an alarm event according to the alarm data, and specifically: generating an alarm event according to a preset event format. . 根据权利要求15所述的计算机设备,其中,所述处理器执行所述计算机程序实现所述基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理,具体实现:基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设告警分析处理、预设告警收敛处理或预设告警聚合处理。The computer device according to claim 15, wherein said processor executes said computer program to implement said machine learning rule and a preset knowledge base to perform preset processing on an alarm event acquired from said preset message queue The specific implementation is: performing preset alarm analysis processing, preset alarm convergence processing, or preset alarm aggregation processing on the alarm event obtained from the preset message queue based on the machine learning rule and the preset knowledge base. 一种存储介质,其中,所述存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行:A storage medium, wherein the storage medium stores a computer program, the computer program comprising program instructions that, when executed by a processor, cause the processor to execute: 获取监控对象的监控数据;Obtain monitoring data of the monitored object; 根据预设告警规则确定所述监控数据中的告警数据,其中所述告警数据为触发所述预设告警规则的监控数据;Determining, by the preset alarm rule, the alarm data in the monitoring data, where the alarm data is monitoring data that triggers the preset alarm rule; 根据所述告警数据生成告警事件,并将所述告警事件记录在预设消息队列;Generating an alarm event according to the alarm data, and recording the alarm event in a preset message queue; 按照预设获取规则获取所述预设消息队列中的告警事件;Obtaining an alarm event in the preset message queue according to a preset acquisition rule; 基于机器学习规则和预设知识库对从所述预设消息队列中获取的告警事件进行预设处理;以及Presetting the alarm event acquired from the preset message queue based on the machine learning rule and the preset knowledge base; 将经过所述预设处理过的告警事件发送至用户终端。The alarm event that has been processed through the preset is sent to the user terminal.
PCT/CN2017/113234 2017-10-24 2017-11-28 Alarm processing method and apparatus, computer device, and storage medium Ceased WO2019080249A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711001755.3 2017-10-24
CN201711001755.3A CN107832200A (en) 2017-10-24 2017-10-24 Alert processing method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2019080249A1 true WO2019080249A1 (en) 2019-05-02

Family

ID=61649105

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/113234 Ceased WO2019080249A1 (en) 2017-10-24 2017-11-28 Alarm processing method and apparatus, computer device, and storage medium

Country Status (2)

Country Link
CN (1) CN107832200A (en)
WO (1) WO2019080249A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113504969A (en) * 2021-07-07 2021-10-15 北京汇钧科技有限公司 Container event alarm method and device and electronic equipment

Families Citing this family (62)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846484A (en) * 2018-04-11 2018-11-20 北京百度网讯科技有限公司 Fault self-recovery system, method, computer equipment and storage medium
CN108810142A (en) * 2018-06-13 2018-11-13 平安科技(深圳)有限公司 Monitoring method, device, computer equipment and the storage medium of Zookeeper
CN108964987A (en) * 2018-06-15 2018-12-07 平安科技(深圳)有限公司 Data monitoring method, device, computer equipment and storage medium
CN108959034A (en) * 2018-07-05 2018-12-07 北京木瓜移动科技股份有限公司 A kind of monitoring alarm method, device, electronic equipment and storage medium
CN109086189A (en) * 2018-07-23 2018-12-25 郑州云海信息技术有限公司 A kind of physical infrastructure manager PIM alert processing method and equipment
CN109039727B (en) * 2018-07-24 2021-08-06 中国银行股份有限公司 Deep learning-based message queue monitoring method and device
CN109144825A (en) * 2018-07-27 2019-01-04 阿里巴巴集团控股有限公司 A kind of alert data monitoring method, device and equipment
CN108763038B (en) * 2018-08-08 2022-04-12 平安科技(深圳)有限公司 Alarm data management method and device, computer equipment and storage medium
CN109669836B (en) * 2018-09-25 2023-04-28 平安普惠企业管理有限公司 Intelligent IT operation and maintenance analysis method, device, equipment and readable storage medium
CN111049664A (en) * 2018-10-11 2020-04-21 中兴通讯股份有限公司 A network alarm processing method, device and storage medium
CN109558298B (en) * 2018-10-12 2022-07-19 平安科技(深圳)有限公司 Alarm execution frequency optimization method based on deep learning model and related equipment
CN109639456B (en) * 2018-11-09 2022-08-16 网宿科技股份有限公司 Improvement method for automatic alarm and automatic processing platform for alarm data
CN109672556A (en) * 2018-11-20 2019-04-23 珠海许继芝电网自动化有限公司 A kind of event alarm system
CN111294218B (en) * 2018-12-06 2022-07-26 云智慧(北京)科技有限公司 Information processing method, device, system and storage medium
CN109636364A (en) * 2018-12-29 2019-04-16 江苏满运软件科技有限公司 The method, system, equipment and the medium that distribute are grouped for electronics red packet
CN111756778B (en) * 2019-03-26 2024-06-18 京东科技控股股份有限公司 Method, device and storage medium for pushing server disk cleaning script
CN109992483A (en) * 2019-04-11 2019-07-09 苏州浪潮智能科技有限公司 A kind of temperature monitoring method, device, equipment and readable storage medium storing program for executing
CN110363381B (en) * 2019-05-31 2023-12-22 创新先进技术有限公司 An information processing method and device
CN110362455B (en) * 2019-07-15 2022-12-20 北京奇艺世纪科技有限公司 A data processing method and data processing device
CN110532152A (en) * 2019-08-05 2019-12-03 北明云智(武汉)网软有限公司 A kind of monitoring alarm processing method and system based on Kapacitor computing engines
CN110661659B (en) * 2019-09-23 2022-06-21 上海艾融软件股份有限公司 Alarm method, device and system and electronic equipment
CN110719207A (en) * 2019-10-23 2020-01-21 北京数制科技有限公司 Alarm message transmission method and device, industrial data acquisition platform and storage medium
CN110865921A (en) * 2019-11-08 2020-03-06 拉扎斯网络科技(上海)有限公司 Data monitoring method, apparatus, readable storage medium and electronic device
CN110708204B (en) * 2019-11-18 2023-03-31 上海维谛信息科技有限公司 Abnormity processing method, system, terminal and medium based on operation and maintenance knowledge base
CN111061616B (en) * 2019-11-25 2024-03-29 京信网络系统股份有限公司 Alarm management method, device, communication equipment and storage medium
CN111259629A (en) * 2020-01-10 2020-06-09 深圳前海环融联易信息科技服务有限公司 Alarm method, device, equipment and storage medium of task scheduling system
CN111352808B (en) * 2020-03-03 2023-04-25 腾讯云计算(北京)有限责任公司 Alarm data processing method, device, equipment and storage medium
CN111522704B (en) * 2020-03-04 2025-03-25 平安科技(深圳)有限公司 Alarm information processing method, device, computer device and storage medium
CN113065884B (en) * 2020-03-31 2024-09-27 中国移动通信集团贵州有限公司 Method and device for processing ticket file and electronic equipment
CN111769977A (en) * 2020-06-17 2020-10-13 广州嘉为科技有限公司 Processing method based on enterprise monitoring alarm event
CN111538643B (en) * 2020-07-07 2020-10-16 宝信软件(成都)有限公司 Alarm information filtering method and system for monitoring system
CN111865691B (en) * 2020-07-22 2022-11-04 平安证券股份有限公司 Alarm file distribution method, device, equipment and medium based on artificial intelligence
US12380342B2 (en) 2020-08-06 2025-08-05 International Business Machines Corporation Alert management in data processing systems
CN112199207A (en) * 2020-09-03 2021-01-08 浙江大华技术股份有限公司 A method, device, system, equipment and medium for pushing alarm information
CN112182367A (en) * 2020-09-18 2021-01-05 佳都新太科技股份有限公司 Management and control alarm method and device
CN114469021A (en) * 2020-10-27 2022-05-13 深圳迈瑞生物医疗电子股份有限公司 Method for reviewing alarm event, monitoring equipment and monitoring system
CN112650642B (en) * 2020-12-07 2024-08-20 深圳前海微众银行股份有限公司 Alarm processing method and device, equipment and storage medium
CN112685247B (en) * 2020-12-24 2024-01-12 京东方科技集团股份有限公司 Alarm suppression method and monitoring system based on Zabbix monitoring system
CN113360292B (en) * 2021-06-01 2024-03-15 北京百度网讯科技有限公司 Message processing methods, devices, electronic equipment, storage media and program products
CN113282420A (en) * 2021-06-07 2021-08-20 新奥数能科技有限公司 Method and device for edge service alarm
CN113434366A (en) * 2021-06-28 2021-09-24 中国建设银行股份有限公司 Event processing method and system
CN113468025B (en) * 2021-07-28 2025-06-24 浙江大华技术股份有限公司 A data alarm method, system, device and storage medium
CN113608839A (en) * 2021-08-10 2021-11-05 曙光信息产业(北京)有限公司 Cluster alarm method, device, computer equipment and storage medium
CN115712646A (en) * 2021-08-18 2023-02-24 腾讯科技(深圳)有限公司 Alarm strategy generation method, device and storage medium
CN113724100B (en) * 2021-08-27 2024-05-10 广东电网有限责任公司 Power grid monitoring alarm message processing method of distributed cluster
CN113704065A (en) * 2021-08-31 2021-11-26 平安普惠企业管理有限公司 Monitoring method, device, equipment and computer storage medium
CN113923327A (en) * 2021-09-08 2022-01-11 深圳市安软慧视科技有限公司 Method and system for displaying camera alarm in three-dimensional map and related equipment
CN113794597B (en) * 2021-09-15 2023-05-30 中国联合网络通信集团有限公司 Alarm information processing method, system, electronic equipment and storage medium
CN113849383B (en) * 2021-09-27 2024-07-05 广州华多网络科技有限公司 Alarm notification control method and device, equipment, medium and product thereof
CN113886182B (en) * 2021-09-29 2025-03-21 深圳市金蝶天燕云计算股份有限公司 Alarm convergence method, device, electronic device and storage medium
CN114066162A (en) * 2021-10-19 2022-02-18 中通服中睿科技有限公司 Intelligent management method and system for alarm event
CN114172785B (en) * 2021-10-21 2023-10-03 广州市百果园信息技术有限公司 Alarm information processing method, device, equipment and storage medium
CN114090412B (en) * 2022-01-20 2022-06-28 北京安帝科技有限公司 Distributed alarm processing method and system
CN114710390A (en) * 2022-02-18 2022-07-05 联通沃悦读科技文化有限公司 Monitoring and alarming method and system, equipment and medium for Internet system
CN114697318A (en) * 2022-06-01 2022-07-01 深圳市华曦达科技股份有限公司 Method and device for pushing alarm snapshot picture of terminal equipment
CN115643164A (en) * 2022-09-19 2023-01-24 杭州浮云网络科技有限公司 Alarm method, alarm device, electronic device and storage medium
CN115664937B (en) * 2022-10-26 2026-01-30 海看网络科技(山东)股份有限公司 A middleware alarm and intelligent recovery system
CN115865619A (en) * 2022-11-24 2023-03-28 浙江中控技术股份有限公司 Alarm processing method, system, device and equipment
CN115865622A (en) * 2022-11-25 2023-03-28 南方电网数字平台科技(广东)有限公司 Multi-cloud monitoring and alarming method and device
CN115827398B (en) * 2023-02-24 2023-06-23 天翼云科技有限公司 Calculation method, device, electronic equipment and storage medium of alarm information component value
CN116827749A (en) * 2023-07-25 2023-09-29 中国电信股份有限公司技术创新中心 Alarm storm processing method, device, computer equipment and storage medium
CN120296077A (en) * 2025-06-12 2025-07-11 中国电建集团西北勘测设计研究院有限公司 Control method and system for safety monitoring and early warning process of deep foundation pit

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2341434A1 (en) * 2008-06-17 2011-07-06 Hitachi Ltd. Method and apparatus for performing root cause analysis
CN103414581A (en) * 2013-07-24 2013-11-27 佳都新太科技股份有限公司 Equipment fault alarm, prediction and processing mechanism based on data mining
CN105743220A (en) * 2016-03-21 2016-07-06 国网天津静海供电有限公司 Dispatching automation monitoring information analyzing and processing system and method
CN106940677A (en) * 2017-02-13 2017-07-11 咪咕音乐有限公司 One kind application daily record data alarm method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103905533A (en) * 2014-03-13 2014-07-02 广州杰赛科技股份有限公司 Distributed type alarm monitoring method and system based on cloud storage
CN106649055A (en) * 2017-01-10 2017-05-10 山东浪潮云服务信息科技有限公司 Domestic CPU (central processing unit) and operating system based software and hardware fault alarming system and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2341434A1 (en) * 2008-06-17 2011-07-06 Hitachi Ltd. Method and apparatus for performing root cause analysis
CN103414581A (en) * 2013-07-24 2013-11-27 佳都新太科技股份有限公司 Equipment fault alarm, prediction and processing mechanism based on data mining
CN105743220A (en) * 2016-03-21 2016-07-06 国网天津静海供电有限公司 Dispatching automation monitoring information analyzing and processing system and method
CN106940677A (en) * 2017-02-13 2017-07-11 咪咕音乐有限公司 One kind application daily record data alarm method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113504969A (en) * 2021-07-07 2021-10-15 北京汇钧科技有限公司 Container event alarm method and device and electronic equipment

Also Published As

Publication number Publication date
CN107832200A (en) 2018-03-23

Similar Documents

Publication Publication Date Title
WO2019080249A1 (en) Alarm processing method and apparatus, computer device, and storage medium
WO2019041406A1 (en) Indecent picture recognition method, terminal and device, and computer-readable storage medium
WO2015127859A1 (en) Sensitive text detecting method and apparatus
WO2022025393A1 (en) Bwp allocation method, apparatus, electronic device and computer readable storage medium
WO2018233367A1 (en) Filing method, device, terminal and computer readable storage medium
WO2018058919A1 (en) Identification information generating method, apparatus, device, and computer readable storage medium
WO2019019374A1 (en) Method, apparatus, and system for controlling household appliance with intelligent voice device
WO2019205280A1 (en) Server testing method, apparatus, and device, and computer readable storage medium
WO2018120457A1 (en) Data processing method, apparatus, device, and computer readable storage medium
WO2019019378A1 (en) Service processing method and apparatus, adapter and computer-readable storage medium
WO2019205323A1 (en) Air conditioner and parameter adjusting method and device therefor, and readable storage medium
WO2019019351A1 (en) User behaviour data processing method and apparatus, and computer readable storage medium
WO2019037395A1 (en) Key management method, device and readable storage medium
WO2019051890A1 (en) Terminal control method and device, and computer-readable storage medium
WO2019024336A1 (en) Data query method and device, and computer readable storage medium
WO2017084337A1 (en) Identity verification method, apparatus and system
WO2018149191A1 (en) Method, apparatus and device for underwriting insurance policy, and computer-readable storage medium
WO2017148037A1 (en) Method and device for diagnosing terminal fault
WO2019041851A1 (en) Home appliance after-sales consulting method, electronic device and computer-readable storage medium
WO2018166107A1 (en) Hybrid-based compatibility method, adapter, operating apparatus and system, and computer-readable storage medium
WO2019051866A1 (en) Right and interest information management method, device, and apparatus, and computer-readable storage medium
WO2019000466A1 (en) Face recognition method and apparatus, storage medium, and electronic device
WO2017161702A1 (en) Method and system indicating washing machine status
WO2017032122A1 (en) Method and apparatus for detecting digital television set
WO2019104874A1 (en) Financial product purchasing method, apparatus and device, and readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17929762

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 23.09.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 17929762

Country of ref document: EP

Kind code of ref document: A1