TWM655004U - System for monitoring multiple abnormal events - Google Patents
System for monitoring multiple abnormal events Download PDFInfo
- Publication number
- TWM655004U TWM655004U TW112214145U TW112214145U TWM655004U TW M655004 U TWM655004 U TW M655004U TW 112214145 U TW112214145 U TW 112214145U TW 112214145 U TW112214145 U TW 112214145U TW M655004 U TWM655004 U TW M655004U
- Authority
- TW
- Taiwan
- Prior art keywords
- trend
- monitoring
- module
- indicator data
- machine learning
- Prior art date
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 86
- 230000002159 abnormal effect Effects 0.000 title claims description 59
- 238000006243 chemical reaction Methods 0.000 claims abstract description 18
- 230000006698 induction Effects 0.000 claims abstract description 15
- 238000010801 machine learning Methods 0.000 claims description 47
- 238000004422 calculation algorithm Methods 0.000 claims description 31
- 230000008859 change Effects 0.000 claims description 27
- 238000012549 training Methods 0.000 claims description 15
- 238000002955 isolation Methods 0.000 claims description 7
- 239000000203 mixture Substances 0.000 claims description 4
- 238000003064 k means clustering Methods 0.000 claims description 3
- 230000000737 periodic effect Effects 0.000 claims description 3
- 230000001939 inductive effect Effects 0.000 claims 1
- 230000003252 repetitive effect Effects 0.000 claims 1
- 230000036541 health Effects 0.000 description 21
- 238000000034 method Methods 0.000 description 9
- 230000005856 abnormality Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000007637 random forest analysis Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000006403 short-term memory Effects 0.000 description 2
- 206010000117 Abnormal behaviour Diseases 0.000 description 1
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Landscapes
- Debugging And Monitoring (AREA)
Abstract
Description
本創作有關於監控系統,尤其對服務系統的不同種類的硬體或軟體的多種性能或能力的異常事件的監控系統。 This invention relates to a monitoring system, and in particular to a monitoring system for abnormal events of various performances or capabilities of different types of hardware or software of a service system.
在電信服務系統中,由許多不同種類的硬體或軟體來共同工作,用以提供通信服務,當要實現穩定和高效的通信服務,就需要即時發現硬體或軟體的異常事件,因此需要對硬體或軟體進行監測,以利於維護通信服務品質,而在對不同的硬體或軟體進行監測時,會面臨幾個主要的挑戰: In telecommunications service systems, many different types of hardware or software work together to provide communication services. To achieve stable and efficient communication services, it is necessary to detect abnormal events in hardware or software in real time. Therefore, it is necessary to monitor the hardware or software to maintain the quality of communication services. When monitoring different hardware or software, there are several major challenges:
1.兼容性問題:每一種軟體(例如:作業系統、應用程式、資料庫…等)與硬體(儲存設備或網路設備…等)各自使用不同的技術和標準,這意味著要實現多種軟體或硬體的異常監測,則需要分別的自定義異常監測標準。例如:儲存設備可能要確認性能(Performance)及能力(Capacity)兩方面的指標,以性能指標而言,可能是處理器的使用量以及輸出入埠的每秒傳輸量(I/O per second),以能力指標而言,可能是讀寫吞吐量(R/W throughput)以及延遲時間(Latency Time)等。 1. Compatibility issues: Each software (e.g. operating system, application, database, etc.) and hardware (storage device or network device, etc.) uses different technologies and standards. This means that to implement abnormal monitoring of multiple software or hardware, you need to customize the abnormal monitoring standards separately. For example, storage devices may need to confirm both performance and capacity indicators. For performance indicators, it may be the processor usage and the I/O per second of the input and output ports. For capacity indicators, it may be the R/W throughput and latency time, etc.
2.資料異質性:軟體或硬體各自產生的數據格式和類型可能各不相同。統一這些數據以便於分析和監控就變成相當複雜的問題。 2. Data heterogeneity: The data formats and types generated by software or hardware may be different. Unifying these data for analysis and monitoring becomes a very complex problem.
3.即時性和效能問題:電信系統通常需要即時或接近即時的監控來快速反應問題。處理和分析大量來自不同設備的數據可能對系統效能造成影響。 3. Real-time and performance issues: Telecommunications systems often require real-time or near-real-time monitoring to respond quickly to problems. Processing and analyzing large amounts of data from different devices may affect system performance.
4.故障檢測的準確性:不同設備的異常行為可能表現各不相同,準確地監測和識別這些異常狀態需要深入的專業知識和高級的分析技術。 4. Accuracy of fault detection: The abnormal behaviors of different devices may manifest differently. Accurately monitoring and identifying these abnormal conditions requires in-depth expertise and advanced analytical techniques.
除此之外,傳統的硬體或軟體的異常事件的監控方法,大都是分別對各項指標分別設定各自的監控閾值,並且在達到監控閾值的情況下才會發出警告,但是這種方式將可能在發生異常事件後,服務系統的服務能力將會突然下降,甚至可能無法提供服務,但是異常事件發生前可能某些軟體或硬體的性能或能力已經有逐漸降低的現象,若是能針對此種現象,將可能可以提早提出警告,避免發生服務系統的服務能力下降或無法服務的問題,但目前並未有任何的平台或技術方案,用來解決以上幾個挑戰及前述的問題,因此如何通過適當的技術方案來解決上述的挑戰及問題,甚至可以不限定用於監控電信系統的硬體或軟體,乃是業界所期待解決的問題。 In addition, the traditional method of monitoring abnormal events of hardware or software is to set a monitoring threshold for each indicator separately, and issue a warning only when the monitoring threshold is reached. However, this method may cause the service capacity of the service system to suddenly decrease after the abnormal event occurs, or even fail to provide services. However, before the abnormal event occurs, the performance or capacity of some software or hardware may have gradually decreased. If this phenomenon can be targeted, it may be possible to issue early warnings to avoid the service system's service capacity being reduced or being unable to provide services. However, there is currently no platform or technical solution to solve the above challenges and problems. Therefore, how to solve the above challenges and problems through appropriate technical solutions, which may not even be limited to the hardware or software used to monitor the telecommunications system, is a problem that the industry expects to solve.
有鑑於先前技術的問題,本創作的目的是對於服務系統的不同軟體與硬體,分別針對不同指標使用數據分析和機器學習技術,來達成異常檢測,尤其是利用分析各種指標的趨勢,用以在發生異常事件前提出警告,而且服務系統包括但不限定於電信服務系統。 In view of the problems of previous technologies, the purpose of this invention is to use data analysis and machine learning technology for different indicators of different software and hardware of the service system to achieve anomaly detection, especially by analyzing the trends of various indicators to issue warnings before abnormal events occur, and the service system includes but is not limited to telecommunications service systems.
根據本創作的目的,提供一種監測多重異常事件系統,包括接收模組、轉換模組、歸納模組、複數個趨勢監控模組及警示模組。接收模組接收 多個指標資料,每一個指標資料用於表示不同軟體或硬體的其中一種性能或能力。轉換模組連接接收模組,並接收多個指標資料,並將每一個指標資料分別轉換成隨時間變化的不同變動趨勢資料。歸納模組連接轉換模組,並將每一個變動趨勢資料被歸納為複數個不同趨勢類型的其中之一個。複數個趨勢監控模組分別連接歸納模組,每一個趨勢監控模組將對每一個趨勢類型各自對應的變動趨勢資料進行監控,而每一個趨勢監控模組分別具有各自變動趨勢標準,用以產生各自的一辨識結果。警示模組連接每一個趨勢監控模組,當任一個趨勢監控模組的辨識結果屬於異常,則警示模組會發出預警訊息,預警訊息為通知指標資料的其中一個或多個為異常。 According to the purpose of this invention, a system for monitoring multiple abnormal events is provided, including a receiving module, a conversion module, an induction module, multiple trend monitoring modules and an alarm module. The receiving module receives multiple indicator data, each indicator data is used to represent one of the performance or capabilities of different software or hardware. The conversion module is connected to the receiving module, and receives multiple indicator data, and converts each indicator data into different change trend data that changes over time. The induction module is connected to the conversion module, and each change trend data is summarized into one of a plurality of different trend types. Multiple trend monitoring modules are connected to the induction module respectively. Each trend monitoring module will monitor the change trend data corresponding to each trend type, and each trend monitoring module has its own change trend standard to generate its own identification result. The warning module is connected to each trend monitoring module. When the identification result of any trend monitoring module is abnormal, the warning module will issue a warning message. The warning message is to notify that one or more of the indicator data is abnormal.
其中,複數個不同趨勢類型的其中一個為固定趨勢類型,固定趨勢類型的定義是指標資料隨時間變化的波動趨勢,以隨時間的反覆週期變化量。 Among them, one of the multiple different trend types is the stationary trend type. The definition of the stationary trend type refers to the fluctuation trend of the indicator data over time, with the amount of change in repeated cycles over time.
其中,固定趨勢類型所對應的趨勢監控模組是由一第一機器學習模型對固定趨勢類型的過往多個指標資料經過訓練所建立而成,用以產生變動趨勢標準,並且由第一機器學習模型依據變動趨勢標準產生辨識結果。 Among them, the trend monitoring module corresponding to the fixed trend type is established by a first machine learning model through training of multiple past indicator data of the fixed trend type to generate a changing trend standard, and the first machine learning model generates an identification result based on the changing trend standard.
其中,第一機器學習模型為一自動編碼器模型。 Among them, the first machine learning model is an automatic encoder model.
其中,複數個不同趨勢類型的其中一個為非固定趨勢類型,非固定趨勢類型的定義是指標資料隨時間變化的波動趨勢是在不確定的時間以超乎預期的差異變化量。 Among them, one of the multiple different trend types is the non-stationary trend type. The definition of the non-stationary trend type is that the fluctuation trend of the indicator data changes over time and changes in an unexpected amount at an uncertain time.
其中,非固定趨勢類型所對應的趨勢監控模組是由一第二機器學習模型對非固定趨勢類型的過往多個指標資料經過訓練所建立而成,用以產生變動趨勢標準,並且由第二機器學習模型依據變動趨勢標準產生辨識結果。 Among them, the trend monitoring module corresponding to the non-fixed trend type is established by a second machine learning model through training of multiple past indicator data of the non-fixed trend type to generate a changing trend standard, and the second machine learning model generates an identification result based on the changing trend standard.
其中,第二機器學習模型為孤立森林模型。 Among them, the second machine learning model is the isolation forest model.
其中,複數個不同趨勢類型的其中一個為多重非固定趨勢類型,多重非固定趨勢類型的定義是多個軟體或硬體的不同性能或能力對應的多個指標資料,分別隨時間變化的波動趨勢是在不確定的時間以超乎預期的差異變化量。 Among them, one of the multiple different trend types is the multiple non-stationary trend type. The definition of the multiple non-stationary trend type is that multiple indicator data corresponding to different performance or capabilities of multiple software or hardware, and the fluctuation trend that changes with time is the amount of change at an uncertain time with an unexpected difference.
其中,多重非固定趨勢類型所對應的趨勢監控模組是多重非固定趨勢類型中同類的性能或能力的多個指標資料,以同一個第三機器學習模型產生中間結果後,再依據中間結果利用一推測解釋演算法的至少一解釋因子,而產生辨識結果。第三機器學習模型是以多重非固定趨勢類型的同類的性能或能力的過往多個指標資料經過訓練所建立而成,並且產生中間標準,並且第三機器學習模型以中間標準產生中間結果,而中間標準及至少一解釋因子的組成即為變動趨勢標準。 Among them, the trend monitoring module corresponding to the multiple non-fixed trend types is a plurality of indicator data of the same performance or capability in the multiple non-fixed trend types. After the intermediate results are generated by the same third machine learning model, at least one explanation factor of an inferred explanation algorithm is used to generate the identification results. The third machine learning model is established by training the past multiple indicator data of the same performance or capability of the multiple non-fixed trend types, and generates the intermediate standard, and the third machine learning model generates the intermediate result with the intermediate standard, and the composition of the intermediate standard and at least one explanation factor is the changing trend standard.
其中,第三機器學習模型為一孤立森林演算法模型,推測解釋演算法為一夏普利加法解釋演算法。 Among them, the third machine learning model is an isolation forest algorithm model, and the inference explanation algorithm is a Shapley addition explanation algorithm.
其中,複數個不同趨勢類型的其中一個為多重不同固定趨勢類型,多重不同固定趨勢類型的定義是其中多個軟體或硬體的同一性能或能力對應的指標資料,分別隨時間變化的波動趨勢是隨時間發生不同的反覆週期變化量。 Among them, one of the multiple different trend types is multiple different fixed trend types. The definition of multiple different fixed trend types is that the indicator data corresponding to the same performance or capability of multiple software or hardware, and the fluctuation trend that changes with time is the amount of repeated periodic changes that occur over time.
其中,多重不同固定趨勢類型所對應的趨勢監控模組是以分群演算法對多重不同固定趨勢類型的多個指標資料產生多個分群結果後,每一分群結果所屬的所有指標資料,再分別以各自的第四機器學習模型,而產生各自的辨識結果,每一個第四機器學習模型對每一分群結果對應的過往多個指標資料經過訓練所建立而成。 Among them, the trend monitoring module corresponding to multiple different fixed trend types uses a clustering algorithm to generate multiple clustering results for multiple indicator data of multiple different fixed trend types. All indicator data belonging to each clustering result are then used by their respective fourth machine learning models to generate their own identification results. Each fourth machine learning model is established by training the previous multiple indicator data corresponding to each clustering result.
其中,分群演算法是K均值類聚演算法,第四機器學習模型為自動編碼器模型。 Among them, the clustering algorithm is the K-means clustering algorithm, and the fourth machine learning model is the automatic encoder model.
據上所述,各種不同類型的變動趨勢資料可以由不同趨勢監控模組所監控,由於趨勢監控模組是監控變動趨勢,使得每一個趨勢監控模組都是可以提早於對應的軟體或硬體的性能或能力發生異常前,提供預警訊息,避免發生服務系統的服務能力下降或無法服務的問題。 As mentioned above, various types of changing trend data can be monitored by different trend monitoring modules. Since the trend monitoring module monitors the changing trend, each trend monitoring module can provide early warning messages before the performance or capability of the corresponding software or hardware is abnormal, thus avoiding the problem of service system's service capability being reduced or unable to provide service.
1:監測多重異常事件系統 1: Monitoring multiple abnormal events system
10:接收模組 10: Receiving module
12:轉換模組 12: Conversion module
14:歸納模組 14: Induction module
16:趨勢監控模組 16: Trend monitoring module
18:警示模組 18: Warning module
2:服務系統 2: Service system
S101~S106:步驟 S101~S106: Steps
圖1是本創作的系統架構示意圖;圖2是本創作的流程圖;圖3是辨識結果為異常的發生時間以人機介面呈現的示意圖;圖4是多重不同固定趨勢類型的辨識用的指標資料與訓練用的指標資料的示意圖;圖5是多重不同固定趨勢類型一年份的訓練用的指標資料的示意圖;圖6是監控資料庫的商業軟體監測出的異常登入次數的示意圖;圖7是本創作檢測出的異常登入次數的示意圖。 Figure 1 is a schematic diagram of the system architecture of this invention; Figure 2 is a flow chart of this invention; Figure 3 is a schematic diagram of the occurrence time of the abnormality as the identification result presented in the human-computer interface; Figure 4 is a schematic diagram of the indicator data for identification and the indicator data for training of multiple different fixed trend types; Figure 5 is a schematic diagram of the indicator data for training of multiple different fixed trend types for one year; Figure 6 is a schematic diagram of the number of abnormal logins monitored by the commercial software of the monitoring database; Figure 7 is a schematic diagram of the number of abnormal logins detected by this invention.
本創作之實施例將藉由下文配合相關圖式進一步加以解說。盡可能的,於圖式與說明書中,相同標號係代表相同或相似構件。於圖式中,基於簡化與方便標示,形狀與厚度可能經過誇大表示。可以理解的是,未特別顯示 於圖式中或描述於說明書中之元件,為所屬技術領域中具有通常技術者所知之形態。本領域之通常技術者可依據本創作之內容而進行多種之改變與修改。 The embodiments of this invention will be further explained below with the help of the relevant drawings. As far as possible, the same reference numerals in the drawings and the manual represent the same or similar components. In the drawings, the shapes and thicknesses may be exaggerated for the sake of simplicity and convenience. It is understood that the components not specifically shown in the drawings or described in the manual are in the form known to ordinary technicians in the relevant technical field. Ordinary technicians in this field can make various changes and modifications based on the content of this invention.
如圖1所示,本創作為一種監測多重異常事件系統,監測多重異常事件系統1包括接收模組10、轉換模組12、歸納模組14、複數個趨勢監控模組16及警示模組18,接收模組10接收一服務系統2的多個指標資料,每一個指標資料用於表示服務系統2的不同軟體或硬體的其中一種性能或能力。轉換模組12連接接收模組10,並接收多個指標資料,轉換模組12將每一個指標資料分別轉換成隨時間變化的不同變動趨勢資料。歸納模組14連接轉換模組12,並將每一個變動趨勢資料被歸納為複數個不同趨勢類型的其中之一個。複數個趨勢監控模組16分別連接歸納模組14,每一個趨勢監控模組16將對每一個趨勢類型各自對應的變動趨勢資料進行監控,而每一趨勢監控模組16分別具有各自變動趨勢標準,用以產生各自的一辨識結果。警示模組18連接每一個趨勢監控模組16,當任一個趨勢監控模組16的辨識結果屬於異常,則警示模組18會發出預警訊息,預警訊息為通知指標資料的其中一個或多個為異常。 As shown in FIG1 , the invention is a system for monitoring multiple abnormal events. The system 1 for monitoring multiple abnormal events includes a receiving module 10, a conversion module 12, an induction module 14, a plurality of trend monitoring modules 16, and an alarm module 18. The receiving module 10 receives a plurality of indicator data of a service system 2, and each indicator data is used to represent one of the performance or capabilities of different software or hardware of the service system 2. The conversion module 12 is connected to the receiving module 10 and receives a plurality of indicator data. The conversion module 12 converts each indicator data into different change trend data that changes over time. The induction module 14 is connected to the conversion module 12 and summarizes each change trend data into one of a plurality of different trend types. A plurality of trend monitoring modules 16 are respectively connected to the induction module 14. Each trend monitoring module 16 will monitor the change trend data corresponding to each trend type, and each trend monitoring module 16 has its own change trend standard to generate its own identification result. The warning module 18 is connected to each trend monitoring module 16. When the identification result of any trend monitoring module 16 is abnormal, the warning module 18 will issue a warning message, which is to notify that one or more of the indicator data is abnormal.
在本創作的一些實施例中,監測多重異常事件系統1可以是一桌上型電腦、筆記型電腦、伺服機或者任何一電子計算設備中具有接收模組10、轉換模組12、歸納模組14、複數個趨勢監控模組16及警示模組18者,皆屬於本創作所稱的監測多重異常事件系統1。而且桌上型電腦、筆記型電腦、伺服機或者電子計算設備可以採用本地端或者遠端連結方式連接到服務系統。此外監測多重異常事件系統1更可以採用雲端計算形式連接到服務系統。 In some embodiments of the present invention, the system 1 for monitoring multiple abnormal events can be a desktop computer, a laptop computer, a server, or any electronic computing device having a receiving module 10, a conversion module 12, an induction module 14, a plurality of trend monitoring modules 16, and an alarm module 18, all of which belong to the system 1 for monitoring multiple abnormal events referred to in the present invention. Moreover, the desktop computer, the laptop computer, the server, or the electronic computing device can be connected to the service system by a local or remote connection. In addition, the system 1 for monitoring multiple abnormal events can be connected to the service system by cloud computing.
請參閱圖2所示,本創作為一種監測多重異常事件的方法,此方法應用於監測多重異常事件系統1,並執行下列步驟: (S101)接收模組10接收多個指標資料,每一指標資料用於表示不同軟體或硬體的其中一種性能或能力;(S102)轉換模組12每一指標資料分別轉換成隨時間變化的一變動趨勢資料;(S103)歸納模組14將每一變動趨勢資料被歸納為複數個趨勢類型的其中之一個,複數個趨勢類型可以包括固定趨勢類型、非固定趨勢類型、多重非固定趨勢類型及多重不同固定趨勢類型,但不限於此;(S104)複數個趨勢監控模組16分別對每個趨勢類型各自對應的每一變動趨勢資料,分別進行監控,而每一趨勢監控模組16分別具有各自變動趨勢標準,用以產生一辨識結果;(S105)警示模組18判斷辨識結果為正常或異常,當變動趨勢資料符合變動趨勢標準下,表示辨識結果為正常,則回到步驟(S101);(S106)警示模組18在變動趨勢資料不符合變動趨勢標準下,表示辨識結果為異常,則發出預警訊息。 Please refer to FIG. 2 . The invention is a method for monitoring multiple abnormal events. The method is applied to a system 1 for monitoring multiple abnormal events and performs the following steps: (S101) The receiving module 10 receives multiple indicator data, each indicator data is used to represent one performance or capability of different software or hardware; (S102) The conversion module 12 converts each indicator data into a variable trend data that changes over time; (S103) The induction module 14 classifies each variable trend data into one of a plurality of trend types. The plurality of trend types may include a fixed trend type, a non-fixed trend type, and multiple non-fixed trend types. (S104) a plurality of trend monitoring modules 16 monitor each variable trend data corresponding to each trend type, and each trend monitoring module 16 has its own variable trend standard to generate an identification result; (S105) The warning module 18 determines whether the identification result is normal or abnormal. When the change trend data meets the change trend standard, it indicates that the identification result is normal, and then returns to step (S101); (S106) when the change trend data does not meet the change trend standard, the warning module 18 indicates that the identification result is abnormal, and then issues a warning message.
根據上述的循環,可以隨著時間反覆地辨識出各項指標資料是否發生異常趨勢,並且可以在超出先前技術所提的監控閾值之前,就可以提出預警訊息,除此之外,由於各項指標資料是隨時間變化取得,因此可以回溯追蹤到是由哪一個性能或者能力的指標資料發生異常,藉以可以解決根本的異常問題。 According to the above cycle, it is possible to repeatedly identify whether the various indicator data have abnormal trends over time, and to issue a warning message before exceeding the monitoring threshold mentioned in the previous technology. In addition, since the various indicator data are obtained over time, it is possible to trace back to which performance or capability indicator data has an abnormality, so as to solve the fundamental abnormality problem.
在本創作的一些實施例中,每一指標資料用於表示不同軟體或硬體的其中一種性能或能力,以在電信服務系統為例,不同的硬體設備(儲存設備或網路設備…等)與不同的軟體(例如:作業系統、應用程式、資料庫…等),分別有不同的性能(Performance)及能力(Capacity)兩方面的指標,例如作業系統只有 評估性能,因此在性能指標方面,包括處理器的使用量、硬碟的輸出入傳輸速率以及記憶體的使用量以及輸出入埠的每秒傳輸量(I/O per second)…等。以儲存設備就會有性能及能力兩方面的指標,在性能指標方面,包括處理器的使用量及輸出入埠每秒傳輸速度,而在能力指標方面,包括讀寫吞吐量(R/W throughput)以及延遲時間(Latency Time)等。 In some embodiments of the present invention, each indicator data is used to represent one of the performance or capabilities of different software or hardware. For example, in a telecommunications service system, different hardware devices (storage devices or network devices, etc.) and different software (such as operating systems, applications, databases, etc.) have different performance (Performance) and capacity (Capacity) indicators. For example, the operating system only evaluates performance, so the performance indicators include the processor usage, the hard disk input and output transfer rate, the memory usage, and the input and output port transfer rate per second (I/O per second)... etc. For storage devices, there will be performance and capacity indicators. The performance indicators include the processor usage and the input and output port transfer rate per second, and the capacity indicators include the read and write throughput (R/W throughput) and latency time, etc.
再者,以Windows作業系統而言,各種性能指標相當於就是工作管理員應用程式所能顯示的各種性能指標,而各種性能指標可以利用”typeperf”指令將效能資料寫入各自的性能日誌檔,而每個性能指標日誌檔可以為CSV格式或純文字檔。另外,利用其他監控軟體對服務系統的各個不同的軟體或硬體進行監控,也可以生成各種性能指標各自的性能日誌檔,或者生成各種能力指標各自的能力日誌檔,例如Performance Co-Pilot(PCP)或OpManager等。但本創作實際實施時,並不以此為限,舉凡可以取得不同軟體或硬體的各種性能指標或能力指標。 Furthermore, in the Windows operating system, various performance indicators are equivalent to the various performance indicators that can be displayed by the task manager application, and various performance indicators can use the "typeperf" command to write performance data into their own performance log files, and each performance indicator log file can be in CSV format or plain text. In addition, by using other monitoring software to monitor the various software or hardware of the service system, it is also possible to generate performance log files for various performance indicators, or generate capability log files for various capability indicators, such as Performance Co-Pilot (PCP) or OpManager. However, when this invention is actually implemented, it is not limited to this, and various performance indicators or capability indicators of different software or hardware can be obtained.
在本創作的一些實施例中,將每個指標資料分別轉換成隨時間變化的一變動趨勢資料的方式,舉例而言,前述的各種性能日誌檔中是包括多筆性能紀錄,每一筆性能紀錄都會伴隨著紀錄發生的時間,因此可以將性能日誌檔轉換成隨時間變化的一變動趨勢資料,而此種轉換過程可以自行設計編輯程式完成,也可以使用現有的多功能的數據分析和監控平台,例如:Splunk、ELK Stack、Grafana、Prometheus及Zabbix等…。 In some embodiments of this invention, each indicator data is converted into a trend data that changes over time. For example, the aforementioned various performance log files include multiple performance records, and each performance record is accompanied by the time when the record occurs. Therefore, the performance log file can be converted into a trend data that changes over time. This conversion process can be completed by designing an editing program by yourself, or by using existing multi-functional data analysis and monitoring platforms, such as Splunk, ELK Stack, Grafana, Prometheus, and Zabbix, etc. .
在本創作的一些實施例中,固定趨勢類型的定義是指標資料隨時間變化的波動趨勢以隨時間的反覆週期變化量。固定趨勢類型在固定時間區間內,資料有固定的走向趨勢,例如:儲存設備的處理器的使用率、輸出入埠每 秒傳輸速度等性能指標,儲存設備的讀取或寫入的速度、延遲時間等能力指標,屬於固定趨勢類型。 In some embodiments of the present invention, the definition of fixed trend type refers to the fluctuation trend of the index data over time and the amount of change in repeated cycles over time. Fixed trend type In a fixed time period, the data has a fixed trend, for example: the utilization rate of the processor of the storage device, the transmission speed of the input and output ports per second and other performance indicators, the speed of reading or writing of the storage device, the delay time and other capability indicators belong to the fixed trend type.
又,非固定趨勢類型的定義是指標資料隨時間變化的波動趨勢是在不確定的時間以超乎預期的差異變化量,非固定趨勢類型在沒有固定趨勢,通常是突然出現大量的工作或程序所造成。例如:資料庫的服務回應時間(SQL Service response Time)、啟用中工作階段(Active Session)等性能指標,而資料庫的正在運行或處理的程序(Process)的數量及資料庫的會話(Session)數量等。 In addition, the definition of non-stationary trend type is that the fluctuation trend of indicator data changes over time is the amount of change at an uncertain time with an unexpected difference. Non-stationary trend type has no fixed trend and is usually caused by the sudden emergence of a large number of tasks or programs. For example: performance indicators such as database service response time (SQL Service response Time), active session (Active Session), and the number of database processes (Process) and database sessions (Session) etc.
另,多重非固定趨勢類型的定義是多個軟體或硬體的不同性能或能力對應的多個指標資料,分別隨時間變化的波動趨勢是在不確定的時間以超乎預期的差異變化量。多重非固定趨勢類型的各種性能日誌檔中是除了包括多筆性能紀錄,每一筆性能紀錄也都會伴隨著紀錄發生的時間外,還會記錄各自的來源資訊,來源資訊可以是網際協定位址(Internet Protocol Address,簡稱:IP Address),或者硬體設備名稱。例如:安全性監控軟體的針對多台伺服器的資訊安全異常偵測日誌檔,或者是多台儲存相同性質資料的資料庫的登入異常日誌檔(login fail)。 In addition, the definition of multiple non-fixed trend types is multiple indicator data corresponding to different performance or capabilities of multiple software or hardware. The fluctuation trend that changes with time is the amount of change at an uncertain time with unexpected differences. In addition to including multiple performance records, each performance record will also be accompanied by the time when the record occurred, and will also record their respective source information. The source information can be the Internet Protocol Address (IP Address) or the name of the hardware device. For example: the information security anomaly detection log file of the security monitoring software for multiple servers, or the login anomaly log file (login fail) of multiple databases storing data of the same nature.
再者,多重不同固定趨勢類型的定義是多個軟體或多個硬體的同一種性能或能力對應的指標資料,分別隨時間變化的波動趨勢是不同的差異變化量。例如:以伺服器開啟檔案(open file)數量作為性能指標,而多個伺服器的開啟檔案數量可以歸納成幾個不同的固定趨勢變化的性能指標,進一步而言,假設有100台伺服器,其中有30台的開啟檔案數量經常在一段時間內為30個,有20台的開啟檔案數量經常在一段時間內為20個,有50台的開啟檔案數量經常在一段時間內為50個,及為本創作所稱的多重不同固定趨勢類型。 Furthermore, the definition of multiple different fixed trend types is that the indicator data corresponding to the same performance or capability of multiple software or multiple hardware, and the fluctuation trends that change over time are different differential changes. For example: the number of open files on the server is used as a performance indicator, and the number of open files on multiple servers can be summarized into several different fixed trend-changing performance indicators. To be more specific, suppose there are 100 servers, of which 30 have 30 open files often within a period of time, 20 have 20 open files often within a period of time, and 50 have 50 open files often within a period of time, which are called multiple different fixed trend types in this work.
在本創作的一些實施例中,固定趨勢類型所對應的趨勢監控模組是由第一機器學習模型對固定趨勢類型的過往多個指標資料經過訓練所建立而成,用以產生變動趨勢標準,並且由第一機器學習模型依據變動趨勢標準產生辨識結果。其中,固定趨勢類型對應的第一機器學習模型為一自動編碼器模型。但本創作在實際實施時並不限於此,也可以是長短期記憶網絡(Long Short-Term Memory,LSTM)、卷積神經網絡(Convolutional Neural Networks,CNN)或閘控遞歸單元(Gated Recurrent Unit,GRU)等機器學習模型。 In some embodiments of the present invention, the trend monitoring module corresponding to the fixed trend type is established by training the first machine learning model on multiple past indicator data of the fixed trend type to generate a changing trend standard, and the first machine learning model generates a recognition result based on the changing trend standard. Among them, the first machine learning model corresponding to the fixed trend type is an automatic encoder model. However, the present invention is not limited to this in actual implementation, and can also be a machine learning model such as Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) or Gated Recurrent Unit (GRU).
在本創作的一些實施例中,非固定趨勢類型所對應的趨勢監控模組是由第二機器學習模型對非固定趨勢類型的過往多個指標資料經過訓練所建立而成,用以產生變動趨勢標準,並且由第二機器學習模型依據變動趨勢標準產生辨識結果。其中第二機器學習模型為孤立森林模型。但本創作在實際實施時並不限於此,也可以是隨機森林演算法(Random Forest)、以密度為基礎之集群演算法(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)、K近鄰演算法(K-Nearest Neighbors,KNN)。 In some embodiments of the present invention, the trend monitoring module corresponding to the non-stationary trend type is established by training the second machine learning model on multiple past indicator data of the non-stationary trend type to generate a changing trend standard, and the second machine learning model generates a recognition result based on the changing trend standard. The second machine learning model is an isolation forest model. However, the present invention is not limited to this in actual implementation, and can also be a random forest algorithm (Random Forest), a density-based spatial clustering algorithm (Density-Based Spatial Clustering of Applications with Noise, DBSCAN), and a K-Nearest Neighbors algorithm (KNN).
在本創作的一些實施例中,多重非固定趨勢類型所對應的趨勢監控模組是多重非固定趨勢類型中同類的性能或能力的多個指標資料,以同一個第三機器學習模型產生中間結果後,再依據中間結果利用推測解釋演算法的解釋因子,用以產生辨識結果。而第三機器學習模型是以多重非固定趨勢類型的同類的性能或能力的過往多個指標資料經過訓練所建立而成,並且產生中間標準,第三機器學習模型依據中間標準以產生中間結果。而中間標準及解釋因子的組成即為變動趨勢標準。其中,第三機器學習模型為孤立森林模型,推測解釋演算法為夏普利加法解釋演算法。第三機器學習模型還可以隨機森林演算法 (Random Forest)、以密度為基礎之集群演算法(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)、K近鄰演算法(K-Nearest Neighbors,KNN),而推測解釋演算法也可以是全局可解釋模型(Global Explanations)。 In some embodiments of the present invention, the trend monitoring module corresponding to multiple non-stationary trend types is a plurality of indicator data of similar performance or capabilities in the multiple non-stationary trend types. After the intermediate results are generated by the same third machine learning model, the explanation factors of the inferred explanation algorithm are used according to the intermediate results to generate identification results. The third machine learning model is established by training multiple past indicator data of similar performance or capabilities of multiple non-stationary trend types, and generates intermediate standards. The third machine learning model generates intermediate results according to the intermediate standards. The composition of the intermediate standards and the explanation factors is the changing trend standard. Among them, the third machine learning model is an isolation forest model, and the inferred explanation algorithm is a Shapley addition explanation algorithm. The third machine learning model can also be a random forest algorithm (Random Forest), a density-based clustering algorithm (Density-Based Spatial Clustering of Applications with Noise, DBSCAN), a K-Nearest Neighbors algorithm (KNN), and the inference explanation algorithm can also be a global explainable model (Global Explanations).
在本創作的一些實施例中,多重不同固定趨勢類型的定義是其中多個軟體或硬體的同一性能或能力對應的多個指標資料,分別隨時間變化的波動趨勢是隨時間發生不同的反覆週期變化量。其中,多重不同固定趨勢類型所對應的趨勢監控模組是以分群演算法對多重不同固定趨勢類型的多個指標資料產生多個分群結果後,每一個分群結果所屬的指標資料,再分別以各自的機器學習模型,而產生各自的辨識結果,每一個機器學習模型對每一分群結果對應的過往指標資料經過訓練所建立而成。其中,分群演算法是K均值類聚演算法,機器學習模型為自動編碼器模型。其中分群演算法還可以是層次聚類(Hierarchical Clustering)、光譜聚類(Spectral Clustering)或者高斯混合模型(Gaussian Mixture Models,GMM)…等,機器學習模型還可以是長短期記憶網絡(Long Short-Term Memory,LSTM)、卷積神經網絡(Convolutional Neural Networks,CNN)或閘控遞歸單元(Gated Recurrent Unit,GRU)等機器學習模型。 In some embodiments of the present invention, the definition of multiple different fixed trend types is that multiple indicator data corresponding to the same performance or capability of multiple software or hardware, and the fluctuation trends that change with time are different repeated periodic changes that occur with time. Among them, the trend monitoring module corresponding to the multiple different fixed trend types uses a clustering algorithm to generate multiple clustering results for multiple indicator data of multiple different fixed trend types, and then the indicator data belonging to each clustering result is respectively generated by a respective machine learning model to generate a respective identification result, and each machine learning model is established by training the past indicator data corresponding to each clustering result. Among them, the clustering algorithm is a K-means clustering algorithm, and the machine learning model is an automatic encoder model. The clustering algorithm can also be hierarchical clustering, spectral clustering or Gaussian mixture models (GMM) etc., and the machine learning model can also be a machine learning model such as long short-term memory network (LSTM), convolutional neural network (CNN) or gated recurrent unit (GRU).
在本創作的一些實施例中,方法還可以將在預定統計區間進行計算系統健康度值,計算方式包括每一個軟體或硬體分別給予一個健康權重值,每個軟體或硬體分別合併累計在預定統計區間各自的性能指標或能力指標的總辨識結果次數以及總異常次數,將每個軟體或硬體的總異常次數除以總辨識結果次數後乘以健康權重值,得到各自的健康扣分值,如下列公式:h i =(A i ÷T i )×W i 其中h i 為其中一個軟體或硬體的健康扣分值,T i 為對應h i 的總辨識結果次數,A i 為對應h i 的總異常次數,W i 為對應h i 的健康權重值,並且將初始健康值減去健康扣分值得到系統健康度值,如下列公式所示:H s =H I -(h i=1+h i=2+…+h i=n )其中H s 為系統健康度值,H I 為初始健康值,例如以100或100%作為初始健康值,i=1表示第一個硬體或軟體,i=2表示第二個硬體或軟體,i=n表示最後一個硬體或軟體。 In some embodiments of the present invention, the method can also calculate the system health value within a predetermined statistical range. The calculation method includes giving a health weight value to each software or hardware, respectively accumulating the total number of identification results and the total number of abnormalities of each software or hardware's performance index or capability index within the predetermined statistical range, dividing the total number of abnormalities of each software or hardware by the total number of identification results and multiplying by the health weight value to obtain the respective health deduction value, as shown in the following formula: hi =( Ai ÷ Ti ) × Wi , wherein hi is the health deduction value of one of the software or hardware, Ti is the total number of identification results corresponding to hi , Ai is the total number of abnormalities corresponding to hi , and Wi is the health deduction value of the corresponding h The health weight value of i is calculated, and the health deduction value is subtracted from the initial health value to obtain the system health value, as shown in the following formula: Hs = H I -( hi = 1 + hi = 2 +…+ hi = n ) where Hs is the system health value, H I is the initial health value, for example, 100 or 100% is used as the initial health value, i = 1 represents the first hardware or software, i = 2 represents the second hardware or software, and i = n represents the last hardware or software.
舉例而言,在某一服務系統中,以作業系統、資料庫、儲存設備、應用程式及網路設備做為評估服務系統的系統健康度值的對象,每一個健康權重值都為20%,預定統計區間為每小時一次,其中作業系統系統在當前的預定統計區間內的總辨識結果次數為72次,總異常次數為4次,故作業系統系統的健康扣分值為1.11。資料庫及儲存設備在當前的預定統計區間內的總辨識結果次數皆為48次,總異常次數皆為4次,故資料庫及儲存設備的健康扣分值皆為1.67。應用程式在當前的預定統計區間內的總辨識結果次數為192次,總異常次數為3次,故應用程式的健康扣分值為0.31。網路設備在當前的預定統計區間內的總辨識結果次數為36次,總異常次數為1次,故網路設備的健康扣分值為0.56,如下表所示:
在一些實施例中,如圖3所示每一個硬體或軟體的每個辨識結果為異常的發生時間以人機介面呈現出來,其中時間軸可以根據需要而調整為所要查看的時間區段,例如:1小時、2小時或24小時等。在圖3中是以前一段所提到的軟體或硬體在預定統計區間(1小時)內的異常的發生時間為例而繪製出來。 In some embodiments, as shown in FIG3, each identification result of each hardware or software is presented as the abnormal occurrence time on a human-machine interface, wherein the time axis can be adjusted to the time period to be viewed as needed, such as: 1 hour, 2 hours or 24 hours, etc. FIG3 is drawn using the abnormal occurrence time of the software or hardware mentioned in the previous paragraph within the predetermined statistical interval (1 hour) as an example.
再以多重不同固定趨勢類型所屬的開啟檔案數量為例,假設有100台伺服器,其中有10台的開啟檔案數量經常在一段時間內為10K個(K表示數量為1,000個)左右,而有其中1台的開啟檔案數量從12/15~12/30之間開始發生變化不尋常的反覆增加趨勢,一直到12/30之後開啟檔案數量突然增加超出20K個(如圖4的虛線線條所示),而傳統的警示機制為開啟檔案數量為超過20K才發出警示訊息。但是在本創作中,假設以過去一年每個月的指標資料(如圖5所示)使用第四機械模型(如:自動編碼器模型)進行訓練後,第四機械模型的每個月的變動趨勢標準相當於圖4的實線線條所示,因此,以第四機械模型進行辨識時,有關於12/15~12/30之間開始發生變化不尋常的反覆增加趨勢之間,將會發出五次的預警訊息。 Taking the number of open files belonging to multiple different fixed trend types as an example, suppose there are 100 servers, and the number of open files of 10 of them is often around 10K (K means the number is 1,000) for a period of time, and the number of open files of one of them begins to change abnormally from 12/15 to 12/30, and the number of open files suddenly increases to more than 20K after 12/30 (as shown by the dotted line in Figure 4). The traditional warning mechanism is to issue a warning message only when the number of open files exceeds 20K. However, in this work, assuming that the fourth mechanical model (such as the auto-encoder model) is trained with the index data of each month in the past year (as shown in Figure 5), the monthly change trend standard of the fourth mechanical model is equivalent to the solid line shown in Figure 4. Therefore, when the fourth mechanical model is used for identification, five warning messages will be issued for the unusual and repeated increase trend that begins between 12/15 and 12/30.
另以多重非固定趨勢類型的指標資料而言,假設我們以監控資料庫的商業軟體的資料管理中心(Data Management Center,DMC)提供監控資料庫各種性能及能力指標的日誌檔,以及商業軟體的搜尋、分析和報告中心(Search,Analytics & Report Center,SARC)檢測出來的異常登入狀況為平均200次(如圖6所示),但是在本創作中,先將不同的資料庫的各種性能及能力的日誌檔結構化整理為指標資料,並且將每一個相同類型的性能指標及能力指標已各自的第三機 器學習模型(如孤立森林演算法)進行辨識後,再以夏普利加法解釋演算法對每一個不同類型的中間結果計算出夏普利值(Shapley values)進行解釋確認是否為異常,而最終與前述商業軟體的相同資料所得到的辨識結果為異常次數約為20次(如圖7所示),此種作法是降低原有監控資料庫的商業軟體的錯誤或過度判斷當作雜訊,因此可以大幅減少異常警示的次數,進而可以減少去排出非異常狀況的人力需求。 In terms of multiple non-fixed trend types of indicator data, assume that the data management center (DMC) of the commercial software monitoring the database provides log files of various performance and capability indicators of the monitoring database, and the abnormal login status detected by the search, analysis and reporting center (SARC) of the commercial software is an average of 200 times (as shown in Figure 6). However, in this work, the various performance and capability log files of different databases are first structured into indicator data, and each performance indicator and capability indicator of the same type is identified by their respective third machine learning models (such as the isolation forest algorithm), and then the Shapley addition interpretation algorithm is used to calculate the Shapley value (Shapley value) of each intermediate result of different types. values) to interpret and confirm whether it is abnormal, and the final identification result obtained for the same data as the aforementioned commercial software is about 20 times of abnormality (as shown in Figure 7). This approach is to reduce the errors or over-judgments of the original commercial software monitoring the database as noise, so the number of abnormal warnings can be greatly reduced, and thus the manpower required to eliminate abnormal conditions can be reduced.
又另一種多重非固定趨勢類型的指標資料而言,假設要對200台伺服機偵測登入失敗的異常偵測問題,由於每一台伺服機各自平常的登入失敗次數水平不同,合併訓練出第三機器學習模型後,再佐以夏普利加法解釋演算法,計算200台在本次異常結果中各自的夏普利值,再以抓取夏普利值貢獻分數高者,作為實際異常表現的對象。如此即可解決每一台伺服機分別以不同的機器學習模型進行訓練及辨識的問題。 For another type of multiple non-fixed trend indicator data, suppose we need to detect the anomaly of login failures on 200 servers. Since each server has a different number of login failures, we can combine and train a third machine learning model, and then use the Shapley addition interpretation algorithm to calculate the Shapley values of the 200 servers in this anomaly result. Then, we can capture the server with the highest Shapley value contribution score as the object of actual anomaly performance. This can solve the problem of training and identifying each server with a different machine learning model.
綜上所述,本創作可以針對各種不同類型的變動趨勢資料,利用不同趨勢監控模組進行監控,由於趨勢監控模組是監控變動趨勢,使得每一個趨勢監控模組都是可以提早於對應的軟體或硬體的性能或能力發生異常前,提供預警訊息,避免發生服務系統的服務能力下降或無法服務的問題,甚至是可以排除錯誤的異常偵測情況,藉以減少對系統除錯的人力需求。 In summary, this invention can monitor various types of changing trend data using different trend monitoring modules. Since the trend monitoring module monitors changing trends, each trend monitoring module can provide early warning messages before the performance or capability of the corresponding software or hardware is abnormal, thus avoiding the problem of service system's service capability being reduced or unable to provide services, and even eliminating erroneous abnormal detection situations, thereby reducing the manpower demand for system debugging.
以上所述,僅為舉例說明本創作的較佳實施方式,並非以此限定實施的範圍,凡是依本創作申請專利範圍及專利說明書內容所作的簡單置換及等效變化,皆屬本創作的專利申請範疇。 The above is only an example to illustrate the best implementation method of this creation, and is not intended to limit the scope of implementation. All simple substitutions and equivalent changes made based on the scope of patent application of this creation and the content of the patent specification are within the scope of patent application of this creation.
1:監測多重異常事件系統 1: Monitoring multiple abnormal events system
10:接收模組 10: Receiving module
12:轉換模組 12: Conversion module
14:歸納模組 14: Induction module
16:趨勢監控模組 16: Trend monitoring module
18:警示模組 18: Warning module
2:服務系統 2: Service system
Claims (15)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW112214145U TWM655004U (en) | 2023-12-25 | 2023-12-25 | System for monitoring multiple abnormal events |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW112214145U TWM655004U (en) | 2023-12-25 | 2023-12-25 | System for monitoring multiple abnormal events |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| TWM655004U true TWM655004U (en) | 2024-05-01 |
Family
ID=92074546
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW112214145U TWM655004U (en) | 2023-12-25 | 2023-12-25 | System for monitoring multiple abnormal events |
Country Status (1)
| Country | Link |
|---|---|
| TW (1) | TWM655004U (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120572548A (en) * | 2025-08-01 | 2025-09-02 | 南京深度智控科技有限公司 | Distribution room operation and maintenance robot based on semantic analysis model |
-
2023
- 2023-12-25 TW TW112214145U patent/TWM655004U/en unknown
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120572548A (en) * | 2025-08-01 | 2025-09-02 | 南京深度智控科技有限公司 | Distribution room operation and maintenance robot based on semantic analysis model |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN114205211B (en) | System and method for fault diagnosis using fault trees | |
| CN112162878B (en) | Database fault discovery method and device, electronic equipment and storage medium | |
| CN111712813B (en) | Intelligent Preprocessing of Multidimensional Time Series Data | |
| CN109634818A (en) | Log analysis method, system, terminal and computer readable storage medium | |
| EP4091110A1 (en) | Systems and methods for distributed incident classification and routing | |
| CN111352794B (en) | Abnormality detection method, abnormality detection device, computer device, and storage medium | |
| CN107704387B (en) | Method, device, electronic equipment and computer readable medium for system early warning | |
| CN114610561A (en) | System monitoring method, apparatus, electronic device, and computer-readable storage medium | |
| CN108234199A (en) | Monitoring method, apparatus and system based on Kafka | |
| CN112052134A (en) | Method and device for monitoring service data | |
| CN117041029A (en) | Network equipment fault processing method and device, electronic equipment and storage medium | |
| CN118796950A (en) | An information collection system based on big data | |
| TWM655004U (en) | System for monitoring multiple abnormal events | |
| US10936401B2 (en) | Device operation anomaly identification and reporting system | |
| CN116541728B (en) | Fault diagnosis method and device based on density clustering | |
| CN106961358A (en) | Web application system cluster method for monitoring operation states and its system based on daily record | |
| CN117354206A (en) | Method, device, system and medium for monitoring API (application program interface) | |
| CN120321102B (en) | Alarm intelligent preprocessing method of self-adaptive rule engine | |
| CN113992496B (en) | Abnormal alarm method and device based on quartile algorithm and computing equipment | |
| CN115118574A (en) | A data processing method, device and storage medium | |
| US20250200073A1 (en) | Implementing Large Language Models to Extract Customized Insights from Log Data | |
| CN119477266A (en) | Equipment operation and maintenance method and system with alarm notification | |
| CN118819994A (en) | Abnormal detection method and device for big data integrated host | |
| CN116016288A (en) | Flow monitoring method, device, equipment and storage medium of industrial equipment | |
| CN115587717A (en) | Data quality detection method, device, storage medium and equipment |