[go: up one dir, main page]

TWM655004U - System for monitoring multiple abnormal events - Google Patents

System for monitoring multiple abnormal events Download PDF

Info

Publication number
TWM655004U
TWM655004U TW112214145U TW112214145U TWM655004U TW M655004 U TWM655004 U TW M655004U TW 112214145 U TW112214145 U TW 112214145U TW 112214145 U TW112214145 U TW 112214145U TW M655004 U TWM655004 U TW M655004U
Authority
TW
Taiwan
Prior art keywords
trend
monitoring
module
indicator data
machine learning
Prior art date
Application number
TW112214145U
Other languages
Chinese (zh)
Inventor
吳盈宜
蔡恆萍
葉泰辰
張根琥
鄭國卿
莊婉君
陳宏宇
張智凱
林哲聖
Original Assignee
遠傳電信股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 遠傳電信股份有限公司 filed Critical 遠傳電信股份有限公司
Priority to TW112214145U priority Critical patent/TWM655004U/en
Publication of TWM655004U publication Critical patent/TWM655004U/en

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The present invention relates to a system for monitoring multiple anomaly events, comprising a receiving module, a conversion module, an induction module, a plurality of trend monitoring modules, and an alert module. The receiving module receives multiple indicator data. The conversion module is connected to the receiving module and receives multiple indicator data. The conversion module converts each indicator data into different trend data that vary over time. The induction module is connected to the conversion module and categorizes each of the trend data into one of several different trend types. The plurality of trend monitoring modules are respectively connected to the induction module, and each trend monitoring module monitors the trend data corresponding to each trend type to generate their respective recognition results. The alert module is connected to each trend monitoring module, and when any of the recognition results indicates an anomaly, the alert module issues a warning message.

Description

監測多重異常事件系統Multiple abnormal event monitoring system

本創作有關於監控系統,尤其對服務系統的不同種類的硬體或軟體的多種性能或能力的異常事件的監控系統。 This invention relates to a monitoring system, and in particular to a monitoring system for abnormal events of various performances or capabilities of different types of hardware or software of a service system.

在電信服務系統中,由許多不同種類的硬體或軟體來共同工作,用以提供通信服務,當要實現穩定和高效的通信服務,就需要即時發現硬體或軟體的異常事件,因此需要對硬體或軟體進行監測,以利於維護通信服務品質,而在對不同的硬體或軟體進行監測時,會面臨幾個主要的挑戰: In telecommunications service systems, many different types of hardware or software work together to provide communication services. To achieve stable and efficient communication services, it is necessary to detect abnormal events in hardware or software in real time. Therefore, it is necessary to monitor the hardware or software to maintain the quality of communication services. When monitoring different hardware or software, there are several major challenges:

1.兼容性問題:每一種軟體(例如:作業系統、應用程式、資料庫…等)與硬體(儲存設備或網路設備…等)各自使用不同的技術和標準,這意味著要實現多種軟體或硬體的異常監測,則需要分別的自定義異常監測標準。例如:儲存設備可能要確認性能(Performance)及能力(Capacity)兩方面的指標,以性能指標而言,可能是處理器的使用量以及輸出入埠的每秒傳輸量(I/O per second),以能力指標而言,可能是讀寫吞吐量(R/W throughput)以及延遲時間(Latency Time)等。 1. Compatibility issues: Each software (e.g. operating system, application, database, etc.) and hardware (storage device or network device, etc.) uses different technologies and standards. This means that to implement abnormal monitoring of multiple software or hardware, you need to customize the abnormal monitoring standards separately. For example, storage devices may need to confirm both performance and capacity indicators. For performance indicators, it may be the processor usage and the I/O per second of the input and output ports. For capacity indicators, it may be the R/W throughput and latency time, etc.

2.資料異質性:軟體或硬體各自產生的數據格式和類型可能各不相同。統一這些數據以便於分析和監控就變成相當複雜的問題。 2. Data heterogeneity: The data formats and types generated by software or hardware may be different. Unifying these data for analysis and monitoring becomes a very complex problem.

3.即時性和效能問題:電信系統通常需要即時或接近即時的監控來快速反應問題。處理和分析大量來自不同設備的數據可能對系統效能造成影響。 3. Real-time and performance issues: Telecommunications systems often require real-time or near-real-time monitoring to respond quickly to problems. Processing and analyzing large amounts of data from different devices may affect system performance.

4.故障檢測的準確性:不同設備的異常行為可能表現各不相同,準確地監測和識別這些異常狀態需要深入的專業知識和高級的分析技術。 4. Accuracy of fault detection: The abnormal behaviors of different devices may manifest differently. Accurately monitoring and identifying these abnormal conditions requires in-depth expertise and advanced analytical techniques.

除此之外,傳統的硬體或軟體的異常事件的監控方法,大都是分別對各項指標分別設定各自的監控閾值,並且在達到監控閾值的情況下才會發出警告,但是這種方式將可能在發生異常事件後,服務系統的服務能力將會突然下降,甚至可能無法提供服務,但是異常事件發生前可能某些軟體或硬體的性能或能力已經有逐漸降低的現象,若是能針對此種現象,將可能可以提早提出警告,避免發生服務系統的服務能力下降或無法服務的問題,但目前並未有任何的平台或技術方案,用來解決以上幾個挑戰及前述的問題,因此如何通過適當的技術方案來解決上述的挑戰及問題,甚至可以不限定用於監控電信系統的硬體或軟體,乃是業界所期待解決的問題。 In addition, the traditional method of monitoring abnormal events of hardware or software is to set a monitoring threshold for each indicator separately, and issue a warning only when the monitoring threshold is reached. However, this method may cause the service capacity of the service system to suddenly decrease after the abnormal event occurs, or even fail to provide services. However, before the abnormal event occurs, the performance or capacity of some software or hardware may have gradually decreased. If this phenomenon can be targeted, it may be possible to issue early warnings to avoid the service system's service capacity being reduced or being unable to provide services. However, there is currently no platform or technical solution to solve the above challenges and problems. Therefore, how to solve the above challenges and problems through appropriate technical solutions, which may not even be limited to the hardware or software used to monitor the telecommunications system, is a problem that the industry expects to solve.

有鑑於先前技術的問題,本創作的目的是對於服務系統的不同軟體與硬體,分別針對不同指標使用數據分析和機器學習技術,來達成異常檢測,尤其是利用分析各種指標的趨勢,用以在發生異常事件前提出警告,而且服務系統包括但不限定於電信服務系統。 In view of the problems of previous technologies, the purpose of this invention is to use data analysis and machine learning technology for different indicators of different software and hardware of the service system to achieve anomaly detection, especially by analyzing the trends of various indicators to issue warnings before abnormal events occur, and the service system includes but is not limited to telecommunications service systems.

根據本創作的目的,提供一種監測多重異常事件系統,包括接收模組、轉換模組、歸納模組、複數個趨勢監控模組及警示模組。接收模組接收 多個指標資料,每一個指標資料用於表示不同軟體或硬體的其中一種性能或能力。轉換模組連接接收模組,並接收多個指標資料,並將每一個指標資料分別轉換成隨時間變化的不同變動趨勢資料。歸納模組連接轉換模組,並將每一個變動趨勢資料被歸納為複數個不同趨勢類型的其中之一個。複數個趨勢監控模組分別連接歸納模組,每一個趨勢監控模組將對每一個趨勢類型各自對應的變動趨勢資料進行監控,而每一個趨勢監控模組分別具有各自變動趨勢標準,用以產生各自的一辨識結果。警示模組連接每一個趨勢監控模組,當任一個趨勢監控模組的辨識結果屬於異常,則警示模組會發出預警訊息,預警訊息為通知指標資料的其中一個或多個為異常。 According to the purpose of this invention, a system for monitoring multiple abnormal events is provided, including a receiving module, a conversion module, an induction module, multiple trend monitoring modules and an alarm module. The receiving module receives multiple indicator data, each indicator data is used to represent one of the performance or capabilities of different software or hardware. The conversion module is connected to the receiving module, and receives multiple indicator data, and converts each indicator data into different change trend data that changes over time. The induction module is connected to the conversion module, and each change trend data is summarized into one of a plurality of different trend types. Multiple trend monitoring modules are connected to the induction module respectively. Each trend monitoring module will monitor the change trend data corresponding to each trend type, and each trend monitoring module has its own change trend standard to generate its own identification result. The warning module is connected to each trend monitoring module. When the identification result of any trend monitoring module is abnormal, the warning module will issue a warning message. The warning message is to notify that one or more of the indicator data is abnormal.

其中,複數個不同趨勢類型的其中一個為固定趨勢類型,固定趨勢類型的定義是指標資料隨時間變化的波動趨勢,以隨時間的反覆週期變化量。 Among them, one of the multiple different trend types is the stationary trend type. The definition of the stationary trend type refers to the fluctuation trend of the indicator data over time, with the amount of change in repeated cycles over time.

其中,固定趨勢類型所對應的趨勢監控模組是由一第一機器學習模型對固定趨勢類型的過往多個指標資料經過訓練所建立而成,用以產生變動趨勢標準,並且由第一機器學習模型依據變動趨勢標準產生辨識結果。 Among them, the trend monitoring module corresponding to the fixed trend type is established by a first machine learning model through training of multiple past indicator data of the fixed trend type to generate a changing trend standard, and the first machine learning model generates an identification result based on the changing trend standard.

其中,第一機器學習模型為一自動編碼器模型。 Among them, the first machine learning model is an automatic encoder model.

其中,複數個不同趨勢類型的其中一個為非固定趨勢類型,非固定趨勢類型的定義是指標資料隨時間變化的波動趨勢是在不確定的時間以超乎預期的差異變化量。 Among them, one of the multiple different trend types is the non-stationary trend type. The definition of the non-stationary trend type is that the fluctuation trend of the indicator data changes over time and changes in an unexpected amount at an uncertain time.

其中,非固定趨勢類型所對應的趨勢監控模組是由一第二機器學習模型對非固定趨勢類型的過往多個指標資料經過訓練所建立而成,用以產生變動趨勢標準,並且由第二機器學習模型依據變動趨勢標準產生辨識結果。 Among them, the trend monitoring module corresponding to the non-fixed trend type is established by a second machine learning model through training of multiple past indicator data of the non-fixed trend type to generate a changing trend standard, and the second machine learning model generates an identification result based on the changing trend standard.

其中,第二機器學習模型為孤立森林模型。 Among them, the second machine learning model is the isolation forest model.

其中,複數個不同趨勢類型的其中一個為多重非固定趨勢類型,多重非固定趨勢類型的定義是多個軟體或硬體的不同性能或能力對應的多個指標資料,分別隨時間變化的波動趨勢是在不確定的時間以超乎預期的差異變化量。 Among them, one of the multiple different trend types is the multiple non-stationary trend type. The definition of the multiple non-stationary trend type is that multiple indicator data corresponding to different performance or capabilities of multiple software or hardware, and the fluctuation trend that changes with time is the amount of change at an uncertain time with an unexpected difference.

其中,多重非固定趨勢類型所對應的趨勢監控模組是多重非固定趨勢類型中同類的性能或能力的多個指標資料,以同一個第三機器學習模型產生中間結果後,再依據中間結果利用一推測解釋演算法的至少一解釋因子,而產生辨識結果。第三機器學習模型是以多重非固定趨勢類型的同類的性能或能力的過往多個指標資料經過訓練所建立而成,並且產生中間標準,並且第三機器學習模型以中間標準產生中間結果,而中間標準及至少一解釋因子的組成即為變動趨勢標準。 Among them, the trend monitoring module corresponding to the multiple non-fixed trend types is a plurality of indicator data of the same performance or capability in the multiple non-fixed trend types. After the intermediate results are generated by the same third machine learning model, at least one explanation factor of an inferred explanation algorithm is used to generate the identification results. The third machine learning model is established by training the past multiple indicator data of the same performance or capability of the multiple non-fixed trend types, and generates the intermediate standard, and the third machine learning model generates the intermediate result with the intermediate standard, and the composition of the intermediate standard and at least one explanation factor is the changing trend standard.

其中,第三機器學習模型為一孤立森林演算法模型,推測解釋演算法為一夏普利加法解釋演算法。 Among them, the third machine learning model is an isolation forest algorithm model, and the inference explanation algorithm is a Shapley addition explanation algorithm.

其中,複數個不同趨勢類型的其中一個為多重不同固定趨勢類型,多重不同固定趨勢類型的定義是其中多個軟體或硬體的同一性能或能力對應的指標資料,分別隨時間變化的波動趨勢是隨時間發生不同的反覆週期變化量。 Among them, one of the multiple different trend types is multiple different fixed trend types. The definition of multiple different fixed trend types is that the indicator data corresponding to the same performance or capability of multiple software or hardware, and the fluctuation trend that changes with time is the amount of repeated periodic changes that occur over time.

其中,多重不同固定趨勢類型所對應的趨勢監控模組是以分群演算法對多重不同固定趨勢類型的多個指標資料產生多個分群結果後,每一分群結果所屬的所有指標資料,再分別以各自的第四機器學習模型,而產生各自的辨識結果,每一個第四機器學習模型對每一分群結果對應的過往多個指標資料經過訓練所建立而成。 Among them, the trend monitoring module corresponding to multiple different fixed trend types uses a clustering algorithm to generate multiple clustering results for multiple indicator data of multiple different fixed trend types. All indicator data belonging to each clustering result are then used by their respective fourth machine learning models to generate their own identification results. Each fourth machine learning model is established by training the previous multiple indicator data corresponding to each clustering result.

其中,分群演算法是K均值類聚演算法,第四機器學習模型為自動編碼器模型。 Among them, the clustering algorithm is the K-means clustering algorithm, and the fourth machine learning model is the automatic encoder model.

據上所述,各種不同類型的變動趨勢資料可以由不同趨勢監控模組所監控,由於趨勢監控模組是監控變動趨勢,使得每一個趨勢監控模組都是可以提早於對應的軟體或硬體的性能或能力發生異常前,提供預警訊息,避免發生服務系統的服務能力下降或無法服務的問題。 As mentioned above, various types of changing trend data can be monitored by different trend monitoring modules. Since the trend monitoring module monitors the changing trend, each trend monitoring module can provide early warning messages before the performance or capability of the corresponding software or hardware is abnormal, thus avoiding the problem of service system's service capability being reduced or unable to provide service.

1:監測多重異常事件系統 1: Monitoring multiple abnormal events system

10:接收模組 10: Receiving module

12:轉換模組 12: Conversion module

14:歸納模組 14: Induction module

16:趨勢監控模組 16: Trend monitoring module

18:警示模組 18: Warning module

2:服務系統 2: Service system

S101~S106:步驟 S101~S106: Steps

圖1是本創作的系統架構示意圖;圖2是本創作的流程圖;圖3是辨識結果為異常的發生時間以人機介面呈現的示意圖;圖4是多重不同固定趨勢類型的辨識用的指標資料與訓練用的指標資料的示意圖;圖5是多重不同固定趨勢類型一年份的訓練用的指標資料的示意圖;圖6是監控資料庫的商業軟體監測出的異常登入次數的示意圖;圖7是本創作檢測出的異常登入次數的示意圖。 Figure 1 is a schematic diagram of the system architecture of this invention; Figure 2 is a flow chart of this invention; Figure 3 is a schematic diagram of the occurrence time of the abnormality as the identification result presented in the human-computer interface; Figure 4 is a schematic diagram of the indicator data for identification and the indicator data for training of multiple different fixed trend types; Figure 5 is a schematic diagram of the indicator data for training of multiple different fixed trend types for one year; Figure 6 is a schematic diagram of the number of abnormal logins monitored by the commercial software of the monitoring database; Figure 7 is a schematic diagram of the number of abnormal logins detected by this invention.

本創作之實施例將藉由下文配合相關圖式進一步加以解說。盡可能的,於圖式與說明書中,相同標號係代表相同或相似構件。於圖式中,基於簡化與方便標示,形狀與厚度可能經過誇大表示。可以理解的是,未特別顯示 於圖式中或描述於說明書中之元件,為所屬技術領域中具有通常技術者所知之形態。本領域之通常技術者可依據本創作之內容而進行多種之改變與修改。 The embodiments of this invention will be further explained below with the help of the relevant drawings. As far as possible, the same reference numerals in the drawings and the manual represent the same or similar components. In the drawings, the shapes and thicknesses may be exaggerated for the sake of simplicity and convenience. It is understood that the components not specifically shown in the drawings or described in the manual are in the form known to ordinary technicians in the relevant technical field. Ordinary technicians in this field can make various changes and modifications based on the content of this invention.

如圖1所示,本創作為一種監測多重異常事件系統,監測多重異常事件系統1包括接收模組10、轉換模組12、歸納模組14、複數個趨勢監控模組16及警示模組18,接收模組10接收一服務系統2的多個指標資料,每一個指標資料用於表示服務系統2的不同軟體或硬體的其中一種性能或能力。轉換模組12連接接收模組10,並接收多個指標資料,轉換模組12將每一個指標資料分別轉換成隨時間變化的不同變動趨勢資料。歸納模組14連接轉換模組12,並將每一個變動趨勢資料被歸納為複數個不同趨勢類型的其中之一個。複數個趨勢監控模組16分別連接歸納模組14,每一個趨勢監控模組16將對每一個趨勢類型各自對應的變動趨勢資料進行監控,而每一趨勢監控模組16分別具有各自變動趨勢標準,用以產生各自的一辨識結果。警示模組18連接每一個趨勢監控模組16,當任一個趨勢監控模組16的辨識結果屬於異常,則警示模組18會發出預警訊息,預警訊息為通知指標資料的其中一個或多個為異常。 As shown in FIG1 , the invention is a system for monitoring multiple abnormal events. The system 1 for monitoring multiple abnormal events includes a receiving module 10, a conversion module 12, an induction module 14, a plurality of trend monitoring modules 16, and an alarm module 18. The receiving module 10 receives a plurality of indicator data of a service system 2, and each indicator data is used to represent one of the performance or capabilities of different software or hardware of the service system 2. The conversion module 12 is connected to the receiving module 10 and receives a plurality of indicator data. The conversion module 12 converts each indicator data into different change trend data that changes over time. The induction module 14 is connected to the conversion module 12 and summarizes each change trend data into one of a plurality of different trend types. A plurality of trend monitoring modules 16 are respectively connected to the induction module 14. Each trend monitoring module 16 will monitor the change trend data corresponding to each trend type, and each trend monitoring module 16 has its own change trend standard to generate its own identification result. The warning module 18 is connected to each trend monitoring module 16. When the identification result of any trend monitoring module 16 is abnormal, the warning module 18 will issue a warning message, which is to notify that one or more of the indicator data is abnormal.

在本創作的一些實施例中,監測多重異常事件系統1可以是一桌上型電腦、筆記型電腦、伺服機或者任何一電子計算設備中具有接收模組10、轉換模組12、歸納模組14、複數個趨勢監控模組16及警示模組18者,皆屬於本創作所稱的監測多重異常事件系統1。而且桌上型電腦、筆記型電腦、伺服機或者電子計算設備可以採用本地端或者遠端連結方式連接到服務系統。此外監測多重異常事件系統1更可以採用雲端計算形式連接到服務系統。 In some embodiments of the present invention, the system 1 for monitoring multiple abnormal events can be a desktop computer, a laptop computer, a server, or any electronic computing device having a receiving module 10, a conversion module 12, an induction module 14, a plurality of trend monitoring modules 16, and an alarm module 18, all of which belong to the system 1 for monitoring multiple abnormal events referred to in the present invention. Moreover, the desktop computer, the laptop computer, the server, or the electronic computing device can be connected to the service system by a local or remote connection. In addition, the system 1 for monitoring multiple abnormal events can be connected to the service system by cloud computing.

請參閱圖2所示,本創作為一種監測多重異常事件的方法,此方法應用於監測多重異常事件系統1,並執行下列步驟: (S101)接收模組10接收多個指標資料,每一指標資料用於表示不同軟體或硬體的其中一種性能或能力;(S102)轉換模組12每一指標資料分別轉換成隨時間變化的一變動趨勢資料;(S103)歸納模組14將每一變動趨勢資料被歸納為複數個趨勢類型的其中之一個,複數個趨勢類型可以包括固定趨勢類型、非固定趨勢類型、多重非固定趨勢類型及多重不同固定趨勢類型,但不限於此;(S104)複數個趨勢監控模組16分別對每個趨勢類型各自對應的每一變動趨勢資料,分別進行監控,而每一趨勢監控模組16分別具有各自變動趨勢標準,用以產生一辨識結果;(S105)警示模組18判斷辨識結果為正常或異常,當變動趨勢資料符合變動趨勢標準下,表示辨識結果為正常,則回到步驟(S101);(S106)警示模組18在變動趨勢資料不符合變動趨勢標準下,表示辨識結果為異常,則發出預警訊息。 Please refer to FIG. 2 . The invention is a method for monitoring multiple abnormal events. The method is applied to a system 1 for monitoring multiple abnormal events and performs the following steps: (S101) The receiving module 10 receives multiple indicator data, each indicator data is used to represent one performance or capability of different software or hardware; (S102) The conversion module 12 converts each indicator data into a variable trend data that changes over time; (S103) The induction module 14 classifies each variable trend data into one of a plurality of trend types. The plurality of trend types may include a fixed trend type, a non-fixed trend type, and multiple non-fixed trend types. (S104) a plurality of trend monitoring modules 16 monitor each variable trend data corresponding to each trend type, and each trend monitoring module 16 has its own variable trend standard to generate an identification result; (S105) The warning module 18 determines whether the identification result is normal or abnormal. When the change trend data meets the change trend standard, it indicates that the identification result is normal, and then returns to step (S101); (S106) when the change trend data does not meet the change trend standard, the warning module 18 indicates that the identification result is abnormal, and then issues a warning message.

根據上述的循環,可以隨著時間反覆地辨識出各項指標資料是否發生異常趨勢,並且可以在超出先前技術所提的監控閾值之前,就可以提出預警訊息,除此之外,由於各項指標資料是隨時間變化取得,因此可以回溯追蹤到是由哪一個性能或者能力的指標資料發生異常,藉以可以解決根本的異常問題。 According to the above cycle, it is possible to repeatedly identify whether the various indicator data have abnormal trends over time, and to issue a warning message before exceeding the monitoring threshold mentioned in the previous technology. In addition, since the various indicator data are obtained over time, it is possible to trace back to which performance or capability indicator data has an abnormality, so as to solve the fundamental abnormality problem.

在本創作的一些實施例中,每一指標資料用於表示不同軟體或硬體的其中一種性能或能力,以在電信服務系統為例,不同的硬體設備(儲存設備或網路設備…等)與不同的軟體(例如:作業系統、應用程式、資料庫…等),分別有不同的性能(Performance)及能力(Capacity)兩方面的指標,例如作業系統只有 評估性能,因此在性能指標方面,包括處理器的使用量、硬碟的輸出入傳輸速率以及記憶體的使用量以及輸出入埠的每秒傳輸量(I/O per second)…等。以儲存設備就會有性能及能力兩方面的指標,在性能指標方面,包括處理器的使用量及輸出入埠每秒傳輸速度,而在能力指標方面,包括讀寫吞吐量(R/W throughput)以及延遲時間(Latency Time)等。 In some embodiments of the present invention, each indicator data is used to represent one of the performance or capabilities of different software or hardware. For example, in a telecommunications service system, different hardware devices (storage devices or network devices, etc.) and different software (such as operating systems, applications, databases, etc.) have different performance (Performance) and capacity (Capacity) indicators. For example, the operating system only evaluates performance, so the performance indicators include the processor usage, the hard disk input and output transfer rate, the memory usage, and the input and output port transfer rate per second (I/O per second)... etc. For storage devices, there will be performance and capacity indicators. The performance indicators include the processor usage and the input and output port transfer rate per second, and the capacity indicators include the read and write throughput (R/W throughput) and latency time, etc.

再者,以Windows作業系統而言,各種性能指標相當於就是工作管理員應用程式所能顯示的各種性能指標,而各種性能指標可以利用”typeperf”指令將效能資料寫入各自的性能日誌檔,而每個性能指標日誌檔可以為CSV格式或純文字檔。另外,利用其他監控軟體對服務系統的各個不同的軟體或硬體進行監控,也可以生成各種性能指標各自的性能日誌檔,或者生成各種能力指標各自的能力日誌檔,例如Performance Co-Pilot(PCP)或OpManager等。但本創作實際實施時,並不以此為限,舉凡可以取得不同軟體或硬體的各種性能指標或能力指標。 Furthermore, in the Windows operating system, various performance indicators are equivalent to the various performance indicators that can be displayed by the task manager application, and various performance indicators can use the "typeperf" command to write performance data into their own performance log files, and each performance indicator log file can be in CSV format or plain text. In addition, by using other monitoring software to monitor the various software or hardware of the service system, it is also possible to generate performance log files for various performance indicators, or generate capability log files for various capability indicators, such as Performance Co-Pilot (PCP) or OpManager. However, when this invention is actually implemented, it is not limited to this, and various performance indicators or capability indicators of different software or hardware can be obtained.

在本創作的一些實施例中,將每個指標資料分別轉換成隨時間變化的一變動趨勢資料的方式,舉例而言,前述的各種性能日誌檔中是包括多筆性能紀錄,每一筆性能紀錄都會伴隨著紀錄發生的時間,因此可以將性能日誌檔轉換成隨時間變化的一變動趨勢資料,而此種轉換過程可以自行設計編輯程式完成,也可以使用現有的多功能的數據分析和監控平台,例如:Splunk、ELK Stack、Grafana、Prometheus及Zabbix等…。 In some embodiments of this invention, each indicator data is converted into a trend data that changes over time. For example, the aforementioned various performance log files include multiple performance records, and each performance record is accompanied by the time when the record occurs. Therefore, the performance log file can be converted into a trend data that changes over time. This conversion process can be completed by designing an editing program by yourself, or by using existing multi-functional data analysis and monitoring platforms, such as Splunk, ELK Stack, Grafana, Prometheus, and Zabbix, etc. .

在本創作的一些實施例中,固定趨勢類型的定義是指標資料隨時間變化的波動趨勢以隨時間的反覆週期變化量。固定趨勢類型在固定時間區間內,資料有固定的走向趨勢,例如:儲存設備的處理器的使用率、輸出入埠每 秒傳輸速度等性能指標,儲存設備的讀取或寫入的速度、延遲時間等能力指標,屬於固定趨勢類型。 In some embodiments of the present invention, the definition of fixed trend type refers to the fluctuation trend of the index data over time and the amount of change in repeated cycles over time. Fixed trend type In a fixed time period, the data has a fixed trend, for example: the utilization rate of the processor of the storage device, the transmission speed of the input and output ports per second and other performance indicators, the speed of reading or writing of the storage device, the delay time and other capability indicators belong to the fixed trend type.

又,非固定趨勢類型的定義是指標資料隨時間變化的波動趨勢是在不確定的時間以超乎預期的差異變化量,非固定趨勢類型在沒有固定趨勢,通常是突然出現大量的工作或程序所造成。例如:資料庫的服務回應時間(SQL Service response Time)、啟用中工作階段(Active Session)等性能指標,而資料庫的正在運行或處理的程序(Process)的數量及資料庫的會話(Session)數量等。 In addition, the definition of non-stationary trend type is that the fluctuation trend of indicator data changes over time is the amount of change at an uncertain time with an unexpected difference. Non-stationary trend type has no fixed trend and is usually caused by the sudden emergence of a large number of tasks or programs. For example: performance indicators such as database service response time (SQL Service response Time), active session (Active Session), and the number of database processes (Process) and database sessions (Session) etc.

另,多重非固定趨勢類型的定義是多個軟體或硬體的不同性能或能力對應的多個指標資料,分別隨時間變化的波動趨勢是在不確定的時間以超乎預期的差異變化量。多重非固定趨勢類型的各種性能日誌檔中是除了包括多筆性能紀錄,每一筆性能紀錄也都會伴隨著紀錄發生的時間外,還會記錄各自的來源資訊,來源資訊可以是網際協定位址(Internet Protocol Address,簡稱:IP Address),或者硬體設備名稱。例如:安全性監控軟體的針對多台伺服器的資訊安全異常偵測日誌檔,或者是多台儲存相同性質資料的資料庫的登入異常日誌檔(login fail)。 In addition, the definition of multiple non-fixed trend types is multiple indicator data corresponding to different performance or capabilities of multiple software or hardware. The fluctuation trend that changes with time is the amount of change at an uncertain time with unexpected differences. In addition to including multiple performance records, each performance record will also be accompanied by the time when the record occurred, and will also record their respective source information. The source information can be the Internet Protocol Address (IP Address) or the name of the hardware device. For example: the information security anomaly detection log file of the security monitoring software for multiple servers, or the login anomaly log file (login fail) of multiple databases storing data of the same nature.

再者,多重不同固定趨勢類型的定義是多個軟體或多個硬體的同一種性能或能力對應的指標資料,分別隨時間變化的波動趨勢是不同的差異變化量。例如:以伺服器開啟檔案(open file)數量作為性能指標,而多個伺服器的開啟檔案數量可以歸納成幾個不同的固定趨勢變化的性能指標,進一步而言,假設有100台伺服器,其中有30台的開啟檔案數量經常在一段時間內為30個,有20台的開啟檔案數量經常在一段時間內為20個,有50台的開啟檔案數量經常在一段時間內為50個,及為本創作所稱的多重不同固定趨勢類型。 Furthermore, the definition of multiple different fixed trend types is that the indicator data corresponding to the same performance or capability of multiple software or multiple hardware, and the fluctuation trends that change over time are different differential changes. For example: the number of open files on the server is used as a performance indicator, and the number of open files on multiple servers can be summarized into several different fixed trend-changing performance indicators. To be more specific, suppose there are 100 servers, of which 30 have 30 open files often within a period of time, 20 have 20 open files often within a period of time, and 50 have 50 open files often within a period of time, which are called multiple different fixed trend types in this work.

在本創作的一些實施例中,固定趨勢類型所對應的趨勢監控模組是由第一機器學習模型對固定趨勢類型的過往多個指標資料經過訓練所建立而成,用以產生變動趨勢標準,並且由第一機器學習模型依據變動趨勢標準產生辨識結果。其中,固定趨勢類型對應的第一機器學習模型為一自動編碼器模型。但本創作在實際實施時並不限於此,也可以是長短期記憶網絡(Long Short-Term Memory,LSTM)、卷積神經網絡(Convolutional Neural Networks,CNN)或閘控遞歸單元(Gated Recurrent Unit,GRU)等機器學習模型。 In some embodiments of the present invention, the trend monitoring module corresponding to the fixed trend type is established by training the first machine learning model on multiple past indicator data of the fixed trend type to generate a changing trend standard, and the first machine learning model generates a recognition result based on the changing trend standard. Among them, the first machine learning model corresponding to the fixed trend type is an automatic encoder model. However, the present invention is not limited to this in actual implementation, and can also be a machine learning model such as Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) or Gated Recurrent Unit (GRU).

在本創作的一些實施例中,非固定趨勢類型所對應的趨勢監控模組是由第二機器學習模型對非固定趨勢類型的過往多個指標資料經過訓練所建立而成,用以產生變動趨勢標準,並且由第二機器學習模型依據變動趨勢標準產生辨識結果。其中第二機器學習模型為孤立森林模型。但本創作在實際實施時並不限於此,也可以是隨機森林演算法(Random Forest)、以密度為基礎之集群演算法(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)、K近鄰演算法(K-Nearest Neighbors,KNN)。 In some embodiments of the present invention, the trend monitoring module corresponding to the non-stationary trend type is established by training the second machine learning model on multiple past indicator data of the non-stationary trend type to generate a changing trend standard, and the second machine learning model generates a recognition result based on the changing trend standard. The second machine learning model is an isolation forest model. However, the present invention is not limited to this in actual implementation, and can also be a random forest algorithm (Random Forest), a density-based spatial clustering algorithm (Density-Based Spatial Clustering of Applications with Noise, DBSCAN), and a K-Nearest Neighbors algorithm (KNN).

在本創作的一些實施例中,多重非固定趨勢類型所對應的趨勢監控模組是多重非固定趨勢類型中同類的性能或能力的多個指標資料,以同一個第三機器學習模型產生中間結果後,再依據中間結果利用推測解釋演算法的解釋因子,用以產生辨識結果。而第三機器學習模型是以多重非固定趨勢類型的同類的性能或能力的過往多個指標資料經過訓練所建立而成,並且產生中間標準,第三機器學習模型依據中間標準以產生中間結果。而中間標準及解釋因子的組成即為變動趨勢標準。其中,第三機器學習模型為孤立森林模型,推測解釋演算法為夏普利加法解釋演算法。第三機器學習模型還可以隨機森林演算法 (Random Forest)、以密度為基礎之集群演算法(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)、K近鄰演算法(K-Nearest Neighbors,KNN),而推測解釋演算法也可以是全局可解釋模型(Global Explanations)。 In some embodiments of the present invention, the trend monitoring module corresponding to multiple non-stationary trend types is a plurality of indicator data of similar performance or capabilities in the multiple non-stationary trend types. After the intermediate results are generated by the same third machine learning model, the explanation factors of the inferred explanation algorithm are used according to the intermediate results to generate identification results. The third machine learning model is established by training multiple past indicator data of similar performance or capabilities of multiple non-stationary trend types, and generates intermediate standards. The third machine learning model generates intermediate results according to the intermediate standards. The composition of the intermediate standards and the explanation factors is the changing trend standard. Among them, the third machine learning model is an isolation forest model, and the inferred explanation algorithm is a Shapley addition explanation algorithm. The third machine learning model can also be a random forest algorithm (Random Forest), a density-based clustering algorithm (Density-Based Spatial Clustering of Applications with Noise, DBSCAN), a K-Nearest Neighbors algorithm (KNN), and the inference explanation algorithm can also be a global explainable model (Global Explanations).

在本創作的一些實施例中,多重不同固定趨勢類型的定義是其中多個軟體或硬體的同一性能或能力對應的多個指標資料,分別隨時間變化的波動趨勢是隨時間發生不同的反覆週期變化量。其中,多重不同固定趨勢類型所對應的趨勢監控模組是以分群演算法對多重不同固定趨勢類型的多個指標資料產生多個分群結果後,每一個分群結果所屬的指標資料,再分別以各自的機器學習模型,而產生各自的辨識結果,每一個機器學習模型對每一分群結果對應的過往指標資料經過訓練所建立而成。其中,分群演算法是K均值類聚演算法,機器學習模型為自動編碼器模型。其中分群演算法還可以是層次聚類(Hierarchical Clustering)、光譜聚類(Spectral Clustering)或者高斯混合模型(Gaussian Mixture Models,GMM)…等,機器學習模型還可以是長短期記憶網絡(Long Short-Term Memory,LSTM)、卷積神經網絡(Convolutional Neural Networks,CNN)或閘控遞歸單元(Gated Recurrent Unit,GRU)等機器學習模型。 In some embodiments of the present invention, the definition of multiple different fixed trend types is that multiple indicator data corresponding to the same performance or capability of multiple software or hardware, and the fluctuation trends that change with time are different repeated periodic changes that occur with time. Among them, the trend monitoring module corresponding to the multiple different fixed trend types uses a clustering algorithm to generate multiple clustering results for multiple indicator data of multiple different fixed trend types, and then the indicator data belonging to each clustering result is respectively generated by a respective machine learning model to generate a respective identification result, and each machine learning model is established by training the past indicator data corresponding to each clustering result. Among them, the clustering algorithm is a K-means clustering algorithm, and the machine learning model is an automatic encoder model. The clustering algorithm can also be hierarchical clustering, spectral clustering or Gaussian mixture models (GMM) etc., and the machine learning model can also be a machine learning model such as long short-term memory network (LSTM), convolutional neural network (CNN) or gated recurrent unit (GRU).

在本創作的一些實施例中,方法還可以將在預定統計區間進行計算系統健康度值,計算方式包括每一個軟體或硬體分別給予一個健康權重值,每個軟體或硬體分別合併累計在預定統計區間各自的性能指標或能力指標的總辨識結果次數以及總異常次數,將每個軟體或硬體的總異常次數除以總辨識結果次數後乘以健康權重值,得到各自的健康扣分值,如下列公式:h i =(A i ÷T i W i 其中h i 為其中一個軟體或硬體的健康扣分值,T i 為對應h i 的總辨識結果次數,A i 為對應h i 的總異常次數,W i 為對應h i 的健康權重值,並且將初始健康值減去健康扣分值得到系統健康度值,如下列公式所示:H s =H I -(h i=1+h i=2+…+h i=n )其中H s 為系統健康度值,H I 為初始健康值,例如以100或100%作為初始健康值,i=1表示第一個硬體或軟體,i=2表示第二個硬體或軟體,i=n表示最後一個硬體或軟體。 In some embodiments of the present invention, the method can also calculate the system health value within a predetermined statistical range. The calculation method includes giving a health weight value to each software or hardware, respectively accumulating the total number of identification results and the total number of abnormalities of each software or hardware's performance index or capability index within the predetermined statistical range, dividing the total number of abnormalities of each software or hardware by the total number of identification results and multiplying by the health weight value to obtain the respective health deduction value, as shown in the following formula: hi =( Ai ÷ Ti ) × Wi , wherein hi is the health deduction value of one of the software or hardware, Ti is the total number of identification results corresponding to hi , Ai is the total number of abnormalities corresponding to hi , and Wi is the health deduction value of the corresponding h The health weight value of i is calculated, and the health deduction value is subtracted from the initial health value to obtain the system health value, as shown in the following formula: Hs = H I -( hi = 1 + hi = 2 +…+ hi = n ) where Hs is the system health value, H I is the initial health value, for example, 100 or 100% is used as the initial health value, i = 1 represents the first hardware or software, i = 2 represents the second hardware or software, and i = n represents the last hardware or software.

舉例而言,在某一服務系統中,以作業系統、資料庫、儲存設備、應用程式及網路設備做為評估服務系統的系統健康度值的對象,每一個健康權重值都為20%,預定統計區間為每小時一次,其中作業系統系統在當前的預定統計區間內的總辨識結果次數為72次,總異常次數為4次,故作業系統系統的健康扣分值為1.11。資料庫及儲存設備在當前的預定統計區間內的總辨識結果次數皆為48次,總異常次數皆為4次,故資料庫及儲存設備的健康扣分值皆為1.67。應用程式在當前的預定統計區間內的總辨識結果次數為192次,總異常次數為3次,故應用程式的健康扣分值為0.31。網路設備在當前的預定統計區間內的總辨識結果次數為36次,總異常次數為1次,故網路設備的健康扣分值為0.56,如下表所示:

Figure 112214145-A0305-02-0014-1
Figure 112214145-A0305-02-0015-2
據上所述,當初始健康值為100時,系統健康度值則為94.68。 For example, in a certain service system, the operating system, database, storage device, application and network device are used as the objects for evaluating the system health value of the service system. Each health weight value is 20%, and the scheduled statistical interval is once every hour. The total number of identification results of the operating system in the current scheduled statistical interval is 72 times, and the total number of abnormal times is 4 times, so the health deduction value of the operating system is 1.11. The total number of identification results of the database and storage device in the current scheduled statistical interval is 48 times, and the total number of abnormal times is 4 times, so the health deduction value of the database and storage device is 1.67. The total number of identification results of the application in the current pre-defined statistical interval is 192 times, and the total number of abnormal results is 3 times, so the health score of the application is 0.31. The total number of identification results of the network device in the current pre-defined statistical interval is 36 times, and the total number of abnormal results is 1 time, so the health score of the network device is 0.56, as shown in the following table:
Figure 112214145-A0305-02-0014-1
Figure 112214145-A0305-02-0015-2
As mentioned above, when the initial health value is 100, the system health value is 94.68.

在一些實施例中,如圖3所示每一個硬體或軟體的每個辨識結果為異常的發生時間以人機介面呈現出來,其中時間軸可以根據需要而調整為所要查看的時間區段,例如:1小時、2小時或24小時等。在圖3中是以前一段所提到的軟體或硬體在預定統計區間(1小時)內的異常的發生時間為例而繪製出來。 In some embodiments, as shown in FIG3, each identification result of each hardware or software is presented as the abnormal occurrence time on a human-machine interface, wherein the time axis can be adjusted to the time period to be viewed as needed, such as: 1 hour, 2 hours or 24 hours, etc. FIG3 is drawn using the abnormal occurrence time of the software or hardware mentioned in the previous paragraph within the predetermined statistical interval (1 hour) as an example.

再以多重不同固定趨勢類型所屬的開啟檔案數量為例,假設有100台伺服器,其中有10台的開啟檔案數量經常在一段時間內為10K個(K表示數量為1,000個)左右,而有其中1台的開啟檔案數量從12/15~12/30之間開始發生變化不尋常的反覆增加趨勢,一直到12/30之後開啟檔案數量突然增加超出20K個(如圖4的虛線線條所示),而傳統的警示機制為開啟檔案數量為超過20K才發出警示訊息。但是在本創作中,假設以過去一年每個月的指標資料(如圖5所示)使用第四機械模型(如:自動編碼器模型)進行訓練後,第四機械模型的每個月的變動趨勢標準相當於圖4的實線線條所示,因此,以第四機械模型進行辨識時,有關於12/15~12/30之間開始發生變化不尋常的反覆增加趨勢之間,將會發出五次的預警訊息。 Taking the number of open files belonging to multiple different fixed trend types as an example, suppose there are 100 servers, and the number of open files of 10 of them is often around 10K (K means the number is 1,000) for a period of time, and the number of open files of one of them begins to change abnormally from 12/15 to 12/30, and the number of open files suddenly increases to more than 20K after 12/30 (as shown by the dotted line in Figure 4). The traditional warning mechanism is to issue a warning message only when the number of open files exceeds 20K. However, in this work, assuming that the fourth mechanical model (such as the auto-encoder model) is trained with the index data of each month in the past year (as shown in Figure 5), the monthly change trend standard of the fourth mechanical model is equivalent to the solid line shown in Figure 4. Therefore, when the fourth mechanical model is used for identification, five warning messages will be issued for the unusual and repeated increase trend that begins between 12/15 and 12/30.

另以多重非固定趨勢類型的指標資料而言,假設我們以監控資料庫的商業軟體的資料管理中心(Data Management Center,DMC)提供監控資料庫各種性能及能力指標的日誌檔,以及商業軟體的搜尋、分析和報告中心(Search,Analytics & Report Center,SARC)檢測出來的異常登入狀況為平均200次(如圖6所示),但是在本創作中,先將不同的資料庫的各種性能及能力的日誌檔結構化整理為指標資料,並且將每一個相同類型的性能指標及能力指標已各自的第三機 器學習模型(如孤立森林演算法)進行辨識後,再以夏普利加法解釋演算法對每一個不同類型的中間結果計算出夏普利值(Shapley values)進行解釋確認是否為異常,而最終與前述商業軟體的相同資料所得到的辨識結果為異常次數約為20次(如圖7所示),此種作法是降低原有監控資料庫的商業軟體的錯誤或過度判斷當作雜訊,因此可以大幅減少異常警示的次數,進而可以減少去排出非異常狀況的人力需求。 In terms of multiple non-fixed trend types of indicator data, assume that the data management center (DMC) of the commercial software monitoring the database provides log files of various performance and capability indicators of the monitoring database, and the abnormal login status detected by the search, analysis and reporting center (SARC) of the commercial software is an average of 200 times (as shown in Figure 6). However, in this work, the various performance and capability log files of different databases are first structured into indicator data, and each performance indicator and capability indicator of the same type is identified by their respective third machine learning models (such as the isolation forest algorithm), and then the Shapley addition interpretation algorithm is used to calculate the Shapley value (Shapley value) of each intermediate result of different types. values) to interpret and confirm whether it is abnormal, and the final identification result obtained for the same data as the aforementioned commercial software is about 20 times of abnormality (as shown in Figure 7). This approach is to reduce the errors or over-judgments of the original commercial software monitoring the database as noise, so the number of abnormal warnings can be greatly reduced, and thus the manpower required to eliminate abnormal conditions can be reduced.

又另一種多重非固定趨勢類型的指標資料而言,假設要對200台伺服機偵測登入失敗的異常偵測問題,由於每一台伺服機各自平常的登入失敗次數水平不同,合併訓練出第三機器學習模型後,再佐以夏普利加法解釋演算法,計算200台在本次異常結果中各自的夏普利值,再以抓取夏普利值貢獻分數高者,作為實際異常表現的對象。如此即可解決每一台伺服機分別以不同的機器學習模型進行訓練及辨識的問題。 For another type of multiple non-fixed trend indicator data, suppose we need to detect the anomaly of login failures on 200 servers. Since each server has a different number of login failures, we can combine and train a third machine learning model, and then use the Shapley addition interpretation algorithm to calculate the Shapley values of the 200 servers in this anomaly result. Then, we can capture the server with the highest Shapley value contribution score as the object of actual anomaly performance. This can solve the problem of training and identifying each server with a different machine learning model.

綜上所述,本創作可以針對各種不同類型的變動趨勢資料,利用不同趨勢監控模組進行監控,由於趨勢監控模組是監控變動趨勢,使得每一個趨勢監控模組都是可以提早於對應的軟體或硬體的性能或能力發生異常前,提供預警訊息,避免發生服務系統的服務能力下降或無法服務的問題,甚至是可以排除錯誤的異常偵測情況,藉以減少對系統除錯的人力需求。 In summary, this invention can monitor various types of changing trend data using different trend monitoring modules. Since the trend monitoring module monitors changing trends, each trend monitoring module can provide early warning messages before the performance or capability of the corresponding software or hardware is abnormal, thus avoiding the problem of service system's service capability being reduced or unable to provide services, and even eliminating erroneous abnormal detection situations, thereby reducing the manpower demand for system debugging.

以上所述,僅為舉例說明本創作的較佳實施方式,並非以此限定實施的範圍,凡是依本創作申請專利範圍及專利說明書內容所作的簡單置換及等效變化,皆屬本創作的專利申請範疇。 The above is only an example to illustrate the best implementation method of this creation, and is not intended to limit the scope of implementation. All simple substitutions and equivalent changes made based on the scope of patent application of this creation and the content of the patent specification are within the scope of patent application of this creation.

1:監測多重異常事件系統 1: Monitoring multiple abnormal events system

10:接收模組 10: Receiving module

12:轉換模組 12: Conversion module

14:歸納模組 14: Induction module

16:趨勢監控模組 16: Trend monitoring module

18:警示模組 18: Warning module

2:服務系統 2: Service system

Claims (15)

一種監測多重異常事件系統,包括:一接收模組,接收多個指標資料,每一該指標資料用於表示不同軟體或硬體的其中一種性能或能力;一轉換模組,連接該接收模組,並接收該多個指標資料,並將每一該指標資料分別轉換成隨時間變化的一變動趨勢資料;一歸納模組,連接該轉換模組,並將每一該變動趨勢資料被歸納為複數個不同趨勢類型的其中之一個;複數個趨勢監控模組,分別連接該歸納模組,每一該趨勢監控模組將對每一該趨勢類型對應的每一該變動趨勢資料進行監控,而每一該趨勢監控模組分別具有各自變動趨勢標準,用以產生各自的一辨識結果;以及一警示模組,連接該複數個趨勢監控模組,當任一該趨勢監控模組的其中一個的該辨識結果屬於異常,則該警示模組會發出一預警訊息,該預警訊息為通知該指標資料的其中一個或多個為異常。 A system for monitoring multiple abnormal events includes: a receiving module, receiving a plurality of indicator data, each of which is used to represent a performance or capability of different software or hardware; a conversion module, connected to the receiving module, receiving the plurality of indicator data, and converting each of the indicator data into a change trend data that changes with time; an induction module, connected to the conversion module, and inducing each of the change trend data to be one of a plurality of different trend types; a plurality of trend monitoring modules, Connected to the induction module respectively, each trend monitoring module will monitor each change trend data corresponding to each trend type, and each trend monitoring module has its own change trend standard to generate its own identification result; and an alarm module connected to the plurality of trend monitoring modules, when the identification result of any one of the trend monitoring modules is abnormal, the alarm module will issue an early warning message, which is to notify that one or more of the indicator data is abnormal. 如請求項1所述的監測多重異常事件系統,其中該複數個不同趨勢類型的其中一個為一固定趨勢類型,該固定趨勢類型的定義是該指標資料隨時間變化的波動趨勢,以隨時間的反覆週期變化量。 A system for monitoring multiple abnormal events as described in claim 1, wherein one of the plurality of different trend types is a fixed trend type, and the fixed trend type is defined as the fluctuation trend of the indicator data over time, with the amount of repeated periodic changes over time. 如請求項2所述的監測多重異常事件系統,其中該固定趨勢類型所對應的該趨勢監控模組是由一第一機器學習模型對該固定趨勢類型的過往多個該指標資料經過訓練所建立而成,用以產生該變動趨勢標準,並且由該第一機器學習模型依據該變動趨勢標準產生該辨識結果。 The system for monitoring multiple abnormal events as described in claim 2, wherein the trend monitoring module corresponding to the fixed trend type is established by a first machine learning model through training of a plurality of past indicator data of the fixed trend type to generate the changing trend standard, and the first machine learning model generates the identification result according to the changing trend standard. 如請求項3所述的監測多重異常事件系統,其中該第一機器學習模型為一自動編碼器模型。 A system for monitoring multiple abnormal events as described in claim 3, wherein the first machine learning model is an automatic encoder model. 如請求項1所述的監測多重異常事件系統,其中該複數個不同趨勢類型的其中一個為一非固定趨勢類型,該非固定趨勢類型的定義是該指標資料隨時間變化的波動趨勢是在不確定的時間以超乎預期的差異變化量。 A system for monitoring multiple abnormal events as described in claim 1, wherein one of the plurality of different trend types is a non-stationary trend type, and the definition of the non-stationary trend type is that the fluctuation trend of the indicator data over time changes with an unexpected difference at an uncertain time. 如請求項5所述的監測多重異常事件系統,其中該非固定趨勢類型所對應的該趨勢監控模組是由一第二機器學習模型對該非固定趨勢類型的過往多個該指標資料經過訓練所建立而成,用以產生該變動趨勢標準,並且由該第二機器學習模型依據該變動趨勢標準產生該辨識結果。 The system for monitoring multiple abnormal events as described in claim 5, wherein the trend monitoring module corresponding to the non-fixed trend type is established by a second machine learning model through training of a plurality of past indicator data of the non-fixed trend type to generate the changing trend standard, and the second machine learning model generates the identification result according to the changing trend standard. 如請求項6所述的監測多重異常事件系統,其中該第二機器學習模型為孤立森林模型。 A system for monitoring multiple abnormal events as described in claim 6, wherein the second machine learning model is an isolation forest model. 如請求項1所述的監測多重異常事件系統,其中該複數個不同趨勢類型的其中一個為多重非固定趨勢類型,該多重非固定趨勢類型的定義是多個軟體或硬體的不同性能或能力對應的多個該指標資料,分別隨時間變化的波動趨勢是在不確定的時間以超乎預期的差異變化量。 A system for monitoring multiple abnormal events as described in claim 1, wherein one of the multiple different trend types is a multiple non-stationary trend type, and the definition of the multiple non-stationary trend type is that the multiple indicator data corresponding to the different performance or capabilities of multiple software or hardware, and the fluctuation trend that changes with time is the amount of change at an uncertain time with an unexpected difference. 如請求項8所述的監測多重異常事件系統,其中該多重非固定趨勢類型所對應的趨勢監控模組是該多重非固定趨勢類型中同類的性能或能力的多個該指標資料以同一個第三機器學習模型產生一中間結果後,再依據該中間結果利用一推測解釋演算法的至少一解釋因子,而產生辨識結果,且該第三機器學習模型是以該多重非固定趨勢類型的同類的性能或能力的過往多個該指標資料經過訓練所建立而成,並且產生一中間標準,該第三機器學習模型依據該 中間標準以產生該中間結果,而該中間標準及該至少一解釋因子的組成即為該變動趨勢標準。 A system for monitoring multiple abnormal events as described in claim 8, wherein the trend monitoring module corresponding to the multiple non-fixed trend types generates an intermediate result using the same third machine learning model for multiple indicator data of similar performance or capability in the multiple non-fixed trend types, and then generates an identification result based on the intermediate result using at least one explanation factor of an inferred explanation algorithm, and the third machine learning model is established by training multiple indicator data of similar performance or capability of the multiple non-fixed trend types in the past, and generates an intermediate standard, the third machine learning model generates the intermediate result based on the intermediate standard, and the composition of the intermediate standard and the at least one explanation factor is the variable trend standard. 如請求項9所述的監測多重異常事件系統,其中該第三機器學習模型為一孤立森林演算法模型。 A system for monitoring multiple abnormal events as described in claim 9, wherein the third machine learning model is an isolation forest algorithm model. 如請求項9所述的監測多重異常事件系統,其中該推測解釋演算法為一夏普利加法解釋演算法。 A system for monitoring multiple abnormal events as described in claim 9, wherein the inference explanation algorithm is a Shapley addition explanation algorithm. 如請求項1所述的監測多重異常事件系統,其中該複數個不同趨勢類型的其中一個為多重不同固定趨勢類型,該多重不同固定趨勢類型的定義是其中多個軟體或硬體的同一性能或能力對應的該指標資料,分別隨時間變化的波動趨勢是隨時間發生不同的反覆週期變化量。 A system for monitoring multiple abnormal events as described in claim 1, wherein one of the plurality of different trend types is a plurality of different fixed trend types, and the definition of the plurality of different fixed trend types is that the fluctuation trend of the indicator data corresponding to the same performance or capability of multiple software or hardware respectively changes with time is a different repetitive cycle change amount occurring with time. 如請求項12所述的監測多重異常事件系統,其中該多重不同固定趨勢類型所對應的趨勢監控模組是以分群演算法對該多重不同固定趨勢類型的多個該指標資料產生多個分群結果後,每一該分群結果所屬的所有該指標資料,再分別以各自的第四機器學習模型,而產生各自的辨識結果,每一個第四機器學習模型對每一該分群結果對應的過往多個該指標資料經過訓練所建立而成。 A system for monitoring multiple abnormal events as described in claim 12, wherein the trend monitoring module corresponding to the multiple different fixed trend types generates multiple grouping results for the multiple indicator data of the multiple different fixed trend types using a grouping algorithm, and then generates respective identification results for all the indicator data belonging to each grouping result using respective fourth machine learning models, and each fourth machine learning model is established by training the multiple past indicator data corresponding to each grouping result. 如請求項13所述的監測多重異常事件系統,其中該分群演算法是K均值類聚演算法。 A system for monitoring multiple abnormal events as described in claim 13, wherein the clustering algorithm is a K-means clustering algorithm. 如請求項13所述的監測多重異常事件系統,其中該第四機器學習模型為自動編碼器模型。A system for monitoring multiple abnormal events as described in claim 13, wherein the fourth machine learning model is an automatic encoder model.
TW112214145U 2023-12-25 2023-12-25 System for monitoring multiple abnormal events TWM655004U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW112214145U TWM655004U (en) 2023-12-25 2023-12-25 System for monitoring multiple abnormal events

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW112214145U TWM655004U (en) 2023-12-25 2023-12-25 System for monitoring multiple abnormal events

Publications (1)

Publication Number Publication Date
TWM655004U true TWM655004U (en) 2024-05-01

Family

ID=92074546

Family Applications (1)

Application Number Title Priority Date Filing Date
TW112214145U TWM655004U (en) 2023-12-25 2023-12-25 System for monitoring multiple abnormal events

Country Status (1)

Country Link
TW (1) TWM655004U (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120572548A (en) * 2025-08-01 2025-09-02 南京深度智控科技有限公司 Distribution room operation and maintenance robot based on semantic analysis model

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120572548A (en) * 2025-08-01 2025-09-02 南京深度智控科技有限公司 Distribution room operation and maintenance robot based on semantic analysis model

Similar Documents

Publication Publication Date Title
CN114205211B (en) System and method for fault diagnosis using fault trees
CN112162878B (en) Database fault discovery method and device, electronic equipment and storage medium
CN111712813B (en) Intelligent Preprocessing of Multidimensional Time Series Data
CN109634818A (en) Log analysis method, system, terminal and computer readable storage medium
EP4091110A1 (en) Systems and methods for distributed incident classification and routing
CN111352794B (en) Abnormality detection method, abnormality detection device, computer device, and storage medium
CN107704387B (en) Method, device, electronic equipment and computer readable medium for system early warning
CN114610561A (en) System monitoring method, apparatus, electronic device, and computer-readable storage medium
CN108234199A (en) Monitoring method, apparatus and system based on Kafka
CN112052134A (en) Method and device for monitoring service data
CN117041029A (en) Network equipment fault processing method and device, electronic equipment and storage medium
CN118796950A (en) An information collection system based on big data
TWM655004U (en) System for monitoring multiple abnormal events
US10936401B2 (en) Device operation anomaly identification and reporting system
CN116541728B (en) Fault diagnosis method and device based on density clustering
CN106961358A (en) Web application system cluster method for monitoring operation states and its system based on daily record
CN117354206A (en) Method, device, system and medium for monitoring API (application program interface)
CN120321102B (en) Alarm intelligent preprocessing method of self-adaptive rule engine
CN113992496B (en) Abnormal alarm method and device based on quartile algorithm and computing equipment
CN115118574A (en) A data processing method, device and storage medium
US20250200073A1 (en) Implementing Large Language Models to Extract Customized Insights from Log Data
CN119477266A (en) Equipment operation and maintenance method and system with alarm notification
CN118819994A (en) Abnormal detection method and device for big data integrated host
CN116016288A (en) Flow monitoring method, device, equipment and storage medium of industrial equipment
CN115587717A (en) Data quality detection method, device, storage medium and equipment