JP2010009313A

JP2010009313A - Fault sign detection device

Info

Publication number: JP2010009313A
Application number: JP2008167805A
Authority: JP
Inventors: Nobuyuki Takai; 伸之高井; Hiroshi Takano; 啓高野
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2008-06-26
Filing date: 2008-06-26
Publication date: 2010-01-14

Abstract

【課題】迅速に障害の予兆を検知可能な障害予兆検知装置を得ること。
【解決手段】本発明にかかる障害予兆検知装置は、情報システムの障害の予兆を検知する障害予兆検知装置であって、たとえば、測定項目毎の実測定値を周期的に記憶するための監視対象測定値データベース５と、過去の障害発生時における測定項目毎の障害時測定値を過去の障害ケース毎に記憶するための障害特異点データベース３と、障害予兆検知の有無の判断基準となるしきい値を記憶するための通知種別しきい値データベース４と、障害ケース毎に所定の計算式を演算することにより同一測定項目における実測定値と障害時測定値の差分である測定項目毎の差分を単一値として数値化し、さらに、当該単一値としきい値とを比較することにより障害予兆検知の有無を判断する障害監視サーバ部１と、を備える。
【選択図】図１To provide a failure sign detection device capable of quickly detecting a failure sign.
A failure sign detection device according to the present invention is a failure sign detection device for detecting a failure sign of an information system, for example, a monitoring target measurement for periodically storing actual measurement values for each measurement item. Value database 5, failure singularity database 3 for storing measured values at the time of failure for each measurement item at the time of a past failure for each failure case, and a threshold value as a criterion for determining whether or not a failure sign is detected The threshold value database 4 for storing the notification type and the difference for each measurement item that is the difference between the actual measurement value and the measurement value at the time of the failure in the same measurement item is calculated by calculating a predetermined calculation formula for each failure case. And a failure monitoring server unit 1 that determines the presence or absence of failure sign detection by quantifying the value and comparing the single value with a threshold value.
[Selection] Figure 1

Description

本発明は、情報システムにおける障害の予兆を検知する障害予兆検知装置に関する。 The present invention relates to a failure sign detection device that detects a failure sign in an information system.

従来の障害監視方式は、監視対象システムを監視するため、単一または複数の測定項目（たとえばルータのＣＰＵ負荷）を設定し、各測定項目で過去の経験から正常値と思われる範囲またはしきい値を設定する。監視装置が周期的に監視対象システムを測定し、測定値が正常値と思われる範囲内か、またはしきい値を超えていないかどうかを確認する。たとえば、測定値が正常値と思われる範囲を外れるか、またはしきい値を超えた場合、監視装置は、障害が発生している、または障害が発生する可能性がある、と判断して監視保守員に通知する。このような障害監視方式が下記特許文献１に開示されている。 In the conventional fault monitoring method, in order to monitor a monitored system, a single or a plurality of measurement items (for example, CPU load of the router) are set, and a range or threshold value that is considered to be a normal value from past experience in each measurement item. Set the value. The monitoring device periodically measures the system to be monitored, and checks whether the measured value is within a range considered to be a normal value or does not exceed the threshold value. For example, if the measured value is outside the range that is considered normal or exceeds the threshold value, the monitoring device determines that a failure has occurred or that a failure may occur, and monitors it. Notify maintenance personnel. Such a fault monitoring system is disclosed in Patent Document 1 below.

特開２００６−０９９２４９号公報JP 2006-099249 A

しかしながら、上記従来の技術によれば、監視保守員は、測定項目が正常値と思われる範囲を外れたこと、またはしきい値を超えたことは把握できるが、それらの状況をもってすぐに障害が発生したとは断定できない。また、監視保守員は、通知を受けた測定項目以外の測定項目や複合的な要因を確認して、経験と知識から障害の有無を判断している。そのため、従来技術では、迅速な障害の検出ができない可能性がある、という問題があった。 However, according to the above-mentioned conventional technology, the monitoring maintenance staff can grasp that the measurement item is out of the range considered to be a normal value or exceeds the threshold value. It cannot be determined that it has occurred. In addition, the monitoring and maintenance staff checks the measurement items other than the measurement item that has been notified and multiple factors, and determines whether there is a failure based on experience and knowledge. For this reason, the conventional technique has a problem that it may not be possible to detect a fault quickly.

本発明は、上記に鑑みてなされたものであって、より迅速に障害の予兆を検知可能な障害予兆検知装置を得ることを目的とする。 The present invention has been made in view of the above, and an object of the present invention is to obtain a failure sign detection device capable of detecting a failure sign more quickly.

上述した課題を解決し、目的を達成するために、本発明は、情報システムの障害の予兆を検知する障害予兆検知装置であって、情報システムの状態を監視するための測定項目の測定値である測定項目毎の実測定値を周期的に記憶するための実測定値記憶手段と、過去の障害発生時における前記測定項目の測定値である測定項目毎の障害時測定値を、過去の障害ケース毎に記憶するための障害時測定値記憶手段と、障害予兆検知の有無の判断基準となるしきい値を記憶するためのしきい値記憶手段と、障害ケース毎に所定の計算式を演算することにより同一測定項目における実測定値と障害時測定値の差分である測定項目毎の差分を単一値として数値化し、さらに、当該単一値としきい値とを比較することにより障害予兆検知の有無を判断する障害監視手段と、を備えることを特徴とする。 In order to solve the above-described problems and achieve the object, the present invention is a failure sign detection device for detecting a sign of a failure of an information system, and is a measurement value of a measurement item for monitoring the state of the information system. Actual measurement value storage means for periodically storing actual measurement values for each measurement item, and failure measurement values for each measurement item, which are measurement values of the measurement item when a past failure occurred, for each past failure case Measured value storage means at the time of failure storage, threshold value storage means for storing a threshold value as a criterion for detecting the presence or absence of failure sign detection, and calculating a predetermined calculation formula for each failure case The difference between each measurement item, which is the difference between the actual measurement value and the measurement value at the time of failure in the same measurement item, is converted into a single value. to decide Characterized in that it comprises a harm monitoring means.

この発明によれば、監視保守員の経験や知識に頼らない障害の予兆の検知ができる、という効果を奏する。 According to the present invention, there is an effect that it is possible to detect a sign of a failure without depending on the experience and knowledge of the supervisory maintenance staff.

以下に、本発明にかかる障害予兆検知装置の実施の形態を図面に基づいて詳細に説明する。なお、この実施の形態によりこの発明が限定されるものではない。 Embodiments of a failure sign detection device according to the present invention will be described below in detail with reference to the drawings. Note that the present invention is not limited to the embodiments.

実施の形態１．
図１は、実施の形態１の障害予兆検知装置の構成例を示す図である。障害予兆検知装置は、障害監視サーバ部１と、監視保守員が操作する障害監視端末部２と、障害特異点データベース３と、通知種別しきい値データベース４と、監視対象測定値データベース５と、を備える。 Embodiment 1 FIG.
FIG. 1 is a diagram illustrating a configuration example of the failure sign detection apparatus according to the first embodiment. The failure sign detection apparatus includes a failure monitoring server unit 1, a failure monitoring terminal unit 2 operated by a monitoring maintenance person, a failure singularity database 3, a notification type threshold value database 4, a monitoring target measurement value database 5, Is provided.

障害監視サーバ部１は、監視対象システムの障害監視を行う装置である。具体的には、障害監視端末部２から受け取った測定項目に基づき、監視対象システムの監視（測定）を行い、測定値を監視対象測定値データベース５に登録する。また、測定値と障害特異点データベース３に登録されている各障害特異点との距離を計算する。そして、求めた距離と通知種別しきい値データベース４に登録されているしきい値とを比較し、求めた距離が登録したしきい値以下の場合、障害監視端末部２を通じて監視保守員に、監視対象システムの障害の予兆を検知したことを通知する。また、障害監視サーバ部１は、障害特異点データベース３と、通知種別しきい値データベース４と、監視対象測定値データベース５について、登録内容の参照および更新を可能とする。 The failure monitoring server unit 1 is a device that performs failure monitoring of the monitoring target system. Specifically, based on the measurement item received from the failure monitoring terminal unit 2, the monitoring target system is monitored (measured), and the measurement value is registered in the monitoring target measurement value database 5. Further, the distance between the measured value and each failure singularity registered in the failure singularity database 3 is calculated. Then, the calculated distance is compared with the threshold value registered in the notification type threshold value database 4, and when the calculated distance is equal to or less than the registered threshold value, the monitoring maintenance staff through the failure monitoring terminal unit 2 Notifies that a predictive failure of the monitored system has been detected. Further, the failure monitoring server unit 1 can refer to and update the registered contents of the failure singularity database 3, the notification type threshold value database 4, and the monitoring target measurement value database 5.

なお、上記「距離」は、測定値と障害特異点との差分である。また、上記「障害特異点」は、障害発生時の測定値の組み合わせである。具体的には、たとえば、測定項目をＸ，Ｙとした場合、過去に障害が発生した際のＸの測定値Ｘ１とＹの測定値Ｙ１との組み合わせ（Ｘ１，Ｙ１）が障害特異点である。本実施の形態では、障害予兆検知装置が測定する項目を、たとえば、ネットワークシステムの重要拠点間のＰｉｎｇ応答とトラフィック負荷の２つとする。それぞれの測定値について、以降、Ｐｉｎｇ応答値を「Ｘ」、トラフィック負荷値を「Ｙ」とする。なお、本実施の形態では、説明の便宜上、測定項目を２つとするが、これに限らず、１つであってもよくまた３つ以上であってもよい。 The “distance” is a difference between the measured value and the failure singularity. The “failure singularity” is a combination of measured values when a failure occurs. Specifically, for example, when the measurement items are X and Y, the combination (X1, Y1) of the X measurement value X1 and the Y measurement value Y1 when a failure has occurred in the past is the failure singularity. . In the present embodiment, there are two items measured by the failure sign detection device, for example, a Ping response between important points of the network system and a traffic load. For each measurement value, the Ping response value is “X” and the traffic load value is “Y”. In the present embodiment, for convenience of explanation, two measurement items are used. However, the measurement items are not limited to this, and may be one or three or more.

障害監視端末部２は、障害監視サーバ部１が測定する項目等を入力し、測定開始後は障害監視サーバ部１からの通知を表示する。障害特異点データベース３は、障害特異点を登録したデータベースである。通知種別しきい値データベース４は、距離との比較対象であるしきい値を通知の種別ごとに登録したデータベースである。監視対象測定値データベース５は、障害監視サーバ部１が測定した測定値を登録するデータベースである。 The failure monitoring terminal unit 2 inputs items to be measured by the failure monitoring server unit 1 and displays a notification from the failure monitoring server unit 1 after the measurement is started. The failure singularity database 3 is a database in which failure singularities are registered. The notification type threshold value database 4 is a database in which a threshold value to be compared with a distance is registered for each notification type. The monitoring target measurement value database 5 is a database for registering measurement values measured by the failure monitoring server unit 1.

また、障害予兆検知装置は、監視対象システムを監視する場合、上記測定項目の他に、測定周期や監視方法（複数回測定して平均値を測定値とする、最大値または最小値を測定値とする、等の測定値の決定方法）、距離の計算方法についても任意に設定可能である。 In addition to the above measurement items, the failure predictor detection device, in addition to the above measurement items, measures the cycle or monitoring method (measures multiple times and sets the average value as the measurement value, the maximum or minimum value as the measurement value The method for determining the measured value such as “,” and the like, and the method for calculating the distance can be arbitrarily set.

つづいて、障害監視処理を図２に基づいて詳細に説明する。図２は、本実施の形態の障害監視処理を示すフローチャートである。まず、監視（測定）を開始する前の事前準備として、障害監視サーバ部１は、障害監視端末部２からの入力に基づき測定条件等の設定を行う（ステップＳ１）。たとえば、設定する測定条件は、測定項目，測定周期，監視方法，距離の計算方法，障害特異点（過去に障害が発生した時点での測定値），通知種別毎のしきい値、である。本実施の形態では、一例として、測定周期を５分周期とし、監視方法としては、測定した１回の結果をそのまま使用する。また、距離の計算方法は最小２乗法とする。 Next, the failure monitoring process will be described in detail with reference to FIG. FIG. 2 is a flowchart showing the failure monitoring process of the present embodiment. First, as a preliminary preparation before starting monitoring (measurement), the fault monitoring server unit 1 sets measurement conditions and the like based on the input from the fault monitoring terminal unit 2 (step S1). For example, the measurement conditions to be set are a measurement item, a measurement cycle, a monitoring method, a distance calculation method, a failure specific point (measured value when a failure has occurred in the past), and a threshold value for each notification type. In the present embodiment, as an example, the measurement cycle is set to a 5-minute cycle, and the measured result is used as it is as the monitoring method. The distance calculation method is the least square method.

上記ステップＳ１の処理において、障害監視サーバ部１は、上記障害特異点を障害特異点データベース３に登録する。ここでは、障害特異点（Ｘ，Ｙ）として、Ｚ１（１００，１５０），Ｚ２（４００，４００），Ｚ３（７００，１００）の３つの障害特異点を登録する。これは、過去の障害発生時のＰｉｎｇ応答値Ｘとトラフィック負荷値Ｙの測定値が、それぞれ（１００，１５０），（４００，４００），（７００，１００）であったことを示している。 In the process of step S <b> 1, the failure monitoring server unit 1 registers the failure singularity in the failure singularity database 3. Here, three failure singularities of Z1 (100, 150), Z2 (400, 400), and Z3 (700, 100) are registered as failure singularities (X, Y). This indicates that the measured values of the Ping response value X and the traffic load value Y when the failure occurred in the past were (100, 150), (400, 400), and (700, 100), respectively.

また、上記ステップＳ１の処理において、障害監視サーバ部１は、上記通知種別毎のしきい値を通知種別しきい値データベース４に登録する。ここでは、図１に示すように、３つの障害特異点に共通のしきい値として、注意通知を行うかどうかを判断するためのしきい値である注意通知しきい値を「１００」とし、障害警告通知を行うかどうかを判断するためのしきい値である障害警告通知しきい値を「４５」とする。なお、本実施の形態では、一例として、３つの障害特異点に共通のしきい値を登録することとしたが、これに限らず、しきい値を障害特異点ごとに登録することも可能である。また、障害監視サーバ部１は、測定値と障害特異点との距離を計算した結果、求めた距離が「１００」より大きい場合、監視対象システムの障害の予兆を検知していないと判断して、いずれの通知も行わない。また、距離が「１００」以下で「４５」より大きい場合は、注意通知を障害監視端末部２へ送る。また、距離が「４５」以下の場合は、障害警告通知を障害監視端末部２へ送る。なお、ここでは、通知の種別を２つとしたが、これに限定するものではなく、１つまたは３つ以上としてもよい。 In the process of step S1, the failure monitoring server unit 1 registers the threshold value for each notification type in the notification type threshold value database 4. Here, as shown in FIG. 1, as a threshold common to the three failure singularities, a caution notification threshold that is a threshold for determining whether to perform caution notification is set to “100”, A failure warning notification threshold value, which is a threshold value for determining whether to perform a failure warning notification, is set to “45”. In this embodiment, as an example, a common threshold value is registered for three fault singularities. However, the present invention is not limited to this, and a threshold value can also be registered for each fault singularity. is there. Further, when the failure monitoring server unit 1 calculates the distance between the measured value and the failure singularity, and the obtained distance is greater than “100”, the failure monitoring server unit 1 determines that no sign of failure of the monitored system has been detected. , Neither notification. If the distance is “100” or less and greater than “45”, a notice of caution is sent to the failure monitoring terminal unit 2. When the distance is “45” or less, a failure warning notification is sent to the failure monitoring terminal unit 2. Here, the number of types of notification is two, but is not limited to this, and may be one or three or more.

図３は、障害特異点データベース３に登録した障害特異点をプロットした状態を示す図である。過去に発生した３つの障害ケースについて、それぞれの障害発生時の各測定項目の値を、Ｘ，Ｙの２次元グラフにプロットした状態を表している。登録した３点の障害特異点をそれぞれＺ１，Ｚ２，Ｚ３で表す。また、各障害特異点を中心にして２つの円があり、外側の円は半径１００の注意通知しきい値を示す円であり、内側の円は半径４５の障害警告通知しきい値を示す円である。これらのしきい値は、通知種別しきい値データベース４に登録した各しきい値に対応する。 FIG. 3 is a diagram showing a state in which the failure singularities registered in the failure singularity database 3 are plotted. For three failure cases that occurred in the past, the values of each measurement item at the time of each failure are plotted on a two-dimensional graph of X and Y. The three registered failure singularities are represented by Z1, Z2, and Z3, respectively. Further, there are two circles centered on each failure singularity, the outer circle is a circle indicating a warning notification threshold of radius 100, and the inner circle is a circle indicating a failure warning notification threshold of radius 45. It is. These threshold values correspond to the respective threshold values registered in the notification type threshold value database 4.

上記ステップＳ１により事前準備を完了した後、つぎに、障害監視サーバ部１は、障害特異点の数を設定する。本実施の形態では、３点の障害特異点を登録しているので、障害特異点数Ｎを「３」に設定する（ステップＳ２）。 After completing the preliminary preparation in step S1, the failure monitoring server unit 1 sets the number of failure singularities. In this embodiment, since three failure singularities are registered, the number N of failure singularities is set to “3” (step S2).

つぎに、障害監視サーバ部１は、障害監視端末部２から測定開始指示を受け取った時点より、各測定項目についての１回目の測定を開始し（ステップＳ３）、測定値を監視対象測定値データベース５に登録する。本実施の形態では、一例として、２月２２日の１３：００から１回目の測定を開始する（ステップＳ３）。そして、１回目の測定で、たとえば、ネットワークシステムの重要拠点間のＰｉｎｇ応答値として「１００」が、トラフィック負荷測定値として「４００」が、それぞれ得られた場合、障害監視サーバ部１は、それらの測定値を監視対象測定値データベース５に登録し、障害特異点の番号Ｉを１にする（ステップＳ４）。 Next, the failure monitoring server unit 1 starts the first measurement for each measurement item from the time when the measurement start instruction is received from the failure monitoring terminal unit 2 (step S3), and the measured values are monitored target measurement value database. Register to 5. In the present embodiment, as an example, the first measurement is started from 13:00 on February 22 (step S3). Then, in the first measurement, for example, when “100” is obtained as the Ping response value between the important points of the network system and “400” is obtained as the traffic load measurement value, the fault monitoring server unit 1 Are registered in the monitoring target measurement value database 5 and the failure singularity number I is set to 1 (step S4).

つぎに、障害監視サーバ部１は、Ｉの値がステップＳ２で設定した障害特異点数Ｎ（＝３）以下となっているので（ステップＳ５：Ｎｏ）、測定値と障害特異点Ｚ１から距離を計算し、求めた距離が通知種別しきい値データベース４に登録した注意通知しきい値以下であるかどうかを比較する（ステップＳ６）。測定値と障害特異点の距離は、最小２乗法により、
「（（測定したＰｉｎｇ応答値−障害特異点のＰｉｎｇ応答値）²＋（測定したトラフィック負荷値−障害特異点のトラフィック負荷値）²）の平方根」
を計算する。具体的には、測定値（１００，４００）と障害特異点Ｚ１（１００，１５０）との距離を計算すると、
「（（１００−１００）²＋（４００−１５０）²）の平方根」＝「２５０」
となり、求めた距離「２５０」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きい（ステップＳ６：Ｎｏ）。したがって、障害監視サーバ部１は、監視保守員に対して注意通知を行う条件に当てはまらないため、Ｉ（＝１）の値に１を加算し（ステップＳ１０）、ステップＳ５へ戻る。 Next, since the value of I is equal to or less than the number of failure singularities N (= 3) set in step S2 (step S5: No), the failure monitoring server unit 1 sets the distance from the measured value and the failure singularity Z1. It is compared whether or not the calculated distance is equal to or less than the attention notification threshold value registered in the notification type threshold value database 4 (step S6). The distance between the measured value and the fault singularity is calculated using the least squares method.
“((Measured Ping response value−Ping response value of failure singularity) ² + (Measured traffic load value−Traffic load value of failure singularity) ² ) ² ”)
Calculate Specifically, when the distance between the measured value (100, 400) and the failure singularity Z1 (100, 150) is calculated,
“Square root of ((100−100) ² + (400−150) ² )” = “250”
When the obtained distance “250” is compared with the attention notification threshold “100”, the obtained distance is larger (step S6: No). Therefore, since the failure monitoring server unit 1 does not apply to the condition for notifying the monitoring maintenance personnel of the notice, 1 is added to the value of I (= 1) (step S10), and the process returns to step S5.

つぎに、障害監視サーバ部１は、Ｉの値が２であり、Ｎ（＝３）以下であるため（ステップＳ５：Ｎｏ）、測定値と障害特異点Ｚ２との距離を計算し、求めた距離が注意通知しきい値以下であるかどうかを比較する（ステップＳ６）。上記同様、測定値（１００，４００）と障害特異点Ｚ２（４００，４００）との距離を計算すると、
「（（１００−４００）²＋（４００−４００）²）の平方根」＝「３００」
となり、求めた距離「３００」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きい（ステップＳ６：Ｎｏ）。したがって、障害監視サーバ部１は、上記同様、監視保守員に対して注意通知を行う条件に当てはまらないため、Ｉ（＝２）の値に１を加算し（ステップＳ１０）、ステップＳ５へ戻る。 Next, since the value of I is 2 and N (= 3) or less (step S5: No), the failure monitoring server unit 1 calculates and obtains the distance between the measured value and the failure singularity Z2. It is compared whether the distance is equal to or smaller than the caution notification threshold (step S6). As above, when the distance between the measured value (100, 400) and the failure singular point Z2 (400, 400) is calculated,
“Square root of ((100−400) ² + (400−400) ² )” = “300”
When the obtained distance “300” is compared with the attention notification threshold “100”, the obtained distance is larger (step S6: No). Therefore, since the failure monitoring server unit 1 does not apply to the condition for notifying the monitoring maintenance personnel as described above, 1 is added to the value of I (= 2) (step S10), and the process returns to step S5.

つぎに、障害監視サーバ部１は、Ｉの値が３であり、Ｎ（＝３）以下であるため（ステップＳ５：Ｎｏ）、測定値と障害特異点Ｚ３との距離を計算し、求めた距離が注意通知しきい値以下であるかどうかを比較する（ステップＳ６）。上記同様、測定値（１００，４００）と障害特異点Ｚ３（７００，１００）との距離を計算すると、
「（（１００−７００）²＋（４００−１００）²）の平方根」＝「６７０．８」
となり、求めた距離「６７０．８」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きい（ステップＳ６：Ｎｏ）。したがって、障害監視サーバ部１は、上記同様、監視保守員に対して注意通知を行う条件に当てはまらないため、Ｉ（＝３）の値に１を加算し（ステップＳ１０）、ステップＳ５へ戻る。 Next, since the value of I is 3 and N (= 3) or less (step S5: No), the failure monitoring server unit 1 calculates and obtains the distance between the measured value and the failure singular point Z3. It is compared whether the distance is equal to or smaller than the caution notification threshold (step S6). As above, when the distance between the measured value (100, 400) and the failure singular point Z3 (700, 100) is calculated,
“Square root of ((100−700) ² + (400−100) ² )” = “670.8”
When the calculated distance “670.8” is compared with the attention notification threshold “100”, the calculated distance is larger (No in step S6). Therefore, as described above, the failure monitoring server unit 1 does not apply to the condition for notifying the monitoring and maintenance staff, so 1 is added to the value of I (= 3) (step S10), and the process returns to step S5.

つぎに、障害監視サーバ部１は、Ｉの値が４であり、Ｎ（＝３）より大きいため（ステップＳ５：Ｙｅｓ）、つぎの測定時間まで待機する（ステップＳ１１）。本実施の形態では、測定周期を５分に設定しているため、１回目の測定から５分経過後に２回目の測定を行う（ステップＳ３）。具体的には、障害監視サーバ部１は、２月２２日の１３：０５に２回目の測定を行う（ステップＳ３）。そして、たとえば、ネットワークシステムの重要拠点間のＰｉｎｇ応答値として「１５０」が、トラフィック負荷測定値として「３００」が、それぞれ得られた場合、障害監視サーバ部１は、それらの測定値を監視対象測定値データベース５に登録し、Ｉの値を「１」に戻す（ステップＳ４）。 Next, since the value of I is 4 and larger than N (= 3) (step S5: Yes), the failure monitoring server unit 1 waits until the next measurement time (step S11). In this embodiment, since the measurement cycle is set to 5 minutes, the second measurement is performed after 5 minutes from the first measurement (step S3). Specifically, the failure monitoring server unit 1 performs the second measurement at 13:05 on February 22 (step S3). For example, when “150” is obtained as the Ping response value between the important sites of the network system and “300” is obtained as the traffic load measurement value, the failure monitoring server unit 1 monitors these measurement values. It registers in the measured value database 5 and returns the value of I to “1” (step S4).

つぎに、障害監視サーバ部１は、Ｉの値が１であり、Ｎ（＝３）以下であるため（ステップＳ５：Ｎｏ）、１回目の測定と同様、測定値と障害特異点から距離を計算し、求めた距離が注意通知しきい値以下であるかどうかを比較する（ステップＳ６）。上記同様、測定値（１５０，３００）と障害特異点Ｚ１（１００，１５０）との距離を計算すると、
「（（１５０−１００）²＋（３００−１５０）²）の平方根」＝「１５８．１」
となり、求めた距離「１５８．１」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きい（ステップＳ６：Ｎｏ）。したがって、障害監視サーバ部１は、監視保守員に対して注意通知を行う条件に当てはまらないため、Ｉ（＝１）の値に１を加算し（ステップＳ１０）、ステップＳ５へ戻る。 Next, since the value of I is 1 and N (= 3) or less (step S5: No), the failure monitoring server unit 1 determines the distance from the measured value and the failure singularity as in the first measurement. It is calculated and compared whether or not the obtained distance is equal to or smaller than the caution notification threshold (step S6). Similarly to the above, when the distance between the measured value (150, 300) and the failure singular point Z1 (100, 150) is calculated,
“Square root of ((150−100) ² + (300−150) ² )” = “158.1”
When the obtained distance “158.1” is compared with the attention notification threshold “100”, the obtained distance is larger (step S6: No). Therefore, since the failure monitoring server unit 1 does not apply to the condition for notifying the monitoring maintenance personnel of the notice, 1 is added to the value of I (= 1) (step S10), and the process returns to step S5.

つぎに、障害監視サーバ部１は、Ｉの値が２であり、Ｎ（＝３）以下であるため（ステップＳ５：Ｎｏ）、測定値と障害特異点Ｚ２との距離を計算し、求めた距離が注意通知しきい値以下であるかどうかを比較する（ステップＳ６）。上記同様、測定値（１５０，３００）と障害特異点Ｚ２（４００，４００）との距離を計算すると、
「（（１５０−４００）²＋（３００−４００）²）の平方根」＝「２６９．３」
となり、求めた距離「２６９．３」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きい（ステップＳ６：Ｎｏ）。したがって、障害監視サーバ部１は、監視保守員に対して注意通知を行う条件に当てはまらないため、Ｉ（＝２）の値に１を加算し（ステップＳ１０）、ステップＳ５へ戻る。 Next, since the value of I is 2 and N (= 3) or less (step S5: No), the failure monitoring server unit 1 calculates and obtains the distance between the measured value and the failure singularity Z2. It is compared whether the distance is equal to or smaller than the caution notification threshold (step S6). Similarly to the above, when the distance between the measured value (150, 300) and the failure singular point Z2 (400, 400) is calculated,
“Square root of ((150−400) ² + (300−400) ² )” = “269.3”
When the calculated distance “269.3” is compared with the attention notification threshold “100”, the calculated distance is larger (step S6: No). Therefore, since the failure monitoring server unit 1 does not apply to the condition for notifying the monitoring maintenance personnel of the notice, 1 is added to the value of I (= 2) (step S10), and the process returns to step S5.

つぎに、障害監視サーバ部１は、Ｉの値が３であり、Ｎ（＝３）以下であるため（ステップＳ５：Ｎｏ）、測定値と障害特異点Ｚ３との距離を計算し、求めた距離が注意通知しきい値以下であるかどうかを比較する（ステップＳ６）。上記同様、測定値（１５０，３００）と障害特異点Ｚ３（７００，１００）との距離を計算すると、
「（（１５０−７００）²＋（３００−１００）²）の平方根」＝「５８５．２」
となり、求めた距離「５８５．２」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きい（ステップＳ６：Ｎｏ）。したがって、障害監視サーバ部１は、監視保守員に対して注意通知を行う条件に当てはまらないため、Ｉ（＝３）の値に１を加算し（ステップＳ１０）、ステップＳ５へ戻る。 Next, since the value of I is 3 and N (= 3) or less (step S5: No), the failure monitoring server unit 1 calculates and obtains the distance between the measured value and the failure singular point Z3. It is compared whether the distance is equal to or smaller than the caution notification threshold (step S6). Similarly to the above, when the distance between the measured value (150, 300) and the failure singular point Z3 (700, 100) is calculated,
“Square root of ((150−700) ² + (300−100) ² )” = “585.2”
When the obtained distance “585.2” is compared with the attention notification threshold “100”, the obtained distance is larger (step S6: No). Therefore, since the failure monitoring server unit 1 does not apply to the condition for notifying the monitoring maintenance personnel of the notice, 1 is added to the value of I (= 3) (step S10), and the process returns to step S5.

つぎに、障害監視サーバ部１は、Ｉの値が４であり、Ｎ（＝３）より大きいため（ステップＳ５：Ｙｅｓ）、つぎの測定時間まで待機し（ステップＳ１１）、２回目の測定から５分経過後に３回目の測定を行う（ステップＳ３）。具体的には、障害監視サーバ部１は、２月２２日の１３：１０に３回目の測定を行う（ステップＳ３）。そして、たとえば、ネットワークシステムの重要拠点間のＰｉｎｇ応答値として「３６０」が、トラフィック負荷測定値として「３６０」が、それぞれ得られた場合、障害監視サーバ部１は、それらの測定値を監視対象測定値データベース５に登録し、Ｉの値を「１」に戻す（ステップＳ４）。 Next, since the value of I is 4 and larger than N (= 3) (step S5: Yes), the failure monitoring server unit 1 waits for the next measurement time (step S11) and starts from the second measurement. After 5 minutes, the third measurement is performed (step S3). Specifically, the failure monitoring server unit 1 performs the third measurement at 13:10 on February 22 (step S3). For example, when “360” is obtained as the Ping response value between the important bases of the network system and “360” is obtained as the traffic load measurement value, the failure monitoring server unit 1 monitors these measurement values. It registers in the measured value database 5 and returns the value of I to “1” (step S4).

つぎに、障害監視サーバ部１は、Ｉの値が１であり、Ｎ（＝３）以下であるため（ステップＳ５：Ｎｏ）、測定値と障害特異点から距離を計算し、求めた距離が注意通知しきい値以下であるかどうかを比較する（ステップＳ６）。測定値（３６０，３６０）と障害特異点Ｚ１（１００，１５０）との距離を計算すると、
「（（３６０−１００）²＋（３６０−１５０）²）の平方根」＝「３３４．２」
となり、求めた距離「３３４．２」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きい（ステップＳ６：Ｎｏ）。したがって、障害監視サーバ部１は、監視保守員に対して注意通知を行う条件に当てはまらないため、Ｉ（＝１）の値に１を加算し（ステップＳ１０）、ステップＳ５へ戻る。 Next, since the value of I is 1 and N (= 3) or less (step S5: No), the failure monitoring server unit 1 calculates the distance from the measured value and the failure singularity, and the obtained distance is It is compared whether it is below the notice notification threshold value (step S6). When the distance between the measured value (360, 360) and the failure singular point Z1 (100, 150) is calculated,
“Square root of ((360−100) ² + (360−150) ² )” = “334.2”
When the calculated distance “334.2” is compared with the attention notification threshold value “100”, the calculated distance is larger (step S6: No). Therefore, since the failure monitoring server unit 1 does not apply to the condition for notifying the monitoring maintenance personnel of the notice, 1 is added to the value of I (= 1) (step S10), and the process returns to step S5.

つぎに、障害監視サーバ部１は、Ｉの値が２であり、Ｎ（＝３）以下であるため（ステップＳ５：Ｎｏ）、測定値と障害特異点Ｚ２との距離を計算し、求めた距離が注意通知しきい値以下であるかどうかを比較する（ステップＳ６）。上記同様、測定値（３６０，３６０）と障害特異点Ｚ２（４００，４００）との距離を計算すると、
「（（３６０−４００）²＋（３６０−４００）²）の平方根」＝「５６．６」
となり、求めた距離「５６．６」と注意通知しきい値「１００」を比較すると、求めた距離の方が小さい（ステップＳ６：Ｙｅｓ）。したがって、障害監視サーバ部１は、つぎに、求めた距離が通知種別しきい値データベース４に登録した障害警告通知しきい値以下であるかどうかを比較する（ステップＳ７）。求めた距離「５６．６」と障害警告通知しきい値「４５」を比較すると、求めた距離の方が大きい（ステップＳ７：Ｎｏ）。したがって、障害監視サーバ部１は、注意通知を行う条件と判断し、障害監視端末部２に注意通知を送信する（ステップＳ８）。障害監視端末部２は、受信した注意通知を表示して、監視保守員に監視対象システムの状態を知らせる。なお、上記ステップＳ７の比較処理において、求めた距離が障害警告通知しきい値「４５」以下の場合（ステップＳ７：Ｙｅｓ）。障害監視サーバ部１は、障害警告通知を行う条件と判断し、障害監視端末部２に障害警告通知を送信する（ステップＳ９）。そして、ステップＳ８の処理で注意通知を送信後、障害監視サーバ部１は、Ｉ（＝２）の値に１を加算し（ステップＳ１０）、ステップＳ５へ戻る。 Next, since the value of I is 2 and N (= 3) or less (step S5: No), the failure monitoring server unit 1 calculates and obtains the distance between the measured value and the failure singularity Z2. It is compared whether the distance is equal to or smaller than the caution notification threshold (step S6). Similarly to the above, when the distance between the measured value (360, 360) and the failure singular point Z2 (400, 400) is calculated,
“Square root of ((360−400) ² + (360−400) ² )” = “56.6”
When the obtained distance “56.6” is compared with the attention notification threshold “100”, the obtained distance is smaller (step S6: Yes). Therefore, the failure monitoring server unit 1 next compares whether or not the obtained distance is equal to or less than the failure warning notification threshold registered in the notification type threshold value database 4 (step S7). When the obtained distance “56.6” is compared with the failure warning notification threshold value “45”, the obtained distance is larger (step S7: No). Therefore, the failure monitoring server unit 1 determines that the condition for performing the notice of attention is transmitted, and transmits the notice of attention to the failure monitoring terminal unit 2 (step S8). The failure monitoring terminal unit 2 displays the received notice notice to notify the monitoring maintenance personnel of the status of the monitored system. In the comparison process in step S7, the calculated distance is equal to or less than the failure warning notification threshold “45” (step S7: Yes). The failure monitoring server unit 1 determines that the failure warning notification condition is satisfied, and transmits a failure warning notification to the failure monitoring terminal unit 2 (step S9). Then, after transmitting the notice of caution in the process of step S8, the failure monitoring server unit 1 adds 1 to the value of I (= 2) (step S10) and returns to step S5.

つぎに、障害監視サーバ部１は、Ｉの値が３であり、Ｎ（＝３）以下であるため（ステップＳ５：Ｎｏ）、測定値と障害特異点Ｚ３との距離を計算し、求めた距離が上記注意通知しきい値以下であるかどうかを比較する（ステップＳ６）。上記同様、測定値（３６０，３６０）と障害特異点Ｚ３（７００，１００）との距離を計算すると、
「（（３６０−７００）²＋（３６０−１００）²）の平方根」＝「４２８．０」
となり、求めた距離「４２８．０」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きい（ステップＳ６：Ｎｏ）。したがって、障害監視サーバ部１は、監視保守員に対して注意通知を行う条件に当てはまらないため、Ｉ（＝３）の値に１を加算し（ステップＳ１０）、ステップＳ５へ戻る。 Next, since the value of I is 3 and N (= 3) or less (step S5: No), the failure monitoring server unit 1 calculates and obtains the distance between the measured value and the failure singular point Z3. It is compared whether or not the distance is less than or equal to the above notice notification threshold (step S6). Similarly to the above, when the distance between the measured value (360, 360) and the failure singular point Z3 (700, 100) is calculated,
“Square root of ((360−700) ² + (360−100) ² )” = “428.0”
When the calculated distance “428.0” is compared with the attention notification threshold value “100”, the calculated distance is larger (step S6: No). Therefore, since the failure monitoring server unit 1 does not apply to the condition for notifying the monitoring maintenance personnel of the notice, 1 is added to the value of I (= 3) (step S10), and the process returns to step S5.

つぎに、障害監視サーバ部１は、Ｉの値が４であり、Ｎ（＝３）より大きいため（ステップＳ５：Ｙｅｓ）、つぎの測定時間まで待機し（ステップＳ１１）、３回目の測定から５分経過後に４回目の測定を行う（ステップＳ３）。 Next, since the value of I is 4 and larger than N (= 3) (step S5: Yes), the failure monitoring server unit 1 waits for the next measurement time (step S11) and starts from the third measurement. After 5 minutes, the fourth measurement is performed (step S3).

以降、障害監視サーバ部１は、障害監視端末部２から測定停止指示があるまで、上記ステップＳ３〜Ｓ１２の処理を繰り返し実行する。 Thereafter, the failure monitoring server unit 1 repeatedly executes the processes of steps S3 to S12 until there is a measurement stop instruction from the failure monitoring terminal unit 2.

図４は、障害特異点と測定値をプロットした状態を示す図ある。１回目および２回目の測定値は、各障害特異点の注意通知しきい値の円の外側に位置している。しかしながら、３回目の測定値は、障害特異点Ｚ２の注意通知しきい値の円の内側、かつ障害警告通知しきい値の円の外側、に位置している。 FIG. 4 is a diagram illustrating a state in which a failure singularity and a measured value are plotted. The first measurement value and the second measurement value are located outside the circle of the notice notification threshold of each failure singularity. However, the measurement value for the third time is located inside the attention notification threshold circle of the failure singularity Z2 and outside the failure warning notification threshold circle.

以上説明したように、本実施の形態によれば、障害予兆検知装置は、障害特異点と測定値から距離を計算し、求めた距離が設定したしきい値以下の場合に、監視対象システムの障害の予兆を検出したと判断することとした。これにより、監視保守員の経験と知識に頼らずに、迅速な障害予兆判断を行うことができる。 As described above, according to the present embodiment, the failure sign detection device calculates the distance from the failure singularity and the measured value, and when the obtained distance is equal to or less than the set threshold value, It was decided that a sign of failure was detected. As a result, it is possible to quickly determine a failure sign without depending on the experience and knowledge of the supervisory maintenance staff.

なお、上記距離の計算式は、３つの障害特異点とも同一の最小２乗法で計算しているが、それぞれで計算式を変更することも可能である。障害特異点毎に計算式を変更する場合は、最小２乗法以外の計算式を用いてもよい。 The distance calculation formula is calculated by the same least-squares method for all three fault singularities, but the calculation formula can be changed for each. When changing the calculation formula for each failure singularity, a calculation formula other than the least square method may be used.

また、障害監視処理の事前準備（ステップＳ１）で登録した各項目については、手動、または予め定義された判断処理により、障害監視中でも、追加，変更，削除を可能とする。たとえば、監視対象システムの障害発生の状況により、測定項目や障害特異点を追加すること、または、測定周期を変更して監視を行うことで、障害の予兆を検出する確率を向上させることができ、障害発生の傾向が変化する場合であっても、その変化に追随して監視を継続することが可能となる。障害特異点については、監視対象測定値データベース５に登録されている測定値を指定し、指定した測定値を障害特異点データベース３に登録することとしてもよい。 In addition, each item registered in the preparatory preparation for the fault monitoring process (step S1) can be added, changed, or deleted even during fault monitoring by manual or predefined determination processing. For example, depending on the status of the failure of the monitored system, the probability of detecting a failure sign can be improved by adding measurement items and failure singularities, or by changing the measurement cycle and monitoring. Even if the tendency of failure occurrence changes, monitoring can be continued following the change. For the failure singularity, a measurement value registered in the monitoring target measurement value database 5 may be designated, and the designated measurement value may be registered in the failure singularity database 3.

また、実施の形態２以降においては、実施の形態１で説明した内容を前提とし、異なる部分について説明する。 In the second and subsequent embodiments, different parts will be described on the premise of the contents described in the first embodiment.

実施の形態２．
本実施の形態では、距離を求める際に測定値に重み付けを行う。図５は、実施の形態２の障害予兆検知装置の構成例を示す図である。このシステムは、障害監視サーバ部１ａと、監視保守員が操作する障害監視端末部２と、障害特異点データベース３ａと、通知種別しきい値データベース４と、監視対象測定値データベース５と、を備える。なお、前述した実施の形態１と同一の構成については、同一の符号を付してその説明を省略する。 Embodiment 2. FIG.
In the present embodiment, the measured value is weighted when the distance is obtained. FIG. 5 is a diagram illustrating a configuration example of the failure sign detection apparatus according to the second embodiment. This system includes a failure monitoring server unit 1a, a failure monitoring terminal unit 2 operated by a monitoring maintenance person, a failure singularity database 3a, a notification type threshold value database 4, and a monitoring target measurement value database 5. . In addition, about the same structure as Embodiment 1 mentioned above, the same code | symbol is attached | subjected and the description is abbreviate | omitted.

障害監視サーバ部１ａは、前述した障害監視サーバ部１と比較して、測定値と障害特異点の距離を計算する際に測定値に対して重み付けを行う点、が異なっている。障害特異点データベース３ａは、障害特異点を登録した障害履歴のデータベースであり、障害特異点ごとに重み付け係数も登録できる。ここでは、Ｘ，Ｙに乗じる重み付け係数をそれぞれＡ，Ｂとする。「重み付け係数」は、障害監視サーバ部１ａが、測定値と障害特異点との距離を計算する場合に適用する係数である。たとえば、測定の単位や障害に対する影響度が測定値ＸとＹでそれぞれ異なる場合であっても、測定値ＸとＹにそれぞれ単位や影響度に応じた重みを持たせることにより、より精度の高い障害監視が可能となる。本実施の形態では、
「（（測定したＰｉｎｇ応答値−障害特異点のＰｉｎｇ応答値）²×重み付け係数Ａ＋（測定したトラフィック負荷値−障害特異点のトラフィック負荷値）²×重み付け係数Ｂ）の平方根」
により距離を計算する。以下、各障害特異点の重み付け係数の組み合わせを（Ａ，Ｂ）と表す。 The failure monitoring server unit 1a differs from the failure monitoring server unit 1 described above in that the measured value is weighted when calculating the distance between the measured value and the failure singularity. The failure singularity database 3a is a failure history database in which failure singularities are registered, and a weighting coefficient can be registered for each failure singularity. Here, the weighting coefficients to be multiplied by X and Y are A and B, respectively. The “weighting coefficient” is a coefficient applied when the failure monitoring server unit 1a calculates the distance between the measurement value and the failure singularity. For example, even if the measurement unit and the degree of influence on the failure are different between the measurement values X and Y, the measurement values X and Y can be weighted according to the unit and the degree of influence, respectively, to achieve higher accuracy. Fault monitoring is possible. In this embodiment,
“((Measured Ping response value−Ping response value of fault singularity) ² × weighting coefficient A + (Measured traffic load value−traffic load value of fault singularity) ² × weighting coefficient B) square root”
To calculate the distance. Hereinafter, the combination of the weighting coefficients of each failure singularity is represented as (A, B).

つづいて、実施の形態２の障害監視処理を図２に基づいて説明する。本実施の形態では、前述した実施の形態１と異なる処理について説明する。 Next, the failure monitoring process according to the second embodiment will be described with reference to FIG. In the present embodiment, processing different from that of the first embodiment will be described.

まず、監視（測定）を開始する前の事前準備として、障害監視サーバ部１ａは、障害監視端末部２からの入力に基づき測定条件等の設定を行う（ステップＳ１）。このとき、障害監視サーバ部１ａは、障害特異点（Ｘ，Ｙ）のそれぞれの値に対応する重み付け係数を障害特異点データベース３ａに登録する。具体的には、障害監視サーバ部１ａは、障害特異点Ｚ１のＸに対する重み付け係数Ａを「１」とし、Ｙに対する重み付け係数Ｂを「２」とし、障害特異点Ｚ２のＸに対する重み付け係数Ａを「１」とし、Ｙに対する重み付け係数Ｂを「０．２」とし、障害特異点Ｚ３のＸに対する重み付け係数Ａを「１」とし、Ｙに対する重み付け係数Ｂを「５」として、障害特異点データベース３ａに登録する。 First, as a preparation before starting monitoring (measurement), the failure monitoring server unit 1a sets measurement conditions and the like based on the input from the failure monitoring terminal unit 2 (step S1). At this time, the failure monitoring server unit 1a registers the weighting coefficient corresponding to each value of the failure singularity (X, Y) in the failure singularity database 3a. Specifically, the fault monitoring server unit 1a sets the weighting coefficient A for X of the fault singularity Z1 to “1”, sets the weighting coefficient B for Y to “2”, and sets the weighting coefficient A for X of the fault singularity Z2 The fault singularity database 3a is set to "1", the weighting coefficient B for Y is "0.2", the weighting coefficient A for X of the fault singularity Z3 is "1", and the weighting coefficient B for Y is "5". Register with.

その後、障害監視サーバ部１ａは、実施の形態１と同様の手順で１回目の測定処理を実行し、その測定値、障害特異点Ｚ１および重み付け係数を用いて距離を計算する。そして、求めた距離が通知種別しきい値データベース４に登録した注意通知しきい値以下であるかどうかを比較する（ステップＳ６）。具体的には、１回目の測定値（１００，４００）、障害特異点Ｚ１（１００，１５０）および障害特異点Ｚ１の重み付け係数（１，２）を用いて、下記式により距離を計算する。
「（（１００−１００）²×１＋（４００−１５０）²×２）の平方根」＝「３５３．６」
求めた距離「３５３．６」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きいので（ステップＳ６：Ｎｏ）、障害監視サーバ部１ａは、監視保守員に対して注意通知を行う条件に当てはまらないと判断する。 After that, the failure monitoring server unit 1a performs the first measurement process in the same procedure as in the first embodiment, and calculates the distance using the measured value, the failure singularity Z1, and the weighting coefficient. And it is compared whether the calculated | required distance is below the caution notification threshold value registered into the notification classification threshold value database 4 (step S6). Specifically, the distance is calculated by the following equation using the first measurement value (100, 400), the failure singularity point Z1 (100, 150), and the weighting coefficient (1, 2) of the failure singularity point Z1.
“Square root of ((100−100) ² × 1 + (400−150) ² × 2)” = “353.6”
When the calculated distance “353.6” is compared with the caution notification threshold value “100”, the calculated distance is larger (step S6: No), so the failure monitoring server unit 1a is careful about the monitoring maintenance personnel. It is determined that the conditions for notification are not applicable.

また、障害監視サーバ部１ａは、１回目の測定値（１００，４００）、障害特異点Ｚ２（４００，４００）および障害特異点Ｚ２の重み付け係数（１，０．２）を用いて、下記式により距離を計算する（ステップＳ６）。
「（（１００−４００）²×１＋（４００−４００）²×０．２）の平方根」＝「３００」
求めた距離「３００」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きいので（ステップＳ６：Ｎｏ）、障害監視サーバ部１ａは、上記同様、監視保守員に対して注意通知を行う条件に当てはまらないと判断する。 Further, the failure monitoring server unit 1a uses the first measurement value (100, 400), the failure singularity point Z2 (400, 400), and the weighting coefficient (1, 0.2) of the failure singularity point Z2 as follows: To calculate the distance (step S6).
“Square root of ((100−400) ² × 1 + (400−400) ² × 0.2)” = “300”
When the calculated distance “300” is compared with the attention notification threshold value “100”, the calculated distance is larger (step S6: No), so that the failure monitoring server unit 1a gives the monitoring maintenance staff the same as described above. Judge that it does not apply to the conditions for notices.

また、障害監視サーバ部１ａは、１回目の測定値（１００，４００）、障害特異点Ｚ３（７００，１００）および障害特異点Ｚ３の重み付け係数（１，５）を用いて、下記式により距離を計算する（ステップＳ６）。
「（（１００−７００）²×１＋（４００−１００）²×５）の平方根」＝「９００」
求めた距離「９００」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きいので（ステップＳ６：Ｎｏ）、障害監視サーバ部１ａは、上記同様、監視保守員に対して注意通知を行う条件に当てはまらないと判断する。 Further, the failure monitoring server unit 1a uses the first measurement value (100, 400), the failure singular point Z3 (700, 100), and the weighting coefficient (1, 5) of the failure singular point Z3 according to the following formula. Is calculated (step S6).
“Square root of ((100−700) ² × 1 + (400−100) ² × 5)” = “900”
When the calculated distance “900” is compared with the attention notification threshold value “100”, the calculated distance is larger (step S6: No), so that the failure monitoring server unit 1a gives the monitoring maintenance staff the same as described above. Judge that it does not apply to the conditions for notices.

その後、障害監視サーバ部１ａは、実施の形態１と同様の手順で２回目の測定処理を実行し、２回目の測定値（１５０，３００）、障害特異点Ｚ１（１００，１５０）および障害特異点Ｚ１の重み付け係数（１，２）を用いて、下記式により距離を計算する（ステップＳ６）。
「（（１５０−１００）²×１＋（３００−１５０）²×２）の平方根」＝「２１７．９」
求めた距離「２１７．９」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きいので（ステップＳ６：Ｎｏ）、障害監視サーバ部１ａは、監視保守員に対して注意通知を行う条件に当てはまらないと判断する。 Thereafter, the failure monitoring server unit 1a performs the second measurement process in the same procedure as in the first embodiment, and the second measurement value (150, 300), the failure singularity point Z1 (100, 150), and the failure singularity. The distance is calculated by the following equation using the weighting coefficient (1, 2) of the point Z1 (step S6).
“Square root of ((150−100) ² × 1 + (300−150) ² × 2)” = “217.9”
When the calculated distance “217.9” is compared with the caution notification threshold “100”, the calculated distance is larger (step S6: No), so the failure monitoring server unit 1a is careful about the monitoring maintenance personnel. It is determined that the conditions for notification are not applicable.

また、障害監視サーバ部１ａは、２回目の測定値（１５０，３００）、障害特異点Ｚ２（４００，４００）および障害特異点Ｚ２の重み付け係数（１，０．２）を用いて、下記式により距離を計算する（ステップＳ６）。
「（（１５０−４００）²×１＋（３００−４００）²×０．２）の平方根」＝「２５４．０」
求めた距離「２５４．０」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きいので（ステップＳ６：Ｎｏ）、障害監視サーバ部１ａは、監視保守員に対して注意通知を行う条件に当てはまらないと判断する。 Further, the failure monitoring server unit 1a uses the second measurement value (150, 300), the failure singularity point Z2 (400, 400), and the weighting coefficient (1, 0.2) of the failure singularity point Z2 as follows: To calculate the distance (step S6).
^{"((150-400) 2 × 1 +} (300-400) 2 × 0.2) the square root of the" = "254.0"
When the calculated distance “254.0” is compared with the caution notification threshold value “100”, the calculated distance is larger (step S6: No), so the failure monitoring server unit 1a is careful about the monitoring maintenance personnel. It is determined that the conditions for notification are not applicable.

また、障害監視サーバ部１ａは、２回目の測定値（１５０，３００）、障害特異点Ｚ３（７００，１００）および障害特異点Ｚ３の重み付け係数（１，５）を用いて、下記式により距離を計算する（ステップＳ６）。
「（（１５０−７００）²×１＋（３００−１００）²×５）の平方根」＝「７０８．９」
求めた距離「７０８．９」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きいので（ステップＳ６：Ｎｏ）、障害監視サーバ部１ａは、監視保守員に対して注意通知を行う条件に当てはまらないと判断する。 In addition, the failure monitoring server unit 1a uses the second measurement value (150, 300), the failure singularity point Z3 (700, 100), and the weighting coefficient (1, 5) of the failure singularity point Z3 according to the following formula. Is calculated (step S6).
“Square root of ((150−700) ² × 1 + (300−100) ² × 5)” = “708.9”
When the calculated distance “708.9” is compared with the caution notification threshold “100”, the calculated distance is larger (step S6: No), so the failure monitoring server unit 1a is careful about the monitoring maintenance personnel. It is determined that the conditions for notification are not applicable.

その後、障害監視サーバ部１ａは、実施の形態１と同様の手順で３回目の測定処理を実行し、３回目の測定値（３６０，３６０）、障害特異点Ｚ１（１００，１５０）および障害特異点Ｚ１の重み付け係数（１，２）を用いて、下記式より距離を計算する（ステップＳ６）。
「（（３６０−１００）²×１＋（３６０−１５０）²×２）の平方根」＝「３９４．７」
求めた距離「３９４．７」と注意通知しきい値「１００」を比較すると、求めた距離の方が大きいので（ステップＳ６：Ｎｏ）、障害監視サーバ部１ａは、監視保守員に対して注意通知を行う条件に当てはまらないと判断する。 Thereafter, the failure monitoring server unit 1a performs the third measurement process in the same procedure as in the first embodiment, and the third measurement value (360, 360), the failure singularity point Z1 (100, 150), and the failure singularity. The distance is calculated from the following equation using the weighting coefficient (1, 2) of the point Z1 (step S6).
^{"((360-100) 2 × 1 +} (360-150) 2 × 2) the square root of the" = "394.7"
When the calculated distance “394.7” is compared with the caution notification threshold “100”, the calculated distance is larger (step S6: No), so the failure monitoring server unit 1a is careful about the monitoring maintenance personnel. It is determined that the conditions for notification are not applicable.

また、障害監視サーバ部１ａは、３回目の測定値（３６０，３６０）、障害特異点Ｚ２（４００，４００）および障害特異点Ｚ２の重み付け係数（１，０．２）を用いて、下記式より距離を計算する（ステップＳ６）。
「（（３６０−４００）²×１＋（３６０−４００）²×０．２）の平方根」＝「４３．８」
求めた距離「４３．８」と注意通知しきい値「１００」を比較すると、求めた距離の方が小さいので（ステップＳ６：Ｙｅｓ）、障害監視サーバ部１ａは、つぎに、求めた距離が通知種別しきい値データベース４に登録した障害警告通知しきい値以下であるかどうかを比較する（ステップＳ７）。求めた距離「４３．８」と障害警告通知のしきい値「４５」を比較すると、求めた距離の方が小さいので（ステップＳ７：Ｙｅｓ）、障害監視サーバ部１ａは、障害警告通知を行う条件と判断する。 Further, the failure monitoring server unit 1a uses the third measurement value (360, 360), the failure singularity point Z2 (400, 400), and the weighting coefficient (1, 0.2) of the failure singularity point Z2 as follows: More distance is calculated (step S6).
“Square root of ((360−400) ² × 1 + (360−400) ² × 0.2)” = “43.8”
When the calculated distance “43.8” is compared with the attention notification threshold value “100”, the calculated distance is smaller (step S6: Yes), so that the failure monitoring server unit 1a next determines the calculated distance. It is compared whether it is below the failure warning notification threshold value registered in the notification type threshold value database 4 (step S7). When the obtained distance “43.8” is compared with the threshold value “45” of the failure warning notification, the obtained distance is smaller (step S7: Yes), so the failure monitoring server unit 1a performs the failure warning notification. Judge as a condition.

また、障害監視サーバ部１ａは、３回目の測定値（３６０，３６０）、障害特異点Ｚ３（７００，１００）および障害特異点Ｚ３の重み付け係数（１，５）を用いて、下記式より距離を計算する（ステップＳ６）。
「（（３６０−７００）²×１＋（３６０−１００）²×５）の平方根」＝「６７３．５」
求めた距離「６７３．５」と注意通知しきい値「１００」を比較すると、距離の方が大きいので（ステップＳ６：Ｎｏ）、障害監視サーバ部１ａは、監視保守員に対して注意通知を行う条件に当てはまらないと判断する。 In addition, the failure monitoring server unit 1a uses the third measurement value (360, 360), the failure singular point Z3 (700, 100), and the weighting coefficient (1, 5) of the failure singular point Z3 to calculate the distance from the following equation: Is calculated (step S6).
^{"((360-700) 2 × 1 +} (360-100) 2 × 5) the square root of the" = "673.5"
When the calculated distance “673.5” is compared with the caution notification threshold value “100”, the distance is larger (step S6: No), so the failure monitoring server unit 1a notifies the monitoring maintenance staff of the caution. Judge that the conditions are not applicable.

上記のように重み付けを行うことにより、実施の形態１では３回目の測定結果が注意通知として監視保守員に通知されていたが、本実施の形態では、３回目の測定結果が障害警告通知として監視保守員に通知される。 By performing the weighting as described above, in the first embodiment, the third measurement result is notified to the supervisory maintenance staff as a notice of notice. However, in the present embodiment, the third measurement result is used as a fault warning notice. Notified to the maintenance personnel.

以上説明したように、本実施の形態では、障害予兆検知装置が、距離を計算するときに、測定項目毎に重み付けを行うこととした。これにより、測定値毎に単位や障害に対する影響度が異なる場合であっても、より精度の高い障害監視が可能となる。 As described above, in this embodiment, the failure sign detection apparatus weights each measurement item when calculating the distance. As a result, even when the unit and the degree of influence on the failure are different for each measurement value, the failure monitoring can be performed with higher accuracy.

なお、ここでは、測定値と障害特異点の差を２乗した後で重み付け係数を乗じたが、２乗する前に重み付け係数を乗じる、または加算する、等の重み付けを行うことも可能である。 Here, the difference between the measured value and the failure singularity is squared and then multiplied by the weighting coefficient. However, weighting such as multiplying or adding the weighting coefficient before the square is possible. .

また、実施の形態１と同様、障害監視処理の事前準備（ステップＳ１）で登録した各項目については、手動、または予め定義された判断処理により、障害監視中でも、追加，変更，削除を可能とする。たとえば、監視対象システムの障害発生の状況により、重み付け係数を変更して監視を行うことで、障害の予兆を検出する確率を向上させることができ、障害発生の傾向が変化する場合であっても、その変化に追随して監視を継続することが可能となる。 Further, as in the first embodiment, each item registered in the preparatory preparation for the fault monitoring process (step S1) can be added, changed, or deleted even during fault monitoring manually or by a predefined determination process. To do. For example, by changing the weighting coefficient according to the status of failure occurrence in the monitored system, the probability of detecting a failure sign can be improved, and even when the tendency of failure occurrence changes. The monitoring can be continued following the change.

実施の形態３．
実施形態１および２では、現在の監視対象システムの状態（測定値）と障害特異点との差分により、障害発生時の状態と近似しているかどうかを判断した。本実施の形態では、障害発生時よりも所定時間前の状態と近似しているかどうかを判断する。 Embodiment 3 FIG.
In the first and second embodiments, whether or not the current state of the monitoring target system is approximate to the state at the time of the failure is determined based on the difference between the state (measured value) of the current monitoring target system and the failure singularity. In this embodiment, it is determined whether or not the state is approximate to a state before a predetermined time before the occurrence of the failure.

本実施の形態では、過去に障害が発生した時の測定値（障害特異点）とその障害発生時の所定時間前に測定した測定値とを用いて距離を計算し、この距離をしきい値として登録する。そして、現在の監視対象システムの測定値と障害特異点との距離が登録したしきい値より小さい場合に、過去に障害が発生した時点よりも所定時間前の状態になっていると判断し、所定時間経過後に障害が発生する可能性がある旨を監視保守員に通知する。 In the present embodiment, a distance is calculated using a measured value (failure specific point) when a failure has occurred in the past and a measured value measured a predetermined time before the failure occurs, and this distance is set as a threshold value. Register as Then, when the distance between the measured value of the current monitoring target system and the failure singularity is smaller than the registered threshold, it is determined that the state is a predetermined time before the time when the failure occurred in the past, Notify the monitoring and maintenance personnel that a failure may occur after a predetermined time.

図６は、実施の形態３の障害予兆検知装置の構成例を示す図である。このシステムは、障害監視サーバ部１ｂと、監視保守員が操作する障害監視端末部２と、障害特異点データベース３と、通知種別しきい値データベース４ａと、監視対象測定値データベース５と、を備える。なお、前述した実施の形態１と同一の構成については、同一の符号を付してその説明を省略する。 FIG. 6 is a diagram illustrating a configuration example of the failure sign detection apparatus according to the third embodiment. This system includes a failure monitoring server unit 1b, a failure monitoring terminal unit 2 operated by a monitoring maintenance person, a failure singularity database 3, a notification type threshold value database 4a, and a monitoring target measurement value database 5. . In addition, about the same structure as Embodiment 1 mentioned above, the same code | symbol is attached | subjected and the description is abbreviate | omitted.

障害監視サーバ部１ｂは、前述した障害監視サーバ部１と比較して、通知種別しきい値データベース４ａに登録されたしきい値にしたがって監視結果の通知を行う点が異なる。通知種別しきい値データベース４ａは、過去の障害発生時の測定値（障害特異点）とその障害発生時よりも所定時間前に測定した測定値との距離をしきい値として登録する。 The failure monitoring server unit 1b differs from the failure monitoring server unit 1 described above in that the monitoring result is notified according to the threshold value registered in the notification type threshold value database 4a. The notification type threshold value database 4a registers a distance between a measured value (failure specific point) at the time of the past failure occurrence and a measurement value measured a predetermined time before the failure occurrence as a threshold value.

つづいて、実施の形態３の障害監視処理を図７に基づいて説明する。図７は、本実施の形態の障害監視処理を示すフローチャートである。本実施の形態では、前述した実施の形態１と異なる処理について説明する。 Next, the failure monitoring process according to the third embodiment will be described with reference to FIG. FIG. 7 is a flowchart showing the fault monitoring process of the present embodiment. In the present embodiment, processing different from that of the first embodiment will be described.

まず、監視（測定）を開始する前の事前準備として、障害監視サーバ部１ｂは、障害監視端末部２からの入力に基づき測定条件等の設定を行う（ステップＳ１）。このとき、通障害監視サーバ部１ｂは、過去に発生した障害時の測定値（障害特異点）とその障害発生時よりも所定時間前に測定した測定値との距離を、通知種別毎のしきい値として通知種別しきい値データベース４ａに登録する。本実施の形態では、上記所定時間を６０分および３０分とし、通知を障害６０分前通知および障害３０分前通知の２種類として登録する。 First, as a preliminary preparation before starting monitoring (measurement), the failure monitoring server unit 1b sets measurement conditions and the like based on an input from the failure monitoring terminal unit 2 (step S1). At this time, the fault monitoring server unit 1b sets, for each notification type, the distance between the measured value (failure singularity) at the time of the failure that occurred in the past and the measured value measured a predetermined time before the failure occurred. The threshold value is registered in the notification type threshold value database 4a as a threshold value. In the present embodiment, the predetermined time is set to 60 minutes and 30 minutes, and notifications are registered as two types of notification, 60 minutes before failure and 30 minutes before failure.

また、監視対象測定値データベース５には、５分周期で測定が行われているため、過去に障害が発生した時刻の６０分前に測定した測定値および３０分前に測定した測定値（図示せず）も登録されている。そのため、通障害監視サーバ部１ｂは、障害発生６０分前の測定値と障害特異点との間の距離、および障害発生３０分前の測定値と障害特異点との間の距離をそれぞれ計算し、それらの距離を障害６０分前通知および障害３０分前通知に対応したしきい値として通知種別しきい値データベース４ａに登録する。具体的な測定値に基づいて計算するため、障害特異点ごとにしきい値が異なるのが一般的である。 In addition, since the monitoring target measurement value database 5 is measured at a cycle of 5 minutes, the measurement value measured 60 minutes before the time when the failure occurred in the past and the measurement value measured 30 minutes before (see FIG. (Not shown) is also registered. Therefore, the fault monitoring server unit 1b calculates the distance between the measurement value 60 minutes before the failure occurrence and the failure singularity, and the distance between the measurement value 30 minutes before the failure occurrence and the failure singularity, respectively. These distances are registered in the notification type threshold value database 4a as thresholds corresponding to the notification 60 minutes before the failure and the notification 30 minutes before the failure. Since the calculation is based on specific measurement values, the threshold value is generally different for each failure singularity.

ここでは、障害特異点Ｚ１に対応するしきい値として、障害６０分前通知を行うかどうかを判断するためのしきい値（障害６０分前通知しきい値）を「１００」とし、障害３０分前通知を行うかどうかを判断するためのしきい値（障害３０分前通知しきい値）を「４５」とする。障害監視サーバ部１ｂは、測定値と障害特異点との距離を計算した結果、求めた距離が「１００」より大きい場合、監視対象システムの障害の予兆を検出していないと判断していずれの通知も行わない。求めた距離が「１００」以下で「４５」より大きい場合は、障害６０分前通知を障害監視端末部２へ送る。また、求めた距離が「４５」以下の場合は、障害３０分前通知を障害監視端末部２へ送る。なお、ここでは、通知の種別を２つとしたが、これに限定するものではなく、１つまたは３つ以上としてもよい。 Here, as a threshold corresponding to the failure singularity Z1, a threshold for determining whether or not to notify a failure 60 minutes before (failure 60 minutes before notification threshold) is “100”, and the failure 30 The threshold value for determining whether or not to perform the minute notification (failure 30 minute notification threshold) is set to “45”. As a result of calculating the distance between the measured value and the failure singularity, the failure monitoring server unit 1b determines that no sign of failure in the monitoring target system has been detected when the calculated distance is greater than “100”. There is no notification. If the calculated distance is equal to or less than “100” and greater than “45”, a notification 60 minutes before the failure is sent to the failure monitoring terminal unit 2. When the calculated distance is equal to or less than “45”, a notification 30 minutes before the failure is sent to the failure monitoring terminal unit 2. Here, the number of types of notification is two, but is not limited to this, and may be one or three or more.

また、障害監視サーバ部１ｂは、障害特異点Ｚ２に対応するしきい値として、障害６０分前通知しきい値を「１０５」とし、障害３０分前通知しきい値を「５０」とする。また、障害監視サーバ部１ｂは、障害特異点Ｚ３に対応するしきい値として、障害６０分前通知しきい値を「９５」とし、障害３０分前通知しきい値を「４０」とする。 Further, the failure monitoring server unit 1b sets the failure threshold value 60 minutes before notification as “105” and the failure threshold value 30 minutes before notification as “50” as the threshold values corresponding to the failure singularity Z2. Further, the failure monitoring server unit 1b sets the notification threshold value for the failure 60 minutes ago as “95” and the failure threshold value 30 minutes before notification as “40” as the threshold values corresponding to the failure singularity Z3.

その後、障害監視サーバ部１ｂは、実施の形態１と同様の手順で１回目の測定処理と距離の計算を実行する。そして、求めた距離が通知種別しきい値データベース４ａに登録した障害特異点Ｚ１の障害６０分前通知しきい値以下であるかどうかを比較する（ステップＳ２０）。なお、計算式、測定値および障害特異点は実施の形態１と同じであり、この計算式により求められる距離は「２５０」である（以下、各ステップで求めた距離は実施の形態１と同じとする）。したがって、求めた距離「２５０」と障害６０分前通知しきい値「１００」を比較すると、求めた距離の方が大きいので（ステップＳ２０：Ｎｏ）、障害監視サーバ部１ｂは、監視保守員に対して障害６０分前通知を行う条件に当てはまらないと判断する。 Thereafter, the failure monitoring server unit 1b executes the first measurement process and the distance calculation in the same procedure as in the first embodiment. Then, it is compared whether or not the obtained distance is equal to or less than the notification threshold 60 minutes before the failure at the failure singularity Z1 registered in the notification type threshold database 4a (step S20). The calculation formula, measured value, and fault singularity are the same as in the first embodiment, and the distance obtained by this calculation formula is “250” (hereinafter, the distance obtained in each step is the same as in the first embodiment). And). Therefore, when the obtained distance “250” is compared with the notification threshold value “100” 60 minutes before the failure, since the obtained distance is larger (step S20: No), the failure monitoring server unit 1b determines to the monitoring maintenance staff. On the other hand, it is determined that the condition for notifying 60 minutes before the failure does not apply.

また、障害監視サーバ部１ｂは、求めた距離が障害特異点Ｚ２の障害６０分前通知しきい値以下であるかどうかを比較する（ステップＳ２０）。求めた距離「３００」と障害６０分前通知しきい値「１０５」を比較すると、求めた距離の方が大きいので（ステップＳ２０：Ｎｏ）、障害監視サーバ部１ｂは、監視保守員に対して障害６０分前通知を行う条件に当てはまらないと判断する。 Further, the failure monitoring server unit 1b compares whether or not the obtained distance is equal to or less than the notification threshold value 60 minutes before the failure at the failure singular point Z2 (step S20). When the calculated distance “300” is compared with the notification threshold value “105” 60 minutes before the failure, the calculated distance is larger (step S20: No), so the failure monitoring server unit 1b It is determined that the condition for notifying 60 minutes before the failure does not apply.

また、障害監視サーバ部１ｂは、求めた距離が障害特異点Ｚ３の障害６０分前通知しきい値以下であるかどうかを比較する（ステップＳ２０）。求めた距離「６７０．８」と障害６０分前通知しきい値「９５」を比較すると、求めた距離の方が大きいので（ステップＳ２０：Ｎｏ）、障害監視サーバ部１ｂは、監視保守員に対して障害６０分前通知を行う条件に当てはまらないと判断する。 Further, the failure monitoring server unit 1b compares whether or not the obtained distance is equal to or less than the notification threshold value 60 minutes before the failure at the failure singular point Z3 (step S20). Comparing the obtained distance “670.8” with the failure notification threshold value “95” 60 minutes before the failure, the obtained distance is larger (step S20: No). On the other hand, it is determined that the condition for notifying 60 minutes before the failure does not apply.

その後、障害監視サーバ部１ｂは、実施の形態１と同様の手順で２回目の測定処理を実行し、２回目の測定値を用いて求めた距離が障害特異点Ｚ１の障害６０分前通知しきい値以下であるかどうかを比較する（ステップＳ２０）。求めた距離「１５８．１」と障害６０分前通知しきい値「１００」を比較すると、求めた距離の方が大きいので（ステップＳ２０：Ｎｏ）、障害監視サーバ部１ｂは、監視保守員に対して障害６０分前通知を行う条件に当てはまらないと判断する。 Thereafter, the failure monitoring server unit 1b performs the second measurement process in the same procedure as in the first embodiment, and notifies the distance obtained using the second measurement value 60 minutes before the failure at the failure singular point Z1. It is compared whether it is below the threshold (step S20). Comparing the obtained distance “158.1” with the failure notification threshold value “100” 60 minutes before the failure, the obtained distance is larger (step S20: No), so the failure monitoring server unit 1b sends the monitoring maintenance staff On the other hand, it is determined that the condition for notifying 60 minutes before the failure does not apply.

また、障害監視サーバ部１ｂは、求めた距離が障害特異点Ｚ２の障害６０分前通知しきい値以下であるかどうかを比較する（ステップＳ２０）。求めた距離「２６９．３」と障害６０分前通知しきい値「１０５」を比較すると、求めた距離の方が大きいので（ステップＳ２０：Ｎｏ）、障害監視サーバ部１ｂは、監視保守員に対して障害６０分前通知を行う条件に当てはまらないと判断する。 Further, the failure monitoring server unit 1b compares whether or not the obtained distance is equal to or less than the notification threshold value 60 minutes before the failure at the failure singular point Z2 (step S20). Comparing the obtained distance “269.3” with the failure notification threshold value “105” 60 minutes before the failure, the obtained distance is larger (step S20: No). On the other hand, it is determined that the condition for notifying 60 minutes before the failure does not apply.

また、障害監視サーバ部１ｂは、求めた距離が障害特異点Ｚ３の障害６０分前通知しきい値以下であるかどうかを比較する（ステップＳ２０）。求めた距離「５８５．２」と障害６０分前通知しきい値「９５」を比較すると、求めた距離の方が大きいので（ステップＳ２０：Ｎｏ）、障害監視サーバ部１ｂは、監視保守員に対して障害６０分前通知を行う条件に当てはまらないと判断する。 Further, the failure monitoring server unit 1b compares whether or not the obtained distance is equal to or less than the notification threshold value 60 minutes before the failure at the failure singular point Z3 (step S20). Comparing the obtained distance “585.2” with the failure threshold value “95” 60 minutes before the failure (step S20: No), the failure monitoring server unit 1b sends the monitoring maintenance staff On the other hand, it is determined that the condition for notifying 60 minutes before the failure does not apply.

その後、障害監視サーバ部１ｂは、実施の形態１と同様の手順で３回目の測定処理を実行し、３回目の測定値を用いて求めた距離が障害特異点Ｚ１の障害６０分前通知しきい値以下であるかどうかを比較する（ステップＳ２０）。求めた距離「３３４．２」と障害６０分前通知しきい値「１００」を比較すると、求めた距離の方が大きいので（ステップＳ２０：Ｎｏ）、障害監視サーバ部１ｂは、監視保守員に対して障害６０分前通知を行う条件に当てはまらないと判断する。 Thereafter, the failure monitoring server unit 1b executes the third measurement process in the same procedure as in the first embodiment, and notifies the distance obtained using the third measurement value 60 minutes before the failure at the failure singular point Z1. It is compared whether it is below the threshold (step S20). When the calculated distance “334.2” is compared with the notification threshold value “100” 60 minutes before the failure, the calculated distance is larger (step S20: No). On the other hand, it is determined that the condition for notifying 60 minutes before the failure does not apply.

また、障害監視サーバ部１ｂは、求めた距離が障害特異点Ｚ２の障害６０分前通知しきい値以下であるかどうかを比較する（ステップＳ２０）。求めた距離「５６．６」と障害６０分前通知しきい値「１０５」を比較すると、求めた距離の方が小さいので（ステップＳ２０：Ｙｅｓ）、障害監視サーバ部１ｂは、つぎに、求めた距離が通知種別しきい値データベース４ａに登録した障害特異点Ｚ２の障害３０分前通知しきい値以下であるかどうかを比較する（ステップＳ２１）。求めた距離「５６．６」と障害３０分前通知しきい値「５０」を比較すると、求めた距離の方が大きいので（ステップＳ２１：Ｎｏ）、障害監視サーバ部１ｂは、障害６０分前通知を行う条件と判断し、障害監視端末部２に障害６０分前通知を送信する（ステップＳ２２）。障害監視端末部２は、受信した障害６０分前通知を表示して、監視保守員に監視対象システムの状態を知らせる。なお、上記ステップＳ２１の比較処理において、求めた距離が障害３０分前通知しきい値「５０」以下の場合（ステップＳ２１：Ｙｅｓ）。障害監視サーバ部１ｂは、障害３０分前通知を行う条件と判断し、障害監視端末部２に障害３０分前通知を送信する（ステップＳ２３）。 Further, the failure monitoring server unit 1b compares whether or not the obtained distance is equal to or less than the notification threshold value 60 minutes before the failure at the failure singular point Z2 (step S20). Comparing the obtained distance “56.6” with the failure notification threshold value “105” 60 minutes before the failure, the obtained distance is smaller (step S20: Yes). It is compared whether or not the distance is equal to or less than the notification threshold value 30 minutes before the failure at the failure singularity Z2 registered in the notification type threshold value database 4a (step S21). When the calculated distance “56.6” is compared with the notification threshold value “50” 30 minutes before the failure, the calculated distance is larger (step S21: No), so the failure monitoring server unit 1b determines that the failure is 60 minutes before the failure. It is determined that the condition is to be notified, and a failure 60 minutes before notification is transmitted to the failure monitoring terminal unit 2 (step S22). The failure monitoring terminal unit 2 displays the received notification 60 minutes before the failure and notifies the monitoring maintenance personnel of the status of the monitored system. In the comparison process in step S21, the calculated distance is equal to or less than the failure threshold value “50” 30 minutes before the failure (step S21: Yes). The failure monitoring server unit 1b determines that the condition is to notify the failure 30 minutes before, and transmits the failure 30 minutes before notification to the failure monitoring terminal unit 2 (step S23).

また、障害監視サーバ部１ｂは、求めた距離が障害特異点Ｚ３の障害６０分前通知しきい値以下であるかどうかを比較する（ステップＳ２０）。求めた距離「４２８．０」と障害６０分前通知しきい値「９５」を比較すると、求めた距離の方が大きいので（ステップＳ２０：Ｎｏ）、障害監視サーバ部１ｂは、監視保守員に対して障害６０分前通知を行う条件に当てはまらないと判断する。 Further, the failure monitoring server unit 1b compares whether or not the obtained distance is equal to or less than the notification threshold value 60 minutes before the failure at the failure singular point Z3 (step S20). When the calculated distance “428.0” is compared with the notification threshold value “95” 60 minutes before the failure (step S20: No), the failure monitoring server unit 1b notifies the monitoring maintenance personnel. On the other hand, it is determined that the condition for notifying 60 minutes before the failure does not apply.

以上説明したように、本実施の形態によれば、障害予兆検知装置は、所定時間経過後に監視対象システムに障害が発生する可能性があることを、監視保守員に知らせることとした。これにより、監視保守員の経験と知識に頼らずに、迅速な障害予兆判断を行うことができる。 As described above, according to the present embodiment, the failure sign detection device notifies the monitoring maintenance personnel that a failure may occur in the monitored system after a predetermined time has elapsed. As a result, it is possible to quickly determine a failure sign without depending on the experience and knowledge of the supervisory maintenance staff.

なお、本実施の形態の応用例を図８に示す。図８は、実施の形態３の障害予兆検知装置の応用例を示す図である。このシステムは、障害監視サーバ部１ｃと、監視保守員が操作する障害監視端末部２と、障害特異点データベース３と、通知種別しきい値データベース４ａと、監視対象測定値データベース５と、保守業者影響顧客データベース６と、を備える。 An application example of this embodiment is shown in FIG. FIG. 8 is a diagram illustrating an application example of the failure sign detection apparatus according to the third embodiment. This system includes a failure monitoring server unit 1c, a failure monitoring terminal unit 2 operated by a monitoring maintenance person, a failure singularity database 3, a notification type threshold value database 4a, a monitoring target measurement value database 5, a maintenance contractor An influential customer database 6.

障害監視サーバ部１ｃは、保守業者影響顧客データベース６と接続し、障害の予兆を検知したとき、障害監視端末部２への通知とあわせて保守業者影響顧客データベース６に登録された保守業者および顧客にも連絡する点が、障害監視サーバ部１ｂと異なる。その他の機能については障害監視サーバ部１ｂと同様のため、詳細な説明は省略する。保守業者影響顧客データベース６は、障害特異点ごとに、担当する保守業者とその連絡先，影響を受ける顧客とその連絡先、を登録したデータベースである。ただし、図８では、連絡先を電話番号としているが、ＦＡＸ番号や電子メールアドレスとしてもよい。保守業者影響顧客データベース６への登録は、障害監視サーバ部１ｃが、他の設定と同様、図７のフローチャートのステップＳ１の事前準備で行う。 When the failure monitoring server unit 1c is connected to the maintenance company influence customer database 6 and detects a sign of a failure, the failure monitoring server unit 1c is registered with the maintenance agent influence customer database 6 together with the notification to the failure monitoring terminal unit 2. Is different from the failure monitoring server unit 1b. Since other functions are the same as those of the failure monitoring server unit 1b, detailed description thereof is omitted. The maintenance company influence customer database 6 is a database in which the maintenance company in charge and its contact information, the affected customer and their contact information are registered for each failure singularity. In FIG. 8, the contact number is a telephone number, but it may be a FAX number or an e-mail address. Registration in the maintenance company influence customer database 6 is performed by the failure monitoring server unit 1c in advance preparation in step S1 of the flowchart of FIG.

また、障害監視サーバ部１ｃは、障害６０分前通知または障害３０分前通知を行うと判断をした場合、障害監視端末部２への通知とあわせて、対応する保守業者のベンダおよび影響を受ける顧客へも連絡を行う。この場合、障害監視サーバ部１ｃは、保守業者のベンダおよび影響を受ける顧客に対して自動発信または自動音声呼び出しを行うことにより連絡を行い、対象装置の故障修理を依頼する。これにより、障害発生の事前防止および障害発生する前の事前リスク対策を促進することができる。 In addition, when the failure monitoring server unit 1c determines to notify the failure 60 minutes before or 30 minutes before the failure, the failure monitoring server unit 1c is affected by the vendor of the corresponding maintenance company together with the notification to the failure monitoring terminal unit 2. Contact customers. In this case, the failure monitoring server unit 1c contacts the vendor of the maintenance company and the affected customer by making an automatic call or an automatic voice call, and requests a repair of the target device. As a result, advance prevention of failure occurrence and advance risk countermeasures before failure occurrence can be promoted.

また、図８では、障害特異点ごとに保守業者と顧客のデータをデータベースに登録することとしたが、たとえば、事前に設定したプログラムによる判定処理よって、障害特異点に保守業者や顧客を結び付けてもよい。 In FIG. 8, the data of the maintenance company and the customer are registered in the database for each failure singularity. For example, the maintenance builder and the customer are linked to the failure singularity by a determination process using a preset program. Also good.

また、実施の形態１と同様、障害監視処理の事前準備（ステップＳ１）で登録した各項目については、障害監視中でも、追加，変更，削除を可能とする。 Further, as in the first embodiment, each item registered in the fault monitoring process preparation (step S1) can be added, changed, and deleted even during fault monitoring.

以上のように、本発明にかかる障害予兆検知装置は、情報システムの障害を監視するシステムに有用であり、特に、精度の高い障害予兆判断を行う場合に適している。 As described above, the failure sign detection device according to the present invention is useful for a system for monitoring a failure of an information system, and is particularly suitable for making a highly accurate failure sign determination.

障害予兆検知装置の構成例を示す図である。It is a figure which shows the structural example of a failure sign detection apparatus. 障害監視の処理を示すフローチャートである。It is a flowchart which shows the process of failure monitoring. 障害特異点を示す図である。It is a figure which shows a failure specific point. 障害特異点と測定値を示す図である。It is a figure which shows a failure specific point and a measured value. 障害予兆検知装置の構成例を示す図である。It is a figure which shows the structural example of a failure sign detection apparatus. 障害予兆検知装置の構成例を示す図である。It is a figure which shows the structural example of a failure sign detection apparatus. 障害監視の処理を示すフローチャートである。It is a flowchart which shows the process of failure monitoring. 障害予兆検知装置の構成例を示す図である。It is a figure which shows the structural example of a failure sign detection apparatus.

Explanation of symbols

１，１ａ，１ｂ，１ｃ障害監視サーバ部
２障害監視端末部
３，３ａ障害特異点データベース
４，４ａ通知種別しきい値データベース
５監視対象測定値データベース
６保守業者影響顧客データベース 1, 1a, 1b, 1c Fault monitoring server part 2 Fault monitoring terminal part 3, 3a Fault singularity database 4, 4a Notification type threshold value database 5 Monitoring target measurement value database 6 Maintenance contractor influence customer database

Claims

A failure sign detection device for detecting a sign of a failure in an information system,
An actual measurement value storage means for periodically storing an actual measurement value for each measurement item, which is a measurement value of the measurement item for monitoring the state of the information system;
Fault measurement value storage means for storing a fault measurement value for each measurement item, which is a measurement value of the measurement item at the time of a past fault occurrence, for each past fault case;
A threshold value storage means for storing a threshold value that is a criterion for determining whether or not a failure sign is detected;
By calculating a predetermined calculation formula for each failure case, the difference for each measurement item, which is the difference between the actual measurement value in the same measurement item and the measurement value at the time of failure, is quantified as a single value, and the single value and A failure monitoring means for determining the presence or absence of failure sign detection by comparing the threshold value;
A failure sign detection device comprising:

As the threshold value, a plurality of threshold values defined in a stepwise manner according to the magnitude of the possibility of failure are stored in the threshold value storage means,
The failure sign detection apparatus according to claim 1, wherein the failure monitoring unit notifies the magnitude of a possibility of a failure as a determination result of the presence or absence of the failure sign detection.

The failure sign detection device according to claim 2, wherein the plurality of threshold values are individually stored for each past failure case.

The fault monitoring means includes
For each of the failure cases, an operation for quantifying the difference for each measurement item as a single value, which is the difference between the measurement value at the time of failure in the same measurement item and the measurement value measured a predetermined time before the failure occurs. A single value obtained for each failure case is recorded in the threshold value storage means as the threshold value,
The failure sign detection apparatus according to claim 1, wherein presence / absence of the failure sign detection is determined using a threshold value obtained for each failure case.

When a plurality of the predetermined times are specified in stages,
The failure monitoring means records a threshold value obtained stepwise for each failure case as the threshold value, and as a determination result of the presence / absence of failure sign detection, a predetermined time before the occurrence of a previous failure is recorded. 5. The failure sign detection apparatus according to claim 4, wherein the failure notification apparatus notifies that the situation is similar to the situation.

The failure monitoring means periodically performs a measurement process for obtaining an actual measurement value for each measurement item, and determines the presence or absence of the failure sign detection for all past failure cases every measurement. The failure sign detection device according to any one of claims 1 to 5.

The failure monitoring means, in the process of comparing the single value and the threshold value, determines that a failure sign has been detected when the single value is smaller than the threshold value. Item 7. A failure sign detection device according to any one of Items 1 to 6.

The failure sign detection device according to any one of claims 1 to 7, wherein a calculation by a least square method is performed as the predetermined calculation formula.

The failure sign detection device according to any one of claims 1 to 7, wherein the predetermined calculation formula can be executed by appropriately specifying a different calculation formula for each fault case.

The failure sign detection device according to any one of claims 1 to 9, wherein the measurement item includes two Ping responses between important points of the system and a traffic load.

The failure sign detection device according to claim 1, wherein the measurement item can be added and deleted.

The failure sign detection device according to any one of claims 1 to 11, wherein the failure measurement value can be added and deleted.

The failure sign detection device according to any one of claims 1 to 12, wherein the measurement cycle is changeable.

The failure monitoring means weights the measurement value of the measurement item according to the difference in the unit of the measurement value or the difference in the degree of influence on the failure for each measurement item, and reflects the result in the predetermined calculation formula. The failure sign detection device according to any one of claims 1 to 13.

15. The failure sign detection apparatus according to claim 14, wherein a weighting coefficient used for the weighting is changeable.