JP2019066339A

JP2019066339A - Diagnostic device, diagnostic method and diagnostic system each using sound

Info

Publication number: JP2019066339A
Application number: JP2017192490A
Authority: JP
Inventors: 洋平川口; Yohei Kawaguchi
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2017-10-02
Filing date: 2017-10-02
Publication date: 2019-04-25
Anticipated expiration: 2037-10-02
Also published as: JP6850709B2

Abstract

To provide diagnosis allowing for abnormality detection even when accuracy of sound source separation is not sufficient.SOLUTION: A diagnostic device performing diagnosis using sound is provided that comprises: a signal acquisition unit which acquires a sound signal being an electric signal acquired by converting sound and which outputs the sound signal; a preprocessing unit which converts the sound signal output by the signal acquisition unit to a frequency domain signal; a spatial correlation calculation unit which calculates a spatial correlation matrix on the basis of the frequency domain signal converted by the preprocessing unit; a spatial correlation abnormality detection unit which determines abnormality on the basis of the spatial correlation matrix calculated by the spatial correlation calculation unit; and an abnormality display unit which displays information related to the abnormality on the basis of the determination of the abnormality performed by the spatial correlation abnormality detection unit.SELECTED DRAWING: Figure 2

Description

本発明は、音による診断装置、診断方法、および診断システムに関する。 The present invention relates to a sound diagnostic apparatus, a diagnostic method, and a diagnostic system.

機械や人の状態は、音・振動に現れることが多い。そこで、機械や人の状態を把握するために機械や人から発生する音・振動に基づく診断は重要である。ただし、音・振動に基づく診断では、診断を誤りうることが問題である。この原因は大きく分けて二つに分かれる。 The state of machines and people often appear in sounds and vibrations. Therefore, diagnosis based on sounds and vibrations generated from machines and people is important in order to grasp the state of machines and people. However, in the diagnosis based on sound and vibration, there is a problem that the diagnosis can be wrong. The cause is roughly divided into two.

一つは、外的要因である、診断対象以外に由来する雑音である。もう一つが、内的要因である、診断対象自身の正常状態のぶれ、すなわち、正常状態同士でも音・振動が異なるという原因である。 One is noise derived from other than the diagnosis target, which is an external factor. The other is an internal factor, that is, the shake of the normal state of the diagnosis object itself, that is, the sound and the vibration are different even between the normal states.

これらの問題を解決する方法として、特許文献１には、「マイクロフォンアレイ装置とセンサ情報統合装置からなる室内空間の歩行音の動線推定装置であって、センサ情報統合装置はさらに室内空間を表示する表示部と信号処理部とを備えてマイクロフォンアレイ装置と接続され、室内空間にマイクロフォンアレイを２個１組にして『ハ』の字型に配置され、室内空間の歩行音がマイクロフォンアレイ１組で録音されて歩行音アナログ信号はＡＤ変換されて歩行音デジタル信号が生成され、ＭＵＳＩＣ法を用いて歩行音デジタル信号から音源位置および到来方向が推定され（ステップ１）、変形最小分散ビームフォーマによる音源分離され（ステップ２）、音源分離された分離音源から特徴が抽出され、音響モデルの尤度計算がされ、異常音が検出され（ステップ３）、パーティクルフィルタによる歩行人数と動線が推定されて（ステップ４）、推定された歩行動線がセンサ情報統合装置の表示部に表示されることを特徴とする室内空間の歩行音の動線推定装置」と記載されている。 As a method for solving these problems, Patent Document 1 discloses “A flow line estimation device for walking sound in an indoor space consisting of a microphone array device and a sensor information integration device, and the sensor information integration device further displays the indoor space. Display unit and signal processing unit, connected to the microphone array device, and arranged in a pair of microphone arrays in the indoor space in a "ha" shape, and the walking sound of the indoor space is one microphone array The walking sound analog signal is recorded in the A / D conversion and AD converted to generate a walking sound digital signal. The sound source position and arrival direction are estimated from the walking sound digital signal using the MUSIC method (step 1). The sound source is separated (step 2), the features are extracted from the separated sound source separated, the likelihood of the acoustic model is calculated, and the abnormal sound The number of people walking and the flow line are detected by the particle filter (step 3), and the estimated walking flow line is displayed on the display unit of the sensor information integration device (step 4). It is described as "flow line estimation device of walking sound".

特開２０１４−１９１６１６号公報JP, 2014-191616, A

特許文献１に記載された装置は、ステップ２で音源分離を施した音に対し、ステップ３で異常音の検出を行う。しかし、このようなカスケード構成は、前段の音源分離の精度が不十分である場合に後段の異常検知が誤る可能性が高い。 The device described in Patent Document 1 detects an abnormal sound in step 3 for the sound subjected to the sound source separation in step 2. However, in such a cascade configuration, there is a high possibility that the abnormality detection in the latter stage is erroneous when the accuracy of the sound source separation in the former stage is insufficient.

そこで本発明は、音源分離の精度が不十分な場合であっても、異常検知が可能な診断を提供することを目的とする。 Therefore, an object of the present invention is to provide a diagnosis capable of detecting an abnormality even when the accuracy of sound source separation is insufficient.

上記課題を解決するために、たとえば特許請求の範囲に記載の構成を採用する。本願は上記課題を解決する手段を複数含んでいるが、その一例を挙げるならば、音により診断する診断装置であって、音から変換された電気信号である音信号を取得し、音信号を出力する信号取得部と、前記信号取得部が出力した音信号を周波数領域信号に変換する前処理部と、前記前処理部が変換した周波数領域信号に基づいて、空間相関行列を計算する空間相関計算部と、前記空間相関計算部が計算した空間相関行列に基づいて、異常を判定する空間相関異常検知部と、前記空間相関異常検知部による異常の判定、に基づいて、異常に関する情報を表示する異常表示部と、を備えたことを特徴とする。 In order to solve the above problems, for example, the configuration described in the claims is adopted. The present application includes a plurality of means for solving the above-mentioned problems, and an example thereof is a diagnostic device for diagnosing by sound, which acquires a sound signal which is an electrical signal converted from sound, and generates a sound signal. A spatial correlation matrix for calculating a spatial correlation matrix based on a signal acquisition unit to output, a pre-processing unit to convert a sound signal output from the signal acquisition unit into a frequency domain signal, and a frequency domain signal converted by the pre-processing unit Based on the calculation unit and the spatial correlation matrix calculated by the spatial correlation calculation unit, information on the abnormality is displayed based on the spatial correlation abnormality detection unit that determines the abnormality and the determination of the abnormality by the spatial correlation abnormality detection unit And an abnormality display unit.

本発明によれば、音源分離の精度が不十分な場合であっても、異常検知が可能な診断を提供できる。上記した以外の課題、構成及び効果は、以下の実施形態の説明により明らかにされる。 According to the present invention, it is possible to provide a diagnosis capable of detecting an abnormality even when the accuracy of sound source separation is insufficient. Problems, configurations, and effects other than those described above will be apparent from the description of the embodiments below.

診断装置の例を示す図である。It is a figure which shows the example of a diagnostic device. 診断処理の例を示す図である。It is a figure which shows the example of a diagnostic process. 異常検知モードと表示される情報の例を示す図である。It is a figure which shows the example of the information displayed as abnormality detection mode. 画面の例を示す図である。It is a figure which shows the example of a screen. 実施例２の診断装置の例を示す図である。FIG. 7 is a diagram showing an example of a diagnostic device of a second embodiment. 実施例２の診断処理の例を示す図である。FIG. 7 is a diagram showing an example of a diagnostic process of the second embodiment. 実施例２の異常検知モードと表示される情報の例を示す図である。FIG. 16 is a diagram showing an example of information displayed as an abnormality detection mode according to the second embodiment. 実施例３の診断システムの例を示す図である。FIG. 16 is a diagram showing an example of a diagnostic system of a third embodiment.

以下、本発明を実施するための形態の好ましい例を、実施例として、図面を用いて説明する。 Hereinafter, a preferred example of the mode for carrying out the present invention will be described as an example using the drawings.

図１は、診断装置１００の例を示す図であり、診断装置１００は一般的なコンピュータであってよい。プロセッサ１２１とメモリ１２２は、一般的なコンピュータのプロセッサとメモリであってもよく、プロセッサ１２１は、メモリ１２２あるいは記憶部１２６に格納されたプログラムを実行する。 FIG. 1 is a diagram showing an example of a diagnostic device 100, which may be a general computer. The processor 121 and the memory 122 may be a processor and a memory of a general computer, and the processor 121 executes a program stored in the memory 122 or the storage unit 126.

信号入力部１２３は、音の信号を入力する電子回路である。マイクロホンと接続されて音のアナログ電気信号を入力する場合、信号入力部１２３はＡＤＣ（Analog-Digital Converter）を含む。診断装置１００の外部に有るＡＤＣと接続されて音のデジタル信号を入力する場合、信号入力部１２３はＡＤＣを含まなくてもよい。 The signal input unit 123 is an electronic circuit that inputs a sound signal. When connected to a microphone to input a sound analog electrical signal, the signal input unit 123 includes an ADC (Analog-Digital Converter). When connected to an ADC external to the diagnostic device 100 to input a digital signal of sound, the signal input unit 123 may not include the ADC.

信号入力部１２３は、入力したデジタル信号のアナログ的な電圧を変換したり、データフォーマットを変換したり、サンプリング周波数を変換したりしてもよい。また、音の信号がネットワーク経由で入力される場合、信号入力部１２３はネットワークインターフェースであってもよい。信号入力部１２３がデジタル信号を入力する場合であっても、そのデジタル信号の元はマイクロホンであるので、以下では信号入力部１２３にマイクロホンが接続される例を説明する。 The signal input unit 123 may convert an analog voltage of the input digital signal, convert a data format, or convert a sampling frequency. In addition, when a sound signal is input via a network, the signal input unit 123 may be a network interface. Even when the signal input unit 123 inputs a digital signal, the source of the digital signal is a microphone, so an example in which a microphone is connected to the signal input unit 123 will be described below.

表示部１２４は、たとえば液晶ディスプレイ装置であり、プロセッサ１２１が生成した表示用データを表示する。また、表示部１２４はネットワークインターフェースであってもよく、ネットワークインターフェースである表示部１２４からネットワークを経由して他のコンピュータで表示用データが表示されてもよい。 The display unit 124 is, for example, a liquid crystal display device, and displays display data generated by the processor 121. The display unit 124 may be a network interface, and display data may be displayed on another computer from the display unit 124 which is a network interface via the network.

入力部１２５は、たとえばキーボードとマウスあるいはタッチパネルであり、ユーザの操作により情報が入力されるユーザインターフェースである。また、入力部１２５は、診断対象となる機械の制御部から情報を入力してもよい。入力された情報は、プロセッサ１２１により処理される。 The input unit 125 is, for example, a keyboard and a mouse or a touch panel, and is a user interface to which information is input by a user operation. The input unit 125 may also input information from the control unit of the machine to be diagnosed. The input information is processed by the processor 121.

また、入力部１２５もネットワークインターフェースであってもよく、ネットワークインターフェースである入力部１２５がネットワークを経由して他のコンピュータで入力された情報を受信してもよい。 The input unit 125 may also be a network interface, and the input unit 125 which is a network interface may receive information input by another computer via a network.

記憶部１２６は、たとえばハードディスクドライブ、ソリッドステートドライブ、あるいはフラッシュメモリであり、プログラムとデータが格納される。記憶部１２６に格納されたプログラムとデータは、メモリ１２２へ転送されてもよいし、メモリ１２２に格納されたプログラムとデータは、記憶部１２６へ転送されてもよい。 The storage unit 126 is, for example, a hard disk drive, a solid state drive, or a flash memory, and stores programs and data. The program and data stored in the storage unit 126 may be transferred to the memory 122, and the program and data stored in the memory 122 may be transferred to the storage unit 126.

このため、記憶部１２６とメモリ１２２のいずれにプログラムが格納されるかは重要でないので、以下では、記憶部１２６に格納されたプログラムとして説明するが、メモリ１２２に格納されたプログラムと読み替えられてもよい。記憶部１２６に格納されたプログラムの中で図１に示したプログラムについては、図２を用いて説明する。 For this reason, it does not matter which of the storage unit 126 and the memory 122 the program is stored. Therefore, although the program is described below as the program stored in the storage unit 126, it is read as the program stored in the memory 122 It is also good. Among the programs stored in the storage unit 126, the program shown in FIG. 1 will be described using FIG.

記憶部１２６は他のプログラムが格納されてもよく、たとえば他のプログラムとして、信号取得プログラム１０１ａから異常検知モード入力プログラム１１２ａまでの全体を制御するプログラム、および診断装置１００であるコンピュータを基本的に動作させるためのＯＳ（Operating System）が格納されてもよい。 Storage unit 126 may store other programs. For example, as the other programs, a program for controlling the whole from signal acquisition program 101a to abnormality detection mode input program 112a, and a computer which is diagnostic device 100 basically An operating system (OS) for operation may be stored.

記憶部１２６は、さらに情報が格納されてもよく、プログラムの実行において判定に使用される閾値の情報が格納されてもよい。また、記憶部１２６にデータベースが構成され、情報がデータベースに蓄積されてもよい。後で図２を用いて説明する入力信号空間相関行列の情報などがデータベースに蓄積されてもよい。 The storage unit 126 may further store information, and may store threshold information used for determination in program execution. Further, a database may be configured in the storage unit 126, and information may be accumulated in the database. Information on the input signal space correlation matrix, which will be described later with reference to FIG. 2, may be stored in the database.

また、図１に例示した以外に、診断装置１００は他のハードウェアを備えてもよく、たとえばネットワークインターフェースや記憶媒体のリーダを備えてもよい。記憶部に格納されるプログラムや情報は、図示を省略したネットワークインターフェースにより入力されてもよいし、記憶媒体のリーダにより入力されてもよい。診断装置１００はネットワークインターフェースを介して他の装置と通信してもよい。 In addition to the examples illustrated in FIG. 1, the diagnostic device 100 may include other hardware, such as a network interface or a reader of a storage medium. The program and information stored in the storage unit may be input by a network interface (not shown) or may be input by a reader of a storage medium. Diagnostic device 100 may communicate with other devices via a network interface.

図２は、診断処理の例を示す図である。信号取得部１０１は、信号取得プログラム１０１ａを実行するプロセッサ１２１と信号入力部１２３である。信号取得部１０１は、Ｍ個のマイクロホンから音の信号をＭチャンネルアナログ信号として取得し、Ｍチャンネルアナログ信号をＭチャンネルデジタル信号に変換し、次の前処理部１０２へ出力する。 FIG. 2 is a diagram showing an example of the diagnostic process. The signal acquisition unit 101 is a processor 121 that executes the signal acquisition program 101 a and a signal input unit 123. The signal acquisition unit 101 acquires sound signals from the M microphones as M channel analog signals, converts the M channel analog signals into M channel digital signals, and outputs the converted signals to the next preprocessing unit 102.

マイクロホンは直線上、円周状、その他様々な配置であってよい。ただし、特に本実施例では非等間隔であることが望ましい。マイクロホンの間隔によって、得意な（空間的エイリアシングを起こさず、かつ、方向推定の精度が高い）周波数が異なるので、マイクロホンが非等間隔の場合は、様々な周波数において効率的に信号を取得できる。なお、Ｍは３以上の整数であることが望ましい。 The microphones may be linear, circumferential, or various other arrangements. However, in the present embodiment, in particular, it is desirable that they are not equally spaced. Since the frequencies that are good (without spatial aliasing and high in the accuracy of direction estimation) differ depending on the distance between the microphones, when the microphones are not equally spaced, signals can be efficiently acquired at various frequencies. Note that M is preferably an integer of 3 or more.

前処理部１０２は、前処理プログラム１０２ａを実行するプロセッサ１２１である。前処理部１０２は、Ｍチャンネルデジタル信号をフレーム毎に分割し、そのフレームに窓関数を乗算し、窓関数乗算後の信号に短時間フーリエ変換を施して、Ｍチャンネル周波数領域信号を、入力信号空間相関計算部１０３、音源毎空間相関計算部１０５、および音源分離部１０７へ出力する。 The preprocessing unit 102 is a processor 121 that executes the preprocessing program 102 a. The pre-processing unit 102 divides the M channel digital signal into frames, multiplies the frame by a window function, performs a short time Fourier transformation on the signal after the window function multiplication, and generates an M channel frequency domain signal as an input signal. This information is output to the spatial correlation calculation unit 103, the per-sound source space correlation calculation unit 105, and the sound source separation unit 107.

ここでＭチャンネル周波数領域信号は、フレームサイズがＮであれば、（Ｎ／２＋１）＝Ｋ個の周波数ビンそれぞれにＭ個の複素数が対応する、Ｋ×Ｍ個の複素数の組である。 Here, if the frame size is N, the M channel frequency domain signal is a set of K × M complex numbers in which M complex numbers correspond to each of (N / 2 + 1) = K frequency bins.

入力信号空間相関計算部１０３は、入力信号空間相関プログラム１０３ａを実行するプロセッサ１２１である。入力信号空間相関計算部１０３は、周波数ｋ毎のＭチャンネル周波数領域信号に基づいて、周波数ｋ毎に入力信号空間相関行列を計算し、周波数ｋ毎の入力信号空間相関行列を音源存在方向クラスタ推定部１０４と入力信号空間相関異常検知部１０８へ出力する。 The input signal space correlation calculation unit 103 is a processor 121 that executes the input signal space correlation program 103a. The input signal space correlation calculation unit 103 calculates the input signal space correlation matrix for each frequency k based on the M channel frequency domain signal for each frequency k, and estimates the input signal space correlation matrix for each frequency k as the sound source presence direction cluster It outputs to the section 104 and the input signal space correlation abnormality detection section 108.

ここで空間相関行列は、Ｍチャンネル周波数領域信号ベクトルｘ＝［ｘ＿１，・・・，ｘ＿Ｍ］＾Ｔとｘ＾Ｈとの乗算結果の行列の時間平均である。ただし、・＾Ｈは共役転置を表す。時間平均は、或るＴフレームの間の算術平均であってもよく、忘却平均であってもよい。 Here, the spatial correlation matrix is a time average of a matrix of multiplication results of M channel frequency domain signal vectors x = [x_1,..., X_M] ^ T and x ^ H. However, ^ H represents conjugate transposition. The time average may be an arithmetic average during a certain T frame or may be a forgetting average.

異常検知モード入力部１１２は、異常検知モード入力プログラム１１２ａを実行するプロセッサ１２１と入力部１２５である。異常検知モード入力部１１２は、ユーザの操作などにより入力される異常検知のモードを受け付ける。異常検知のモードは、たとえば（１）雑音音源の移動の有無、（２）正常状態の目的音源の移動の有無、（３）診断対象機械の正常稼働状態、である。 The abnormality detection mode input unit 112 is a processor 121 that executes the abnormality detection mode input program 112 a and an input unit 125. The abnormality detection mode input unit 112 receives an abnormality detection mode input by a user operation or the like. The mode of abnormality detection is, for example, (1) presence or absence of movement of noise sound source, (2) presence or absence of movement of target sound source in a normal state, and (3) normal operation state of a diagnosis target machine.

ここで（１）雑音音源の移動の有無は、異常検知表示に関するモードである。（２）正常状態の目的音源の移動の有無は、異常検知表示に関するモードであり、診断対象機械の制御部から入力されてもよい。（３）診断対象機械の正常稼働状態は、異常を検知せずに正常時の情報を蓄積するためのモードであり、診断対象機械の制御部から入力されてもよい。 Here, (1) the presence or absence of movement of the noise source is a mode relating to an abnormality detection display. (2) The presence or absence of movement of the target sound source in the normal state is a mode related to an abnormality detection display, and may be input from the control unit of the machine to be diagnosed. (3) The normal operating state of the diagnosis target machine is a mode for accumulating information at the normal time without detecting an abnormality, and may be input from the control unit of the diagnosis target machine.

入力信号空間相関異常検知部１０８は、入力信号空間相関異常検知プログラム１０８ａを実行するプロセッサ１２１であり、周波数ｋ毎の入力信号空間相関に基づき異常を検知する。 The input signal space correlation abnormality detection unit 108 is a processor 121 that executes the input signal space correlation abnormality detection program 108 a, and detects an abnormality based on the input signal space correlation for each frequency k.

入力信号空間相関異常検知部１０８は、計算された周波数ｋ毎の入力信号空間相関行列が、データベース上に蓄積した正常時の周波数ｋ毎の入力信号空間相関行列と類似している度合いを計算し、計算された第１の類似度が予め設定された第１の閾値以上高ければ、正常との判定結果を出力し、第１の類似度が低ければ、異常との判定結果を出力する。 The input signal space correlation abnormality detection unit 108 calculates the degree to which the calculated input signal space correlation matrix for each frequency k is similar to the input signal space correlation matrix for each frequency k at the normal time stored in the database. If the calculated first similarity is higher than a preset first threshold, the result of determination as normal is output, and if the first similarity is low, the result of determination as abnormal is output.

正常時の周波数ｋ毎の入力信号空間相関行列には、異常検知モード入力部１１２から入力された（３）診断対象機械の正常稼働状態に応じて蓄積された周波数ｋ毎の入力信号空間相関行列が用いられる。 The input signal space correlation matrix for each frequency k at the normal time is (3) the input signal space correlation matrix for each frequency k accumulated according to the normal operating state of the diagnosis target machine input from the abnormality detection mode input unit 112 Is used.

後で説明する音源分離を経由する異常検知は、音源間の方向が近すぎる場合、同種類の部品が複数存在して音源間の独立性が低すぎる場合、あるいは雑音が大きすぎる場合に、音源分離の精度が悪化することで異常検知精度も著しく低下する。 Anomaly detection via sound source separation, which will be described later, is a sound source when the directions between sound sources are too close, when there are multiple parts of the same type and the independence between sound sources is too low, or when the noise is too large. As the separation accuracy deteriorates, the abnormality detection accuracy also significantly decreases.

しかし、それらの場合においても、目的音の異常によって入力信号空間相関行列は変化するので、入力信号空間相関異常検知部１０８は音源分離の精度が悪化する場合であっても異常検知が可能であるという効果を奏する。 However, even in those cases, since the input signal space correlation matrix changes due to an abnormality in the target sound, the input signal space correlation abnormality detection unit 108 can detect an abnormality even if the accuracy of the sound source separation deteriorates. It plays an effect.

入力信号空間相関異常検知部１０８は、診断対象の入力信号空間相関行列と正常時の入力信号空間相関行列との比較を、たとえば、Ｋ個の周波数ビンの空間相関行列をベクトル化して実施する。 The input signal space correlation abnormality detection unit 108 compares the input signal space correlation matrix to be diagnosed with the input signal space correlation matrix at normal time, for example, by vectorizing a space correlation matrix of K frequency bins.

すなわち、空間相関行列はエルミート行列であるため、上三角と対角成分だけを抽出したＫ×Ｍ×（Ｍ−１）／２個の成分を要素に持つベクトル間の第１の類似度の比較を行う。このように次元数を削減しておくことで、過学習の影響を軽減することができ、また、計算量を低減することができる。 That is, since the spatial correlation matrix is a Hermitian matrix, the comparison of the first similarity between vectors having K × M × (M−1) / 2 components in which only upper triangle and diagonal components are extracted I do. By reducing the number of dimensions in this manner, the influence of over-learning can be reduced, and the amount of calculation can be reduced.

診断対象の入力信号空間相関行列をベクトル化したものをｖとし、正常時の入力信号空間相関行列をベクトル化したものをｗとする。第１の類似度として、たとえばｗの平均ベクトルとｖとの間のユークリッド距離の２乗に−１を乗算したものを用いることができる。この場合、異常検知が高速に実行できるという効果が期待できる。 Let v be the vectorization of the input signal space correlation matrix to be diagnosed, and w be the vectorization of the normal input signal space correlation matrix. As the first similarity, for example, one obtained by multiplying the square of the Euclidean distance between the average vector of w and v by -1 can be used. In this case, the effect that abnormality detection can be performed at high speed can be expected.

また、第１の類似度として、ｗを多変量複素ガウス分布にフィッティングし、フィッティング結果の多変量複素ガウス分布がｖを生成する確率密度関数の対数尤度を用いることができる。複数のマイクロホン間で音量感度が大きく異なる場合や、複数のマイクロホンの設置間隔が大きく異なる場合に、前述の単純なユークリッド距離を用いると異常検知を誤り易いが、多変量複素ガウス分布を用いればこれらのぶれを吸収して学習できるので正しい異常検知が可能となるという効果が期待できる。 Also, as the first similarity, w can be fitted to a multivariate complex Gaussian distribution, and the log likelihood of a probability density function in which the multivariate complex Gaussian distribution as a fitting result generates v can be used. When the volume sensitivity differs greatly among multiple microphones, or when the installation intervals of multiple microphones differ significantly, using the above-mentioned simple Euclidean distance is likely to cause error in anomaly detection, but using a multivariate complex Gaussian distribution Since it is possible to absorb and learn the blurring of the object, it is possible to expect the effect that correct abnormality detection becomes possible.

また、第１の類似度として、ｗを複素混合ガウス分布にフィッティングし、フィッティング結果の複素混合ガウス分布がｖを生成する確率密度関数の対数尤度を用いることができる。正常時において複数の音源が存在する場合に、前述の多変量複素ガウス分布ではモデル化できないため、多変量複素ガウス分布を用いると異常検知を誤り易いが、複素混合ガウス分布を用いれば複数の音源をモデル化できるので正しい異常検知が可能となるという効果が期待できる。 Also, as the first similarity, w can be fitted to a complex mixture Gaussian distribution, and the log likelihood of the probability density function in which the complex mixture Gaussian distribution of the fitting result generates v can be used. When there are multiple sound sources in the normal state, it can not be modeled with the above multivariate complex Gaussian distribution, so using the multivariate complex gaussian distribution is likely to make error detection abnormal, but using the complex mixed Gaussian distribution makes multiple sound sources Since it can be modeled, the effect that correct abnormality detection becomes possible can be expected.

他に、正常状態のｗのモデリング方法としては、一般的な１クラスサポートベクター分類器、部分空間法、局所部分空間法、k-meansクラスタリング、Deep Neural Network（ＤＮＮ） autoencoder、Convolutional Neural Network（ＣＮＮ） autoencoder、Long Short Term Memory（ＬＳＴＭ） autoencoder、variational autoencoder（ＶＡＥ）などが用いられてもよい。 Besides, as a method of modeling w in a normal state, general 1 class support vector classifier, subspace method, local subspace method, k-means clustering, Deep Neural Network (DNN) autoencoder, Convolutional Neural Network (CNN) ) Autoencoder, Long Short Term Memory (LSTM) autoencoder, variational autoencoder (VAE), etc. may be used.

音源存在方向クラスタ推定部１０４は、音源存在方向クラスタ推定プログラム１０４ａを実行するプロセッサ１２１である。音源存在方向クラスタ推定部１０４は、入力信号空間相関計算部１０３が出力する周波数ｋ毎の入力信号空間相関行列に基づいて、空間相関行列の計算に用いたＴフレーム内の音源存在方向クラスタを推定する。 The sound source existing direction cluster estimation unit 104 is a processor 121 that executes the sound source existing direction cluster estimation program 104 a. The sound source presence direction cluster estimation unit 104 estimates a sound source presence direction cluster in the T frame used in the calculation of the spatial correlation matrix based on the input signal space correlation matrix for each frequency k output from the input signal space correlation calculation unit 103. Do.

まず、空間相関行列に基づいて方向毎の音の大きさを表す周波数・方向ヒストグラムを推定する。その推定処理として、Minimum Variance Distortion Response（ＭＶＤＲ） beamformerやMUltiple SIgnal Classification（ＭＵＳＩＣ） beamformerといった一般的な技術が用いられてよい。 First, based on the spatial correlation matrix, a frequency / direction histogram representing the magnitude of sound in each direction is estimated. As the estimation process, a general technique such as Minimum Variance Distortion Response (MVDR) beamformer or MUltiple SIgnal Classification (MUSIC) beamformer may be used.

周波数・方向ヒストグラムを周波数方向に積算して、方向ヒストグラムを得る。積算処理は、単純に周波数・方向ヒストグラムの値の和であってもよく、周波数・方向ヒストグラムの値に定数を加算した値の対数値の和であってもよい。 The frequency and direction histograms are integrated in the frequency direction to obtain a direction histogram. The integration process may be simply the sum of the values of the frequency and direction histograms, or may be the sum of logarithmic values of values obtained by adding a constant to the values of the frequency and direction histograms.

方向ヒストグラムの各方向の音の大きさが予め設定された第４の閾値より大きい場合には対応する方向に音源が存在すると判定し、第４の閾値より小さい場合には対応する方向に音源が存在しないと判定する。そして、音源が存在する方向が複数存在し、それらの方向が十分近傍であるもの同士をクラスタリングする。 If the magnitude of the sound in each direction of the direction histogram is larger than the preset fourth threshold, it is determined that the sound source is present in the corresponding direction, and if smaller than the fourth threshold, the sound source is detected in the corresponding direction. It determines that it does not exist. Then, there are a plurality of directions in which the sound source is present, and those directions in which the directions are sufficiently close are clustered.

クラスタリングには凝集型クラスタリングやk-meansクラスタリングなどの一般的な技術が用いられてよい。ここでクラスタ数Ｃが音源数となる。それぞれのクラスタｃは、クラスタに属する音源存在方向に対して計算される方向統計的標本平均方向（sample mean direction）と方向統計的標本分散（sample variance direction）を有し、それらで定めたvon Mises分布によって定められる。 For clustering, general techniques such as aggregation clustering and k-means clustering may be used. Here, the number of clusters C is the number of sound sources. Each cluster c has a directional statistical sample mean direction and a directional statistical sample variance direction calculated with respect to the sound source presence direction belonging to the cluster, and von Mises defined by them It is determined by the distribution.

角度値の度数やラジアンの平均や分散が用いられると、誤差が大きいという問題があるので、方向統計的標本平均方向と方向統計的標本分散を用いることによって、この問題は解決される。 The problem is solved by the use of directional statistical sample mean direction and directional statistical sample variance, as there is the problem that the errors are large if the frequency values or the mean or variance of the angular values are used.

音源毎空間相関計算部１０５は、音源毎空間相関計算プログラム１０５ａを実行するプロセッサ１２１であり、Mチャンネル周波数領域信号と、音源存在方向クラスタとに基づいて、音源毎空間相関行列Ｒ＿ｃ（ｃ＝１，・・・，Ｃ）を計算し、音源毎空間相関行列を音源分離フィルタ更新部１０６と音源毎空間相関異常検知部１０９へ出力する。 The per-sound source space correlation calculation unit 105 is a processor 121 that executes the per-sound source space correlation calculation program 105a, and the per-sound source space correlation matrix R_c (c = 1) based on the M channel frequency domain signal and the sound source presence direction cluster. ,..., C), and output the per-source space correlation matrix to the per-source separation filter updating unit 106 and the per-source spatial correlation abnormality detection unit 109.

具体的には、まず、Ｍチャンネル周波数領域信号の各フレーム・各周波数ビンで独立に方向推定を行う。このときマイクロホン配置に従って方向毎にステアリングベクトルをあらかじめ計算しておく。 Specifically, first, direction estimation is performed independently for each frame and each frequency bin of the M channel frequency domain signal. At this time, a steering vector is previously calculated for each direction according to the microphone arrangement.

この計算のために、たとえば、M. Togami, Y. Obuchi, and A. Amano, “Automatic speech recognition of human-symbiotic robot emiew,”in Human-Robot Interaction, Nilanjan Sarkar, Ed., pp. 395-404. I-tech Education and Publishing, 2007.に開示された処理が用いられてもよい。 For this calculation, for example, M. Togami, Y. Obuchi, and A. Amano, “Automatic speech recognition of human-symbiotic robot emiew,” in Human-Robot Interaction, Nilanjan Sarkar, Ed., Pp. 395-404 The process disclosed in I-tech Education and Publishing, 2007. may be used.

Mチャンネル周波数領域信号を正規化したベクトルとステアリングベクトルとの内積が最も高い方向が、そのフレーム・周波数の音源方向であるとする。その音源方向が前述の音源存在方向クラスタｃに紐付いたvon Mises分布から生成される尤度を計算し、その尤度が十分高ければ、そのフレーム・周波数をｃに割り当てる。 It is assumed that the direction in which the inner product of the vector obtained by normalizing the M channel frequency domain signal and the steering vector is the source direction of the frame / frequency. The likelihood generated from the von Mises distribution whose source direction is linked to the aforementioned source presence direction cluster c is calculated, and if the likelihood is sufficiently high, the frame frequency is assigned to c.

そして、ｃに割り当てられたフレーム・周波数のＭチャンネル周波数領域信号ベクトルｘだけの時間平均により音源毎空間相関行列Ｒ＿ｃを更新する。時間平均処理は前述の入力信号空間相関行列と同様にｘとｘ＾Ｈの積ｘｘ＾Ｈの算術平均または忘却平均で計算する。 Then, the per-source spatial correlation matrix R_c is updated by time averaging of only the M channel frequency domain signal vector x of the frame / frequency assigned to c. The time averaging process is calculated using the arithmetic mean or forgetting mean of the product x ^ H of x and x ^ H as in the above-described input signal space correlation matrix.

音源毎空間相関異常検知部１０９は、音源毎空間相関異常検知プログラム１０９ａを実行するプロセッサ１２１であり、計算された周波数ｋ毎の音源毎信号空間相関行列Ｒ＿ｃが、データベース上に蓄積された正常時の周波数ｋ毎の音源毎空間相関行列と類似している度合いを計算し、計算された第２の類似度が予め設定された第２の閾値以上高ければ、正常との判定結果を出力し、第２の類似度が低ければ、異常との判定結果を出力する。 The per-sound source spatial correlation abnormality detection unit 109 is a processor 121 that executes the per-sound source space correlation abnormality detection program 109a, and the normal per-sound source signal space correlation matrix R_c for each calculated frequency k is stored in the database Calculate the degree of similarity with the per-source spatial correlation matrix for each frequency k, and if the calculated second similarity is higher than a preset second threshold, output the judgment result as normal. If the second similarity is low, the result of the judgment of abnormality is output.

正常時の周波数ｋ毎の音源毎空間相関行列には、異常検知モード入力部１１２から入力された（３）診断対象機械の正常稼働状態に応じて蓄積された周波数ｋ毎の音源毎空間相関行列が用いられる。 The per-sound source spatial correlation matrix for each frequency k at the time of normal operation is input from the abnormality detection mode input unit 112 (3) the per-sound source spatial correlation matrix for each frequency k accumulated according to the normal operating state of the diagnosis target machine Is used.

前述のとおり、入力信号空間相関異常検知部１０８は、音源分離の精度が悪化する条件であっても、目的音の異常により入力信号空間相関行列が変化することを利用している。しかし、目的音ではない雑音が変化した場合だけでも入力信号空間相関行列は変化するため、入力信号空間相関異常検知部１０８は雑音が存在する条件での異常検知精度が低い。 As described above, the input signal space correlation abnormality detection unit 108 utilizes the fact that the input signal space correlation matrix changes due to the abnormality of the target sound even under the condition that the accuracy of the sound source separation deteriorates. However, since the input signal space correlation matrix changes only when noise that is not the target sound changes, the input signal space correlation abnormality detection unit 108 has a low accuracy of abnormality detection in the presence of noise.

これに対し、音源毎空間相関異常検知部１０９は、目的音・雑音毎に分解した音源毎空間相関を用いるので、雑音が存在する条件であっても異常検知が可能であるという効果を奏する。 On the other hand, since the per-sound source space correlation abnormality detection unit 109 uses per-sound source space correlation decomposed for each of the target sound and noise, there is an effect that abnormality detection is possible even under the condition where noise is present.

診断対象の音源毎空間相関行列と正常時の音源毎空間相関行列との比較は、上述の入力信号空間相関異常検知部１０８と同様に、たとえば、Ｋ個の周波数ビンの空間相関行列をベクトル化して実施する。 The comparison between the sound source spatial correlation matrix to be diagnosed and the normal sound source space correlation matrix at the normal time is performed, for example, as a vectorization of the spatial correlation matrix of K frequency bins, like the above-mentioned input signal space correlation anomaly detection unit 108. To carry out.

すなわち、空間相関行列はエルミート行列であるため、上三角と対角成分だけを抽出したＫ×Ｍ×（Ｍ−１）／２個の成分を要素に持つベクトル間の第２の類似度の比較を行う。このように次元数を削減しておくことで、過学習の影響を軽減することができ、また、計算量を低減することができる。 That is, since the spatial correlation matrix is a Hermitian matrix, the comparison of the second similarity between vectors having K × M × (M−1) / 2 components in which only upper triangle and diagonal components are extracted I do. By reducing the number of dimensions in this manner, the influence of over-learning can be reduced, and the amount of calculation can be reduced.

診断対象の音源毎空間相関行列をベクトル化したものをｖとし、正常時の音源毎空間相関行列をベクトル化したものをｗとする。第２の類似度として、たとえばｗの平均ベクトルとｖとの間のユークリッド距離の２乗に−１を乗算したものを用いることができる。この場合、異常検知が高速に実行できるという効果が期待できる。 Let v be a vectorization of the per-source spatial correlation matrix to be diagnosed, and w be a vector of the per-source spatial correlation matrix in normal operation. As the second similarity, for example, one obtained by multiplying the square of the Euclidean distance between the average vector of w and v by -1 can be used. In this case, the effect that abnormality detection can be performed at high speed can be expected.

また、第２の類似度として、ｗを多変量複素ガウス分布にフィッティングし、フィッティング結果の多変量複素ガウス分布がｖを生成する確率密度関数の対数尤度を用いることができる。複数のマイクロホン間で音量感度が大きく異なる場合や、複数のマイクロホンの設置間隔が大きく異なる場合に、前述の単純なユークリッド距離を用いると異常検知を誤り易いが、多変量複素ガウス分布を用いればこれらのぶれを吸収して学習できるので正しい異常検知が可能となるという効果が期待できる。 Also, as the second similarity, w can be fitted to a multivariate complex Gaussian distribution, and the log likelihood of a probability density function in which the multivariate complex Gaussian distribution as a fitting result generates v can be used. When the volume sensitivity differs greatly among multiple microphones, or when the installation intervals of multiple microphones differ significantly, using the above-mentioned simple Euclidean distance is likely to cause error in anomaly detection, but using a multivariate complex Gaussian distribution Since it is possible to absorb and learn the blurring of the object, it is possible to expect the effect that correct abnormality detection becomes possible.

また、第２の類似度として、ｗを複素混合ガウス分布にフィッティングし、フィッティング結果の複素混合ガウス分布がｖを生成する確率密度関数の対数尤度を用いることができる。正常時において同一方向に複数の音源が存在する場合に、前述の多変量複素ガウス分布ではモデル化できないため、多変量複素ガウス分布を用いると異常検知を誤り易いが、複素混合ガウス分布を用いれば複数の音源をモデル化できるので正しい異常検知が可能となるという効果が期待できる。 Also, as the second similarity, w can be fitted to a complex mixture Gaussian distribution, and the log likelihood of the probability density function in which the complex mixture Gaussian distribution of the fitting result generates v can be used. When multiple sound sources exist in the same direction in normal times, modeling can not be performed with the above-described multivariate complex Gaussian distribution, so using the multivariate complex Gaussian distribution is likely to cause errors in anomaly detection, but using complex mixed Gaussian distribution Since a plurality of sound sources can be modeled, an effect that correct abnormality detection can be performed can be expected.

他に、正常状態のｗのモデリング方法としては、一般的な１クラスサポートベクター分類器、部分空間法、局所部分空間法、k-meansクラスタリング、ＤＮＮ autoencoder、ＣＮＮ autoencoder、ＬＳＴＭ autoencoder、ＶＡＥなどが用いられてもよい。 Besides, as a method of modeling w in a normal state, a general one class support vector classifier, subspace method, local subspace method, k-means clustering, DNN autoencoder, CNN autoencoder, LSTM autoencoder, VAE etc. are used. It may be done.

音源分離フィルタ更新部１０６は、音源分離フィルタ更新プログラム１０６ａを実行するプロセッサ１２１であり、各音源の空間相関行列に基づいて、音源分離フィルタを計算する。音源分離フィルタは、たとえば、一般的なGeneralized EigenValue （ＧＥＶ） beamformerである。ＧＥＶ beamformerは、Ｒ＿ｎを雑音の空間相関行列、Ｒ＿ｘを目的音の空間相関行列としたときの一般化固有ベクトルｅを音源分離フィルタとする。 The sound source separation filter updating unit 106 is a processor 121 that executes the sound source separation filter update program 106a, and calculates the sound source separation filter based on the spatial correlation matrix of each sound source. The sound source separation filter is, for example, a general Generalized EigenValue (GEV) beamformer. GEV beamformer uses generalized eigenvectors e as a sound source separation filter when R_n is a spatial correlation matrix of noise, and R_x is a spatial correlation matrix of a target sound.

すなわち、
Ｒ＿ｎｅ＝λＲ＿ｘｅ
ただし、音源存在方向クラスタｃ’を目的音方向、それ以外のクラスタを雑音方向とした場合、Ｒ＿ｘとＲ＿ｎは以下のように計算できる。 That is,
R_ne = λR_xe
However, when the sound source presence direction cluster c ′ is the target sound direction and the other clusters are the noise direction, R_x and R_n can be calculated as follows.

Ｒ＿ｘ＝Ｒ＿ｃ’
Ｒ＿ｎ＝Σ＿｛ｃ≠ｃ’｝Ｒ＿ｃ
なお、ｅのスケールは不定であるので、Blind Analytic Normalization （ＢＡＮ）などの一般的な正規化を施したｅ’を最終的な音源分離フィルタとする。 R_x = R_c '
R_n = Σ_ {c ≠ c '} R_c
In addition, since the scale of e is indefinite, let e 'which performed general normalization, such as Blind Analytic Normalization (BAN), be a final sound source separation filter.

ＧＥＶ beamformerの代わりにＭＶＤＲ beamformerなどの一般的な音源分離フィルタが用いられてもよい。これらの音源分離フィルタは線形フィルタであるので、音源分離信号に歪みが発生しないというメリットがある。 A general sound source separation filter such as MVDR beamformer may be used instead of GEV beamformer. Since these sound source separation filters are linear filters, there is an advantage that no distortion occurs in the sound source separation signal.

音源分離部１０７は、音源分離プログラム１０７ａを実行するプロセッサ１２１であり、Ｍチャンネル周波数領域信号に対して、音源分離フィルタを施すことで音源分離を行い、音源分離信号を出力する。 The sound source separation unit 107 is a processor 121 that executes the sound source separation program 107a, performs sound source separation by applying a sound source separation filter to the M channel frequency domain signal, and outputs a sound source separation signal.

音源分離信号異常検知部１１０は、音源分離信号異常検知プログラム１１０ａを実行するプロセッサ１２１であり、まず、音源分離信号に基づいて、特徴量ベクトルを算出する。特徴量ベクトルは、たとえば、音源分離信号のパワースペクトラム、振幅ケプストラム、メル周波数ケプストラム係数（ＭＦＣＣ）で構成されている。 The sound source separation signal abnormality detection unit 110 is a processor 121 that executes the sound source separation signal abnormality detection program 110a, and first calculates a feature amount vector based on the sound source separation signal. The feature quantity vector is composed of, for example, the power spectrum of the sound source separation signal, the amplitude cepstrum, and the mel frequency cepstrum coefficient (MFCC).

そして、音源分離信号異常検知部１１０は、算出された特徴量ベクトルが、データベース上に蓄積した正常時の特徴量ベクトルと類似している度合いを計算し、計算された第３の類似度が予め設定された第３の閾値以上高ければ、正常との判定結果を出力し、第３の類似度が低ければ、異常との判定結果を出力する。 Then, the sound source separation signal abnormality detection unit 110 calculates the degree to which the calculated feature quantity vector is similar to the normal feature quantity vector stored in the database, and the calculated third similarity is previously calculated. If it is higher than the set third threshold value, the judgment result that it is normal is outputted, and if the third similarity is low, the judgment result that it is abnormal is outputted.

正常時の特徴量ベクトルには、異常検知モード入力部１１２から入力された（３）診断対象機械の正常稼働状態に応じて蓄積された特徴量ベクトルが用いられる。音源分離信号は雑音が除去されて目的音のみが抽出された音なので、入力信号に対して異常検知を行う場合よりも異常検知の精度が向上するという効果を奏する。 As the feature vector at the normal time, the feature vector stored in accordance with the normal operating state of the (3) diagnosis target machine input from the abnormality detection mode input unit 112 is used. Since the sound source separation signal is a sound from which noise is removed and only the target sound is extracted, it is possible to improve the accuracy of the abnormality detection compared to the case where the abnormality detection is performed on the input signal.

異常表示部１１１は、異常表示プログラム１１１ａを実行するプロセッサ１２１と表示部１２４であり、異常検知モード入力部１１２から入力された、（１）雑音音源の移動の有無と（２）正常状態の目的音源の移動の有無に応じて、入力信号空間相関異常検知部１０８、音源毎空間相関異常検知部１０９、および音源分離信号異常検知部１１０から入力された異常の有無の判定結果を表示する。 The abnormality display unit 111 is a processor 121 that executes the abnormality display program 111a and the display unit 124. The abnormality display unit 111 has (1) presence / absence of movement of noise sound source and (2) purpose of normal state input from the abnormality detection mode input unit 112. According to the presence or absence of movement of the sound source, the determination result of the presence or absence of abnormality input from the input signal space correlation abnormality detection unit 108, the sound source space correlation abnormality detection unit 109, and the sound source separation signal abnormality detection unit 110 is displayed.

図３は、異常検知モードと表示される情報の例を示す図である。異常検知モード３０１には、（１）雑音音源の移動の有無のモード３０２と、（２）正常状態の目的音源の移動の有無のモード３０３があり、モード３０２とモード３０３における「有り」と「無し」の４通りの組み合わせのいずれかに応じて表示される情報３０４が決まっている。 FIG. 3 is a diagram showing an example of information displayed as the abnormality detection mode. The abnormality detection mode 301 includes (1) a mode 302 of presence / absence of movement of a noise source and (2) a mode 303 of presence / absence of movement of a target sound source in a normal state. Information 304 to be displayed is determined according to any of the four combinations of "none".

表示される情報３０４は、入力信号空間相関異常検知部１０８、音源毎空間相関異常検知部１０９、あるいは音源分離信号異常検知部１１０のいずれの判定結果を表示するかの情報であり、複数の判定結果を並べて表示する場合もある。 Information 304 to be displayed is information as to which of the determination results of the input signal space correlation abnormality detection unit 108, the sound source space correlation abnormality detection unit 109, or the sound source separation signal abnormality detection unit 110 is to be displayed. Sometimes the results are displayed side by side.

モード３０２とモード３０３の組み合わせによらず表示する情報が同じであると異常と表示された場合に本当に異常なのかどうかがユーザに判りにくいという問題がある。このような表示切り換えによって、この問題を解決するという効果を奏する。 If the displayed information is the same regardless of the combination of the mode 302 and the mode 303, if it is displayed as abnormal, it is difficult for the user to know whether it is really abnormal or not. Such display switching has the effect of solving this problem.

図４は、表示部１２４の画面の例を示す図である。たとえば異常表示部１１１は、入力信号空間相関異常検知部１０８から入力された判定結果が異常を示す場合、メッセージ４０１およびメッセージ４０２、あるいはメッセージ４０１またはメッセージ４０２を表示部１２４に表示してもよく、メッセージ４０１とメッセージ４０２は音源位置が異なることを伝えている。 FIG. 4 is a diagram showing an example of the screen of the display unit 124. As shown in FIG. For example, if the determination result input from the input signal space correlation abnormality detection unit 108 indicates an abnormality, the abnormality display unit 111 may display the message 401 and the message 402, or the message 401 or the message 402 on the display unit 124. The message 401 and the message 402 convey that the sound source positions are different.

音源毎空間相関異常検知部１０９から入力された判定結果が異常を示す場合も、メッセージ４０１およびメッセージ４０２、あるいはメッセージ４０１またはメッセージ４０２を表示部１２４に表示してもよい。 The message 401 and the message 402, or the message 401 or the message 402 may be displayed on the display unit 124 also when the determination result input from the sound source space correlation abnormality detection unit 109 indicates an abnormality.

異常表示部１１１は、音源分離信号異常検知部１１０から入力された判定結果が異常を示す場合、メッセージ４０３を表示部１２４に表示し、音の特徴量に基づいて異常な音であると判定したことを伝える。 When the determination result input from the sound source separation signal abnormality detection unit 110 indicates an abnormality, the abnormality display unit 111 displays the message 403 on the display unit 124, and determines that the sound is an abnormality based on the feature amount of the sound. Tell that.

メッセージ４０１およびメッセージ４０２と、メッセージ４０３とのいずれのメッセージが表示部１２４に表示されるかは、図３に示した表示される情報３０４のとおりであり、異常表示部１１１はモード３０２とモード３０３との組み合わせに応じて選択する。 Which of the messages 401 and 402 and the message 403 is displayed on the display section 124 is as shown in the information 304 shown in FIG. 3, and the abnormality display section 111 has modes 302 and 303. Select according to the combination with

音源分離の精度が悪化する条件は、音源間の方向が近すぎる場合や、同種類の部品が複数存在し、音源間の音質の独立性が低すぎる場合や、雑音が大きすぎる場合であるが、それらの条件においても、空間相関行列は変化する。 The condition that the accuracy of the sound source separation deteriorates is when the directions of the sound sources are too close, or when there are multiple parts of the same type and the sound quality independence between the sound sources is too low, or the noise is too large. The spatial correlation matrix also changes under these conditions.

本実施例の音と振動に基づく診断処理は、音源分離信号を経由する音源分離信号異常検知部１１０に加えて、音源分離信号を経由しない異常検知である、入力信号空間相関異常検知部１０８と音源毎空間相関異常検知部１０９も有しているため、音源分離の精度が悪化する場合であっても異常検知が可能であるという効果を奏する。 The diagnostic processing based on sound and vibration in the present embodiment is, in addition to the sound source separation signal abnormality detection unit 110 via the sound source separation signal, the input signal space correlation abnormality detection unit 108 and abnormality detection not via the sound source separation signal. Since each sound source space correlation abnormality detection unit 109 is also provided, an effect that abnormality detection is possible even when the accuracy of the sound source separation is deteriorated is exhibited.

入力信号空間相関異常検知部１０８が第１の類似度の比較に使用する第１の閾値と、音源毎空間相関異常検知部１０９が第２の類似度の比較に使用する第２の閾値と、音源分離信号異常検知部１１０が第３の類似度の比較に使用する第３の閾値とは、異なる値であってもよい。 A first threshold used by the input signal space correlation anomaly detection unit 108 for the first similarity comparison, and a second threshold used by the sound source space correlation anomaly detection unit 109 for the second similarity comparison; The third threshold used by the sound source separation signal abnormality detection unit 110 for the third similarity comparison may be a different value.

これら３つの閾値は、３つの類似度の尺度が異なるため、そのままでは直接に比較できる値ではないが、３つの類似度あるいは３つの類似度の計算の元となる入力信号空間相関などを正規化して、比較できる値としてもよい。このように３つの閾値を比較できる値とした場合、第１の閾値は第２の閾値より高い値であり、第２の閾値は第３の閾値より高い値であってもよい。 These three threshold values are not directly comparable values as they are because they are different in the measure of the three similarities, but normalize the input signal spatial correlation that is the basis of the three similarities or the calculation of the three similarities. It may be a value that can be compared. When the three threshold values can be compared as described above, the first threshold value may be higher than the second threshold value, and the second threshold value may be higher than the third threshold value.

３つの閾値が診断装置１００に設定される場合、３つの閾値を互いに比較し、第１の閾値は第２の閾値より高い値であり、第２の閾値は第３の閾値より高い値である場合以外に、警告を表示して再設定を促してもよい。 When three thresholds are set in the diagnostic device 100, the three thresholds are compared with each other, the first threshold is higher than the second threshold, and the second threshold is higher than the third threshold. In addition to the case, a warning may be displayed to prompt resetting.

また、異常であると最終結論付けられる場合も含めて統計的に十分なサンプル数となる量の信号が取得されて統計処理され、第１の類似度が第１の閾値以上となる発生確率が、第２の類似度が第２の閾値以上となる発生確率より低くなるように第１の閾値と第２の閾値が設定され、第２の類似度が第２の閾値以上となる発生確率が、第３の類似度が第３の閾値以上となる発生確率より低くなるように第２の閾値と第３の閾値が設定されてもよい。 In addition, a signal having a statistically sufficient number of samples is acquired including the final conclusion that it is abnormal and obtained statistically, and the occurrence probability that the first similarity is equal to or more than the first threshold is The first threshold and the second threshold are set such that the second similarity is lower than the occurrence probability that the second similarity is equal to or higher than the second threshold, and the occurrence probability that the second similarity is equal to or higher than the second threshold is The second threshold and the third threshold may be set such that the third similarity is lower than the occurrence probability that the third similarity is equal to or higher than the third threshold.

さらに、異常であると最終結論付けられる場合も含めて統計的に十分なサンプル数となる量の信号が取得されて統計処理され、第１の類似度の確率密度関数と、第２の類似度の確率密度関数と、第３の類似度の確率密度関数とがそれぞれ正規化され、第１の閾値は第２の閾値より高い値に設定され、第２の閾値は第３の閾値より高い値に設定されてもよい。 Furthermore, a signal having a statistically sufficient number of samples is obtained and statistically processed, including the case where it is finally concluded that it is abnormal, and the probability density function of the first similarity and the second similarity are obtained. And the third similarity density probability density function are respectively normalized, the first threshold value is set to a value higher than the second threshold value, and the second threshold value is higher than the third threshold value. It may be set to

実施例２では、音源分離の精度が不十分であり、マイク数より音源数が多い場合であっても、実施例１より高精度での異常検知を可能とする診断処理の例を説明する。実施例２は、実施例１と比較して、周波数・方向パワー信号を計算し、その周波数・方向パワー信号に対して信号分離を行い、分離後の周波数・方向パワー信号を異常検知に用いる点で異なる例である。 In the second embodiment, an example of a diagnostic process that enables abnormality detection with higher accuracy than the first embodiment will be described even if the accuracy of sound source separation is insufficient and the number of sound sources is larger than the number of microphones. In the second embodiment, compared to the first embodiment, the frequency / direction power signal is calculated, signal separation is performed on the frequency / direction power signal, and the frequency / direction power signal after separation is used for abnormality detection. Is a different example.

図５は、実施例２の診断装置５００の例を示す図である。図１に示した診断装置１００と同じものには同じ符号を付けて説明を省略する。記憶部１２６には、音源存在方向クラスタ推定プログラム１０４ａと異常表示プログラム１１１ａが格納されていない。 FIG. 5 is a diagram showing an example of a diagnostic device 500 of the second embodiment. The same components as those of the diagnostic device 100 shown in FIG. The storage unit 126 does not store the sound source presence direction cluster estimation program 104 a and the abnormality display program 111 a.

その代わりに、記憶部１２６には、周波数・方向パワー計算プログラム５０１ａ、周波数・方向パワー信号分離プログラム５０２ａ、周波数・方向パワー異常検知プログラム５０３ａ、音源存在方向クラスタ推定プログラム５０４ａ、および異常表示プログラム５０５ａが格納されている。 Instead, the storage unit 126 includes a frequency / direction power calculation program 501a, a frequency / direction power signal separation program 502a, a frequency / direction power abnormality detection program 503a, a sound source presence direction cluster estimation program 504a, and an abnormality display program 505a. It is stored.

図６は、実施例２の診断処理の例を示す図である。周波数・方向パワー計算部５０１は、周波数・方向パワー計算プログラム５０１ａを実行するプロセッサ１２１であり、フレームｔ毎、周波数ｋ毎のＭチャンネル周波数領域信号に基づいて、フレームｔ毎、周波数ｋ毎、方向ｄ毎のパワーＸ（ｔ，ｋ，ｄ）を計算する。 FIG. 6 is a diagram illustrating an example of a diagnosis process of the second embodiment. The frequency / direction power calculation unit 501 is a processor 121 that executes the frequency / direction power calculation program 501a, and for each frame t, for every frame t, for every frequency k, based on the M channel frequency domain signal for every frequency k. Calculate the power X (t, k, d) for each d.

具体的には、まず、Ｍチャンネル周波数領域信号の各フレーム・各周波数ビンで独立に方向推定を行う。このために、前述した“Automatic speech recognition of human-symbiotic robot emiew”などに開示された技術が用いられてもよい。そして、マイクロホンの配置に従って方向ｄ毎にステアリングベクトルがあらかじめ計算されている。 Specifically, first, direction estimation is performed independently for each frame and each frequency bin of the M channel frequency domain signal. For this purpose, the technology disclosed in the above-mentioned "Automatic speech recognition of human-symbiotic robot emiew" or the like may be used. Then, steering vectors are calculated in advance for each direction d in accordance with the arrangement of the microphones.

Ｍチャンネル周波数領域信号を正規化したベクトルとステアリングベクトルとの内積が最も高い方向ｄが、そのフレーム・周波数に対応した音源方向であるとする。そして、そのフレーム・周波数成分のパワーを、周波数・方向パワーＸ（ｔ，ｋ，ｄ）とする。 It is assumed that the direction d in which the inner product of the vector obtained by normalizing the M channel frequency domain signal and the steering vector is the highest is the sound source direction corresponding to the frame / frequency. Then, the power of the frame / frequency component is taken as the frequency / direction power X (t, k, d).

音源存在方向クラスタ推定部５０４は、音源存在方向クラスタ推定プログラム５０４ａを実行するプロセッサ１２１であり、周波数・方向パワー計算部５０１が出力するフレームｔ毎、周波数ｋ毎、方向ｄ毎のパワーＸ（ｔ，ｋ，ｄ）に基づいて、Ｔフレーム内の方向毎に音源存在方向クラスタを推定する。まず、Ｘ（ｔ，ｋ，ｄ）をＴフレーム内で積算したＹ（ｋ，ｄ）を計算する。 The sound source existing direction cluster estimation unit 504 is a processor 121 that executes the sound source existing direction cluster estimation program 504a, and the power X (t for each frequency k and direction d for each frame t output by the frequency / direction power calculator 501). , K, d) to estimate a sound source presence direction cluster for each direction in the T frame. First, Y (k, d) obtained by integrating X (t, k, d) in T frame is calculated.

さらにＹ（ｋ，ｄ）を周波数方向ｋ＝１，…，Ｋに積算して、方向ヒストグラムＺ（ｄ）を得る。積算処理は、単純に周波数・方向ヒストグラムの値の和であってもよく、周波数・方向ヒストグラムの値に定数を加算した値の対数値の和であってもよい。 Further, Y (k, d) is integrated in the frequency direction k = 1,..., K to obtain a direction histogram Z (d). The integration process may be simply the sum of the values of the frequency and direction histograms, or may be the sum of logarithmic values of values obtained by adding a constant to the values of the frequency and direction histograms.

方向ヒストグラムの各方向の音の大きさが、予め設定された閾値より大きい場合には対応する方向に音源が存在すると判定し、予め設定された閾値より小さい場合には対応する方向に音源が存在しないと判定する。そして、音源が存在する方向が複数存在し、それらの方向が十分近傍であるもの同士をクラスタリングする。 If the magnitude of the sound in each direction of the direction histogram is larger than the preset threshold, it is determined that the sound source is present in the corresponding direction, and if smaller than the preset threshold, the sound source is present in the corresponding direction It decides that it does not do. Then, there are a plurality of directions in which the sound source is present, and those directions in which the directions are sufficiently close are clustered.

クラスタリングには凝集型クラスタリングやk-meansクラスタリングなどの一般的な技術が用いられてもよい。ここでクラスタ数Ｃが音源数となる。それぞれのクラスタｃは、クラスタに属する音源存在方向に対して計算される方向統計的標本平均方向（sample mean direction）と方向統計的標本分散（sample variance direction）を持ち、それらで定めたvon Mises分布によって定められる。 For clustering, general techniques such as aggregation clustering and k-means clustering may be used. Here, the number of clusters C is the number of sound sources. Each cluster c has a directional statistical sample mean direction and a directional statistical sample variance direction calculated with respect to the sound source presence direction belonging to the cluster, and the von Mises distribution defined by them Determined by

角度値の度数やラジアンの平均や分散を用いると誤差が大きいという問題があるので、方向統計的標本平均方向と方向統計的標本分散を用いることによって、この問題を解決している。 Since there is a problem that the error is large when the frequency value of the angle value or the mean or the dispersion of the radian is used, this problem is solved by using the directional statistical sample mean and the directional statistical sample dispersion.

音源の時間・周波数スパース性が高い場合、音源存在方向クラスタ推定部５０４で計算される方向ヒストグラムは、実施例１の音源存在方向クラスタ推定部１０４の方向ヒストグラムに比べて、指向性が鋭いという性質がある。 When the time / frequency sparsity of the sound source is high, the direction histogram calculated by the sound source existing direction cluster estimation unit 504 has a sharp directivity as compared with the direction histogram of the sound source existing direction cluster estimation unit 104 of the first embodiment. There is.

したがって、音源の時間・周波数スパース性が高い場合、音源存在方向クラスタ推定部５０４は、実施例１の音源存在方向クラスタ推定部１０４より、推定精度が高いという効果を奏する。 Therefore, when the time / frequency sparsity of the sound source is high, the sound source presence direction cluster estimation unit 504 has an effect that the estimation accuracy is higher than that of the sound source presence direction cluster estimation unit 104 of the first embodiment.

周波数・方向パワー信号分離部５０２は、周波数・方向パワー信号分離プログラム５０２ａを実行するプロセッサ１２１であり、周波数・方向パワー計算部５０１が出力するフレームｔ毎、周波数ｋ毎、方向ｄ毎のパワーＸ（ｔ，ｋ，ｄ）に対して、信号分離を行い、分離後の周波数・方向パワーを出力する。 The frequency / direction power signal separation unit 502 is a processor 121 that executes the frequency / direction power signal separation program 502a, and the power X for each frequency k and for each direction d for each frame t output by the frequency / direction power calculation unit 501. Signal separation is performed for (t, k, d), and frequency / direction power after separation is output.

まず、Ｘ（ｔ，ｋ，ｄ）)を、周波数ｋと方向ｄを一つの軸とした行列Ｑ（ｔ，ａ）に変換する。具体的には、インデックスａを
ａ＝ｄ×Ｋ＋ｋ
と定義し、
Ｑ（ｔ，ａ）＝Ｘ（ｔ，ｋ，ｄ）
と代入する。 First, X (t, k, d)) is converted into a matrix Q (t, a) in which the frequency k and the direction d are one axis. Specifically, the index a is a = d × K + k
Defined as
Q (t, a) = X (t, k, d)
Substitute.

次に、Ｔフレーム分のＱ（ｔ，ａ）を入力として、各基底インデックスｂに対応する周波数・方向パワーＱ＿ｂ（ｔ，ａ）を抽出するような信号分離を行う。信号分離には、一般的な教師有り非負値行列分解（supervised non-negative matrix factorization）を用いることができる。 Next, signal separation is performed to extract frequency / direction power Q_b (t, a) corresponding to each base index b with Q (t, a) for T frames as an input. For signal separation, general supervised non-negative matrix factorization can be used.

基底は、あらかじめ、正常状態のＱ（ｔ，ａ）を入力として学習しておく。基底学習には乗法的更新などの一般的な学習方法が用いられてもよい。基底学習の初期化には、非負値独立成分分析などの一般的な初期化方法が用いられてもよい。 The basis is learned in advance with Q (t, a) in a normal state as an input. For basis learning, general learning methods such as multiplicative updating may be used. For initialization of basis learning, a general initialization method such as nonnegative independent component analysis may be used.

このようにＱ（ｔ，ａ）が教師有り非負値行列分解によって信号分離できる理由は、各基底インデックスｂの音が無相関であれば周波数・方向パワーＱ（ｔ，ａ）とその構成成分Ｑ＿ｂ（ｔ，ａ）に現れる値がすべて非負であるという性質、および、正常状態においては周波数・方向パワーＱ（ｔ，ａ）は限られた個数の基底の線形和で表されるという性質を利用しているからである。 The reason that Q (t, a) can be signal separated by supervised nonnegative matrix decomposition in this way is that if the sound of each base index b is uncorrelated, the frequency / direction power Q (t, a) and its component Q_b Uses the property that all values appearing in (t, a) are nonnegative and that in the normal state the frequency and direction power Q (t, a) is represented by a linear sum of a limited number of bases Because

なお、教師有り非負値行列分解の代わりに、Deep Neural Network（ＤＮＮ） autoencoder、Convolutional Neural Network（ＣＮＮ） autoencoder、Long Short Term Memory（ＬＳＴＭ） autoencoderなどが用いられてもよい。 Note that, instead of supervised nonnegative matrix decomposition, Deep Neural Network (DNN) autoencoder, Convolutional Neural Network (CNN) autoencoder, Long Short Term Memory (LSTM) autoencoder or the like may be used.

最後に、分離後の周波数・方向パワー
Ｐ（ｔ，ａ）＝Ｑ（ｔ，ａ）−Σ＿ｂＱ＿ｂ（ｔ，ａ）
を計算し、出力する。 Finally, the frequency and direction power after separation P (t, a) = Q (t, a) -Σ_bQ_b (t, a)
Calculate and output.

これは、基底で表せる成分Σ＿ｂＱ＿ｂ（ｔ，ａ）は正常状態とされる範囲内で最大限Ｑ（ｔ，ａ）を近似した行列であると仮定し、近似誤差であるＰ（ｔ，ａ）を異常に対応する成分であると仮定した処理である。 This assumes that the component Σ_bQ_b (t, a) that can be expressed by the basis is a matrix that approximates Q (t, a) at the maximum within the range considered to be normal, and the approximation error P (t, a) Is assumed to be a component corresponding to abnormality.

周波数・方向パワー異常検知部５０３は、周波数・方向パワー異常検知プログラム５０３ａを実行するプロセッサ１２１であり、分離後の周波数・方向パワーＰ（ｔ，ａ）が、データベース上に蓄積した正常時の分離後の周波数・方向パワーと類似している度合いを計算し、計算された類似度が予め設定された閾値以上高ければ、正常との判定結果を出力し、類似度が低ければ、異常との判定結果を出力する。 The frequency / direction power abnormality detection unit 503 is a processor 121 that executes the frequency / direction power abnormality detection program 503a, and the separated frequency / direction power P (t, a) is normally separated and stored in the database The degree of similarity with the subsequent frequency / direction power is calculated, and if the calculated similarity is higher than a preset threshold, the result of determination as normal is output, and if the similarity is low, determination as abnormal Output the result.

正常時の分離後の周波数・方向パワーには、異常検知モード入力部１１２から入力された（３）診断対象機械の正常稼働状態に応じて蓄積された分離後の周波数・方向パワーが用いられる。 The frequency / direction power after separation that is input according to the normal operation state of the diagnosis target machine (3) input from the abnormality detection mode input unit 112 is used as the frequency / direction power after separation during normal operation.

前述のとおり、入力信号空間相関異常検知部１０８は、音源分離の精度が悪化する条件であっても目的音の異常により入力信号空間相関行列が変化することを利用している。しかし、目的音ではない雑音が変化した場合だけでも入力信号空間相関行列は変化するため、入力信号空間相関異常検知部１０８は雑音が存在する条件での異常検知精度が低い。 As described above, the input signal space correlation abnormality detection unit 108 utilizes the fact that the input signal space correlation matrix changes due to the abnormality of the target sound even under the condition that the accuracy of the sound source separation deteriorates. However, since the input signal space correlation matrix changes only when noise that is not the target sound changes, the input signal space correlation abnormality detection unit 108 has a low accuracy of abnormality detection in the presence of noise.

これに対し、周波数・方向パワー異常検知部５０３は、異常に対応する成分のみを抽出した周波数・方向パワーを用いるので、雑音が存在する条件であっても異常検知が可能であるという効果を奏する。 On the other hand, since the frequency / direction power abnormality detection unit 503 uses the frequency / direction power from which only the component corresponding to the abnormality is extracted, there is an effect that abnormality detection is possible even under the condition where noise is present. .

診断対象のフレームｔの分離後の周波数・方向パワーＰ（ｔ，ａ）と正常時の分離後の周波数・方向パワーとの比較は、たとえば、Ｐ（ｔ，ａ）をＫ×Ｄ次元のベクトルｖと見なして実施する。ただし、Ｄは離散化した方向の個数である。正常時の分離後の周波数・方向パワーをＫ×Ｄ次元のベクトルｖと見なしたものをｗとする。 The comparison between the frequency / direction power P (t, a) after separation of the frame t to be diagnosed and the frequency / direction power after separation in the normal state is, for example, a vector of K × D dimensions of P (t, a) Conduct as v. Where D is the number of discretized directions. The frequency / direction power after separation in the normal state is regarded as a vector v of K × D dimensions and is w.

類似度として、たとえば、ｗの平均ベクトルとｖとの間のユークリッド距離の２乗に−１を乗算したものを用いることができる。この場合、異常検知が高速に実行できるという効果が期待できる。また、類似度として、ｗを多変量複素ガウス分布にフィッティングし、フィッティング結果の多変量複素ガウス分布がｖを生成する確率密度関数の対数尤度を用いることができる。 As the similarity, for example, one obtained by multiplying the square of the Euclidean distance between the average vector of w and v by -1 can be used. In this case, the effect that abnormality detection can be performed at high speed can be expected. Also, as the similarity, w can be fitted to a multivariate complex Gaussian distribution, and the log likelihood of a probability density function in which the multivariate complex Gaussian distribution as a fitting result generates v can be used.

正常状態において周波数間で相関が高い場合や、反響・残響が大きい場合に、前述の単純なユークリッド距離を用いると異常検知を誤り易いが、多変量複素ガウス分布を用いればこれらのぶれを吸収して学習できるので正しい異常検知が可能となるという効果が期待できる。 If the correlation is high between frequencies in a normal state, or if echo or reverberation is large, using the above-mentioned simple Euclidean distance is likely to cause error in anomaly detection, but using a multivariate complex Gaussian distribution will absorb these blurrings. Since the learning can be performed, an effect that correct abnormality detection can be performed can be expected.

また、類似度として、ｗを複素混合ガウス分布にフィッティングし、フィッティング結果の複素混合ガウス分布がｖを生成する確率密度関数の対数尤度を用いることができる。正常時の中でも複数の周波数パターンを持つような音源を対象とする場合、前述の多変量複素ガウス分布ではモデル化できないため、多変量複素ガウス分布を用いると異常検知を誤り易いが、複素混合ガウス分布を用いれば複数の周波数パターンをモデル化できるので正しい異常検知が可能となるという効果が期待できる。 Also, as the similarity, w can be fitted to a complex mixture Gaussian distribution, and the log likelihood of the probability density function in which the complex mixture Gaussian distribution of the fitting result generates v can be used. When targeting a sound source that has multiple frequency patterns even during normal times, the above multivariate complex Gaussian distribution can not be modeled, so using multivariate complex Gaussian distribution is likely to make error detection abnormal, but complex mixture Gaussian If a distribution is used, a plurality of frequency patterns can be modeled, so that an effect that correct anomaly detection can be expected can be expected.

異常表示部５０５は、異常表示プログラム５０５ａを実行するプロセッサ１２１と表示部１２４であり、異常検知モード入力部１１２から入力された、（１）雑音音源の移動の有無と（２）正常状態の目的音源の移動の有無に応じて、入力信号空間相関異常検知部１０８、音源毎空間相関異常検知部１０９、音源分離信号異常検知部１１０、入力信号空間相関異常検知部１０８、および周波数・方向パワー異常検知部５０３から入力された異常の有無の判定結果を表示する。 The abnormality display unit 505 is a processor 121 that executes the abnormality display program 505a and the display unit 124, and (1) presence / absence of movement of noise source and (2) purpose of normal state input from the abnormality detection mode input unit 112. Input signal space correlation abnormality detection unit 108, each sound source space correlation abnormality detection unit 109, sound source separation signal abnormality detection unit 110, input signal space correlation abnormality detection unit 108, and frequency / direction power abnormality according to the presence or absence of movement of the sound source The determination result of the presence or absence of abnormality input from the detection unit 503 is displayed.

図７は、実施例２の異常検知モードと表示される情報の例を示す図である。異常検知モード３０１、モード３０２、およびモード３０３は、図３を用いて説明したとおりであり、表示される情報７０４は、表示される情報３０４といずれの判定結果を表示するかの情報が異なるだけである。 FIG. 7 is a diagram illustrating an example of information displayed as an abnormality detection mode according to the second embodiment. The abnormality detection mode 301, the mode 302, and the mode 303 are as described with reference to FIG. 3, and the information 704 to be displayed is different from the information 304 to be displayed as to which judgment result is to be displayed. It is.

図３の例と同じく、モード３０２とモード３０３の組み合わせによらず表示する情報が同じであると異常と表示された場合に本当に異常なのかどうかがユーザに判りにくいという問題がある。このような表示切り換えによって、この問題を解決するという効果を奏する。 As in the example of FIG. 3, there is a problem that it is difficult for the user to know whether the information is displayed as abnormal if the displayed information is the same regardless of the combination of the mode 302 and the mode 303. Such display switching has the effect of solving this problem.

本実施例では、周波数・方向パワー信号を計算し、その周波数・方向パワー信号に対して信号分離を行い、分離後の周波数・方向パワー信号を異常検知に用いることも可能である。 In this embodiment, it is also possible to calculate the frequency / direction power signal, perform signal separation on the frequency / direction power signal, and use the separated frequency / direction power signal for abnormality detection.

周波数・方向パワー信号に対する信号分離は、複数のマイクロホン間の位相に基づくビームフォーミングではなく、非負値行列分解などの周波数・方向の振幅特性が基底間で異なることに基づく処理であるので、マイクロホンの数による制限を受けない。したがって、マイクロホンの数より音源数が多い場合であっても、実施例１より高精度での異常検知を可能であるという効果を奏する。 Signal separation for frequency / direction power signals is not based on phase-based beamforming between multiple microphones, but is a process based on differences in frequency / direction amplitude characteristics such as non-negative value matrix decomposition etc. Not limited by number. Therefore, even when the number of sound sources is larger than the number of microphones, it is possible to perform abnormality detection with higher accuracy than in the first embodiment.

診断装置１００あるいは診断装置５００は、複数の装置から構成されてもよい。図８は複数の装置により構成された診断システムの例を示す図である。以下では、図５、６に基づいて説明するが、図１、２の場合であっても、対応する各プログラムあるいは対応する各部を置き換えただけの説明となるため、図１、２に基づく説明は省略する。 Diagnostic device 100 or diagnostic device 500 may be configured of a plurality of devices. FIG. 8 is a diagram showing an example of a diagnostic system configured by a plurality of devices. Although the following description will be made based on FIGS. 5 and 6, even in the case of FIGS. 1 and 2, the description will be based only on replacement of the corresponding programs or corresponding parts. Is omitted.

図５に基づいて説明すると、信号分析装置８０１は、信号取得プログラム１０１ａから音源分離プログラム１０７ａまで、周波数・方向パワー計算プログラム５０１ａ、周波数・方向パワー信号分離プログラム５０２ａ、および音源存在方向クラスタ推定プログラム５０４ａが記憶部に格納されたコンピュータ（サーバ）である。 Referring to FIG. 5, the signal analysis device 801 includes a frequency / direction power calculation program 501a, a frequency / direction power signal separation program 502a, and a sound source presence direction cluster estimation program 504a from the signal acquisition program 101a to the sound source separation program 107a. Is a computer (server) stored in the storage unit.

また、異常検知装置８０２は、入力信号空間相関異常検知プログラム１０８ａから音源分離信号異常検知プログラム１１０ａまで、周波数・方向パワー異常検知プログラム５０３ａ、および異常表示プログラム５０５ａが記憶部に格納されたコンピュータ（サーバ）である。 In addition, the abnormality detection device 802 is a computer (server that stores the frequency / direction power abnormality detection program 503a and the abnormality display program 505a in the storage unit from the input signal space correlation abnormality detection program 108a to the sound source separation signal abnormality detection program 110a. ).

そして、図６に基づいて説明すると、信号分析装置８０１は、信号取得部１０１から音源分離部１０７まで、周波数・方向パワー計算部５０１、周波数・方向パワー信号分離部５０２、および音源存在方向クラスタ推定部５０４を備えた装置である。 Then, the signal analysis device 801 will be described based on FIG. 6 from the signal acquisition unit 101 to the sound source separation unit 107, the frequency / direction power calculation unit 501, the frequency / direction power signal separation unit 502, and the sound source presence direction cluster estimation. It is an apparatus provided with the unit 504.

また、異常検知装置８０２は、入力信号空間相関異常検知部１０８から音源分離信号異常検知部１１０まで、周波数・方向パワー異常検知部５０３、および異常表示部５０５を備えた装置である。 Further, the abnormality detection device 802 is a device provided with a frequency / direction power abnormality detection unit 503 and an abnormality display unit 505 from the input signal space correlation abnormality detection unit 108 to the sound source separation signal abnormality detection unit 110.

図２、６を用いて説明したように、入力信号空間相関計算部１０３は入力信号空間相関行列を出力し、音源毎空間相関計算部１０５は音源毎空間相関行列を出力し、音源分離部１０７は音源分離信号を出力し、周波数・方向パワー信号分離部５０２は分離後の周波数・方向パワーを出力する。これら４つの情報を信号分析装置８０１は、それぞれ信号線８１１〜８１４を介して異常検知装置８０２へ出力する。 As described with reference to FIGS. 2 and 6, the input signal space correlation calculation unit 103 outputs the input signal space correlation matrix, the sound source space correlation calculation unit 105 outputs the sound source space correlation matrix, and the sound source separation unit 107. Outputs a sound source separation signal, and the frequency / direction power signal separation unit 502 outputs the separated frequency / direction power. The signal analysis device 801 outputs these four pieces of information to the abnormality detection device 802 via the signal lines 811 to 814, respectively.

異常検知装置８０２は、信号線８１１〜８１４のそれぞれを介して情報が信号分析装置８０１から入力されると、図２、６を用いて説明したように、入力信号空間相関異常検知部１０８、音源毎空間相関異常検知部１０９、音源分離信号異常検知部１１０、および周波数・方向パワー異常検知部５０３のそれぞれは、各情報に対して類似度を計算して、判定結果を出力し、異常表示部５０５が判定結果を表示する。 When information is inputted from the signal analysis device 801 through the signal lines 811 to 814, the abnormality detection device 802 causes the input signal space correlation abnormality detection unit 108 and the sound source as described with reference to FIGS. Each space correlation abnormality detection unit 109, the sound source separation signal abnormality detection unit 110, and the frequency / direction power abnormality detection unit 503 calculate the degree of similarity for each information, and output the determination result to display the abnormality display unit. 505 displays the determination result.

ここで、入力信号空間相関異常検知部１０８、音源毎空間相関異常検知部１０９、音源分離信号異常検知部１１０、および周波数・方向パワー異常検知部５０３のそれぞれが、類似度を計算するために用いるデータベースに蓄積された情報を、異常検知装置８０２は、信号線８２１〜８２４を介して正常モデル管理装置８０３から入力する。各閾値は、異常検知装置８０２に格納されてもよいし、正常モデル管理装置８０３から入力されてもよい。 Here, each of the input signal space correlation abnormality detection unit 108, the individual sound source space correlation abnormality detection unit 109, the sound source separation signal abnormality detection unit 110, and the frequency / direction power abnormality detection unit 503 is used to calculate the similarity. The abnormality detection apparatus 802 inputs the information accumulated in the database from the normal model management apparatus 803 via the signal lines 821 to 824. Each threshold may be stored in the abnormality detection apparatus 802 or may be input from the normal model management apparatus 803.

正常モデル管理装置８０３は、データベースにより情報を蓄積するコンピュータ（サーバ）であり、正常時の入力信号空間相関行列、正常時の音源毎空間相関行列、正常時の特徴量ベクトル、および正常時の分離後の周波数・方向パワーをデータベースに予め蓄積し、信号線８２１〜８２４を介して異常検知装置８０２に出力する。 The normal model management device 803 is a computer (server) that accumulates information in a database, and the input signal space correlation matrix at normal time, the sound source space correlation matrix at normal time, the feature value vector at normal time, and the separation at normal time The subsequent frequency / direction powers are stored in the database in advance, and are output to the abnormality detection device 802 via the signal lines 821 to 824.

また、正常モデル管理装置８０３は、これらの正常時の情報を、信号分析装置８０１が出力する信号線８１１〜８１４から予め取得して蓄積してもよい。このために、正常モデル管理装置８０３は、信号線８１１〜８１４を介して入力する情報を機械学習してもよい。 In addition, the normal model management device 803 may obtain and store the information at the time of normal from the signal lines 811 to 814 outputted by the signal analysis device 801 in advance. For this purpose, the normal model management device 803 may machine-learn the information input via the signal lines 811 to 814.

異常検知モード入力プログラム１１２ａは、異常検知装置８０２の記憶部に格納されてもよいし、正常モデル管理装置８０３の記憶部に格納されてもよい。そして、異常検知モード入力部１１２は、異常検知装置８０２に備えられてもよいし、正常モデル管理装置８０３に備えられてもよい。異常検知モードの情報は信号線８２５を介して、異常検知装置８０２と正常モデル管理装置８０３のいずれか一方から他方へ送信されてもよい。 The abnormality detection mode input program 112 a may be stored in the storage unit of the abnormality detection device 802 or may be stored in the storage unit of the normal model management device 803. The abnormality detection mode input unit 112 may be included in the abnormality detection device 802 or in the normal model management device 803. Information on the abnormality detection mode may be transmitted from one of the abnormality detection device 802 and the normal model management device 803 to the other via the signal line 825.

信号分析装置８０１と異常検知装置８０２と正常モデル管理装置８０３は、信号線８１１〜８１４、８２１〜８２５の代わりにネットワークで接続されもよく、３つの装置の中の任意の２つの装置が一体となって１つの装置となってもよい。 The signal analysis device 801, the abnormality detection device 802, and the normal model management device 803 may be connected by a network instead of the signal lines 811-814 and 821-825, and any two of the three devices may be integrated. May be one device.

また、信号分析装置８０１は、信号取得部１０１から音源分離部１０７まで処理の流れの途中で分けられることにより、複数の装置すなわち複数のコンピュータで構成されてもよい。診断システムを複数の装置で構成することにより、ハードウェアを柔軟に構成することが可能となり、たとえば各部の処理の負荷に応じた装置の割り当てが可能となる。 In addition, the signal analysis device 801 may be configured by a plurality of devices, that is, a plurality of computers by being divided in the middle of the flow of processing from the signal acquisition unit 101 to the sound source separation unit 107. By configuring the diagnostic system with a plurality of devices, hardware can be flexibly configured, and for example, devices can be assigned according to the processing load of each part.

さらに、信号分析装置８０１と異常検知装置８０２が複数あり、１つの正常モデル管理装置８０３から複数の異常検知装置８０２へ信号線８２１〜８２４あるいはネットワークを介して正常時の情報が配布されてもよい。これにより、診断対象が複数存在し、信号分析装置８０１が複数存在しても、正常時の情報を統一して管理することが可能となる。 Furthermore, there may be a plurality of signal analysis devices 801 and abnormality detection devices 802, and information from the normal state may be distributed from one normal model management device 803 to a plurality of abnormality detection devices 802 via signal lines 821 to 824 or a network. . As a result, even when there are a plurality of diagnosis targets and a plurality of signal analysis devices 801, it is possible to unify and manage information at normal times.

なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。たとえば、上記した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。 The present invention is not limited to the embodiments described above, but includes various modifications. For example, the embodiments described above are described in detail in order to explain the present invention in an easy-to-understand manner, and are not necessarily limited to those having all the configurations described.

また、上記の各構成は、記憶部１２６に格納されたプログラムを実行するプロセッサ１２１というソフトウェアによる実現を説明したが、それらの一部又は全部を、たとえば集積回路で設計するなどによりハードウェアで実現してもよい。 In addition, although each configuration described above has been described as the software implementation of the processor 121 that executes the program stored in the storage unit 126, a part or all of them may be implemented by hardware, for example, by designing with integrated circuits. You may

また、制御線や情報線（信号線）は説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。実際には殆ど全ての構成が相互に接続されていると考えてもよい。 Further, control lines and information lines (signal lines) indicate what is considered to be necessary for the description, and not all the control lines and information lines in the product are shown. In practice, almost all configurations may be considered to be mutually connected.

１０３入力信号空間相関計算部
１０４音源存在方向クラスタ推定部
１０５音源毎空間相関計算部
１０６音源分離フィルタ更新部
１０７音源分離部
１０８入力信号空間相関異常検知部
１０９音源毎空間相関異常検知部
１１０音源分離信号異常検知部 103 input signal space correlation calculation unit 104 sound source existing direction cluster estimation unit 105 per sound source space correlation calculation unit 106 sound source separation filter update unit 107 sound source separation unit 108 input signal space correlation abnormality detection unit 109 per sound source space correlation abnormality detection unit 110 sound source separation Signal abnormality detection unit

Claims

A diagnostic device that diagnoses by sound, and
A signal acquisition unit that acquires a sound signal that is an electrical signal converted from sound and outputs the sound signal;
A pre-processing unit that converts a sound signal output from the signal acquisition unit into a frequency domain signal;
A spatial correlation calculation unit that calculates a spatial correlation matrix based on the frequency domain signal converted by the pre-processing unit;
A space correlation abnormality detection unit that determines an abnormality based on the space correlation matrix calculated by the space correlation calculation unit;
And an abnormality display unit for displaying information related to the abnormality based on the determination of the abnormality by the space correlation abnormality detection unit.

The diagnostic device according to claim 1, wherein
A per-sound source processing unit that calculates a per-sound source spatial correlation matrix based on the frequency domain signal converted by the pre-processing unit;
A sound source processing unit that separates a sound source and generates a sound source separation signal based on the sound source space correlation matrix calculated by the sound source processing unit and the frequency domain signal converted by the preprocessing unit;
A sound source space correlation abnormality detection unit that determines an abnormality based on the sound source space correlation matrix calculated by the sound source processing unit;
A sound source separation signal abnormality detection unit that determines an abnormality based on the sound source separation signal generated by the sound source processing unit;
The abnormality display unit is
In addition to the determination of the abnormality by the spatial correlation abnormality detection unit, information on the abnormality is displayed based on the determination of the abnormality by the sound source space correlation abnormality detection unit and the determination of the abnormality by the sound source separation signal abnormality detection unit. A diagnostic device characterized by

The diagnostic device according to claim 2, wherein
The sound source processing unit
A filter updating unit that calculates a sound source separation filter based on the sound source space correlation matrix calculated by the sound source processing unit;
A sound source separation unit that applies the sound source separation filter calculated by the filter updating unit to the frequency domain signal converted by the pre-processing unit and separates the sound source to generate a sound source separation signal is provided. Diagnostic device.

The diagnostic device according to claim 3, wherein
The sound source processing unit
A direction cluster estimation unit that estimates a sound source presence direction cluster based on the spatial correlation matrix calculated by the spatial correlation calculation unit;
A per-sound source space correlation calculation unit for calculating a per-sound source space correlation matrix based on the frequency domain signal converted by the pre-processing unit and the sound source existing direction cluster estimated by the direction cluster estimation unit; Diagnostic equipment.

The diagnostic device according to claim 4, wherein
It further has an input unit,
The abnormality display unit is
When the information indicating that there is no movement of the noise source and no movement of the target sound source in the normal state is obtained from the input unit, determination of abnormality by the spatial correlation abnormality detection unit, determination of abnormality by the spatial correlation abnormality detection unit for each sound source, And a diagnostic device for displaying information related to the abnormality based on the determination of the abnormality by the sound source separation signal abnormality detection unit.

The diagnostic device according to claim 3, wherein
The sound source processing unit
A power calculator configured to calculate power of frequency components in each direction for each frequency based on the frequency domain signal converted by the pre-processor;
A power signal separation unit that separates the power calculated by the power calculation unit so as to remove a base learned in a normal state;
A direction cluster estimation unit that estimates a sound source presence direction cluster based on the power calculated by the power calculation unit;
An individual sound source space correlation calculation unit for calculating an individual sound source space correlation matrix based on the frequency domain signal converted by the pre-processing unit and the sound source presence direction cluster estimated by the direction cluster estimation unit;
The diagnostic device
And a power abnormality detection unit that determines an abnormality based on the power separated by the power signal separation unit.
The abnormality display unit is
Based on the determination of abnormality by the spatial correlation abnormality detection unit, the determination of abnormality by the sound source space correlation abnormality detection unit, the determination of abnormality by the sound source separation signal abnormality detection unit, and the abnormality determination by the power abnormality detection unit The diagnostic device characterized by displaying the information regarding.

The diagnostic device according to claim 6, wherein
It further has an input unit,
The abnormality display unit is
When the information indicating that there is no movement of the noise source and no movement of the target sound source in the normal state is obtained from the input unit, determination of abnormality by the spatial correlation abnormality detection unit, determination of abnormality by the spatial correlation abnormality detection unit for each sound source, A diagnostic apparatus characterized by displaying information related to an abnormality based on the determination of an abnormality by the sound source separation signal abnormality detection unit and the determination of an abnormality by the power abnormality detection unit.

A diagnostic method that a computer diagnoses by sound, and
The computer is
A storage unit in which the program is stored;
A processor that executes a program stored in the storage unit;
The processor is
Obtain and convert a sound signal, which is an electrical signal converted from sound
Convert the converted sound signal into a frequency domain signal,
Calculate the spatial correlation matrix based on the transformed frequency domain signal,
Calculate the per-source spatial correlation matrix based on the transformed frequency domain signal,
Based on the calculated per-source space correlation matrix and the transformed frequency domain signal, the source is separated to generate a source separation signal,
Determine the anomaly based on the calculated spatial correlation matrix,
Anomaly is determined based on the calculated per-source spatial correlation matrix,
An abnormality is determined based on the generated sound source separation signal,
A diagnostic method characterized by displaying information related to abnormality according to the judgment of abnormality based on the spatial correlation matrix, the judgment of abnormality based on the spatial correlation matrix for each sound source, and the judgment of abnormality based on the sound source separation signal.

The diagnostic method according to claim 8, wherein
The processor is
Based on the calculated spatial correlation matrix, the sound source presence direction cluster is estimated, and the sound source space correlation matrix is calculated based on the transformed frequency domain signal and the estimated sound source presence direction cluster. Calculate the correlation matrix
By calculating the sound source separation filter based on the calculated sound source space correlation matrix, applying the calculated sound source separation filter to the converted frequency domain signal, and separating the sound source to generate the sound source separation signal , Separating the sound source.

The diagnostic method according to claim 9, wherein
The processor is
When information is obtained that there is no movement of the noise source and no movement of the target sound source in the normal state, the determination of the abnormality based on the spatial correlation matrix, the determination of the abnormality based on the per-source spatial correlation matrix, and the abnormality based on the sound source separation signal A diagnostic method characterized by displaying information related to an abnormality according to the determination.

The diagnostic method according to claim 8, wherein
The processor is
Based on the converted frequency domain signal, the power of the frequency component in each direction is calculated for each frequency, separated from the calculated power so as to remove the basis learned in the normal state, and based on the calculated power The sound source space correlation matrix is calculated by estimating the sound source presence direction cluster and calculating the sound source space correlation matrix based on the transformed frequency domain signal and the estimated sound source presence direction cluster,
Determine the anomaly based on the separated power,
According to the determination of the abnormality based on the spatial correlation matrix, the determination of the abnormality based on the spatial correlation matrix for each sound source, the determination of the abnormality based on the sound source separation signal, and the determination of the abnormality based on the power, displaying information about the abnormality. Diagnostic methods.

The diagnostic method according to claim 11, wherein
The processor is
When information is obtained that there is no movement of the noise source and no movement of the target sound source in the normal state, determination of an abnormality based on the spatial correlation matrix, determination of an abnormality based on the spatial correlation matrix for each sound source, determination of an abnormality based on the sound source separation signal And displaying information related to the abnormality in accordance with the determination of the abnormality based on the power and the power.

A diagnostic system that includes a plurality of computers and diagnoses by sound,
The first one of the plurality of computers is:
A signal acquisition unit that acquires a sound signal that is an electrical signal converted from sound and outputs the sound signal;
A pre-processing unit that converts a sound signal output from the signal acquisition unit into a frequency domain signal;
A spatial correlation calculation unit that calculates a spatial correlation matrix based on the frequency domain signal converted by the pre-processing unit;
A per-sound source processing unit that calculates a per-sound source spatial correlation matrix based on the frequency domain signal converted by the pre-processing unit;
A sound source processing unit that separates a sound source and generates a sound source separation signal based on the sound source-specific spatial correlation matrix calculated by the sound source-specific processing unit and the frequency domain signal converted by the preprocessing unit;
Transmitting the spatial correlation matrix calculated by the spatial correlation calculation unit, the sound source spatial correlation matrix calculated by the sound source processing unit, and the sound source separation signal generated by the sound source processing unit;
The second computer of the plurality of computers is
A spatial correlation abnormality detection unit that determines an abnormality based on the spatial correlation matrix received from the first computer;
A sound source space correlation abnormality detection unit that determines an abnormality based on the sound source space correlation matrix received from the first computer;
A sound source separation signal abnormality detection unit that determines an abnormality based on the sound source separation signal received from the first computer;
An abnormality display unit that displays information related to the abnormality based on the determination of the abnormality by the spatial correlation abnormality detection unit, the determination of the abnormality by the sound source space correlation abnormality detection unit, and the determination of the abnormality by the sound source separation signal abnormality detection unit; The diagnostic system characterized by having.

The diagnostic system according to claim 13, wherein
The sound source processing unit of the first computer
A direction cluster estimation unit that estimates a sound source presence direction cluster based on the spatial correlation matrix calculated by the spatial correlation calculation unit;
An individual sound source space correlation calculation unit for calculating an individual sound source space correlation matrix based on the frequency domain signal converted by the pre-processing unit and the sound source presence direction cluster estimated by the direction cluster estimation unit;
The sound source processing unit of the first computer
A filter updating unit that calculates a sound source separation filter based on the per-sound source space correlation matrix calculated by the per-sound source space correlation calculation unit;
A sound source separation unit that applies the sound source separation filter calculated by the filter updating unit to the frequency domain signal converted by the pre-processing unit and separates the sound source to generate a sound source separation signal is provided. Diagnostic system.

The diagnostic system according to claim 13, wherein
The sound source processing unit of the first computer
A power calculator configured to calculate power of frequency components in each direction for each frequency based on the frequency domain signal converted by the pre-processor;
A power signal separation unit that separates the power calculated by the power calculation unit so as to remove a base learned in a normal state;
A direction cluster estimation unit that estimates a sound source presence direction cluster based on the power calculated by the power calculation unit;
An individual sound source space correlation calculation unit for calculating an individual sound source space correlation matrix based on the frequency domain signal converted by the pre-processing unit and the sound source presence direction cluster estimated by the direction cluster estimation unit;
The sound source processing unit of the first computer
A filter updating unit that calculates a sound source separation filter based on the per-sound source space correlation matrix calculated by the per-sound source space correlation calculation unit;
A sound source separation unit that applies the sound source separation filter calculated by the filter updating unit to the frequency domain signal converted by the pre-processing unit and separates the sound source to generate a sound source separation signal;
The first computer is
The power signal separation unit transmits the separated power;
The second computer is
The apparatus further comprises a power abnormality detection unit that determines an abnormality based on the power received from the first computer.
The abnormality display unit of the second computer is:
Based on the determination of abnormality by the spatial correlation abnormality detection unit, the determination of abnormality by the sound source space correlation abnormality detection unit, the determination of abnormality by the sound source separation signal abnormality detection unit, and the abnormality determination by the power abnormality detection unit A diagnostic system characterized by displaying information on