[go: up one dir, main page]

CN111914875A - Fault early warning method of rotating machinery based on Bayesian LSTM model - Google Patents

Fault early warning method of rotating machinery based on Bayesian LSTM model Download PDF

Info

Publication number
CN111914875A
CN111914875A CN202010520887.2A CN202010520887A CN111914875A CN 111914875 A CN111914875 A CN 111914875A CN 202010520887 A CN202010520887 A CN 202010520887A CN 111914875 A CN111914875 A CN 111914875A
Authority
CN
China
Prior art keywords
model
data
signal
lstm
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010520887.2A
Other languages
Chinese (zh)
Inventor
游东东
黎家良
沈小成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202010520887.2A priority Critical patent/CN111914875A/en
Publication of CN111914875A publication Critical patent/CN111914875A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Multimedia (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

本发明公开了一种基于贝叶斯LSTM模型的旋转机械的故障预警方法。所述方法主要针对大型旋转机械设备故障预警,通过整合分析数据,运用离散小波包、PPCA概率主成分分析、C‑C算法等对数据进行降噪降维重构,得到较好处理的数据集,再通过LSTM循环神经网络的方法构建数据预测模型,最后结合贝叶斯假设检验方法求其置信度,输出异常情况的时间节点,来达到需要的故障预警目的。本发明实现了量化分析机械设备运行数据,集成了多种高等智能监测诊断技术和方法,保证了预测具有更高的可靠性及稳健性。本发明提供了一种全新可行的故障检测与预警方法。

Figure 202010520887

The invention discloses a fault early warning method for rotating machinery based on a Bayesian LSTM model. The method is mainly aimed at early warning of large-scale rotating machinery and equipment failures. By integrating and analyzing data, the data is denoised and dimensionally reconstructed by using discrete wavelet packets, PPCA probability principal component analysis, C-C algorithm, etc., to obtain a well-processed data set. , and then construct the data prediction model through the LSTM cyclic neural network method, and finally combine the Bayesian hypothesis testing method to obtain its confidence, and output the time node of the abnormal situation to achieve the required fault warning purpose. The invention realizes quantitative analysis of mechanical equipment operation data, integrates various advanced intelligent monitoring and diagnosis technologies and methods, and ensures higher reliability and robustness of prediction. The present invention provides a new and feasible fault detection and early warning method.

Figure 202010520887

Description

Fault early warning method of rotating machinery based on Bayesian LSTM model
Technical Field
The invention belongs to the field of fault early warning of large-scale rotating machinery equipment, and relates to a fault early warning method of rotating machinery based on a Bayesian LSTM model.
Background
In order to improve the data quality, reduce the influence of bad data on the analysis result and facilitate the analysis and mining of the data, data preprocessing is necessary, namely identification and integrated transformation are carried out on the bad data, and the accuracy of subsequent modeling prediction is directly related. Lee et al recently highlighted the key role of Signal Processing techniques in the diagnosis and prognosis of rotating machines (Lee, Jay; Wu, Fangji; ZHao, Wenyu; Ghaffari, Mass; Liao, Linxia; Siegel, David. Prognostics and health management design for road map Systems-Reviews, methods and applications, Mechanical Systems and Signal Processing,2014,42(1-2): 314-334 doi:10.1016/j. ymssp.2013.06.004.). Wavelet transformation is a common method and has some applications in monitoring data processing of nuclear power plants. Such as Upadhyaya et al, enhance sensor measurements by filtering the low and high frequency components of the nuclear power plant sensor using wavelet transforms. An efficient method for data pre-processing is provided by using Wavelet transformation for signal conditioning with minimal distortion of signal bandwidth (Upadhyaya, Belle R.; Mehta, Chaitanya; Bayram, Duygu. integration of Time Series Modeling and Wavelet Transform for Monitoring Nuclear Plant Sensors. IEEE Transactions on Nuclear Science,2014,61(5):2628-2635.doi: 10.1109/TNS.2014.2341035). Whereas wavelet transform can only transform the low-pass filtered result, discrete wavelet packet transform is a more elaborate signal analysis method, while it can decompose the high frequency part.
To simplify the study of the problem, reduce the number of variables, fewer variables are used to represent most of the information of the problem. PCA is used as a linear dimensionality reduction method and is widely applied to the fields of pattern recognition, feature extraction and the like. A Fault Detection and Diagnosis (FDD) framework is constructed for a Nuclear Power Plant (NPP) pressurized water reactor, PCA is adopted to remove information of fault sensors, fuzzy theory is combined with data fusion, and data of a plurality of sensors are fused onto one node (Guohua Wu, Jeijua Tong, Liguo Zhuang, Yunfei Zhuao, Zhuhing Duan. framework for fault diagnosis with multi-source sensor nodes in Nuclear power plant based on Bayesian network of Nuclear energy.2018,122:297-308.https:// doi. org/10.1016/j. anucene.2018.08.08.). PPCA is a derivative of conventional PCA, and overcomes the limitation that conventional PCA simply discards other non-principal components. In PPCA, this discarded information is used as gaussian noise estimation, which can maximally preserve the useful information in the original signal.
The application of deep learning in the field of fault diagnosis and health management of mechanical equipment is gradually increased recently, wherein the effect of applying a Recurrent Neural Network (RNN) to sequence-type data is obvious. Long-and-short memory networks (LSTM) are a variant of RNN that effectively alleviate the gradient explosion and gradient disappearance problems that occur with RNN processing longer time series data. Therefore, when long-time sequence data is processed, the LSTM can effectively solve the long-term dependence problem of effective information in the long-time sequence, is more sensitive to deep features of long-term historical data, is easier to capture key information hidden in the long-term historical data, is suitable for being applied to big data processing, and improves prediction accuracy. For example, Yang et al proposed a novel electromagnetic bearing fault detection and isolation method, which used an advanced Long Short Term Memory (LSTM) neural network to build a sensor data model, and could effectively process time series data (Yang, lacing; Guo, Yingqing; Zhao, Wanli. Long short-term memory network based fault detection and isolation for electric-mechanical actuators. NEUROMPUTING.2019, 360:85-96.DOI:10.1016/j. neucom.2019.06.029).
For the judgment of the accuracy of the neural network model, the mean square error MSE and the goodness of fit R are usually relied on2Two indexes are provided. The mean square error MSE can indicate the deviation degree of a predicted result from an actual result, but the sample set dimension difference causes the prediction result not to be readable, the decision coefficient R2 represents the superiority degree of the model compared with the direct averaging, and the result is between 0 and 1, the closer to 1, the better the model effect is. Research shows that the Bayesian method for quantitative model verification has larger exploration space, so that the Bayesian hypothesis testing method is introduced into the inventionThe method can quantify the reliability of the model and also consider the prior information of the training set. The calculation of the expression of the Bayes factor was studied as in Jiang et al, facilitating the overall reliability assessment of model validation (Jiang X M, Mahadevan S. Bayesian structural evaluation modeling for the systematic uncertain qualification. International Journal for the Numerical Methods in Engineering,2009,80(6): 717) 737.).
The invention provides a fault early warning method based on deep learning in the process of researching fault early warning in large-scale rotating machinery, which processes, analyzes and trains a real-time monitoring data model, establishes an LSTM neural network prediction model considering data uncertainty and a quantitative reliability inspection method according to a Bayesian lambda value.
Disclosure of Invention
Aiming at the defects of the prior art, the invention ingeniously combines advanced signal processing, mode recognition, intelligent algorithm and a probability decision method, provides a rotating machinery fault early warning method based on a Bayesian LSTM model, can realize that multiple parameter data share one algorithm frame, improves the correlation degree of data utilization and improves the efficiency; the fault information can be judged and analyzed more accurately. The method comprises the steps of firstly utilizing discrete wavelet packet threshold denoising and probability principal component analysis to process data; thirdly, performing phase space reconstruction on the data set by utilizing a C-C algorithm to obtain a reconstruction data set according to the embedding dimension and the optimal time delay; then, taking one part of the data set as a training set to train the LSTM model, taking the other part of the data set as a verification set to verify the reliability of the model, and firstly providing a Bayesian model reliability test method; and finally, a method for judging whether to give out an early warning or not is provided on the basis of the prior information of the historical data set, so that the model can accurately give out an alarm in the fault creep stage of the unit or when the monitoring system has a problem.
The invention is realized by at least one of the following technical solutions.
A fault early warning method of a rotating machine based on a Bayesian LSTM model comprises the following steps:
s1, reading n groups of time sequencesProcessing the signal data into a p-dimensional matrix Xp×nConveniently processing signals to obtain noise-containing abnormal-free data, decomposing the signals by a discrete wavelet packet threshold denoising method, filtering wavelet coefficients and reconstructing to effectively remove noise;
s2, for p-dimensional sample data, each dimension may not independently represent equipment status, and each dimension may be correlated with each other, in order to reduce the dimension and improve efficiency, Probability Principal Component Analysis (PPCA) is performed on the sample data, and the sample data is reduced to q-dimension Xq×n
S3, performing phase space reconstruction on the dimensionality reduction signal data by using a C-C algorithm;
s4, dividing the sample data into training, verifying and testing data according to proportion, and facilitating the training, verifying and testing of the prediction model; constructing an LSTM (least squares metric) cyclic neural network, taking a sigmoid function as an activation function, and taking a back propagation algorithm as a training algorithm of the activation function to obtain optimal parameters of a model; verifying the model precision by using the verification sample, judging whether the model precision meets the application condition or not, and continuing training if the model precision does not meet the precision requirement; if the model prediction is correct, the model is applied by using a test sample, the confidence coefficient of the data predicted by the model is solved by adopting a Bayesian hypothesis reliability test method, and finally, a judgment method for whether to give out an early warning is given on the basis of the prior information of a historical data set, so that the model can accurately give out an alarm in a unit fault creep stage or when a monitoring system has a problem.
Further, in step S1, the read signal data is an original signal, that is, a signal collected by the sensor without any processing is usually collected from the sensor, the data recorded by the real-time monitoring system often contains noise, and if the data does not undergo denoising processing, the data affects the subsequent signal analysis, so that the analysis has a deviation, and the result has no meaning; therefore, a discrete wavelet packet threshold denoising method is adopted to denoise an original signal, and the method specifically comprises the following steps:
after a noisy signal is subjected to multi-layer decomposition, the energy of the noisy signal is mainly concentrated in partial wavelet packet decomposition coefficients, the noise energy is distributed in the coefficients of the whole wavelet domain, and the amplitude value of the wavelet packet transformation coefficient of the signal is larger than that of the wavelet packet transformation coefficient of the noise; setting a proper threshold value, screening out wavelet packet transformation coefficients caused by signals, filtering out coefficients caused by noise, and performing wavelet reconstruction to obtain signals without noise, so that the selection of the threshold value has a decisive influence on the denoising effect of the wavelet packet threshold value;
the threshold is determined by a Bayesian estimation method, the advantage of prior information is considered, the threshold is established in the face of a risk function, and the denoising effect is effectively enhanced;
in discrete wavelet packet transform, the basic wavelet function is typically:
Figure BDA0002526260210000031
wherein t represents a continuous time variable; pi is a time index; the syndrome of an attack is the frequency index; z is the set of all integers; l is2(R) represents hilbert space; ψ is a basic wavelet function;
in practical applications of the discrete wavelet packet technique, a signal f (t) with n discrete data points is wavelet decomposed:
Figure BDA0002526260210000041
fj(t) the wavelet series is further decomposed into wavelet packet components:
Figure BDA0002526260210000042
wnwavelet function cluster representing the nth discrete data point, n 0,1,2, …, where k represents the number of layers to continue decomposition, 0 ≦ k ≦ j, coefficient
Figure BDA0002526260210000043
Referred to as f (t) at a resolution of j-kOrthogonal wavelet packet decomposition coefficients; w is an(t) satisfies the two-scale equation:
Figure BDA0002526260210000044
wherein, { h }n}n∈ZIs a conjugate quadrature mirror filter satisfying
Figure BDA0002526260210000045
gn=(-1)kh1-kk,lIs a Kronecker function, and satisfies
Figure BDA0002526260210000046
k,l∈Z;
When n is 0, w0(t) is a scale function, and when n is 1, w is1(t) is the wavelet function ψ (t), and in this case, equation (1) can be expressed as:
Figure BDA0002526260210000047
wavelet packet decomposition coefficient
Figure BDA0002526260210000048
Next layer decomposition coefficient
Figure BDA0002526260210000049
The following is obtained by a recurrence formula:
Figure BDA00025262602100000410
if the initial value of the wavelet packet decomposition coefficient is the discrete data, the reconstruction algorithm is as follows:
Figure BDA00025262602100000411
and (4) carrying out noise-containing abnormal signal decomposition by using the formula (6), obtaining the wavelet packet decomposition coefficient of each node of the binary tree in the 3-level wavelet packet decomposition, filtering out the wavelet packet decomposition coefficient according to a threshold value, and then carrying out signal reconstruction according to the formula (7).
Further, the threshold is determined by using a bayesian estimation method, which includes:
the threshold is determined by adopting a Bayes shock threshold estimation method with self-adaptability, and the calculation formula is as follows:
T=σ2x; (8)
wherein σ2Is the variance of the noise, σxIs the original signal variance;
the noisy signal is:
f(t)=g(t)+(t); (9)
wherein g (t) is the original signal and (t) is the noise signal;
writing a wavelet packet decomposition coefficient d obtained by performing discrete wavelet packet transformation on the signal into:
d=dx+d; (10)
wherein d isxIs the noise wavelet coefficient, d is the original signal wavelet coefficient;
the noise standard deviation was estimated as:
Figure BDA0002526260210000051
wherein,
Figure BDA0002526260210000052
that is, the wavelet coefficients of each node in the binary tree are averaged by sigmayRepresenting the noisy signal variance:
Figure BDA0002526260210000053
the raw signal variance is estimated as:
Figure BDA0002526260210000054
thus obtaining Bayes threshold value capable of being adaptively adjusted according to scale
Figure BDA0002526260210000055
The wavelet coefficients can be filtered by selecting a soft threshold function:
Figure BDA0002526260210000056
further, in step S2, performing Probabilistic Principal Component Analysis (PPCA) on the noise-reduced signal to reduce the dimension of the multidimensional data and to process the uncertainty of the data; probability Principal Component Analysis (PPCA) defines a proper probability model for Principal Component Analysis (PCA), thereby overcoming the limitation that the traditional principal component analysis simply discards other non-principal components; in the probability principal component analysis, the discarded information is used as Gaussian noise estimation, and the useful information in the original signal can be retained to the maximum extent;
with sample X of dimension pp×nIn the probabilistic principal component analysis, a sample X is assumedp×nExpressed as X ═ Wz + μ +, where W is the weight vector, dimension p × q, q ≦ p, and z is the q × N dimension obeying z to N (0, I)q) A random gaussian vector, also known as the result of X dimensionality reduction, μ is the sample mean, which is noise, assuming that the noise follows a gaussian distribution with variance σ:
Figure BDA0002526260210000061
substituting into Bayes formula to obtain z posterior probability P (z | X) -N (M)-1WT(X-μ),σ2M-1) Wherein M ═ σ2Iq+WTW)-1In Bayesian probabilistic principal component analysis, the expected M of the z posterior probability is considered-1WT(X- μ) is the result of dimension reduction of X, and then only W and σ in the formula are unknown numbers, and W and σ are estimated by the maximum likelihood function, and the estimation result is:
Figure BDA0002526260210000062
Figure BDA0002526260210000063
wherein λjIs obtained by decomposing the covariance matrix of the sample X according to the eigenvalues, i.e. Cvj=λjvj,vjIs a feature vector, Uq=(v1,v2,…,vq),Δq=diag(λ1,…,λq);
When q is equal to p, there is the equation W-1X=z+W-1μ+W-1Using W-1The method can be used for carrying out primary fault diagnosis by searching reversely with the main component; w-1Expressed as:
Figure BDA0002526260210000064
wherein, w12Represents the data sample Xp×nThe contribution rate of the second-dimensional signal of (a) to the first principal component; by utilizing the principle, the influence degree of all signals on the first main component can be obtained, when the main component is abnormal, the probability of the signal with high contribution rate to have fault is the maximum, the signal is diagnosed preferentially, and the problem can be found quickly and effectively.
Further, in step S3, since the artificial neural network has a strong nonlinear mapping capability, the artificial neural network is often combined with the C-C algorithm, and in the reconstructed phase space, the artificial neural network is used to approximate the mapping relationship between the current state and the future state to predict the chaotic time series, and meanwhile, the input and output of the neural network are determined by the reconstructed m of the embedded dimension and the delay time τ; for the determination of the delay time τ and the embedding dimension m, the delay time and the embedding dimension can be estimated simultaneously using correlation integration, as follows:
first construct disjoint time series matrices:
Figure BDA0002526260210000065
assuming that the maximum value of the delay time τ is T, where T is a natural number in [1, T ], k is INT (N/T), N is the length of the time series data set, and each row of the time series matrix is called a sub-time series vector; defining the correlation product of each subsequence as:
Figure BDA0002526260210000071
where r > 0, M is the number of phase space points M ═ N- (M-1) t, where the embedding dimension M may then be 2,3,4,5, dbc=||YB-YC||
Figure BDA0002526260210000072
The correlation integral is a cumulative distribution function and represents the probability that the distance between any two points in the phase space is less than the radius r, and the distance between the points is represented by the infinite norm of the difference of vectors; define the detection statistic as:
S(m,N,r,t)=C(m,N,r,t)-Cm(1,N,r,t); (21)
the delay time τ may be obtained by the first local minimum of the following equation:
Figure BDA0002526260210000073
where Δ S (m, t, N) is the maximum minimum delta of the statistic S (r) corresponding to the radius r:
ΔS(m,t,N)=max{S(m,N,rj,t)}-min{S(m,N,rj,t)}; (23)
while the delay window τw(m-1) τ can be obtained by the minimum value of the following formula:
Figure BDA0002526260210000074
embedding dimension m, and finally reconstructing a time sequence to obtain a reconstruction matrix F in a formula (19); and taking the element of each row in the matrix as an input node of the neural network, and taking the next tau time of the row corresponding to the last element as an output node, constructing an input and output layer, completing phase space reconstruction, and obtaining reconstructed data, namely sample data.
Further, step S4 specifically includes the following steps:
s4.1, constructing an LSTM prediction model;
s4.2, training an LSTM prediction model;
s4.3, carrying out reliability evaluation on the LSTM model;
and S4.4, early warning is carried out by setting a threshold value on the basis of the prior information of the historical data set.
Further, in step S4.1, a data set obtained after reconstructing the phase space is divided into a training set, a verification set and a test set, the training set is used for building a recurrent neural network prediction model, and the recurrent neural network prediction model is built by using a recurrent neural network method (LSTM); embedding dimension m obtained by the C-C algorithm is the number of input units, and delay time tau is the distance between input points;
each LSTM unit consists of 3 gates, which are a forgetting gate (Forget gate), an Input gate (Input gate), and an Output gate (Output gate);
in the used LSTM model, the forgetting gate has the function of discarding some useless information; vector h for output of last time instantt-1And x of the vector input at the current timetInputting together, and discarding irrelevant information in the input; forgetting the door determines the unit state c of the last momentt-1How much to keep current time ct(ii) a The formula for a forget gate is as follows:
ft=σ(Wxf·[ht-1,xt]+bf); (25)
wherein, WxfAnd bfRespectively representing a weight matrix and an offset term of the forgetting gate, sigma being a sigmoid function, [ h ]t-1,xt]Means to concatenate two vectors into one longer vector;
the input gate controls the input information to the network, and the updating of the state of the input gate requires two parts to work together: firstly determining which information should be updated, secondly generating a new vector by the tanh layer, the input gate determining the input x of the network at the current momenttHow many cells to save to cell state ct;itIndicating an input gate,/tIndicating the state of the currently input cell, ctCell state representing the current time:
it=σ(Wxi·[ht-1,xt]+bi); (26)
It=tanh(Wxl·[ht-1,xt]+bl); (27)
Figure BDA0002526260210000081
wherein, WxiAnd biWeight matrix and offset term, W, representing input gates, respectivelyxlAnd blWeight matrix and bias term respectively representing cell state;
the output gate controls the output information of the LSTM prediction model; it controls the cell state ctHow much current final output value h is output to LSTMt,otValue representing output gate output:
ot=σ(Wxo·[ht-1,xt]+bo); (29)
Figure BDA0002526260210000082
wherein, WxoAnd boWeight matrix and bias term respectively representing output gate; the final output values for each set of training are as follows:
yt=Why·ht; (31)
Whya weight matrix representing the final LSTM model output.
Further, in step S4.2, a back propagation algorithm is adopted as the LSTM training algorithm; there are 8 sets of parameters to be learned by LSTM, which are: weight matrix W of forgetting gatexfAnd bias term bfWeight matrix W of input gatesxiAnd bias term biWeight matrix W of output gatesoAnd bias term boAnd calculating a weight matrix W of cell statesxlAnd bias term bl(ii) a Since the two parts of the weight matrix use different formulas in the back propagation, in the subsequent derivation, the weight matrix Wxf、Wxi、Wxo、WxlWill be written as two separate matrices: wfh、Wfx、Wih、Wix、Woh、Wox、Wlh、Wlx
The following formula, if not specified, E represents the loss function; and LSTM has four weighted inputs, respectively ft、it、ct、otFor passing an error term up; the specific steps of step S4.2 are as follows:
s4.2.1, the error term is passed backwards in time, and the error term is passed forward to the equation at any time k:
Figure BDA0002526260210000091
s4.2.2, pass the error term to the previous layer:
Figure BDA0002526260210000092
s4.2.3, calculating weight gradient and bias term gradient:
in the above, the error term has been foundo,tf,ti,tl,tThen, the W at time t is determinedoh、Wih、WfhAnd WlhThe final gradient is obtained by adding the gradients at the various times together:
Figure BDA0002526260210000093
Figure BDA0002526260210000094
Figure BDA0002526260210000095
Figure BDA00025262602100000913
the offset term b followsf、bi、bl、boLike the weight gradient, the gradients at the respective times are added together; the final bias term gradient, i.e., the bias term gradients at each time instant are added together, is as follows:
Figure BDA0002526260210000096
Figure BDA0002526260210000097
Figure BDA0002526260210000098
Figure BDA0002526260210000099
for Wfx、Wix、Wlx、WoxThe weight gradient of (2) is only required to be based onThe corresponding error term can be directly calculated:
Figure BDA00025262602100000910
Figure BDA00025262602100000911
Figure BDA00025262602100000912
Figure BDA0002526260210000101
further, in step S4.3, the verification set obtained in step S4.1 is used to verify whether the result of the prediction of the recurrent neural network prediction model is accurate; at present, the judgment of the precision of a prediction model of a recurrent neural network depends on Mean Square Error (MSE) and a decision coefficient R2Two indexes; the mean square error MSE can indicate the deviation degree of the predicted result from the actual result, but the difference of the sample set dimension causes the predicted result not to have readability, and the coefficient R is determined2The superiority of the model compared with the direct averaging is shown, the result is between 0 and 1, and the closer to 1, the better the model effect is;
by adopting a Bayesian hypothesis testing method, the reliability of the model is quantified, and the prior information of the training set can be considered at the same time, specifically as follows:
assume that the data predicted by the LSTM prediction model is
Figure BDA0002526260210000102
Corresponding to the actual data being yexp={y1,y2,…,ynN is the number of data; let
Figure BDA0002526260210000103
Residual representing ith vibration signal and ith prediction signalIf there are n residuals, { e1, e2, …, en }, the residual e isiObeying a normal distribution N (μ, σ 1)2) Then mean of residual errors
Figure BDA00025262602100001011
Obeying a normal distribution N (mu, sigma)1 2N); establishing the original hypothesis H0And alternative hypothesis H1
H0:μ=0,H1:μ≠0;
Assume a prior probability density of N (0, σ) for μ0 2),σ0 2The verification set is selected as prior information, namely, the mean value of the prediction error of the verification set is calculated in a segmentation mode, and the variance of the mean value is used as sigma0 2(ii) a The probability density function for μ is found to be:
Figure BDA0002526260210000104
in combination with the regularity of the normal distribution,
Figure BDA0002526260210000105
the edge density function for g (μ) is:
Figure BDA0002526260210000106
the bayesian factor is then expressed as:
Figure BDA0002526260210000107
taking logarithm on two sides of formula (48):
Figure BDA0002526260210000108
wherein lambda is the confidence of the reliability of the recurrent neural network prediction model,
Figure BDA0002526260210000109
time lambda → 0, which means that the confidence of the support model is 0%, the recurrent neural network prediction model is unreliable; while
Figure BDA00025262602100001010
Time λ → ∞ indicates that the confidence of supporting the model is 100%, and the recurrent neural network prediction model is very reliable.
Further, in step S4.4, if the prediction result of the recurrent neural network prediction model is accurate, the residual between the actual value and the model prediction value, which is the value of the real-time monitoring value after denoising, dimensionality reduction and phase space reconstruction, can be used for judgment;
when a unit or a monitoring system has a fault at a certain moment, the fault can be reflected on a signal detected by the monitoring system, and a larger error can be generated between the fault and the signal in a rational health state; the cyclic neural network prediction model gives predictions between the fault moments, and the predicted values given according to the previous health data are regarded as the rational health data, so that a larger difference exists between the predicted values and the monitoring values of the monitoring system, and whether to give out early warning is judged by identifying the large error;
whether an alarm is given is determined by setting a threshold value, wherein the threshold value is the maximum absolute value e of the error between the predicted value and the actual value of the predictive model of the verification centralized circulation neural networkmax(ii) a The training set and the verification set are historical data sampled under the health state of the unit, but the training set is used for building a model, so that the training set naturally has smaller errors on the whole and has no reference; the verification set is an application of the model, if a time threshold is set artificially in the process of monitoring the unit by using the recurrent neural network prediction model, and if the duration time greater than zeta or the total time exceeds the time threshold, the unit is judged to have a potential fault and needs to be repaired.
Compared with the prior art, the invention has the following beneficial effects:
(1) the method for denoising noisy data sets by adopting discrete wavelet packet transformation overcomes the limitation that the conventional wavelet transformation can only transform a low-pass filtering result, the discrete wavelet packet transformation is a more refined signal analysis method, can decompose low-frequency and high-frequency parts at the same time, and adopts a Bayesian estimation method to determine a threshold value, so that the destuffing effect is enhanced, and the prediction precision of an LSTM model can be effectively improved;
(2) the PPCA is adopted to reduce the dimension of the data set, and the limitation that the traditional PCA simply discards other non-principal components is overcome. In PPCA, this discarded information is used as gaussian noise estimation, which can maximally preserve the useful information in the original signal.
(3) By adopting a Bayesian hypothesis testing method, the reliability of the model is quantified, the prior information of a training set can be considered, the influence of uncertain factors on the accuracy of the LSTM prediction model is quantitatively analyzed, quantitative evaluation and error correction are carried out by using the method, the prediction accuracy is improved, and seamless integration of Bayesian inference and the LSTM prediction model is realized.
Drawings
FIG. 1 is a schematic diagram of a three-level wavelet decomposition according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of an LSTM unit in an embodiment of the present invention;
FIG. 3 is a schematic diagram of error between a fault signal and a health signal and a predicted signal according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating a noise reduction effect of a vibration signal according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the phase space reconstruction of the C-C algorithm of the principal component of the vibration signal according to the embodiment of the present invention;
FIG. 6 is a schematic diagram of a construction process of a prediction model according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating the prediction of the vibration principal component signal according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a reliability test λ value of the prediction model in an embodiment of the present invention;
FIG. 9 is a schematic diagram of the application of a Bayesian LSTM prediction model in monitoring crack failure in an embodiment of the present invention;
fig. 10 is a flowchart of a fault warning method according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following detailed description of the embodiments of the present invention is provided by way of a specific example with reference to the accompanying drawings, but the embodiments of the present invention are not limited thereto.
Example 1:
in the embodiment, a fault early warning method for a rotating machine based on a Bayesian LSTM model is characterized in that a high-medium-pressure (HIP) cylinder closing monitoring and control system of a steam turbine is provided, when an abnormal condition occurs in the system, the cylinder closing operation can be stopped, and steam fed into a cylinder body expands to push a rotor to rotate to apply work; steam turbines are large and precise high-speed rotating equipment, and the gap between a rotor and a cylinder body is very narrow, so that a matched monitoring system is required to monitor the operation. When an existing TSI gives an alarm, the unit is usually shut down; in the embodiment, data of 1 month of operation of a high-medium-pressure (HIP) combined cylinder of a steam turbine in a steam turbine instrument monitoring system (TSI) of a certain nuclear power plant in China is taken to explain the algorithm flow of the whole early warning model, and whether the model is effective or not is proved by judging whether a signal can be correctly predicted or not; as shown in fig. 10, the method comprises the following steps:
s1, reading n groups of time-series signal data, and processing the signal data into a p-dimensional matrix Xp×nConveniently processing signals to obtain noise-containing abnormal-free data, decomposing the signals by a discrete wavelet packet threshold denoising method, filtering wavelet coefficients and reconstructing to effectively remove noise;
the read signal data are original signals, namely signals collected by the sensor and not subjected to any processing are usually collected from the sensor, the data recorded by the real-time monitoring system often contain noise, if the data do not undergo denoising processing, the data can influence the next signal analysis, so that the analysis has deviation, and the result has no meaning; therefore, a discrete wavelet packet threshold denoising method is adopted to denoise an original signal, and the method specifically comprises the following steps:
as shown in fig. 1, after a noisy signal is decomposed in multiple layers, its energy is mainly concentrated in partial wavelet packet decomposition coefficients, and the noise energy is distributed in the coefficients of the whole wavelet domain, the amplitude of the wavelet packet transform coefficient of the signal itself is greater than that of the wavelet packet transform coefficient of the noise; setting a proper threshold value, screening out wavelet packet transformation coefficients caused by signals, filtering out coefficients caused by noise, and performing wavelet reconstruction to obtain signals without noise, so that the selection of the threshold value has a decisive influence on the denoising effect of the wavelet packet threshold value;
the threshold is determined by a Bayesian estimation method, the advantage of prior information is considered, the threshold is established in the face of a risk function, and the denoising effect is effectively enhanced;
in discrete wavelet packet transform, the basic wavelet function is typically:
Figure BDA0002526260210000131
wherein t represents a continuous time variable; pi is a time index; the syndrome of an attack is the frequency index; z is the set of all integers; l is2(R) represents hilbert space; ψ is a basic wavelet function;
in practical applications of the discrete wavelet packet technique, a signal f (t) with n discrete data points is wavelet decomposed:
Figure BDA0002526260210000132
fj(t) the wavelet series is further decomposed into wavelet packet components:
Figure BDA0002526260210000133
wnwavelet function cluster representing the nth discrete data point, n 0,1,2, …, where k represents the number of layers to continue decomposition, 0 ≦ k ≦ j, coefficient
Figure BDA0002526260210000134
Referred to as f (t) at resolution j-a coefficient of orthogonal wavelet packet decomposition at k; w is an(t) satisfies the two-scale equation:
Figure BDA0002526260210000135
wherein, { h }n}n∈ZIs a conjugate quadrature mirror filter satisfying
Figure BDA0002526260210000136
gn=(-1)kh1-kk,lIs a Kronecker function, and satisfies
Figure BDA0002526260210000137
k,l∈Z;
When n is 0, w0(t) is a scale function, and when n is 1, w is1(t) is the wavelet function ψ (t), and in this case, equation (1) can be expressed as:
Figure BDA0002526260210000138
wavelet packet decomposition coefficient
Figure BDA0002526260210000139
Next layer decomposition coefficient
Figure BDA00025262602100001310
The following is obtained by a recurrence formula:
Figure BDA00025262602100001311
if the initial value of the wavelet packet decomposition coefficient is the discrete data, the reconstruction algorithm is as follows:
Figure BDA00025262602100001312
and (4) carrying out noise-containing abnormal signal decomposition by using the formula (6), obtaining the wavelet packet decomposition coefficient of each node of the binary tree in the 3-level wavelet packet decomposition, filtering out the wavelet packet decomposition coefficient according to a threshold value, and then carrying out signal reconstruction according to the formula (7).
The threshold is determined by using a bayesian estimation method, which specifically includes:
the threshold is determined by adopting a Bayes shock threshold estimation method with self-adaptability, and the calculation formula is as follows:
T=σ2x; (8)
wherein σ2Is the variance of the noise, σxIs the original signal variance;
the noisy signal is:
f(t)=g(t)+(t); (9)
wherein g (t) is the original signal and (t) is the noise signal;
writing a wavelet packet decomposition coefficient d obtained by performing discrete wavelet packet transformation on the signal into:
d=dx+d; (10)
wherein d isxIs the noise wavelet coefficient, d is the original signal wavelet coefficient;
the noise standard deviation was estimated as:
Figure BDA0002526260210000141
wherein,
Figure BDA0002526260210000142
that is, the wavelet coefficients of each node in the binary tree are averaged by sigmayRepresenting the noisy signal variance:
Figure BDA0002526260210000143
the raw signal variance is estimated as:
Figure BDA0002526260210000144
thus obtaining Bayes threshold value capable of being adaptively adjusted according to scale
Figure BDA0002526260210000145
The wavelet coefficients can be filtered by selecting a soft threshold function:
Figure BDA0002526260210000146
in this embodiment, a steam turbine signal of one month is sampled, which includes 29 signals of six types including vibration, expansion difference (cylinder expansion amount-rotor expansion amount), rotor shaft displacement, temperature and rotation speed, that is, 29 sets of time-series signal data, and the signal data is processed into a 720-dimensional matrix X720×29(ii) a Table 1 is a sample signal description:
TABLE 1 sample Signal description
Figure BDA0002526260210000147
Figure BDA0002526260210000151
In order to realize the prediction of the HIP cylinder combination monitoring signal, in this embodiment, data before 4 months and 18 days is used as a training set to build a model, a weight is trained, data from 4 months and 19 days to 4 months and 24 days is used as a verification set to verify, and data after 24 days is used as a test set to test the effect of the actual use of the model.
After a noisy abnormal-free data set is obtained, DWPT denoising is carried out to decompose signals, FIG. 4 shows the denoising effect of the vibration signals, and it can be known from the graph that the change trends of the fold lines before and after denoising are basically consistent, so that the algorithm retains useful information while filtering noise, has higher similarity with the original signals, and is more beneficial to predicting model training and finding weight information.
S2, for sample data of p dimension, each dimensionThe device conditions may not be represented independently, and each dimension may be correlated with each other, in order to reduce the dimensions and improve the efficiency, Probability Principal Component Analysis (PPCA) is performed on sample data, and the sample data is reduced to q dimension Xq×n
Adopting Probability Principal Component Analysis (PPCA) to realize multidimensional data dimension reduction and data processing uncertainty aiming at the denoised signals; probability Principal Component Analysis (PPCA) defines a proper probability model for Principal Component Analysis (PCA), thereby overcoming the limitation that the traditional principal component analysis simply discards other non-principal components; in the probability principal component analysis, the discarded information is used as Gaussian noise estimation, and the useful information in the original signal can be retained to the maximum extent;
with sample X of dimension pp×nIn the probabilistic principal component analysis, a sample X is assumedp×nExpressed as X ═ Wz + μ +, where W is the weight vector, dimension p × q, q ≦ p, and z is the q × N dimension obeying z to N (0, I)q) A random gaussian vector, also known as the result of X dimensionality reduction, μ is the sample mean, which is noise, assuming that the noise follows a gaussian distribution with variance σ:
Figure BDA0002526260210000152
substituting into Bayes formula to obtain z posterior probability P (z | X) -N (M)-1WT(X-μ),σ2M-1) Wherein M ═ σ2Iq+WTW)-1In Bayesian probabilistic principal component analysis, the expected M of the z posterior probability is considered-1WT(X- μ) is the result of dimension reduction of X, and then only W and σ in the formula are unknown numbers, and W and σ are estimated by the maximum likelihood function, and the estimation result is:
Figure BDA0002526260210000153
Figure BDA0002526260210000161
wherein λjIs obtained by decomposing the covariance matrix of the sample X according to the eigenvalues, i.e. Cvj=λjvj,vjIs a feature vector, Uq=(v1,v2,…,vq),Δq=diag(λ1,…,λq);
When q is equal to p, there is the equation W-1X=z+W-1μ+W-1Using W-1The method can be used for carrying out primary fault diagnosis by searching reversely with the main component; w-1Expressed as:
Figure BDA0002526260210000162
wherein, w12Represents the data sample Xp×nThe contribution rate of the second-dimensional signal of (a) to the first principal component; by utilizing the principle, the influence degree of all signals on the first main component can be obtained, when the main component is abnormal, the probability of the signal with high contribution rate to have fault is the maximum, the signal is diagnosed preferentially, and the problem can be found quickly and effectively.
In this embodiment, in order to reduce data dimensions, sample information is represented by a smaller variable, so that a prediction result is representative, and PPCA analysis is performed on data at present, specifically: only one-dimensional signal types, namely expansion difference and shaft displacement, are reserved, PPCA dimension reduction is only carried out on vibration, temperature and rotating speed, and the dimension reduction effect is shown in table 2. In table wiWhich represents the rate of contribution of each component signal to the first principal component, for three signals subjected to PPCA analysis. The bold part represents a component having a relatively high influence on the principal component, which makes sense in performing fault diagnosis after fault warning, and the health condition of these components can be preferentially judged.
TABLE 2 vibration, temperature, rotational speed PCA weight, contribution ratio
Figure BDA0002526260210000163
Figure BDA0002526260210000171
S3, performing phase space reconstruction on the dimensionality reduction signal data by using a C-C algorithm;
as shown in fig. 5, because the artificial neural network has a strong nonlinear mapping capability, the artificial neural network is often combined with a C-C algorithm, and in a reconstructed phase space, the artificial neural network is used to approximate the mapping relationship between the current state and the future state to predict the chaotic time sequence, and meanwhile, the input and output of the neural network are determined by the reconstructed m of the embedded dimension and the delay time τ; for the determination of the delay time τ and the embedding dimension m, the delay time and the embedding dimension can be estimated simultaneously using correlation integration, as follows:
first construct disjoint time series matrices:
Figure BDA0002526260210000172
assuming that the maximum value of the delay time τ is T, where T is a natural number in [1, T ], k is INT (N/T), N is the length of the time series data set, and each row of the time series matrix is called a sub-time series vector; defining the correlation product of each subsequence as:
Figure BDA0002526260210000173
where r > 0, M is the number of phase space points M ═ N- (M-1) t, where the embedding dimension M may then be 2,3,4,5, dbc=||YB-YC||
Figure BDA0002526260210000174
The correlation integral is a cumulative distribution function and represents the probability that the distance between any two points in the phase space is less than the radius r, and the distance between the points is represented by the infinite norm of the difference of vectors; define the detection statistic as:
S(m,N,r,t)=C(m,N,r,t)-Cm(1,N,r,t); (21)
the delay time τ may be obtained by the first local minimum of the following equation:
Figure BDA0002526260210000175
where Δ S (m, t, N) is the maximum minimum delta of the statistic S (r) corresponding to the radius r:
ΔS(m,t,N)=max{S(m,N,rj,t)}-min{S(m,N,rj,t)}; (23)
while the delay window τw(m-1) τ can be obtained by the minimum value of the following formula:
Figure BDA0002526260210000181
embedding dimension m, and finally reconstructing a time sequence to obtain a reconstruction matrix F in a formula (19); and taking the element of each row in the matrix as an input node of the neural network, and taking the next tau time of the row corresponding to the last element as an output node, constructing an input and output layer, completing phase space reconstruction, and obtaining reconstructed data, namely sample data.
In this embodiment, the principal component obtained by PPCA analysis already disturbs the original signal component relationship, and is likely to become a chaotic time series. In this case, the principal component needs to be reconstructed in phase space, and the input layer of the prediction model is determined. Fig. 2 shows the response values of the discriminant formula of the principal component of the vibration signal in the C-C algorithm at different delay times, where the delay time τ is 4 or τ is 3, and τ is shown in the figurew7 according to the formula τwIf the dimension m is 4 or m is 4, the number of input units for which the prediction model is to be determined is 4.
S4, dividing the sample data into training, verifying and testing data according to proportion, and facilitating the training, verifying and testing of the prediction model; constructing an LSTM (least squares metric) cyclic neural network, taking a sigmoid function as an activation function, and taking a back propagation algorithm as a training algorithm of the activation function to obtain optimal parameters of a model; verifying the model precision by using the verification sample, judging whether the model precision meets the application condition or not, and continuing training if the model precision does not meet the precision requirement; if the model prediction is correct, applying the model by using a test sample, solving the confidence coefficient of the data predicted by the model by adopting a Bayesian hypothesis reliability test method, and finally giving a judgment method for whether to give out an early warning or not on the basis of the prior information of a historical data set, so that the model can accurately give out an alarm in a unit fault creep stage or when a monitoring system has a problem; as shown in fig. 6, the method specifically includes the following steps:
s4.1, constructing an LSTM prediction model;
dividing a data set obtained after phase space reconstruction into a training set, a verification set and a test set, wherein the training set is used for building a recurrent neural network prediction model, and the recurrent neural network prediction model is built by adopting a recurrent neural network method (LSTM); embedding dimension m obtained by the C-C algorithm is the number of input units, and delay time tau is the distance between input points;
as shown in fig. 2, each LSTM unit consists of 3 gates, namely a forgetting gate (Forget gate), an Input gate (Input gate), and an Output gate (Output gate);
in the used LSTM model, the forgetting gate has the function of discarding some useless information; vector h for output of last time instantt-1And x of the vector input at the current timetInputting together, and discarding irrelevant information in the input; forgetting the door determines the unit state c of the last momentt-1How much to keep current time ct(ii) a The formula for a forget gate is as follows:
ft=σ(Wxf·[ht-1,xt]+bf); (25)
wherein, WxfAnd bfRespectively representing a weight matrix and an offset term of the forgetting gate, sigma being a sigmoid function, [ h ]t-1,xt]Means to concatenate two vectors into one longer vector;
the input gate controls the input information to the network, and the updating of the state of the input gate requires two parts to work together: firstly determining which information should be updated, secondly generating a new vector by the tanh layer, the input gate determining the input x of the network at the current momenttHow many cells to save to cell state ct;itIndicating an input gate,/tIndicating the state of the currently input cell, ctCell state representing the current time:
it=σ(Wxi·[ht-1,xt]+bi); (26)
It=tanh(Wxl·[ht-1,xt]+bl); (27)
Figure BDA0002526260210000193
wherein, WxiAnd biWeight matrix and offset term, W, representing input gates, respectivelyxlAnd blWeight matrix and bias term respectively representing cell state;
the output gate controls the output information of the LSTM prediction model; it controls the cell state ctHow much current final output value h is output to LSTMt,otValue representing output gate output:
ot=σ(Wxo·[ht-1,xt]+bo); (29)
Figure BDA0002526260210000194
wherein, WxoAnd boWeight matrix and bias term respectively representing output gate; the final output values for each set of training are as follows:
yt=Why·ht; (31)
Whya weight matrix representing the final LSTM model output.
S4.2, training an LSTM prediction model;
adopting a back propagation algorithm as an LSTM training algorithm; there are 8 sets of parameters to be learned by LSTM, which are: weight matrix W of forgetting gatexfAnd bias term bfWeight matrix W of input gatesxiAnd bias term biWeight matrix W of output gatesoAnd bias term boAnd calculating a weight matrix W of cell statesxlAnd bias term bl(ii) a Since the two parts of the weight matrix use different formulas in the back propagation, in the subsequent derivation, the weight matrix Wxf、Wxi、Wxo、WxlWill be written as two separate matrices: wfh、Wfx、Wih、Wix、Woh、Wox、Wlh、Wlx
The following formula, if not specified, E represents the loss function; and LSTM has four weighted inputs, respectively ft、it、ct、otFor passing an error term up; the specific steps of step S4.2 are as follows:
s4.2.1, the error term is passed backwards in time, and the error term is passed forward to the equation at any time k:
Figure BDA0002526260210000191
s4.2.2, pass the error term to the previous layer:
Figure BDA0002526260210000192
s4.2.3, calculating weight gradient and bias term gradient:
in the above, the error term has been foundo,tf,ti,tl,tThen, the W at time t is determinedoh、Wih、WfhAnd WlhThe final gradient is obtained by adding the gradients at the various times together:
Figure BDA0002526260210000201
Figure BDA0002526260210000202
Figure BDA0002526260210000203
Figure BDA0002526260210000204
the offset term b followsf、bi、bl、boLike the weight gradient, the gradients at the respective times are added together; the final bias term gradient, i.e., the bias term gradients at each time instant are added together, is as follows:
Figure BDA0002526260210000205
Figure BDA0002526260210000206
Figure BDA0002526260210000207
Figure BDA0002526260210000208
for Wfx、Wix、Wlx、WoxThe weight gradient of (2) can be directly calculated only according to the corresponding error term:
Figure BDA0002526260210000209
Figure BDA00025262602100002010
Figure BDA00025262602100002011
Figure BDA00025262602100002012
s4.3, carrying out reliability evaluation on the LSTM model;
the verification set obtained in the step S4.1 is used for verifying whether the result predicted by the recurrent neural network prediction model is accurate or not; at present, the judgment of the precision of a prediction model of a recurrent neural network depends on Mean Square Error (MSE) and a decision coefficient R2Two indexes; the mean square error MSE can indicate the deviation degree of the predicted result from the actual result, but the difference of the sample set dimension causes the predicted result not to have readability, and the coefficient R is determined2The superiority of the model compared with the direct averaging is shown, the result is between 0 and 1, and the closer to 1, the better the model effect is;
by adopting a Bayesian hypothesis testing method, the reliability of the model is quantified, and the prior information of the training set can be considered at the same time, specifically as follows:
assume that the data predicted by the LSTM prediction model is
Figure BDA0002526260210000211
Corresponding to the actual data being yexp={y1,y2,…,ynN is the number of data; let
Figure BDA0002526260210000212
When the residual between the ith vibration signal and the ith prediction signal is represented, n residuals { e }1,e2,…,en}, residual eiObey positiveDistribution of states N (mu, sigma)1 2) Then mean of residual errors
Figure BDA0002526260210000213
Obeying a normal distribution N (mu, sigma)1 2N); establishing the original hypothesis H0And alternative hypothesis H1
H0:μ=0,H1:μ≠0;
Assume a prior probability density of N (0, σ) for μ0 2),σ0 2The verification set is selected as prior information, namely, the mean value of the prediction error of the verification set is calculated in a segmentation mode, and the variance of the mean value is used as sigma0 2(ii) a The probability density function for μ is found to be:
Figure BDA0002526260210000214
in combination with the regularity of the normal distribution,
Figure BDA0002526260210000215
the edge density function for g (μ) is:
Figure BDA0002526260210000216
the bayesian factor is then expressed as:
Figure BDA0002526260210000217
taking logarithm on two sides of formula (48):
Figure BDA0002526260210000218
wherein lambda is the confidence of the reliability of the recurrent neural network prediction model,
Figure BDA0002526260210000219
time lambda → 0, which means that the confidence of the support model is 0%, the recurrent neural network prediction model is unreliable; while
Figure BDA00025262602100002110
Time λ → ∞ indicates that the confidence of supporting the model is 100%, and the recurrent neural network prediction model is very reliable.
In this example, various combinations of delay times and embedding dimensions are enumerated in conjunction with phase space reconstruction, with goodness of fit R2The mean square error is used for judging the precision of the model built by each combination, and Table 4 shows that the difference between the goodness of fit and the mean square error of each delay time is small under different embedding dimensions, R is the mean square error of the model built by each combination, and R is the mean square error of the model built by each combination2All above 0.95 and the mean square error is around 0.01, which shows that the prediction model can accurately predict the future signal value, thereby generating a reference value which is comparable with the monitored actual value and is used for judging the health condition of the equipment.
TABLE 4 prediction model accuracy under different combinations of embedding dimension and delay time
Figure BDA0002526260210000221
And comprehensively comparing, selecting an embedding dimension of 4 and a delay time of 3 to construct a prediction model. The prediction signal is shown in fig. 7. The probability of assuming the model is accurate is 50%, i.e., π0And (5) solving the value in the prediction model verification set by using the Bayesian reliability verification method provided by the invention when the value is 0.5. As can be seen from FIG. 8, the sample size increases with the accumulation of time, and the confidence level eventually stabilizes at 92%, indicating that the prediction model is relatively reliable.
Furthermore, the delay time determines how far into the future the prediction model can see, the longer the delay time, the longer the prediction model can predict into the future.
S4.4, early warning is carried out by setting a threshold value on the basis of prior information of the historical data set;
if the prediction result of the prediction model of the recurrent neural network is accurate, the residual error between the actual value and the predicted value of the model can be judged by utilizing the value of the real-time monitoring value after noise reduction-dimensionality reduction-phase space reconstruction;
as shown in fig. 3, when a fault occurs in a unit or a monitoring system at a certain time, the fault is reflected in a signal detected by the monitoring system, and a large error is generated between the fault and the signal in a health state; the cyclic neural network prediction model gives predictions between the fault moments, and the predicted values given according to the previous health data are regarded as the rational health data, so that a larger difference exists between the predicted values and the monitoring values of the monitoring system, and whether to give out early warning is judged by identifying the large error;
whether an alarm is given is determined by setting a threshold value, wherein the threshold value is the maximum absolute value e of the error between the predicted value and the actual value of the predictive model of the verification centralized circulation neural networkmax(ii) a The training set and the verification set are historical data sampled under the health state of the unit, but the training set is used for building a model, so that the training set naturally has smaller errors on the whole and has no reference; the verification set is an application of the model, if a time threshold is set artificially in the process of monitoring the unit by using the recurrent neural network prediction model, and if the duration time greater than zeta or the total time exceeds the time threshold, the unit is judged to have a potential fault and needs to be repaired.
Example 2:
in the embodiment, the steam turbine signal is monitored by using the recurrent neural network prediction model, and early warning can be sent out during the steam turbine fault creep or when the sensor breaks down.
Example 1 describes the case of a health signal. In order to illustrate the effectiveness of the fault early warning method of the present invention, the present embodiment is described as a fault case. Generally, the model trained by the health training set is a healthy and fault-free model, so that it is predicted that the signal at a future time point should be healthy and fault-free.
In the embodiment, the vibration signal is adopted, and the crack generation of the secondary impeller of the pneumatic pump in the turbine is analyzed. The solid line portion in fig. 9 shows the signal change for about four months. During which 6 alarms from the monitoring system indicate that the vibration signal exceeds the system threshold 6 times. In addition, the value of the signal from 6 days 5 months to 7 days 5 months has a small abnormality caused by the sensor, and the monitoring system does not give an alarm even if the value exceeds the threshold value of the system. Processing the monitoring signal according to the sequence of discrete wavelet packet denoising, phase space reconstruction and LSTM model prediction; in this embodiment, the health data of 2 months and 10 days before is used as a training set to establish a recurrent neural network prediction model, and the health data of 2 months and 10 days to 3 months and 1 day is used as a verification set. The prediction error of the recurrent neural network prediction model is shown by the dashed line in fig. 9, and the threshold is set to the maximum and minimum value of the validation set error, as shown by the thick dashed line parallel to the abscissa in the figure. As shown in the figure, the recurrent neural network prediction model correctly predicts all 6 system alarms, and the rate of missed diagnosis is 0%. And the recurrent neural network prediction model generates an alarm when the monitoring system does not monitor the abnormality from 5 months and 6 days to 5 months and 7 days. As is evident from the figure, the recurrent neural network predictive model always gives an alarm before the monitoring system, and the alarm is advanced by 31 hours on average, and then the equipment is subjected to shutdown inspection on the day of 5 months and 27 days, and severe crack fracture is observed. The analysis shows that the recurrent neural network prediction model constructed by the invention can predict the alarm and sensor faults of the monitoring system in advance.

Claims (10)

1. A fault early warning method of a rotating machine based on a Bayesian LSTM model is characterized by comprising the following steps:
s1, reading n groups of time-series signal data, and processing the signal data into a p-dimensional matrix Xp×nConveniently processing signals to obtain noise-containing abnormal-free data, decomposing the signals by a discrete wavelet packet threshold denoising method, filtering wavelet coefficients and reconstructing to effectively remove noise;
s2, for p-dimensional sample data, each dimension may not independently represent equipment status, and each dimension may be correlated with each other, in order to reduce the dimension and improve efficiency, Probability Principal Component Analysis (PPCA) is performed on the sample data, and the sample data is reduced to q-dimension Xq×n
S3, performing phase space reconstruction on the dimensionality reduction signal data by using a C-C algorithm;
s4, dividing the sample data into training, verifying and testing data according to proportion, and facilitating the training, verifying and testing of the prediction model; constructing an LSTM (least squares metric) cyclic neural network, taking a sigmoid function as an activation function, and taking a back propagation algorithm as a training algorithm of the activation function to obtain optimal parameters of a model; verifying the model precision by using the verification sample, judging whether the model precision meets the application condition or not, and continuing training if the model precision does not meet the precision requirement; if the model prediction is correct, the model is applied by using a test sample, the confidence coefficient of the data predicted by the model is solved by adopting a Bayesian hypothesis reliability test method, and finally, a judgment method for whether to give out an early warning is given on the basis of the prior information of a historical data set, so that the model can accurately give out an alarm in a unit fault creep stage or when a monitoring system has a problem.
2. The fault early warning method for the rotating machine based on the Bayesian LSTM model as claimed in claim 1, wherein in step S1, the read signal data is an original signal, that is, a signal collected by a sensor without any processing is usually collected from the sensor, the data recorded by the real-time monitoring system is often noisy, and if the signal is not processed by denoising, the data affects the subsequent signal analysis, so that the analysis is biased, and the result is meaningless; therefore, a discrete wavelet packet threshold denoising method is adopted to denoise an original signal, and the method specifically comprises the following steps:
after a noisy signal is subjected to multi-layer decomposition, the energy of the noisy signal is mainly concentrated in partial wavelet packet decomposition coefficients, the noise energy is distributed in the coefficients of the whole wavelet domain, and the amplitude value of the wavelet packet transformation coefficient of the signal is larger than that of the wavelet packet transformation coefficient of the noise; setting a proper threshold value, screening out wavelet packet transformation coefficients caused by signals, filtering out coefficients caused by noise, and performing wavelet reconstruction to obtain signals without noise, so that the selection of the threshold value has a decisive influence on the denoising effect of the wavelet packet threshold value;
the threshold is determined by a Bayesian estimation method, the advantage of prior information is considered, the threshold is established in the face of a risk function, and the denoising effect is effectively enhanced;
in discrete wavelet packet transform, the basic wavelet function is typically:
Figure FDA0002526260200000011
wherein t represents a continuous time variable; pi is a time index; the syndrome of an attack is the frequency index; z is the set of all integers; l is2(R) represents hilbert space; ψ is a basic wavelet function;
in practical applications of the discrete wavelet packet technique, a signal f (t) with n discrete data points is wavelet decomposed:
Figure FDA0002526260200000021
fj(t) the wavelet series is further decomposed into wavelet packet components:
Figure FDA0002526260200000022
wnwavelet function cluster representing the nth discrete data point, n 0,1,2, …, where k represents the number of layers to continue decomposition, 0 ≦ k ≦ j, coefficient
Figure FDA0002526260200000023
Referred to as f (t) the orthogonal wavelet packet decomposition coefficient at resolution j-k; w is an(t) satisfies the two-scale equation:
Figure FDA0002526260200000024
wherein, { h }n}n∈ZIs a conjugate quadrature mirror filter satisfying
Figure FDA0002526260200000025
gn=(-1)kh1-kk,lIs a Kronecker function, and satisfies
Figure FDA0002526260200000026
k,l∈Z;
When n is 0, w0(t) is a scale function, and when n is 1, w is1(t) is the wavelet function ψ (t), and in this case, equation (1) can be expressed as:
Figure FDA0002526260200000027
wavelet packet decomposition coefficient
Figure FDA0002526260200000028
Next layer decomposition coefficient
Figure FDA0002526260200000029
The following is obtained by a recurrence formula:
Figure FDA00025262602000000210
if the initial value of the wavelet packet decomposition coefficient is the discrete data, the reconstruction algorithm is as follows:
Figure FDA00025262602000000211
and (4) carrying out noise-containing abnormal signal decomposition by using the formula (6), obtaining the wavelet packet decomposition coefficient of each node of the binary tree in the 3-level wavelet packet decomposition, filtering out the wavelet packet decomposition coefficient according to a threshold value, and then carrying out signal reconstruction according to the formula (7).
3. The Bayesian LSTM model-based fault early warning method for rotary machines as recited in claim 2, wherein the threshold is determined by a Bayesian estimation method, specifically as follows:
the threshold is determined by adopting a Bayes shock threshold estimation method with self-adaptability, and the calculation formula is as follows:
T=σ2x; (8)
wherein σ2Is the variance of the noise, σxIs the original signal variance;
the noisy signal is:
f(t)=g(t)+(t); (9)
wherein g (t) is the original signal and (t) is the noise signal;
writing a wavelet packet decomposition coefficient d obtained by performing discrete wavelet packet transformation on the signal into:
d=dx+d; (10)
wherein d isxIs the noise wavelet coefficient, d is the original signal wavelet coefficient;
the noise standard deviation was estimated as:
Figure FDA0002526260200000031
wherein,
Figure FDA0002526260200000032
that is, the wavelet coefficients of each node in the binary tree are averaged by sigmayRepresenting the noisy signal variance:
Figure FDA0002526260200000033
the raw signal variance is estimated as:
Figure FDA0002526260200000034
thus obtaining Bayes threshold value capable of being adaptively adjusted according to scale
Figure FDA0002526260200000035
The wavelet coefficients can be filtered by selecting a soft threshold function:
Figure FDA0002526260200000036
4. the Bayesian LSTM model-based fault early warning method for the rotary machine according to claim 1, wherein in step S2, multi-dimensional data dimension reduction and data processing uncertainty are realized by using Probability Principal Component Analysis (PPCA) for the noise-reduced signals; probability Principal Component Analysis (PPCA) defines a proper probability model for Principal Component Analysis (PCA), thereby overcoming the limitation that the traditional principal component analysis simply discards other non-principal components; in the probability principal component analysis, the discarded information is used as Gaussian noise estimation, and the useful information in the original signal can be retained to the maximum extent;
with sample X of dimension pp×nIn the probabilistic principal component analysis, a sample X is assumedp×nExpressed as X ═ Wz + μ +, where W is the weight vector, dimension p × q, q ≦ p, and z is the q × N dimension obeying z to N (0, I)q) A random gaussian vector, also known as the result of X dimensionality reduction, μ is the sample mean, which is noise, assuming that the noise follows a gaussian distribution with variance σ:
Figure FDA0002526260200000041
substituting into Bayes formula to obtain z posterior probability P (z | X) -N (M)-1WT(X-μ),σ2M-1) Wherein M ═ σ2Iq+WTW)-1In Bayesian probabilistic principleIn the component analysis, the expected M of the z posterior probability is considered-1WT(X- μ) is the result of dimension reduction of X, and then only W and σ in the formula are unknown numbers, and W and σ are estimated by the maximum likelihood function, and the estimation result is:
Figure FDA0002526260200000042
Figure FDA0002526260200000043
wherein λjIs obtained by decomposing the covariance matrix of the sample X according to the eigenvalues, i.e. Cvj=λjvj,vjIs a feature vector, Uq=(v1,v2,...,vq),Δq=diag(λ1,...,λq);
When q is equal to p, there is the equation W-1X=z+W-1μ+W-1Using W-1The method can be used for carrying out primary fault diagnosis by searching reversely with the main component; w-1Expressed as:
Figure FDA0002526260200000044
wherein, w12Represents the data sample Xp×nThe contribution rate of the second-dimensional signal of (a) to the first principal component; by utilizing the principle, the influence degree of all signals on the first main component can be obtained, when the main component is abnormal, the probability of the signal with high contribution rate to have fault is the maximum, the signal is diagnosed preferentially, and the problem can be found quickly and effectively.
5. The fault early warning method of the rotating machine based on the Bayesian LSTM model as claimed in claim 1, wherein in step S3, because the artificial neural network has strong nonlinear mapping capability, the artificial neural network is often combined with the C-C algorithm, and in the reconstructed phase space, the artificial neural network is used to approximate the mapping relationship between the current state and the future state to predict the chaotic time series, and meanwhile, the input and output of the neural network are determined by the m of the reconstructed embedding dimension and the delay time τ; for the determination of the delay time τ and the embedding dimension m, the delay time and the embedding dimension can be estimated simultaneously using correlation integration, as follows:
first construct disjoint time series matrices:
Figure FDA0002526260200000051
assuming that the maximum value of the delay time τ is T, where T is a natural number in [1, T ], k is INT (N/T), N is the length of the time series data set, and each row of the time series matrix is called a sub-time series vector; defining the correlation product of each subsequence as:
Figure FDA0002526260200000052
where r > 0, M is the number of phase space points M ═ N- (M-1) t, where the embedding dimension M may then be 2,3,4,5, dbc=||YB-YC||
Figure FDA0002526260200000053
The correlation integral is a cumulative distribution function and represents the probability that the distance between any two points in the phase space is less than the radius r, and the distance between the points is represented by the infinite norm of the difference of vectors; define the detection statistic as:
S(m,N,r,t)=C(m,N,r,t)-Cm(1,N,r,t); (21)
the delay time τ may be obtained by the first local minimum of the following equation:
Figure FDA0002526260200000054
where Δ S (m, t, N) is the maximum minimum delta of the statistic S (r) corresponding to the radius r:
ΔS(m,t,N)=max{S(m,N,rj,t)}-min{S(m,N,rj,t)}; (23)
while the delay window τw(m-1) τ can be obtained by the minimum value of the following formula:
Figure FDA0002526260200000055
embedding dimension m, and finally reconstructing a time sequence to obtain a reconstruction matrix F in a formula (19); and taking the element of each row in the matrix as an input node of the neural network, and taking the next tau time of the row corresponding to the last element as an output node, constructing an input and output layer, completing phase space reconstruction, and obtaining reconstructed data, namely sample data.
6. The Bayesian LSTM model-based fault early warning method for the rotary machine according to claim 1, wherein the step S4 specifically comprises the following steps:
s4.1, constructing an LSTM prediction model;
s4.2, training an LSTM prediction model;
s4.3, carrying out reliability evaluation on the LSTM model;
and S4.4, early warning is carried out by setting a threshold value on the basis of the prior information of the historical data set.
7. The Bayesian LSTM model-based fault early warning method for the rotary machine is characterized in that in the step S4.1, a data set obtained after phase space reconstruction is divided into a training set, a verification set and a test set, wherein the training set is used for building a recurrent neural network prediction model, and the recurrent neural network prediction model is built by adopting a recurrent neural network method (LSTM); embedding dimension m obtained by the C-C algorithm is the number of input units, and delay time tau is the distance between input points;
each LSTM unit consists of 3 gates, which are a forgetting gate (Forget gate), an Input gate (Input gate), and an Output gate (Output gate);
in the used LSTM model, the forgetting gate has the function of discarding some useless information; vector h for output of last time instantt-1And x of the vector input at the current timetInputting together, and discarding irrelevant information in the input; forgetting the door determines the unit state c of the last momentt-1How much to keep current time ct(ii) a The formula for a forget gate is as follows:
ft=σ(Wxf·[ht-1,xt]+bf); (25)
wherein, WxfAnd bfRespectively representing a weight matrix and an offset term of the forgetting gate, sigma being a sigmoid function, [ h ]t-1,xt]Means to concatenate two vectors into one longer vector;
the input gate controls the input information to the network, and the updating of the state of the input gate requires two parts to work together: firstly determining which information should be updated, secondly generating a new vector by the tanh layer, the input gate determining the input x of the network at the current momenttHow many cells to save to cell state ct;itIndicating an input gate,/tIndicating the state of the currently input cell, ctCell state representing the current time:
it=σ(Wxi·[ht-1,xt]+bi); (26)
It=tanh(Wxl·[ht-1,xt]+bl); (27)
Figure FDA0002526260200000061
wherein, WxiAnd biWeight matrix and offset term, W, representing input gates, respectivelyxlAnd blWeight matrix and bias term respectively representing cell state;
the output gate controls the output information of the LSTM prediction model; it controls the cell state ctHow much current final output value h is output to LSTMt,otValue representing output gate output:
ot=σ(Wxo·[ht-1,xt]+bo); (29)
Figure FDA0002526260200000062
wherein, WxoAnd boWeight matrix and bias term respectively representing output gate; the final output values for each set of training are as follows:
yt=Why·ht; (31)
Whya weight matrix representing the final LSTM model output.
8. The Bayesian LSTM model-based fault early warning method for the rotary machine according to claim 6, wherein in the step S4.2, a back propagation algorithm is adopted as an LSTM training algorithm; there are 8 sets of parameters to be learned by LSTM, which are: weight matrix W of forgetting gatexfAnd bias term bfWeight matrix W of input gatesxiAnd bias term biWeight matrix W of output gatesoAnd bias term boAnd calculating a weight matrix W of cell statesxlAnd bias term bl(ii) a Since the two parts of the weight matrix use different formulas in the back propagation, in the subsequent derivation, the weight matrix Wxf、Wxi、Wxo、WxlWill be written as two separate matrices: wfh、Wfx、Wih、Wix、Woh、Wox、Wlh、Wlx
The following formula, if not specified, E represents the loss function; while the LSTM has four weightsInputs, are respectively ft、it、ct、otFor passing an error term up; the specific steps of step S4.2 are as follows:
s4.2.1, the error term is passed backwards in time, and the error term is passed forward to the equation at any time k:
Figure FDA0002526260200000071
s4.2.2, pass the error term to the previous layer:
Figure FDA0002526260200000072
s4.2.3, calculating weight gradient and bias term gradient:
in the above, the error term has been foundo,tf,ti,tl,tThen, the W at time t is determinedoh、Wih、WfhAnd WlhThe final gradient is obtained by adding the gradients at the various times together:
Figure FDA0002526260200000073
Figure FDA0002526260200000074
Figure FDA0002526260200000075
Figure FDA0002526260200000076
the offset term b followsf、bi、bl、boLike the weight gradient, the gradients at the respective times are added together; the final bias term gradient, i.e., the bias term gradients at each time instant are added together, is as follows:
Figure FDA0002526260200000077
Figure FDA0002526260200000078
Figure FDA0002526260200000079
Figure FDA00025262602000000710
for Wfx、Wix、Wlx、WoxThe weight gradient of (2) can be directly calculated only according to the corresponding error term:
Figure FDA0002526260200000081
Figure FDA0002526260200000082
Figure FDA0002526260200000083
Figure FDA0002526260200000084
9. the Bayesian LSTM model-based fault early warning method for the rotary machine according to claim 6, wherein in step S4.3, the validation set obtained in step S4.1 is used for verifying whether the result predicted by the recurrent neural network prediction model is accurate; at present, the judgment of the precision of a prediction model of a recurrent neural network depends on Mean Square Error (MSE) and a decision coefficient R2Two indexes; the mean square error MSE can indicate the deviation degree of the predicted result from the actual result, but the difference of the sample set dimension causes the predicted result not to have readability, and the coefficient R is determined2The superiority of the model compared with the direct averaging is shown, the result is between 0 and 1, and the closer to 1, the better the model effect is;
by adopting a Bayesian hypothesis testing method, the reliability of the model is quantified, and the prior information of the training set can be considered at the same time, specifically as follows:
assume that the data predicted by the LSTM prediction model is
Figure FDA0002526260200000085
Corresponding to the actual data being yexp={y1,y2,…,ynN is the number of data; let
Figure FDA0002526260200000086
When the residual between the ith vibration signal and the ith prediction signal is represented, n residuals { e }1,e2,…,en}, residual eiObeying a normal distribution N (mu, sigma)1 2) Then mean of residual errors
Figure FDA0002526260200000087
Obeying a normal distribution N (mu, sigma)1 2N); establishing the original hypothesis H0And alternative hypothesis H1
H0:μ=0,H1:μ≠0;
Assume a prior probability density of N (0, σ) for μ0 2),σ0 2The verification set is selected as prior information, namely, the mean value of the prediction error of the verification set is calculated in a segmentation mode, and the variance of the mean value is used as sigma0 2(ii) a The probability density function for μ is found to be:
Figure FDA0002526260200000088
in combination with the regularity of the normal distribution,
Figure FDA0002526260200000089
the edge density function for g (μ) is:
Figure FDA00025262602000000810
the bayesian factor is then expressed as:
Figure FDA0002526260200000091
taking logarithm on two sides of formula (48):
Figure FDA0002526260200000092
wherein lambda is the confidence of the reliability of the recurrent neural network prediction model,
Figure FDA0002526260200000093
time lambda → 0, which means that the confidence of the support model is 0%, the recurrent neural network prediction model is unreliable; while
Figure FDA0002526260200000094
Time λ → ∞ indicates that the confidence of supporting the model is 100%, and the recurrent neural network prediction model is very reliable.
10. The Bayesian LSTM model-based fault early warning method for the rotary machine according to claim 6, wherein in step S4.4, if the prediction result of the recurrent neural network prediction model is accurate, the residual error between the actual value and the model prediction value, which is the value obtained by noise reduction, dimension reduction and phase space reconstruction of the real-time monitoring value, can be used for judgment;
when a unit or a monitoring system has a fault at a certain moment, the fault can be reflected on a signal detected by the monitoring system, and a larger error can be generated between the fault and the signal in a rational health state; the cyclic neural network prediction model gives predictions between the fault moments, and the predicted values given according to the previous health data are regarded as the rational health data, so that a larger difference exists between the predicted values and the monitoring values of the monitoring system, and whether to give out early warning is judged by identifying the large error;
whether an alarm is given is determined by setting a threshold value, wherein the threshold value is the maximum absolute value e of the error between the predicted value and the actual value of the predictive model of the verification centralized circulation neural networkmax(ii) a The training set and the verification set are historical data sampled under the health state of the unit, but the training set is used for building a model, so that the training set naturally has smaller errors on the whole and has no reference; the verification set is an application of the model, if a time threshold is set artificially in the process of monitoring the unit by using the recurrent neural network prediction model, and if the duration time greater than zeta or the total time exceeds the time threshold, the unit is judged to have a potential fault and needs to be repaired.
CN202010520887.2A 2020-06-05 2020-06-05 Fault early warning method of rotating machinery based on Bayesian LSTM model Pending CN111914875A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010520887.2A CN111914875A (en) 2020-06-05 2020-06-05 Fault early warning method of rotating machinery based on Bayesian LSTM model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010520887.2A CN111914875A (en) 2020-06-05 2020-06-05 Fault early warning method of rotating machinery based on Bayesian LSTM model

Publications (1)

Publication Number Publication Date
CN111914875A true CN111914875A (en) 2020-11-10

Family

ID=73237460

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010520887.2A Pending CN111914875A (en) 2020-06-05 2020-06-05 Fault early warning method of rotating machinery based on Bayesian LSTM model

Country Status (1)

Country Link
CN (1) CN111914875A (en)

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112241599A (en) * 2020-11-18 2021-01-19 中国联合网络通信集团有限公司 Method and device for establishing fault analysis model
CN112487910A (en) * 2020-11-24 2021-03-12 中广核工程有限公司 Fault early warning method and system for nuclear turbine system
CN112650166A (en) * 2020-12-14 2021-04-13 云南迦南飞奇科技有限公司 Production line condition big data system based on wireless network and diagnosis method thereof
CN112734201A (en) * 2020-12-31 2021-04-30 国网浙江省电力有限公司电力科学研究院 Multi-equipment overall quality evaluation method based on expected failure probability
CN112766657A (en) * 2020-12-31 2021-05-07 国网浙江省电力有限公司电力科学研究院 Single equipment quality evaluation method based on fault probability and equipment state
CN112990598A (en) * 2021-03-31 2021-06-18 浙江禹贡信息科技有限公司 Reservoir water level time sequence prediction method and system
CN113240099A (en) * 2021-07-09 2021-08-10 北京博华信智科技股份有限公司 LSTM-based rotating machine health state prediction method and device
CN113359682A (en) * 2021-06-30 2021-09-07 西安力传智能技术有限公司 Equipment fault prediction method, device, equipment fault prediction platform and medium
CN113486473A (en) * 2021-07-27 2021-10-08 上海电气风电集团股份有限公司 State monitoring method and system of wind generating set and computer readable storage medium
CN113657454A (en) * 2021-07-23 2021-11-16 杭州安脉盛智能技术有限公司 Autoregressive BiGRU-based nuclear power rotating machine state monitoring method
CN113743670A (en) * 2021-09-08 2021-12-03 电子科技大学 Circuit fault real-time prediction method and verification circuit based on GRU model
CN113762471A (en) * 2021-08-24 2021-12-07 浙江中辰城市应急服务管理有限公司 Phase space reconstruction parameter estimation method based on attention mechanism and Bayesian optimization
CN114239161A (en) * 2021-11-29 2022-03-25 武汉欧格莱液压动力设备有限公司 Mechanical state detection method for hydraulic element
CN114563950A (en) * 2022-01-21 2022-05-31 深圳铂今节能科技有限公司 A sensorless intelligent control method and system for electromechanical equipment
CN114996022A (en) * 2022-07-18 2022-09-02 浙江出海数字技术有限公司 Multi-channel available big data real-time decision making system
CN115018178A (en) * 2022-06-23 2022-09-06 北京华能新锐控制技术有限公司 Power station fan fault early warning method based on deep learning
CN115034410A (en) * 2022-06-06 2022-09-09 浙江理工大学 Multi-source data fusion-based textile machinery operation and maintenance method
CN115567943A (en) * 2022-10-11 2023-01-03 潍柴动力股份有限公司 Internet of vehicles pseudo AP identification method and device
CN115618209A (en) * 2022-09-16 2023-01-17 哈尔滨工业大学 Railway Track Condition Assessment Method Based on Sparse Extreme Learning Machine and Hypothesis Testing
CN116125327A (en) * 2023-02-22 2023-05-16 湘潭大学 Switching power supply fault diagnosis method based on improved wavelet denoising and LSTM
CN116383608A (en) * 2023-04-06 2023-07-04 中国人民解放军空军工程大学 A Small Sample Equipment Fault Online Prediction Method
CN116861164A (en) * 2023-05-08 2023-10-10 华电电力科学研究院有限公司 Turbine operation fault monitoring system
CN116989274A (en) * 2023-08-01 2023-11-03 华南理工大学 An oil and gas pipeline safety monitoring method, system, device and storage medium
CN117353455A (en) * 2023-10-17 2024-01-05 济南泉晓电气设备有限公司 Supervision method of power transmission and transformation system based on artificial intelligence
CN117574114A (en) * 2024-01-15 2024-02-20 安徽农业大学 A method for remote reconstruction of rotating machinery operating data and jump disturbance detection
CN117725480A (en) * 2023-12-16 2024-03-19 国网山东省电力公司青岛供电公司 An intelligent detection method and system for lightning arrester faults
CN117829822A (en) * 2024-03-04 2024-04-05 合肥工业大学 A power transformer fault early warning method and system
CN117992726A (en) * 2023-11-08 2024-05-07 大连蓝雪智能科技有限公司 A multi-level early warning method, device, equipment and medium for rotating machinery
CN118037014A (en) * 2024-04-12 2024-05-14 深圳市中航环海建设工程有限公司 Road construction monitoring system based on Internet of Things
CN118332477A (en) * 2024-06-13 2024-07-12 中国人民解放军海军航空大学 A method for generating equipment fault detection model based on state data analysis
CN118643320A (en) * 2024-05-28 2024-09-13 江南大学 Quality-related minor fault detection method based on dynamic orthogonal subspace
CN119046748A (en) * 2024-07-25 2024-11-29 江西大唐国际新余第二发电有限责任公司 Auxiliary equipment fault diagnosis method and system based on model analysis
CN119272121A (en) * 2024-12-10 2025-01-07 青岛华电高压电气有限公司 A method and system for online intelligent monitoring of high-voltage cable anti-theft return line
CN119533582A (en) * 2025-01-23 2025-02-28 上海且恩数据技术有限公司 A housing safety monitoring method and system based on machine learning
CN119719970A (en) * 2025-03-03 2025-03-28 北京工业大学 A flue gas desulfurization equipment health status monitoring method and system based on deep neural network
CN120063477A (en) * 2025-01-22 2025-05-30 西安交通大学 Sensor fault detection method based on wavelet packet decomposition and long-short-time memory network
CN121051610A (en) * 2025-10-31 2025-12-02 深圳技术大学 Spark MLlib-based actuator fault diagnosis method, spark MLlib-based actuator fault diagnosis equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190285517A1 (en) * 2017-10-25 2019-09-19 Nanjing Univ. Of Aeronautics And Astronautics Method for evaluating health status of mechanical equipment
CN110501585A (en) * 2019-07-12 2019-11-26 武汉大学 A Transformer Fault Diagnosis Method Based on Bi-LSTM and Dissolved Gas Analysis in Oil

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190285517A1 (en) * 2017-10-25 2019-09-19 Nanjing Univ. Of Aeronautics And Astronautics Method for evaluating health status of mechanical equipment
CN110501585A (en) * 2019-07-12 2019-11-26 武汉大学 A Transformer Fault Diagnosis Method Based on Bi-LSTM and Dissolved Gas Analysis in Oil

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GAOJUN LIU 等: "Bayesian Long Short-Term Memory Model for Fault Early Warning of Nuclear Power Turbine", IEEE, 23 March 2020 (2020-03-23), pages 1 - 13 *

Cited By (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112241599A (en) * 2020-11-18 2021-01-19 中国联合网络通信集团有限公司 Method and device for establishing fault analysis model
CN112487910A (en) * 2020-11-24 2021-03-12 中广核工程有限公司 Fault early warning method and system for nuclear turbine system
CN112650166A (en) * 2020-12-14 2021-04-13 云南迦南飞奇科技有限公司 Production line condition big data system based on wireless network and diagnosis method thereof
CN112734201A (en) * 2020-12-31 2021-04-30 国网浙江省电力有限公司电力科学研究院 Multi-equipment overall quality evaluation method based on expected failure probability
CN112766657A (en) * 2020-12-31 2021-05-07 国网浙江省电力有限公司电力科学研究院 Single equipment quality evaluation method based on fault probability and equipment state
CN112766657B (en) * 2020-12-31 2022-07-05 国网浙江省电力有限公司电力科学研究院 Single equipment quality evaluation method based on fault probability and equipment state
CN112734201B (en) * 2020-12-31 2022-07-05 国网浙江省电力有限公司电力科学研究院 Multi-equipment overall quality evaluation method based on expected failure probability
CN112990598A (en) * 2021-03-31 2021-06-18 浙江禹贡信息科技有限公司 Reservoir water level time sequence prediction method and system
CN113359682B (en) * 2021-06-30 2022-12-02 西安力传智能技术有限公司 Equipment fault prediction method, device, equipment fault prediction platform and medium
CN113359682A (en) * 2021-06-30 2021-09-07 西安力传智能技术有限公司 Equipment fault prediction method, device, equipment fault prediction platform and medium
CN113240099A (en) * 2021-07-09 2021-08-10 北京博华信智科技股份有限公司 LSTM-based rotating machine health state prediction method and device
CN113657454B (en) * 2021-07-23 2024-02-23 杭州安脉盛智能技术有限公司 Nuclear power rotating machinery condition monitoring method based on autoregressive BiGRU
CN113657454A (en) * 2021-07-23 2021-11-16 杭州安脉盛智能技术有限公司 Autoregressive BiGRU-based nuclear power rotating machine state monitoring method
CN113486473A (en) * 2021-07-27 2021-10-08 上海电气风电集团股份有限公司 State monitoring method and system of wind generating set and computer readable storage medium
CN113762471A (en) * 2021-08-24 2021-12-07 浙江中辰城市应急服务管理有限公司 Phase space reconstruction parameter estimation method based on attention mechanism and Bayesian optimization
CN113762471B (en) * 2021-08-24 2023-07-28 浙江中辰城市应急服务管理有限公司 A Phase Space Reconstruction Parameter Estimation Method Based on Attention Mechanism and Bayesian Optimization
CN113743670A (en) * 2021-09-08 2021-12-03 电子科技大学 Circuit fault real-time prediction method and verification circuit based on GRU model
CN113743670B (en) * 2021-09-08 2023-05-09 电子科技大学 A Real-time Prediction Method and Verification Circuit of Circuit Fault Based on GRU Model
CN114239161A (en) * 2021-11-29 2022-03-25 武汉欧格莱液压动力设备有限公司 Mechanical state detection method for hydraulic element
CN114563950A (en) * 2022-01-21 2022-05-31 深圳铂今节能科技有限公司 A sensorless intelligent control method and system for electromechanical equipment
CN115034410A (en) * 2022-06-06 2022-09-09 浙江理工大学 Multi-source data fusion-based textile machinery operation and maintenance method
CN115018178A (en) * 2022-06-23 2022-09-06 北京华能新锐控制技术有限公司 Power station fan fault early warning method based on deep learning
CN114996022A (en) * 2022-07-18 2022-09-02 浙江出海数字技术有限公司 Multi-channel available big data real-time decision making system
CN114996022B (en) * 2022-07-18 2024-03-08 山西华美远东科技有限公司 A multi-channel big data real-time decision-making system
CN115618209A (en) * 2022-09-16 2023-01-17 哈尔滨工业大学 Railway Track Condition Assessment Method Based on Sparse Extreme Learning Machine and Hypothesis Testing
CN115567943A (en) * 2022-10-11 2023-01-03 潍柴动力股份有限公司 Internet of vehicles pseudo AP identification method and device
CN116125327A (en) * 2023-02-22 2023-05-16 湘潭大学 Switching power supply fault diagnosis method based on improved wavelet denoising and LSTM
CN116383608A (en) * 2023-04-06 2023-07-04 中国人民解放军空军工程大学 A Small Sample Equipment Fault Online Prediction Method
CN116861164A (en) * 2023-05-08 2023-10-10 华电电力科学研究院有限公司 Turbine operation fault monitoring system
CN116989274B (en) * 2023-08-01 2025-10-17 华南理工大学 Oil and gas pipeline safety monitoring method, system, device and storage medium
CN116989274A (en) * 2023-08-01 2023-11-03 华南理工大学 An oil and gas pipeline safety monitoring method, system, device and storage medium
CN117353455B (en) * 2023-10-17 2024-03-29 济南泉晓电气设备有限公司 Supervision method of power transmission and transformation system based on artificial intelligence
CN117353455A (en) * 2023-10-17 2024-01-05 济南泉晓电气设备有限公司 Supervision method of power transmission and transformation system based on artificial intelligence
CN117992726B (en) * 2023-11-08 2024-09-20 大连蓝雪智能科技有限公司 Multi-stage early warning method, device, equipment and medium for rotary machine
CN117992726A (en) * 2023-11-08 2024-05-07 大连蓝雪智能科技有限公司 A multi-level early warning method, device, equipment and medium for rotating machinery
CN117725480A (en) * 2023-12-16 2024-03-19 国网山东省电力公司青岛供电公司 An intelligent detection method and system for lightning arrester faults
CN117574114B (en) * 2024-01-15 2024-04-19 安徽农业大学 Remote reconstruction and jump disturbance detection method for running data of rotary machine
CN117574114A (en) * 2024-01-15 2024-02-20 安徽农业大学 A method for remote reconstruction of rotating machinery operating data and jump disturbance detection
CN117829822A (en) * 2024-03-04 2024-04-05 合肥工业大学 A power transformer fault early warning method and system
CN117829822B (en) * 2024-03-04 2024-06-04 合肥工业大学 Power transformer fault early warning method and system
CN118037014A (en) * 2024-04-12 2024-05-14 深圳市中航环海建设工程有限公司 Road construction monitoring system based on Internet of Things
CN118643320A (en) * 2024-05-28 2024-09-13 江南大学 Quality-related minor fault detection method based on dynamic orthogonal subspace
CN118332477A (en) * 2024-06-13 2024-07-12 中国人民解放军海军航空大学 A method for generating equipment fault detection model based on state data analysis
CN119046748A (en) * 2024-07-25 2024-11-29 江西大唐国际新余第二发电有限责任公司 Auxiliary equipment fault diagnosis method and system based on model analysis
CN119272121A (en) * 2024-12-10 2025-01-07 青岛华电高压电气有限公司 A method and system for online intelligent monitoring of high-voltage cable anti-theft return line
CN119272121B (en) * 2024-12-10 2025-03-25 青岛华电高压电气有限公司 A method and system for online intelligent monitoring of high-voltage cable anti-theft return line
CN120063477A (en) * 2025-01-22 2025-05-30 西安交通大学 Sensor fault detection method based on wavelet packet decomposition and long-short-time memory network
CN119533582A (en) * 2025-01-23 2025-02-28 上海且恩数据技术有限公司 A housing safety monitoring method and system based on machine learning
CN119719970A (en) * 2025-03-03 2025-03-28 北京工业大学 A flue gas desulfurization equipment health status monitoring method and system based on deep neural network
CN121051610A (en) * 2025-10-31 2025-12-02 深圳技术大学 Spark MLlib-based actuator fault diagnosis method, spark MLlib-based actuator fault diagnosis equipment and medium

Similar Documents

Publication Publication Date Title
CN111914875A (en) Fault early warning method of rotating machinery based on Bayesian LSTM model
JP7657743B2 (en) Anomaly detection system, anomaly detection method, anomaly detection program, and trained model generation method
Sun et al. Deep transfer learning based on sparse autoencoder for remaining useful life prediction of tool in manufacturing
Liu et al. Bayesian long short-term memory model for fault early warning of nuclear power turbine
CN120217127A (en) Real-time monitoring method and system for thermal power equipment based on edge computing
Baraldi et al. Reconstruction of missing data in multidimensional time series by fuzzy similarity
CN112559598B (en) Telemetry time series data abnormity detection method and system based on graph neural network
CN110083593B (en) Power station operation parameter cleaning and repairing method and repairing system
CN114879612A (en) Blast furnace iron-making process monitoring method based on Local-DBKSSA
Kim et al. Opt-TCAE: Optimal temporal convolutional auto-encoder for boiler tube leakage detection in a thermal power plant using multi-sensor data
Cai et al. Knowledge embedded spatial–temporal graph convolutional networks for remaining useful life prediction
Zhong et al. Industrial Robot Vibration Anomaly Detection Based on Sliding Window One‐Dimensional Convolution Autoencoder
CN118966643A (en) An intelligent production scheduling and early warning method and system for hardware processing
CN120372510A (en) Phase change energy storage thermal reservoir anomaly detection system and method based on deep learning
Tian et al. Anomaly detection with convolutional autoencoder for predictive maintenance
Ashraf et al. DESIGN AND IMPLEMENTATION OF ERROR ISOLATION IN TECHNO METER
Hassani Meta-model structural monitoring with cutting-edge AAE-VMD fusion alongside optimized machine learning methods
Zhou et al. Multisensor‐Based Heavy Machine Faulty Identification Using Sparse Autoencoder‐Based Feature Fusion and Deep Belief Network‐Based Ensemble Learning
Yoo et al. Hybrid explainable anomaly detection framework of gas turbines for feature selection and fault localization
CN120654104A (en) Industrial time sequence event analysis method, equipment and medium based on causal regularization
CN120469364A (en) Anomaly detection method for non-stationary industrial processes based on slow eigendecomposition and Koopman high-dimensional space prediction
CN117556538B (en) Aeroengine aerodynamic instability prediction method and system based on sampling observer
Cao et al. An adaptive UKF algorithm for process fault prognostics
Ganguli Data rectification and detection of trend shifts in jet engine gas path measurements using median filters and fuzzy logic
Hill et al. Automated fault detection for in-situ environmental sensors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201110

RJ01 Rejection of invention patent application after publication