Disclosure of Invention
The invention aims to provide a chemical system multi-working-condition fault detection method based on local adaptive standardization, which is used for overcoming the defects of the existing method, applies a variational automatic encoder technology of a deep neural network, processes data of a local moving window by utilizing the local adaptive standardization, inputs the window data into a variational automatic encoder model to detect whether the deviation trend exists or not, and judges whether the process data is in a normal running state or has a fault or not, so that early warning is carried out when the early data deviates, and the possibility of occurrence of chemical accidents is reduced to the maximum extent.
The invention provides a chemical system multi-working-condition fault detection method based on local adaptive standardization, which comprises the following steps of:
(1) obtaining normal operation data set D under N working conditions from historical database of chemical systemhistoryData set DhistoryThe method comprises the following steps of (1) totally m rows and n columns of data, wherein m represents a process variable of a chemical system, and n represents total operation time;
(2) setting the normal operation data set D in the step (1)
historyInto training sets D
trainAnd a verification set D
validTraining set D
trainComprising m rows n
trainColumn data, validation set D
validComprising m rows n
validColumn data, in which training set D
trainD of historical normal operation data set
historyIn a ratio of
60%≤a≤90%;
(3) Training set D in step (2)trainAnd a verification set DvalidCarrying out local self-adaptive standardization processing to obtain a transformed training set TtrainAnd a verification set TvalidThe method comprises the following specific steps:
(3-1) Using the training set D of step (2)trainThe global mean standard deviation gmstd (D) of m process variables in the chemical system is calculated by using the following formulatrain) Including m numbers:
wherein i represents the working condition serial number of the chemical process, i is more than or equal to 1 and less than or equal to N, and Dtrain,iRepresentative training set DtrainNormal operating data of the ith operating mode, std (D)train,i) Representative training set DtrainThe standard deviation vector of the ith working condition comprises m numerical values, and std (D) is obtained by calculating the standard deviation of the corresponding variabletrain,i),ntrain,iRepresentative training set DtrainThe amount of normal operation data for the ith condition,
(3-2) training set D for step (2)trainThe k-th normal operation data xk, k in (b) represents the training set DtrainRun time number of (1), 2, ntrain,xkComprises m variable values with time sequence number k, and local moving window data w with time window t is selected forward by calculating timek,wkThe total m rows and t columns of data are provided, wherein t is a time window, t is more than or equal to 10 and less than or equal to 100:
utilizing local moving window datawkCalculating wkAverage value of m variables in (1) to obtain mean (w)k),mean(wk) Comprises m numbers;
(3-3) Using gmstd (D) in step (3-1)
train) And mean (w) in step (3-2)
k) For the local moving window data w of step (3-2)
kPerforming local adaptive normalization to w
kApproximately converting m variables of the data into standard normal distribution to obtain local self-adaptive standardized local moving window data
(3-4) repeating the step (3-2) and the step (3-3), and sequentially calculating D in the training settrainObtaining a local adaptive standardized training set T by each normal operation datatrain;
(3-5) authentication set D for step (2)validP-th normal operation data x in (1)pAnd p represents the verification set DvalidOperating time number of (1), 2, nvalid,xpThe local moving window data w with time window t is selected forward in time and comprises m variable values with time sequence number pp,wpThere are m rows and t columns of data, where t is the time window in step (3-2):
using local moving window data wpCalculating wpAverage value of m variables in (1) to obtain mean (w)p),mean(wp) Comprises m numbers;
(3-6) Using gmstd (D) in step (3-1)
train) And mean (w) in step (3-5)
p) For the local moving window data w of step (3-5)
pThe local self-adaptive standardization is carried out,let w
pApproximately converting m variables of the data into standard normal distribution to obtain local self-adaptive standardized local moving window data
(3-7) repeating the steps (3-5) and (3-6), and sequentially calculating D in the verification setvalidObtaining a local self-adaptive standardized verification set T from each normal operation datavalid;
(4) Constructing a variational automatic encoder which comprises an encoder part and a decoder part and utilizing the training set T obtained in the step (3-4)trainTraining the variational automatic encoder to obtain the trained variational automatic encoder, and specifically comprising the following steps:
(4-1) designing and constructing an encoder by using a convolutional neural network, a cyclic neural network or a deep belief network, and carrying out local adaptive normalization on the local moving window data obtained in the step (3-3)
As input to the encoder, the mapping is derived
Feature vector σ of
kAnd mu
kFeature vector σ
kAnd mu
kThere are l numbers, l represents the dimension of the feature vector, l is greater than or equal to m and less than or equal to 4 m:
(4-2) Using the feature vector of step (4-1)σ
kAnd mu
kCarrying out reparameterization to obtain
Hidden feature vector h of
k,h
kIncludes l values:
hk=μk+σk⊙∈
where e is normally distributed from the standard
A random sample results,. indicates multiplication of corresponding elements of the vector;
(4-3) designing and constructing a decoder by using a convolutional neural network, a cyclic neural network or a deep belief network, and enabling the hidden feature vector h in the step (4-2)
kAs input to the decoder, reconstructing to obtain the data corresponding to step (3-3)
Reconstructed data with the same dimensionality
There are m rows and t columns of data:
(4-4) utilizing the eigenvector σ of step (4-1) according to the following loss function
kAnd mu
kAnd the reconstructed data of step (4-3)
Calculating the locally adaptive normalized local moving window data of step (3-3)
Error of (2)
I.e. the loss function of the variational automatic encoder, the loss function
Including reconstruction losses
And KL divergence loss
λ is the weighting factor of KL divergence loss versus reconstruction loss, 10
3≤λ≤10
6The loss of the two parts is calculated as follows, wherein j represents the variable serial number of the chemical process, and j is more than or equal to 1 and less than or equal to m:
(4-5) repeating the step (4-1) to the step (4-4), and sequentially combining the training sets T in the step (3-4)
trainEach data of
Inputting the variational automatic encoder to carry out error calculation, and training the variational automatic encoder through an error back propagation algorithm to obtain a trained variational automatic encoder;
(5) utilizing the trained variational automatic encoder obtained in the step (4) and the verification set T obtained in the step (3-7)validBy estimating the verification set TvalidTo obtain a variational automatic encoder for fault detection taskThe specific steps of the time monitoring threshold eta are as follows:
(5-1) local moving window data for local adaptive normalization of step (3-6)
Mapping the input of the variational automatic encoder trained in the step (4) to obtain
Feature vector σ of
pAnd mu
pFeature vector σ
pAnd mu
pThere are l values, respectively, where l represents the dimension of the feature vector:
(5-2) Using the feature vector σ of step (5-1)
pAnd mu
pTo, for
Carrying out reparameterization to obtain
Hidden feature vector h of
p,h
pIncludes l values:
hp=μp+σp⊙∈
where e is normally distributed from the standard
A random sample results,. indicates multiplication of corresponding elements of the vector;
(5-3) combining the hidden feature vector h of the step (5-2)
pAs the input of the decoder in the variational automatic encoder trained in the step (4), reconstructing to obtain the result of the step (3-6)
Reconstructed data with the same dimensionality
There are m rows and t columns of data:
(5-4) utilizing the feature vector σ of step (5-1) according to the following abnormality score calculation formula
pAnd mu
pAnd the reconstructed data of step (5-3)
Calculating the local adaptive standardized local moving window number in the step (3-6)
Is abnormal score of
Abnormal score
Including reconstruction losses
And KL divergence loss
λ is a weighting coefficient of KL divergence loss with respect to reconstruction loss, the same as λ of step (4-4); two-part loss calculationWherein j represents the variable serial number of the chemical process, j is more than or equal to 1 and less than or equal to m:
(5-5) repeating the steps (5-1) to (5-4), and sequentially adding the verification sets T of the steps (3-7)
validEach data of
Input variable automatic encoder for calculating abnormal score
Get the verification set T
validIs abnormal score data set S
valid;
(5-6) abnormal score data set SvalidObtaining abnormal score data set S according to normal distributionvalidThe abnormal fraction with the normal distribution confidence coefficient alpha is used as a monitoring threshold eta of the chemical system, and alpha is more than or equal to 99% and less than or equal to 99.99%;
(6) and (3) carrying out online fault detection on the process data of the chemical system under different working conditions by using the variational automatic encoder trained in the step (4) and the monitoring threshold eta obtained in the step (5), wherein the method comprises the following steps:
(6-1) collecting process data from a real-time database of the chemical system at the current detection moment q, and selecting local moving window data w with a time window t from the time to the frontq,wqThere are m rows and t columns of data, where t is the time window in step (3-2):
using local moving window data wqCalculating wqAverage of m variables inValue, to obtain mean (w)q),mean(wq) Comprises m numbers;
(6-2) Using gmstd (Dtrain) in step (3-1) and mean (w) in step (6-1)
q) For the local moving window data w of step (6-1)
qPerforming local adaptive normalization to w
qApproximately converting m variables of the data into standard normal distribution to obtain local self-adaptive standardized local moving window data
(6-3) local moving window data for local adaptive normalization of step (6-2)
Mapping the input of the encoder in the variational automatic encoder trained in the step (4) to obtain
Feature vector σ of
qAnd mu
qThe two feature vectors have values of l, wherein l represents the dimension of the feature vector and has the same size as l in the step (4-1):
(6-4) Using the feature vector σ of step (6-3)
qAnd mu
qCarrying out reparameterization to obtain
Hidden feature vector h of
q,h
qComprisingIndividual values:
hq=μq+σq⊙∈
where e is normally distributed from the standard
A random sample results,. indicates multiplication of corresponding elements of the vector;
(6-5) hiding the feature vector h of the step (6-4)
qAs the input of the decoder in the variational automatic encoder trained in the step (4), reconstructing to obtain the result corresponding to the step (6-2)
Reconstructed data with the same dimensionality
There are m rows and t columns of data:
(6-6) utilizing the feature vector σ of step (6-3) according to the following abnormality score calculation formula
qAnd mu
qAnd the reconstructed data of step (6-5)
Calculating the locally adaptive normalized local moving window data of step (6-2)
Is abnormal score of
Abnormal score
Including reconstruction losses
And KL divergence loss
λ is a weighting coefficient of KL divergence loss with respect to reconstruction loss, the same as λ of step (4-4); the loss of the two parts is calculated as follows, wherein j represents the variable serial number of the chemical process, j is more than or equal to 1 and less than or equal to m:
(6-7) scoring the abnormality of step (6-6)
Comparing with the monitoring threshold eta obtained in the step (5), if so
The current chemical system is in a normal operation state, the step (6-1) is returned to continue monitoring the online real-time data, and if the online real-time data is not monitored, the chemical system is in a normal operation state
The system fault of the current chemical system is indicated, and fault warning is sent out, so that the multi-working-condition fault detection of the chemical system based on local adaptive standardization is realized.
The invention provides a chemical system multi-working-condition fault detection method based on local adaptive standardization, which has the advantages that:
the invention discloses a chemical system multi-working-condition fault detection method based on local adaptive standardization, which is different from the existing fault detection method by providing a local adaptive standardization method and applying a variational automatic encoder technology of a deep neural network. The method is different from other existing detection methods which detect the deviation degree of the current data and the normal operation data, and the fault detection is carried out by detecting whether the data in the local moving window deviates or not by utilizing local self-adaptive standardization processing. Therefore, the method can be suitable for any working condition, and not only can be applied to the historical existing working condition, but also can be applied to the historical non-occurring working condition. In addition, the invention combines and applies the variational automatic encoder to detect whether the current window data has the deviation trend, and has higher accuracy and stronger generalization capability compared with the traditional multivariate statistical method. The invention can meet the requirement of real-time detection, can be applied to the fault detection task of the chemical system under all working conditions of the chemical process, and avoids the occurrence of chemical accidents or reduces the harm brought by the accidents by early warning of the faults.
Detailed Description
The invention provides a chemical system multi-working-condition fault detection method based on local adaptive standardization, which has an overall flow diagram shown in figure 1 and comprises the following steps:
(1) obtaining normal operation data set D under N working conditions from historical database of chemical systemhistoryData set DhistoryThere are m rows and n columns of data, where m represents process variables of the chemical system, such as temperature, time, pressure, etc., and n represents total run time;
(2) setting the normal operation data set D in the step (1)
historyInto training sets D
trainAnd a verification set D
validTraining set D
trainComprising m rows n
trainColumn data, validation set D
validComprising m rows n
validColumn data, in which training set D
trainD of historical normal operation data set
historyIn a ratio of
60%≤a≤90%;
(3) Training set D in step (2)trainAnd a verification set DvalidCarrying out local self-adaptive standardization processing to obtain a transformed training set TtrainAnd a verification set TvalidThe method comprises the following specific steps:
(3-1) Using the training set D of step (2)trainThe global mean standard deviation gmstd (D) of m process variables in the chemical system is calculated by using the following formulatrain) Including m numbers:
wherein i represents the working condition serial number of the chemical process, i is more than or equal to 1 and less than or equal to N, and Dtrain,iRepresentative training set DtrainNormal operating data of the ith operating mode, std (D)train,i) Representative training set DtrainThe standard deviation vector of the ith working condition comprises m numerical values, and std (D) is obtained by calculating the standard deviation of the corresponding variabletrain,i),ntrain,iRepresentative training set DtrainThe amount of normal operation data for the ith condition,
(3-2) training set D for step (2)trainThe k-th normal operation data xk, k in (b) represents the training set DtrainRun time number of (1), 2, nttain,xkComprises m variable values with time sequence number k, and local moving window data w with time window t is selected forward by calculating timek,wkThe total m rows and t columns of data are provided, wherein t is a time window, t is more than or equal to 10 and less than or equal to 100:
using local moving window data wkCalculating wkAverage value of m variables in (1) to obtain mean (w)k),mean(wk) Comprises m numbers;
(3-3) Using gmstd (D) in step (3-1)
train) And mean (w) in step (3-2)
k) For the local moving window data w of step (3-2)
kPerforming local adaptive normalization to w
kApproximately converting m variables of the data into standard normal distribution to obtain local self-adaptive standardized local moving window data
(3-4) repeating the step (3-2) and the step (3-3), and sequentially calculating D in the training settrainObtaining a local adaptive standardized training set T by each normal operation datatrain;
(3-5) authentication set D for step (2)validP-th normal operation data x in (1)pAnd p represents the verification set DvalidOperating time number of (1), 2, nvalid,xpThe local moving window data w with time window t is selected forward in time and comprises m variable values with time sequence number pp,wpThere are m rows and t columns of data, where t is the time window in step (3-2):
using local moving window data wpCalculating wpAverage value of m variables in (1) to obtain mean (w)p),mean(wp) Comprises m numbers;
(3-6) Using gmstd (D) in step (3-1)
train) And mean (w) in step (3-5)
p) For the local moving window data w of step (3-5)
pPerforming local adaptive normalization to w
pApproximately converting m variables of the data into standard normal distribution to obtain local self-adaptive standardized local moving window data
(3-7) repeating the steps (3-5) and (3-6), and sequentially calculating D in the verification setvalidObtaining a local self-adaptive standardized verification set T from each normal operation datavalid;
(4) Constructing a variational automatic encoder which comprises an encoder part and a decoder part and utilizing the training set T obtained in the step (3-4)trainTraining the variational automatic encoder to obtain the trained variational automatic encoder, and specifically comprising the following steps:
(4-1) designing and constructing an encoder by using a convolutional neural network, a cyclic neural network or a deep belief network, and carrying out local adaptive normalization on the local moving window data obtained in the step (3-3)
As input to the encoder, the mapping is derived
Feature vector σ of
kAnd mu
kFeature vector σ
kAnd mu
kThere are l numbers, l represents the dimension of the feature vector, l is greater than or equal to m and less than or equal to 4 m:
(4-2) Using the feature vector σ of step (4-1)
kAnd mu
kCarrying out reparameterization to obtain
Hidden feature vector h of
k,h
kIncludes l values:
hk=μk+σk⊙∈
where e is normally distributed from the standard
A random sample results,. indicates multiplication of corresponding elements of the vector;
(4-3) designing and constructing a decoder by using a convolutional neural network, a cyclic neural network or a deep belief network, and enabling the hidden feature vector h in the step (4-2)
kAs input to the decoder, reconstructing to obtain the data corresponding to step (3-3)
Reconstructed data with the same dimensionality
There are m rows and t columns of data:
(4-4) utilizing the eigenvector σ of step (4-1) according to the following loss function
kAnd mu
kAnd the reconstructed data of step (4-3)
Calculating the locally adaptive normalized local moving window data of step (3-3)
Error of (2)
I.e. the loss function of the variational automatic encoder, the loss function
Including reconstruction losses
And KL divergence loss
λ is the weighting factor of KL divergence loss versus reconstruction loss, 10
3≤λ≤10
6The loss of the two parts is calculated as follows, wherein j represents the variable serial number of the chemical process, and j is more than or equal to 1 and less than or equal to m:
(4-5) repeating the step (4-1) to the step (4-4), and sequentially combining the training sets T in the step (3-4)
trainEach data of
Inputting the variational automatic encoder to carry out error calculation, and training the variational automatic encoder through an error back propagation algorithm to obtain the variational automatic encoderThe trained variational automatic encoder is used for a fault detection task of the chemical system;
(5) utilizing the trained variational automatic encoder obtained in the step (4) and the verification set T obtained in the step (3-7)validBy estimating the verification set TvalidThe abnormal score confidence interval of the variable automatic encoder is used for obtaining a monitoring threshold eta when the variable automatic encoder is used for a fault detection task, and the method specifically comprises the following steps:
(5-1) local moving window data for local adaptive normalization of step (3-6)
Mapping the input of the variational automatic encoder trained in the step (4) to obtain
Feature vector σ of
pAnd mu
pFeature vector σ
pAnd mu
pThere are l numbers, respectively, l represents the dimension of the feature vector, and has the same size as l of step (4-1):
(5-2) Using the feature vector σ of step (5-1)
pAnd mu
pTo, for
Carrying out reparameterization to obtain
Hidden feature vector h of
p,h
pIncludes l values:
hp=μp+σp⊙∈
where e is normally distributed from the standard
A random sample results,. indicates multiplication of corresponding elements of the vector;
(5-3) combining the hidden feature vector h of the step (5-2)
pAs the input of the decoder in the variational automatic encoder trained in the step (4), reconstructing to obtain the result of the step (3-6)
Reconstructed data with the same dimensionality
There are m rows and t columns of data:
(5-4) utilizing the feature vector σ of step (5-1) according to the following abnormality score calculation formula
pAnd mu
pAnd the reconstructed data of step (5-3)
Calculating the local adaptive standardized local moving window number in the step (3-6)
Is abnormal score of
Abnormal score
Including reconstruction losses
And KL divergence loss
λ is a weighting coefficient of KL divergence loss with respect to reconstruction loss, the same as λ of step (4-4); the loss of the two parts is calculated as follows, wherein j represents the variable serial number of the chemical process, j is more than or equal to 1 and less than or equal to m:
(5-5) repeating the steps (5-1) to (5-4), and sequentially adding the verification sets T of the steps (3-7)
validEach data of
Input variable automatic encoder for calculating abnormal score
Get the verification set T
validIs abnormal score data set S
valid;
(5-6) abnormal score data set SvalidObtaining abnormal score data set S according to normal distributionvalidThe abnormal fraction with the normal distribution confidence coefficient alpha is used as a monitoring threshold eta of the chemical system, and alpha is more than or equal to 99% and less than or equal to 99.99%;
(6) and (3) carrying out online fault detection on the process data of the chemical system under different working conditions by using the variational automatic encoder trained in the step (4) and the monitoring threshold eta obtained in the step (5), wherein the method comprises the following steps:
(6-1) collecting process data from a real-time database of the chemical system at the current detection moment q, and selecting local moving window data w with a time window t from the time to the frontq,wqThere are m rows and t columns of data, where t is the time window in step (3-2):
using local moving window data wqCalculating wqAverage value of m variables in (1) to obtain mean (w)q),mean(wq) Comprises m numbers;
(6-2) Using gmstd (Dtrain) in step (3-1) and mean (w) in step (6-1)
q) For the local moving window data w of step (6-1)
qPerforming local adaptive normalization to w
qApproximately converting m variables of the data into standard normal distribution to obtain local self-adaptive standardized local moving window data
(6-3) local moving window data for local adaptive normalization of step (6-2)
Mapping the input of the encoder in the variational automatic encoder trained in the step (4) to obtain
Feature vector σ of
qAnd v
qThe two feature vectors have values of l, wherein l represents the dimension of the feature vector and has the same size as l in the step (4-1):
(6-4) Using the feature vector σ of step (6-3)
qAnd mu
qCarrying out reparameterization to obtain
Hidden feature vector hq, h of
qIncludes l values:
hq=μq+σq⊙∈
where e is normally distributed from the standard
A random sample results,. indicates multiplication of corresponding elements of the vector;
(6-5) hiding the feature vector h of the step (6-4)
qAs the input of the decoder in the variational automatic encoder trained in the step (4), reconstructing to obtain the result corresponding to the step (6-2)
Reconstructed data with the same dimensionality
There are m rows and t columns of data:
(6-6) utilizing the feature vector σ of step (6-3) according to the following abnormality score calculation formula
qAnd mu
qAnd the reconstructed data of step (6-5)
Calculating the locally adaptive normalized local moving window data of step (6-2)
Is abnormal score of
Abnormal score
Including reconstruction losses
And KL divergence loss
λ is a weighting coefficient of KL divergence loss with respect to reconstruction loss, the same as λ of step (4-4); the loss of the two parts is calculated as follows, wherein j represents the variable serial number of the chemical process, j is more than or equal to 1 and less than or equal to m:
(6-7) scoring the abnormality of step (6-6)
Comparing with the monitoring threshold eta obtained in the step (5), if so
The current chemical system is in a normal operation state, the step (6-1) is returned to continue monitoring the online real-time data, and if the online real-time data is not monitored, the chemical system is in a normal operation state
The system fault of the current chemical system is indicated, and fault warning is sent out, so that the chemical system based on local adaptive standardization is moreAnd detecting working condition faults.
Embodiments of the method of the present invention are described below with reference to the accompanying drawings:
(1) acquiring a normal operation data set D under 2 working conditions from a historical database of a chemical systemhistoryData set DhistoryThe method comprises the following steps of (1) sharing m rows and n columns of data, wherein m is 42 to represent a process variable of a chemical system, n is 16000 to represent total operation time, and each working condition comprises 8000 operation times;
(2) setting the normal operation data set D in the step (1)historyInto training sets DtrainAnd a verification set DvalidTraining set DtrainComprising m rows ntrainColumn data, validation set DvalidComprising m rows nvalidColumn data, in which training set DtrainD of historical normal operation data sethistoryThe ratio of a to 75%, ntrain=12000,nvalid=4000;
(3) Training set D in step (2)trainAnd a verification set DvalidCarrying out local self-adaptive standardization processing to obtain a transformed training set TtrainAnd a verification set TvalidThe method comprises the following specific steps:
(3-1) Using the training set D of step (2)trainThe global mean standard deviation gmstd (D) of m process variables in the chemical system is calculated by using the following formulatrain) And m is 42 values:
wherein i represents the working condition serial number of the chemical process, i is more than or equal to 1 and less than or equal to N, and Dtrain,iRepresentative training set DtrainNormal operating data of the ith operating mode, std (D)train,i) Representative training set DtrainThe standard deviation vector of the ith working condition comprises m numerical values, and std (D) is obtained by calculating the standard deviation of the corresponding variabletrain,i),ntrain,iRepresentative training set DtrainNumber of normal operating data of the ith working condition, ntrain,i=6000;
(3-2) training set D for step (2)trainThe k-th normal operation data xk, k in (b) represents the training set DtrainRun time number of (1), 2, ntrain,xkComprises m variable values with time sequence number k, and local moving window data w with time window t is selected forward by calculating timek,wkThere are m rows and t columns of data, where t is the time window, m is 42, t is 30:
using local moving window data wkCalculating wkAverage value of m variables in (1) to obtain mean (w)k),mean(wk) Comprises m numbers;
(3-3) Using gmstd (D) in step (3-1)
train) And mean (w) in step (3-2)
k) For the local moving window data w of step (3-2)
kPerforming local adaptive normalization to w
kApproximately converting m variables of the data into standard normal distribution to obtain local self-adaptive standardized local moving window data
(3-4) repeating the step (3-2) and the step (3-3), and sequentially calculating D in the training settrainObtaining a local adaptive standardized training set T by each normal operation datatrain;
(3-5) authentication set D for step (2)validP-th normal operation data x in (1)pAnd p represents the verification set DvalidOperating time number of (1), 2, nvalid,xpLocal movement comprising m variable values with time sequence number p, with time forward selecting time window tWindow data wp,wpThere are m rows and t columns of data, where t is the time window in step (3-2), m is 42, t is 30:
using local moving window data wpCalculating wpAverage value of m variables in (1) to obtain mean (w)p),mean(wp) Comprises m numbers;
(3-6) Using gmstd (D) in step (3-1)
train) And mean (w) in step (3-5)
p) For the local moving window data w of step (3-5)
pPerforming local adaptive normalization to w
pApproximately converting m variables of the data into standard normal distribution to obtain local self-adaptive standardized local moving window data
(3-7) repeating the steps (3-5) and (3-6), and sequentially calculating D in the verification setvalidObtaining a local self-adaptive standardized verification set T from each normal operation datavalid;
(4) Constructing a variational automatic encoder which comprises an encoder part and a decoder part and utilizing the training set T obtained in the step (3-4)trainTraining the variational automatic encoder to obtain the trained variational automatic encoder, and specifically comprising the following steps:
(4-1) designing and constructing an encoder by using a bidirectional Long Short-term memory (BilSTM) and a linear layer, wherein the encoder has a structure shown in figure 2 and comprises two layers of BilSTM and the linear layer, and the local adaptive standardized local moving window data in the step (3-3) is processed
As input to the encoder, the mapping is derived
Feature vector σ of
kAnd mu
kFeature vector σ
kAnd mu
kThere are l values, l represents the dimension of the feature vector, l is 50:
(4-2) Using the feature vector σ of step (4-1)
kAnd mu
kCarrying out reparameterization to obtain
Hidden feature vector h of
k,h
kIncludes l values, l 50:
hk=μk+σk⊙∈
where e is normally distributed from the standard
A random sample results,. indicates multiplication of corresponding elements of the vector;
(4-3) designing and constructing a decoder by using a bidirectional Long Short-term Memory (BilSTM) and a linear layer, wherein the decoder has a structure shown in figure 2 and comprises two layers of BilSTM and the linear layer, and the hidden feature vector h in the step (4-2) is processed
kAs input to the decoder, reconstructing to obtain the data corresponding to step (3-3)
Reconstructed data with the same dimensionality
There are m rows and t columns of data, m is 42, t is 30:
(4-4) utilizing the eigenvector σ of step (4-1) according to the following loss function
kAnd mu
kAnd the reconstructed data of step (4-3)
Calculating the locally adaptive normalized local moving window data of step (3-3)
Error of (2)
I.e. the loss function of the variational automatic encoder, the loss function
Including reconstruction losses
And KL divergence loss
λ is the weighting factor of KL divergence loss versus reconstruction loss, λ is 10
5The loss of the two parts is calculated as follows, wherein j represents the variable serial number of the chemical process, j is more than or equal to 1 and less than or equal to m, and m is 42:
(4-5) repeating the step (4-1) to the step (4-4), and sequentially combining the training sets T in the step (3-4)
trainEach data of
Inputting the variational automatic encoder to carry out error calculation, and training the variational automatic encoder through an error back propagation algorithm to obtain a trained variational automatic encoder;
(5) utilizing the trained variational automatic encoder obtained in the step (4) and the verification set T obtained in the step (3-7)validBy estimating the verification set TvalidThe abnormal score confidence interval of the variable automatic encoder is used for obtaining a monitoring threshold eta when the variable automatic encoder is used for a fault detection task, and the method specifically comprises the following steps:
(5-1) local moving window data for local adaptive normalization of step (3-6)
Mapping the input of the variational automatic encoder trained in the step (4) to obtain
Feature vector σ of
pAnd mu
pFeature vector σ
pAnd mu
pThere are l values, l represents the dimension of the feature vector, l is 50:
(5-2) Using the feature vector σ of step (5-1)
pAnd mu
pTo, for
Carrying out reparameterization to obtain
Hidden feature vector h of
p,h
pIncludes l values, l 50:
hp=μp+σp⊙∈
where e is normally distributed from the standard
A random sample results,. indicates multiplication of corresponding elements of the vector;
(5-3) combining the hidden feature vector h of the step (5-2)
pAs the input of the decoder in the variational automatic encoder trained in the step (4), reconstructing to obtain the result of the step (3-6)
Reconstructed data with the same dimensionality
There are m rows and t columns of data, m is 42, t is 30:
(5-4) utilizing the feature vector σ of step (5-1) according to the following abnormality score calculation formula
pAnd mu
pAnd the reconstructed data of step (5-3)
Computing step (3-6) local adaptationNormalized local moving window data
Is abnormal score of
Abnormal score
Including reconstruction losses
And KL divergence loss
λ is a weighting coefficient of KL divergence loss with respect to reconstruction loss, and is the same as λ in step (4-4), where λ is 10
5(ii) a The loss of the two parts is calculated as follows, wherein j represents the variable serial number of the chemical process, j is more than or equal to 1 and less than or equal to m, and m is 42:
(5-5) repeating the steps (5-1) to (5-4), and sequentially adding the verification sets T of the steps (3-7)
validEach data of
Input variable automatic encoder for calculating abnormal score
Get the verification set T
validIs abnormal score data set S
valid;
(5-6) abnormal score data set SvalidObtaining abnormal score data set S according to normal distributionvalidThe abnormal fraction with the normal distribution confidence coefficient alpha is used as a monitoring threshold eta of the chemical system, and the alpha is 99.9 percent;
(6) and (3) carrying out online fault detection on the process data of the chemical system under different working conditions by using the variational automatic encoder trained in the step (4) and the monitoring threshold eta obtained in the step (5), wherein the method comprises the following steps:
(6-1) collecting the process data of 4 working conditions from the database of the chemical system as a test set DtestIn total, m rows ntestColumn data, m 42, ntest4 (2000+4 × 1650), wherein the 4 working conditions include 2 historical working conditions and 2 new working conditions in the step (1), and are used for testing the fault detection effect of the invention under different working conditions. Each condition includes normal operating data and 4 types of failed operating data. Wherein, each operating mode includes 2000 normal operating data, and each operating mode of the operating data that breaks down includes 4 fault types, and each fault type includes 1650 operating data, and preceding 450 operating data still belongs to normal operating data, introduces the trouble from 450 operating data, and 1200 last operating data belong to trouble operating data, and 4 fault types are as shown in the following table:
table 1 4 fault types in test data
For test set DtestQ-th normal operation data x in (1)qAnd q represents test set DtestOperating time sequence number q 1, 2, ntest,xqComprises m variable values with time sequence number q, and local moving window data w with time window t is selected forward by calculating timeq,wqThere are m rows and t columns of data, where t is the time window in step (3-2), m is 42, t is 30:
using local moving window data wqCalculating wqAverage value of m variables in (1) to obtain mean (w)q),mean(wq) Comprises m numbers;
(6-2) Using gmstd (Dtrain) in step (3-1) and mean (w) in step (6-1)
q) For the local moving window data w of step (6-1)
qPerforming local adaptive normalization to w
qApproximately converting m variables of the data into standard normal distribution to obtain local self-adaptive standardized local moving window data
(6-3) local moving window data for local adaptive normalization of step (6-2)
Mapping the input of the encoder in the variational automatic encoder trained in the step (4) to obtain
Feature vector σ of
qAnd mu
qThe two eigenvectors have values of l, l represents the dimension of the eigenvector and has the same size as l in step (4-1), and l is 50:
(6-4) advantageUsing the feature vector σ of step (6-3)
qAnd mu
qCarrying out reparameterization to obtain
Hidden feature vector h of
q,h
qIncludes l values, l 50:
hq=μq+σq⊙∈
where e is normally distributed from the standard
A random sample results,. indicates multiplication of corresponding elements of the vector;
(6-5) hiding the feature vector h of the step (6-4)
qAs the input of the decoder in the variational automatic encoder trained in the step (4), reconstructing to obtain the result corresponding to the step (6-2)
Reconstructed data with the same dimensionality
There are m rows and t columns of data, m is 42, t is 30:
(6-6) utilizing the feature vector σ of step (6-3) according to the following abnormality score calculation formula
qAnd mu
qAnd the reconstructed data of step (6-5)
Calculating the locally adaptive normalized local moving window data of step (6-2)
Is abnormal score of
Abnormal score
Including reconstruction losses
And KL divergence loss.
λ is a weighting coefficient of KL divergence loss with respect to reconstruction loss, and is the same as λ in step (4-4), where λ is 10
5(ii) a The loss of the two parts is calculated as follows, wherein j represents the variable serial number of the chemical process, j is more than or equal to 1 and less than or equal to m, and m is 42:
(6-7) scoring the abnormality of step (6-6)
Comparing with the monitoring threshold eta obtained in the step (5), if so
The current chemical system is in a normal operation state, the step (6-1) is returned to continue monitoring the online real-time data, and if the online real-time data is not monitored, the chemical system is in a normal operation state
The system fault of the current chemical system is indicated, and fault warning is sent out, so that the multi-working-condition fault detection of the chemical system based on local adaptive standardization is realized.
According to the above determination rule, fig. 3 shows the effect of fault detection in the present embodiment under the 4 conditions of step (6-1). Wherein, the working condition 1 and the working condition 2 represent 2 historical working conditions of the step (1), and the working condition 3 and the working condition 4 represent new working conditions which are not used in the step (4) and the step (5). Fig. 3 (a) to (e) show the monitoring effects of the fault detection model on the normal operation data and the fault operation data of the faults 1 to 4, respectively. The abscissa of each sub-graph represents the running time, the ordinate represents the anomaly score, the abscissa represents the monitoring threshold η obtained in step (5), and the vertical dotted line represents the introduction of a fault starting from the 450 th running data in step (1). If the black solid line is lower than the monitoring threshold represented by the dotted line, the normal operation of the chemical system is indicated; if the solid black line is higher than the monitoring threshold represented by the horizontal dashed line, it indicates that the chemical system is malfunctioning. As shown in fig. 3, in (a), the black solid line of the normal operation data under 4 working conditions is below the monitoring threshold (horizontal dotted line), which proves that the method can correctly determine the normal operation of the chemical system; and (b) to (e), the black solid line of the 4 working condition fault operation data is higher than the monitoring threshold (horizontal dotted line) from 450 (vertical dotted line), so that the method can be used for correctly judging that the chemical system has faults. The invention has similar fault detection results for the operation data of the working conditions 1-4, and proves that the fault detection method based on the local adaptive standardization has better detection effect under all the working conditions.