CN111367253B

CN111367253B - Chemical system multi-working-condition fault detection method based on local adaptive standardization

Info

Publication number: CN111367253B
Application number: CN202010098141.7A
Authority: CN
Inventors: 赵劲松; 吴昊
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2020-02-18
Filing date: 2020-02-18
Publication date: 2021-03-16
Anticipated expiration: 2040-02-18
Also published as: CN111367253A

Abstract

The invention relates to a multi-working-condition fault detection method for a chemical system based on local adaptive standardization, and belongs to the technical fields of chemical process monitoring, industrial data processing and process system engineering. This method proposes a local adaptive normalization method, and applies the variational auto-encoder technology of deep neural network. By calculating the average value of the data in the local moving window, it is used as the average value parameter of local adaptive normalization, which is used for different data. Different averages, with adaptive capability. The method utilizes local adaptive normalization to detect faults by detecting whether the data in the local moving window deviates from the trend. The method can be applied to any working condition, has higher accuracy and stronger generalization ability, can meet the needs of real-time detection, and can avoid chemical accidents or reduce the harm caused by accidents through early warning faults.

Description

Chemical system multi-working-condition fault detection method based on local adaptive standardization

Technical Field

The invention relates to a chemical system multi-working-condition fault detection method based on local adaptive standardization, and belongs to the technical field of chemical process monitoring, industrial data processing and process system engineering.

Background

The safe production in the petrochemical industry relates to each link in the life cycle of chemicals, and because the production link has large chemical quantity and centralized personnel distribution, serious property loss, casualties and environmental damage are caused once an accident occurs. With the continuous progress, popularization and implementation of informatization technology, the chemical industry enters a big data era. The fault detection technology is a basic key technology in the field of chemical process safety, and aims to distinguish whether a chemical system is in a normal operation state or has a fault by collecting and analyzing real-time data of an industrial process.

With the continuous improvement of the automation degree of chemical plants, most of the chemical plants are provided with advanced process control systems and industrial large data storage platforms, so that in recent years, a data-driven chemical fault detection method becomes a research hotspot in academia and industry. Data-driven failure detection mainly includes two types of methods. The first category of methods is multivariate statistical process monitoring methods, including Principal Component Analysis (PCA) and Partial Least Squares (PLS). Because the chemical process has the characteristics of multivariable, dynamic property, nonlinearity and the like, researchers provide a dynamic method and a kernel method based on PCA and PLS in order to be more applied to the chemical process. The second category of methods is deep neural network based methods, including deep belief networks, convolutional neural networks, and Variational auto-encoders (VAEs). The VAE method can train and obtain a monitoring model for chemical engineering fault detection by using normal operation data only. Compared with a multivariate statistical process monitoring method, the deep neural network has higher accuracy, recall rate and higher generalization capability. In recent years, with the development of hardware computing capabilities such as CPUs, GPUs and the like, the computing speed of the method can meet the real-time requirement of industrial data monitoring, and the method has great advantages in practical application. However, with the influence of factors such as raw materials, markets, environment and the like, the chemical device needs to continuously adjust the operation conditions in the production link, namely, the multi-working-condition characteristic exists. The existing chemical fault detection method based on the deep neural network has the problems that normal process variable data are generally assumed to be subjected to normal distribution or unimodal distribution, and the data need to be standardized before being input into a model, so that the model can only be suitable for a single working condition. In the face of the characteristic of multiple working conditions of chemical industry, the existing deep neural network method cannot effectively deal with the problem and cannot complete the fault detection task of the chemical process.

For the multi-working-condition characteristics of chemical industry, the current research generally utilizes a local neighbor standardization combined multivariate statistical process monitoring method, the method uses local neighbor standardization to preprocess data, and then uses the standardized data to model a PCA (principal component analysis) or PLS (partial least squares) method. The original standardization method is to estimate the distribution of variables by using the average value and standard deviation of historical normal operation data, and when online data standardization is carried out, the fixed historical average value and standard deviation are used for calculation, but the differences between the average value and the standard deviation under different working conditions are huge, so that the method can only be used for a single working condition. Local neighbor normalization is performed by finding a local neighbor set of current data in historical normal operating data, and calculating with the mean and standard deviation of the local neighbor set. The local neighbor standardization can find that the current data belongs to a certain working condition, and then the data of the working condition is utilized to carry out standardization processing, so that the data of a plurality of working conditions can be mapped to approximate unimodal distribution, and further the fault detection of historical working conditions can be completed. The method has the problems that the local neighbor standardization still uses historical data to calculate the average value and the standard deviation, is highly dependent on historical working conditions and can only be applied to the historical working conditions. Once the chemical process runs under a new working condition, the fault detection task cannot be finished if neighbor data does not exist in the historical data. Up to now, no universal fault detection method capable of monitoring all working conditions of a chemical process has appeared.

Disclosure of Invention

The invention aims to provide a chemical system multi-working-condition fault detection method based on local adaptive standardization, which is used for overcoming the defects of the existing method, applies a variational automatic encoder technology of a deep neural network, processes data of a local moving window by utilizing the local adaptive standardization, inputs the window data into a variational automatic encoder model to detect whether the deviation trend exists or not, and judges whether the process data is in a normal running state or has a fault or not, so that early warning is carried out when the early data deviates, and the possibility of occurrence of chemical accidents is reduced to the maximum extent.

The invention provides a chemical system multi-working-condition fault detection method based on local adaptive standardization, which comprises the following steps of:

(1) obtaining normal operation data set D under N working conditions from historical database of chemical system_historyData set D_historyThe method comprises the following steps of (1) totally m rows and n columns of data, wherein m represents a process variable of a chemical system, and n represents total operation time;

(2) setting the normal operation data set D in the step (1)_historyInto training sets D_trainAnd a verification set D_validTraining set D_trainComprising m rows n_trainColumn data, validation set D_validComprising m rows n_validColumn data, in which training set D_trainD of historical normal operation data set_historyIn a ratio of

60％≤a≤90％；

(3) Training set D in step (2)_trainAnd a verification set D_validCarrying out local self-adaptive standardization processing to obtain a transformed training set T_trainAnd a verification set T_validThe method comprises the following specific steps:

(3-1) Using the training set D of step (2)_trainThe global mean standard deviation gmstd (D) of m process variables in the chemical system is calculated by using the following formula_train) Including m numbers:

wherein i represents the working condition serial number of the chemical process, i is more than or equal to 1 and less than or equal to N, and D_train，iRepresentative training set D_trainNormal operating data of the ith operating mode, std (D)_train，i) Representative training set D_trainThe standard deviation vector of the ith working condition comprises m numerical values, and std (D) is obtained by calculating the standard deviation of the corresponding variable_train，i)，n_train，iRepresentative training set D_trainThe amount of normal operation data for the ith condition,

(3-2) training set D for step (2)_trainThe k-th normal operation data xk, k in (b) represents the training set D_trainRun time number of (1), 2, n_train，x_kComprises m variable values with time sequence number k, and local moving window data w with time window t is selected forward by calculating time_k，w_kThe total m rows and t columns of data are provided, wherein t is a time window, t is more than or equal to 10 and less than or equal to 100:

utilizing local moving window dataw_kCalculating w_kAverage value of m variables in (1) to obtain mean (w)_k)，mean(w_k) Comprises m numbers;

(3-3) Using gmstd (D) in step (3-1)_train) And mean (w) in step (3-2)_k) For the local moving window data w of step (3-2)_kPerforming local adaptive normalization to w_kApproximately converting m variables of the data into standard normal distribution to obtain local self-adaptive standardized local moving window data

(3-4) repeating the step (3-2) and the step (3-3), and sequentially calculating D in the training set_trainObtaining a local adaptive standardized training set T by each normal operation data_train；

(3-5) authentication set D for step (2)_validP-th normal operation data x in (1)_pAnd p represents the verification set D_validOperating time number of (1), 2, n_valid，x_pThe local moving window data w with time window t is selected forward in time and comprises m variable values with time sequence number p_p，w_pThere are m rows and t columns of data, where t is the time window in step (3-2):

using local moving window data w_pCalculating w_pAverage value of m variables in (1) to obtain mean (w)_p)，mean(w_p) Comprises m numbers;

(3-6) Using gmstd (D) in step (3-1)_train) And mean (w) in step (3-5)_p) For the local moving window data w of step (3-5)_pThe local self-adaptive standardization is carried out,let w_pApproximately converting m variables of the data into standard normal distribution to obtain local self-adaptive standardized local moving window data

(3-7) repeating the steps (3-5) and (3-6), and sequentially calculating D in the verification set_validObtaining a local self-adaptive standardized verification set T from each normal operation data_valid；

(4) Constructing a variational automatic encoder which comprises an encoder part and a decoder part and utilizing the training set T obtained in the step (3-4)_trainTraining the variational automatic encoder to obtain the trained variational automatic encoder, and specifically comprising the following steps:

(4-1) designing and constructing an encoder by using a convolutional neural network, a cyclic neural network or a deep belief network, and carrying out local adaptive normalization on the local moving window data obtained in the step (3-3)

As input to the encoder, the mapping is derived

Feature vector σ of_kAnd mu_kFeature vector σ_kAnd mu_kThere are l numbers, l represents the dimension of the feature vector, l is greater than or equal to m and less than or equal to 4 m:

(4-2) Using the feature vector of step (4-1)σ_kAnd mu_kCarrying out reparameterization to obtain

Hidden feature vector h of_k，h_kIncludes l values:

h_k＝μ_k+σ_k⊙∈

where e is normally distributed from the standard

A random sample results,. indicates multiplication of corresponding elements of the vector;

(4-3) designing and constructing a decoder by using a convolutional neural network, a cyclic neural network or a deep belief network, and enabling the hidden feature vector h in the step (4-2)_kAs input to the decoder, reconstructing to obtain the data corresponding to step (3-3)

Reconstructed data with the same dimensionality

There are m rows and t columns of data:

(4-4) utilizing the eigenvector σ of step (4-1) according to the following loss function_kAnd mu_kAnd the reconstructed data of step (4-3)

Calculating the locally adaptive normalized local moving window data of step (3-3)

Error of (2)

I.e. the loss function of the variational automatic encoder, the loss function

Including reconstruction losses

And KL divergence loss

λ is the weighting factor of KL divergence loss versus reconstruction loss, 10³≤λ≤10⁶The loss of the two parts is calculated as follows, wherein j represents the variable serial number of the chemical process, and j is more than or equal to 1 and less than or equal to m:

(4-5) repeating the step (4-1) to the step (4-4), and sequentially combining the training sets T in the step (3-4)_trainEach data of

Inputting the variational automatic encoder to carry out error calculation, and training the variational automatic encoder through an error back propagation algorithm to obtain a trained variational automatic encoder;

(5) utilizing the trained variational automatic encoder obtained in the step (4) and the verification set T obtained in the step (3-7)_validBy estimating the verification set T_validTo obtain a variational automatic encoder for fault detection taskThe specific steps of the time monitoring threshold eta are as follows:

(5-1) local moving window data for local adaptive normalization of step (3-6)

Mapping the input of the variational automatic encoder trained in the step (4) to obtain

Feature vector σ of_pAnd mu_pFeature vector σ_pAnd mu_pThere are l values, respectively, where l represents the dimension of the feature vector:

(5-2) Using the feature vector σ of step (5-1)_pAnd mu_pTo, for

Carrying out reparameterization to obtain

Hidden feature vector h of_p，h_pIncludes l values:

h_p＝μ_p+σ_p⊙∈

where e is normally distributed from the standard

(5-3) combining the hidden feature vector h of the step (5-2)_pAs the input of the decoder in the variational automatic encoder trained in the step (4), reconstructing to obtain the result of the step (3-6)

Reconstructed data with the same dimensionality

There are m rows and t columns of data:

(5-4) utilizing the feature vector σ of step (5-1) according to the following abnormality score calculation formula_pAnd mu_pAnd the reconstructed data of step (5-3)

Calculating the local adaptive standardized local moving window number in the step (3-6)

Is abnormal score of

Abnormal score

Including reconstruction losses

And KL divergence loss

λ is a weighting coefficient of KL divergence loss with respect to reconstruction loss, the same as λ of step (4-4); two-part loss calculationWherein j represents the variable serial number of the chemical process, j is more than or equal to 1 and less than or equal to m:

(5-5) repeating the steps (5-1) to (5-4), and sequentially adding the verification sets T of the steps (3-7)_validEach data of

Input variable automatic encoder for calculating abnormal score

Get the verification set T_validIs abnormal score data set S_valid；

(5-6) abnormal score data set S_validObtaining abnormal score data set S according to normal distribution_validThe abnormal fraction with the normal distribution confidence coefficient alpha is used as a monitoring threshold eta of the chemical system, and alpha is more than or equal to 99% and less than or equal to 99.99%;

(6) and (3) carrying out online fault detection on the process data of the chemical system under different working conditions by using the variational automatic encoder trained in the step (4) and the monitoring threshold eta obtained in the step (5), wherein the method comprises the following steps:

(6-1) collecting process data from a real-time database of the chemical system at the current detection moment q, and selecting local moving window data w with a time window t from the time to the front_q，w_qThere are m rows and t columns of data, where t is the time window in step (3-2):

using local moving window data w_qCalculating w_qAverage of m variables inValue, to obtain mean (w)_q)，mean(w_q) Comprises m numbers;

(6-2) Using gmstd (Dtrain) in step (3-1) and mean (w) in step (6-1)_q) For the local moving window data w of step (6-1)_qPerforming local adaptive normalization to w_qApproximately converting m variables of the data into standard normal distribution to obtain local self-adaptive standardized local moving window data

(6-3) local moving window data for local adaptive normalization of step (6-2)

Mapping the input of the encoder in the variational automatic encoder trained in the step (4) to obtain

Feature vector σ of_qAnd mu_qThe two feature vectors have values of l, wherein l represents the dimension of the feature vector and has the same size as l in the step (4-1):

(6-4) Using the feature vector σ of step (6-3)_qAnd mu_qCarrying out reparameterization to obtain

Hidden feature vector h of_q，h_qComprisingIndividual values:

h_q＝μ_q+σ_q⊙∈

where e is normally distributed from the standard

(6-5) hiding the feature vector h of the step (6-4)_qAs the input of the decoder in the variational automatic encoder trained in the step (4), reconstructing to obtain the result corresponding to the step (6-2)

Reconstructed data with the same dimensionality

There are m rows and t columns of data:

(6-6) utilizing the feature vector σ of step (6-3) according to the following abnormality score calculation formula_qAnd mu_qAnd the reconstructed data of step (6-5)

Calculating the locally adaptive normalized local moving window data of step (6-2)

Is abnormal score of

Abnormal score

Including reconstruction losses

And KL divergence loss

λ is a weighting coefficient of KL divergence loss with respect to reconstruction loss, the same as λ of step (4-4); the loss of the two parts is calculated as follows, wherein j represents the variable serial number of the chemical process, j is more than or equal to 1 and less than or equal to m:

(6-7) scoring the abnormality of step (6-6)

Comparing with the monitoring threshold eta obtained in the step (5), if so

The current chemical system is in a normal operation state, the step (6-1) is returned to continue monitoring the online real-time data, and if the online real-time data is not monitored, the chemical system is in a normal operation state

The system fault of the current chemical system is indicated, and fault warning is sent out, so that the multi-working-condition fault detection of the chemical system based on local adaptive standardization is realized.

The invention provides a chemical system multi-working-condition fault detection method based on local adaptive standardization, which has the advantages that:

the invention discloses a chemical system multi-working-condition fault detection method based on local adaptive standardization, which is different from the existing fault detection method by providing a local adaptive standardization method and applying a variational automatic encoder technology of a deep neural network. The method is different from other existing detection methods which detect the deviation degree of the current data and the normal operation data, and the fault detection is carried out by detecting whether the data in the local moving window deviates or not by utilizing local self-adaptive standardization processing. Therefore, the method can be suitable for any working condition, and not only can be applied to the historical existing working condition, but also can be applied to the historical non-occurring working condition. In addition, the invention combines and applies the variational automatic encoder to detect whether the current window data has the deviation trend, and has higher accuracy and stronger generalization capability compared with the traditional multivariate statistical method. The invention can meet the requirement of real-time detection, can be applied to the fault detection task of the chemical system under all working conditions of the chemical process, and avoids the occurrence of chemical accidents or reduces the harm brought by the accidents by early warning of the faults.

Drawings

FIG. 1 is a block diagram of the overall process of the method of the present invention.

Fig. 2 is a schematic diagram of a variational auto-encoder configuration in an embodiment of the present invention.

Fig. 3 is a schematic diagram of a fault detection result under different working conditions according to an embodiment of the present invention.

Detailed Description

The invention provides a chemical system multi-working-condition fault detection method based on local adaptive standardization, which has an overall flow diagram shown in figure 1 and comprises the following steps:

(1) obtaining normal operation data set D under N working conditions from historical database of chemical system_historyData set D_historyThere are m rows and n columns of data, where m represents process variables of the chemical system, such as temperature, time, pressure, etc., and n represents total run time;

60％≤a≤90％；

(3-2) training set D for step (2)_trainThe k-th normal operation data xk, k in (b) represents the training set D_trainRun time number of (1), 2, n_ttain，x_kComprises m variable values with time sequence number k, and local moving window data w with time window t is selected forward by calculating time_k，w_kThe total m rows and t columns of data are provided, wherein t is a time window, t is more than or equal to 10 and less than or equal to 100:

using local moving window data w_kCalculating w_kAverage value of m variables in (1) to obtain mean (w)_k)，mean(w_k) Comprises m numbers;

(3-6) Using gmstd (D) in step (3-1)_train) And mean (w) in step (3-5)_p) For the local moving window data w of step (3-5)_pPerforming local adaptive normalization to w_pApproximately converting m variables of the data into standard normal distribution to obtain local self-adaptive standardized local moving window data

As input to the encoder, the mapping is derived

(4-2) Using the feature vector σ of step (4-1)_kAnd mu_kCarrying out reparameterization to obtain

Hidden feature vector h of_k，h_kIncludes l values:

h_k＝μ_k+σ_k⊙∈

where e is normally distributed from the standard

Reconstructed data with the same dimensionality

There are m rows and t columns of data:

Error of (2)

I.e. the loss function of the variational automatic encoder, the loss function

Including reconstruction losses

And KL divergence loss

Inputting the variational automatic encoder to carry out error calculation, and training the variational automatic encoder through an error back propagation algorithm to obtain the variational automatic encoderThe trained variational automatic encoder is used for a fault detection task of the chemical system;

(5) utilizing the trained variational automatic encoder obtained in the step (4) and the verification set T obtained in the step (3-7)_validBy estimating the verification set T_validThe abnormal score confidence interval of the variable automatic encoder is used for obtaining a monitoring threshold eta when the variable automatic encoder is used for a fault detection task, and the method specifically comprises the following steps:

(5-1) local moving window data for local adaptive normalization of step (3-6)

Feature vector σ of_pAnd mu_pFeature vector σ_pAnd mu_pThere are l numbers, respectively, l represents the dimension of the feature vector, and has the same size as l of step (4-1):

(5-2) Using the feature vector σ of step (5-1)_pAnd mu_pTo, for

Carrying out reparameterization to obtain

Hidden feature vector h of_p，h_pIncludes l values:

h_p＝μ_p+σ_p⊙∈

where e is normally distributed from the standard

Reconstructed data with the same dimensionality

There are m rows and t columns of data:

Is abnormal score of

Abnormal score

Including reconstruction losses

And KL divergence loss

Input variable automatic encoder for calculating abnormal score

Get the verification set T_validIs abnormal score data set S_valid；

using local moving window data w_qCalculating w_qAverage value of m variables in (1) to obtain mean (w)_q)，mean(w_q) Comprises m numbers;

(6-3) local moving window data for local adaptive normalization of step (6-2)

Feature vector σ of_qAnd v_qThe two feature vectors have values of l, wherein l represents the dimension of the feature vector and has the same size as l in the step (4-1):

Hidden feature vector hq, h of_qIncludes l values:

h_q＝μ_q+σ_q⊙∈

where e is normally distributed from the standard

Reconstructed data with the same dimensionality

There are m rows and t columns of data:

Is abnormal score of

Abnormal score

Including reconstruction losses

And KL divergence loss

(6-7) scoring the abnormality of step (6-6)

Comparing with the monitoring threshold eta obtained in the step (5), if so

The system fault of the current chemical system is indicated, and fault warning is sent out, so that the chemical system based on local adaptive standardization is moreAnd detecting working condition faults.

Embodiments of the method of the present invention are described below with reference to the accompanying drawings:

(1) acquiring a normal operation data set D under 2 working conditions from a historical database of a chemical system_historyData set D_historyThe method comprises the following steps of (1) sharing m rows and n columns of data, wherein m is 42 to represent a process variable of a chemical system, n is 16000 to represent total operation time, and each working condition comprises 8000 operation times;

(2) setting the normal operation data set D in the step (1)_historyInto training sets D_trainAnd a verification set D_validTraining set D_trainComprising m rows n_trainColumn data, validation set D_validComprising m rows n_validColumn data, in which training set D_trainD of historical normal operation data set_historyThe ratio of a to 75%, n_train＝12000，n_valid＝4000；

(3-1) Using the training set D of step (2)_trainThe global mean standard deviation gmstd (D) of m process variables in the chemical system is calculated by using the following formula_train) And m is 42 values:

wherein i represents the working condition serial number of the chemical process, i is more than or equal to 1 and less than or equal to N, and D_train，iRepresentative training set D_trainNormal operating data of the ith operating mode, std (D)_train，i) Representative training set D_trainThe standard deviation vector of the ith working condition comprises m numerical values, and std (D) is obtained by calculating the standard deviation of the corresponding variable_train，i)，n_train，iRepresentative training set D_trainNumber of normal operating data of the ith working condition, n_train，i＝6000；

(3-2) training set D for step (2)_trainThe k-th normal operation data xk, k in (b) represents the training set D_trainRun time number of (1), 2, n_train，x_kComprises m variable values with time sequence number k, and local moving window data w with time window t is selected forward by calculating time_k，w_kThere are m rows and t columns of data, where t is the time window, m is 42, t is 30:

(3-5) authentication set D for step (2)_validP-th normal operation data x in (1)_pAnd p represents the verification set D_validOperating time number of (1), 2, n_valid，x_pLocal movement comprising m variable values with time sequence number p, with time forward selecting time window tWindow data w_p，w_pThere are m rows and t columns of data, where t is the time window in step (3-2), m is 42, t is 30:

(4-1) designing and constructing an encoder by using a bidirectional Long Short-term memory (BilSTM) and a linear layer, wherein the encoder has a structure shown in figure 2 and comprises two layers of BilSTM and the linear layer, and the local adaptive standardized local moving window data in the step (3-3) is processed

As input to the encoder, the mapping is derived

Feature vector σ of_kAnd mu_kFeature vector σ_kAnd mu_kThere are l values, l represents the dimension of the feature vector, l is 50:

Hidden feature vector h of_k，h_kIncludes l values, l 50:

h_k＝μ_k+σ_k⊙∈

where e is normally distributed from the standard

(4-3) designing and constructing a decoder by using a bidirectional Long Short-term Memory (BilSTM) and a linear layer, wherein the decoder has a structure shown in figure 2 and comprises two layers of BilSTM and the linear layer, and the hidden feature vector h in the step (4-2) is processed_kAs input to the decoder, reconstructing to obtain the data corresponding to step (3-3)

Reconstructed data with the same dimensionality

There are m rows and t columns of data, m is 42, t is 30:

Error of (2)

I.e. the loss function of the variational automatic encoder, the loss function

Including reconstruction losses

And KL divergence loss

λ is the weighting factor of KL divergence loss versus reconstruction loss, λ is 10⁵The loss of the two parts is calculated as follows, wherein j represents the variable serial number of the chemical process, j is more than or equal to 1 and less than or equal to m, and m is 42:

(5-1) local moving window data for local adaptive normalization of step (3-6)

Feature vector σ of_pAnd mu_pFeature vector σ_pAnd mu_pThere are l values, l represents the dimension of the feature vector, l is 50:

(5-2) Using the feature vector σ of step (5-1)_pAnd mu_pTo, for

Carrying out reparameterization to obtain

Hidden feature vector h of_p，h_pIncludes l values, l 50:

h_p＝μ_p+σ_p⊙∈

where e is normally distributed from the standard

Reconstructed data with the same dimensionality

There are m rows and t columns of data, m is 42, t is 30:

Computing step (3-6) local adaptationNormalized local moving window data

Is abnormal score of

Abnormal score

Including reconstruction losses

And KL divergence loss

λ is a weighting coefficient of KL divergence loss with respect to reconstruction loss, and is the same as λ in step (4-4), where λ is 10⁵(ii) a The loss of the two parts is calculated as follows, wherein j represents the variable serial number of the chemical process, j is more than or equal to 1 and less than or equal to m, and m is 42:

Input variable automatic encoder for calculating abnormal score

Get the verification set T_validIs abnormal score data set S_valid；

(5-6) abnormal score data set S_validObtaining abnormal score data set S according to normal distribution_validThe abnormal fraction with the normal distribution confidence coefficient alpha is used as a monitoring threshold eta of the chemical system, and the alpha is 99.9 percent;

(6-1) collecting the process data of 4 working conditions from the database of the chemical system as a test set D_testIn total, m rows n_testColumn data, m 42, n_test4 (2000+4 × 1650), wherein the 4 working conditions include 2 historical working conditions and 2 new working conditions in the step (1), and are used for testing the fault detection effect of the invention under different working conditions. Each condition includes normal operating data and 4 types of failed operating data. Wherein, each operating mode includes 2000 normal operating data, and each operating mode of the operating data that breaks down includes 4 fault types, and each fault type includes 1650 operating data, and preceding 450 operating data still belongs to normal operating data, introduces the trouble from 450 operating data, and 1200 last operating data belong to trouble operating data, and 4 fault types are as shown in the following table:

table 1 4 fault types in test data

For test set D_testQ-th normal operation data x in (1)_qAnd q represents test set D_testOperating time sequence number q 1, 2, n_test，x_qComprises m variable values with time sequence number q, and local moving window data w with time window t is selected forward by calculating time_q，w_qThere are m rows and t columns of data, where t is the time window in step (3-2), m is 42, t is 30:

(6-3) local moving window data for local adaptive normalization of step (6-2)

Feature vector σ of_qAnd mu_qThe two eigenvectors have values of l, l represents the dimension of the eigenvector and has the same size as l in step (4-1), and l is 50:

(6-4) advantageUsing the feature vector σ of step (6-3)_qAnd mu_qCarrying out reparameterization to obtain

Hidden feature vector h of_q，h_qIncludes l values, l 50:

h_q＝μ_q+σ_q⊙∈

where e is normally distributed from the standard

Reconstructed data with the same dimensionality

There are m rows and t columns of data, m is 42, t is 30:

Is abnormal score of

Abnormal score

Including reconstruction losses

And KL divergence loss.

(6-7) scoring the abnormality of step (6-6)

Comparing with the monitoring threshold eta obtained in the step (5), if so

According to the above determination rule, fig. 3 shows the effect of fault detection in the present embodiment under the 4 conditions of step (6-1). Wherein, the working condition 1 and the working condition 2 represent 2 historical working conditions of the step (1), and the working condition 3 and the working condition 4 represent new working conditions which are not used in the step (4) and the step (5). Fig. 3 (a) to (e) show the monitoring effects of the fault detection model on the normal operation data and the fault operation data of the faults 1 to 4, respectively. The abscissa of each sub-graph represents the running time, the ordinate represents the anomaly score, the abscissa represents the monitoring threshold η obtained in step (5), and the vertical dotted line represents the introduction of a fault starting from the 450 th running data in step (1). If the black solid line is lower than the monitoring threshold represented by the dotted line, the normal operation of the chemical system is indicated; if the solid black line is higher than the monitoring threshold represented by the horizontal dashed line, it indicates that the chemical system is malfunctioning. As shown in fig. 3, in (a), the black solid line of the normal operation data under 4 working conditions is below the monitoring threshold (horizontal dotted line), which proves that the method can correctly determine the normal operation of the chemical system; and (b) to (e), the black solid line of the 4 working condition fault operation data is higher than the monitoring threshold (horizontal dotted line) from 450 (vertical dotted line), so that the method can be used for correctly judging that the chemical system has faults. The invention has similar fault detection results for the operation data of the working conditions 1-4, and proves that the fault detection method based on the local adaptive standardization has better detection effect under all the working conditions.

Claims

1. a chemical system multi-condition fault detection method based on local self-adaptive standardization, is characterized in that, comprises the following steps:

(1) Obtain the normal operation data set D _history under N working conditions from the historical database of the chemical system. The data set D _history has m rows and n columns of data, where m represents the process variables of the chemical system, and n represents the total running time ;

(2) Divide the normal operation data set D _history in step (1) into a training set D _train and a validation set D _valid , the training set D _train includes m rows and n _train columns of data, and the validation set D _valid includes m rows and n _valid columns Data, in which the proportion of the training set D _train to the D _history of the historical normal operation data set is:

60%≤a≤90%;

(3) Perform local adaptive normalization on the training set D _train and the verification set D _valid in step (2) to obtain the transformed training set T _train and verification set T _valid , and the specific steps are as follows:

(3-1) Using the normal operation data in the training set D _train of step (2), use the following formula to calculate the global average standard deviation gmstd(D _train ) of m process variables in the chemical system, including m values:

Among them, i represents the working condition number of the chemical process, 1≤i≤N, then D _{train, i} represents the normal operation data of the ith working condition in the training set D _train , std(D _{train, i} ) represents the training set D _train The standard deviation vector of the ith working condition in the middle, including m values, by calculating the standard deviation of the corresponding variables to get std(D _{train, i} ), n _{train, i} represents the normal operation of the ith working condition in the training set D _train the amount of data,

(3-2) For the kth normal running data x _k in the training set D _train in step (2), k represents the running time sequence number in the training set D _train , k=1, 2..., n _train , x _k includes m variable values with time serial number k, and the calculation time forward selects local moving window data w _k with time window t, w _k has m rows and t columns of data, where t is the time window, 10≤t≤100 :

Using the local moving window data w _k , calculate the mean value of m variables in w _k to obtain mean(w _k ), where mean(w _k ) includes m values;

(3-3) Using gmstd(D _train ) in step (3-1) and mean(w _k ) in step (3-2), perform localization on the local moving window data w _k of step (3-2) Adaptive standardization, so that the m variables of the _wk data are approximately converted into standard normal distribution, and the local moving window data of local adaptive standardization is obtained.

(3-4) repeat step (3-2) and step (3-3), calculate successively each normal operation data in training set D _train , obtain the training set T _train of local adaptive standardization;

(3-5) For the p-th normal running data x _p in the verification set D _valid in step (2), p represents the running time sequence number in the verification set D _valid , p=1, 2..., n _valid , x _p includes m variable values with time serial number p, and selects local moving window data w _p with time window t forward in time, w _p has m rows and t columns of data, where t is the time in step (3-2) window:

Using the local moving window data w _p , calculate the mean value of m variables in w _p to obtain mean(w _p ), where mean(w _p ) includes m values;

(3-6) Using gmstd(D _train ) in step (3-1) and mean(w _p ) in step (3-5), perform localization on the local moving window data w _p of step (3-5) Adaptive standardization, so that the m variables of the _wp data are approximately converted into standard normal distribution, and the local moving window data of local adaptive standardization is obtained.

(3-7) repeat step (3-5) and step (3-6), calculate successively each normal operation data in the verification set D _valid , obtain the verification set T _valid of local adaptive standardization;

(4) Construct a variational auto-encoder, including an encoder and a decoder, and use the training set T _train obtained in step (3-4) to train the variational auto-encoder to obtain the trained variation Autoencoder, the specific steps are as follows:

(4-1) Using a convolutional neural network, a recurrent neural network or a deep belief network, an encoder is designed and constructed to normalize the local moving window data of the local adaptation in step (3-3)

As the input to the encoder, the mapping gets

The eigenvectors σ _k and μ _k of , the eigenvectors σ _k and μ _k have l values respectively, l represents the dimension of the eigenvector, m≤l≤4m:

(4-2) Using the eigenvectors σ _k and μ _k of step (4-1), re-parameterization is performed to obtain

The hidden feature vector h _k of , h _k includes l values:

h _k = μ _k +σ _k ⊙∈

where ∈ is from the standard normal distribution

It is obtained by random sampling, and ⊙ means that the corresponding elements of the vector are multiplied;

(4-3) Use convolutional neural network, recurrent neural network or deep belief network to design and build a decoder, take the hidden feature vector h _k of step (4-2) as the input of the decoder, and reconstruct to obtain the same value as step ( 3-3) of

Reconstructed data with the same dimensions

There are m rows and t columns of data:

(4-4) According to the following loss function, use the feature vectors σ _k and μ _k of step (4-1) and the reconstructed data of step (4-3)

Computation step (3-3) Local adaptive normalized local moving window data

error

is the loss function of the variational autoencoder, the loss function

including reconstruction loss

and KL divergence loss

λ is the weight coefficient of the KL divergence loss relative to the reconstruction loss, 10 ³ ≤λ≤10 ⁶ , the two parts of the loss are calculated as follows, where j represents the variable number of the chemical process, 1≤j≤m:

(4-5) Repeat step (4-1)-step (4-4), and sequentially convert each data of the training set T _train of step (3-4)

Input the variational autoencoder for error calculation, and train the variational autoencoder through the error back-propagation algorithm to obtain the trained variational autoencoder;

(5) Using the trained variational auto-encoder obtained in step (4) and the validation set T _valid obtained in step (3-7), the variational auto-encoder is obtained by estimating the confidence interval of the abnormal score of the validation set T _valid The monitoring threshold η when the controller is used for the fault detection task, the specific steps are as follows:

(5-1) Local adaptive normalized local moving window data of step (3-6)

As the input of the trained variational autoencoder in step (4), the mapping obtains

The eigenvectors σ _p and μ _p of , the eigenvectors σ _p and μ _p have l values respectively, and l represents the dimension of the eigenvector:

(5-2) Using the eigenvectors σ _p and μ _p of step (5-1), for

Reparameterization is carried out to get

The hidden feature vector h _p of , h _p includes l values:

h _p = μ _p +σ _p ⊙∈

where ∈ is from the standard normal distribution

(5-3) Use the hidden feature vector h _p of step (5-2) as the input of the decoder in the variational autoencoder trained in step (4), and reconstruct to obtain the same value as step (3-6).

Reconstructed data with the same dimensions

There are m rows and t columns of data:

(5-4) According to the following abnormal score calculation formula, use the feature vectors σ _p and μ _p of step (5-1) and the reconstructed data of step (5-3)

Computation step (3-6) Local adaptive normalized local moving window data

abnormal score of

abnormal score

including reconstruction loss

and KL divergence loss

λ is the weight coefficient of the KL divergence loss relative to the reconstruction loss, which is the same as λ in step (4-4); the two parts of the loss are calculated as follows, where j represents the variable number of the chemical process, 1≤j≤m:

(5-5) Repeat step (5-1)-step (5-4), and sequentially convert each data of the validation set T _valid in step (3-7)

Input variational autoencoder, compute anomaly score

Obtain the abnormal score dataset S _{valid of the validation set T valid} _;

(5-6) The abnormal score data set S _valid obeys the normal distribution, and the abnormal score whose normal distribution reliability is α of the abnormal score data set S _valid is taken as the monitoring threshold η of the chemical system, 99%≤α≤99.99% ;

(6) utilize the variational autoencoder trained in step (4), and the monitoring threshold η that step (5) obtains, carry out online fault detection to the process data of this chemical system under different operating conditions, including the following steps:

(6-1) At the current detection time q, collect process data from the real-time database of the chemical system, select the local moving window data w _q with the time window t forward in time, and w _q has m rows and t columns of data, where t is the time window in step (3-2):

Using the local moving window data w _q , calculate the average value of m variables in w _q to obtain mean(w _q ), where mean(w _q ) includes m values;

(6-2) Using gmstd(D _train ) in step (3-1) and mean(w _q ) in step (6-1), perform localization on the local moving window data w _q of step (6-1) Adaptive standardization, so that the m variables of the w _q data are approximately converted into standard normal distribution, and the local moving window data of local adaptive standardization is obtained.

(6-3) The local moving window data normalized by the local adaptation of step (6-2)

As the input of the encoder in the variational autoencoder trained in step (4), the mapping is obtained

The eigenvectors σ _q and μ _q of , the two eigenvectors each have 1 value, l represents the dimension of the eigenvector, and has the same size as l in step (4-1):

(6-4) Using the eigenvectors σ _q and μ _q of step (6-3), perform reparameterization to obtain

The hidden feature vector h _q of , h _q includes l values:

h _q = μ _q +σ _q ⊙∈

where ∈ is from the standard normal distribution

It is obtained by random sampling, and ⊙ represents the multiplication of the corresponding elements of the vector;

(6-5) Use the hidden feature vector h _q of step (6-4) as the input of the decoder in the variational autoencoder trained in step (4), and reconstruct to obtain the same value as step (6-2).

Reconstructed data with the same dimensions

There are m rows and t columns of data:

(6-6) According to the following abnormal score calculation formula, use the feature vectors σ _q and v _q of step (6-3) and the reconstructed data of step (6-5)

Computation step (6-2) Local adaptive normalized local moving window data

abnormal score of

abnormal score

including reconstruction loss

and KL divergence loss

(6-7) Set the abnormal score of step (6-6)

Compare with the monitoring threshold η obtained in step (5), if

Then the current chemical system is in a normal operation state, return to step (6-1) to continue monitoring online real-time data, if

It indicates that a system fault has occurred in the current chemical system, and a fault warning is issued to realize the multi-condition fault detection of the chemical system based on local adaptive standardization.