CN118885738B

CN118885738B - Comprehensive automated testing method and system for electric control cabinets

Info

Publication number: CN118885738B
Application number: CN202411338200.8A
Authority: CN
Inventors: 李艳春; 华星; 苏丽; 濮伟新
Original assignee: WUXI KANGBEI ELECTRONIC EQUIPMENT CO Ltd
Current assignee: WUXI KANGBEI ELECTRONIC EQUIPMENT CO Ltd
Priority date: 2024-09-25
Filing date: 2024-09-25
Publication date: 2025-04-15
Anticipated expiration: 2044-09-25
Also published as: CN118885738A

Abstract

The invention relates to the technical field of data cleaning, in particular to a comprehensive automatic test method and system for an electric control cabinet. According to the invention, when the simulation test model is established to detect the electric control cabinet, noise data is effectively screened and targeted denoising is performed, so that the efficiency and accuracy are improved. And based on the data similarity cluster analysis parameter time sequence data, constructing a frequent pattern model, and distinguishing the normal data from the noise data obviously. And analyzing the relevance and frequent modes in the model, quantifying the noise existence probability, and providing a basis for evaluating the quality of the data set. Comprehensively considering the variation trend and fluctuation condition of the data, and combining the noise probability to obtain the data quality. And accurately identifying the noise data set, efficiently denoising and obtaining high-quality data. Finally, the performance test of the electric control cabinet is performed based on the high-quality data set, so that the denoising efficiency is improved, and the accuracy and reliability of the test result are ensured. The invention provides a high-efficiency and accurate data processing scheme for detection of the electric control cabinet.

Description

Comprehensive automatic test method and system for electric control cabinet

Technical Field

The invention relates to the technical field of data cleaning, in particular to a comprehensive automatic test method and system for an electric control cabinet.

Background

The electric control cabinet is totally called as an electric control cabinet, and centralized control of equipment is realized through integrating various electric elements, namely, the comprehensive automation of the electric control cabinet. In order to find potential fault points of equipment and improve reliability and service life of the electric control cabinet, performance of the electric control cabinet is required to be tested, and the conventional method is used for realizing comprehensive automatic test of the electric control cabinet through software simulation detection.

Currently, when an electric control cabinet is tested, a large amount of operation data of the electric control cabinet is generally acquired, and then the detection is performed by using software simulation, so that the quality of the acquired data is crucial to the whole simulation detection process. Because the acquired data can be interfered by factors such as environment and the like when the data are acquired, noise is generated when the data are denoised, the prior art generally adopts indifferent denoising, however, considering that the data amount required by the simulation model detection is large, the indifferent denoising can cause low efficiency, so that the accurate and efficient screening of the noisy data in a large amount of data becomes a key problem in the simulation detection process.

Disclosure of Invention

In order to solve the technical problems that denoising of a large amount of data by using an indiscriminate denoising method causes low efficiency and further influences the accuracy of a test result, the invention aims to provide a comprehensive automatic test method and system for an electric control cabinet, and the adopted technical scheme is as follows:

For any performance parameter of the electric control cabinet, acquiring a parameter data set of the performance parameter in each test period, wherein each parameter data set comprises parameter time sequence data of a plurality of data periods;

according to the numerical characteristics of all parameter time sequence data in each parameter data set and a frequent pattern mining algorithm, carrying out fusion analysis on the parameter time sequence data in each cluster in the clustering result to obtain a frequent pattern model corresponding to each parameter data set;

In the frequent pattern model, the noise existence probability of each parameter data set is determined based on the distribution condition of each item in all parameter time sequence data, the similarity condition among frequent item sets and the position of the frequent item set in the frequent pattern model;

And screening the noise data sets in all the parameter data sets according to the data quality corresponding to the parameter data sets of all the test periods, denoising the parameter time sequence data in the noise data sets to obtain high-quality data sets corresponding to all the test periods, and performing performance test of the electric control cabinet based on all the high-quality data sets corresponding to all the performance parameters.

Further, the method for acquiring the clustering result comprises the following steps:

In each parameter data set, for parameter time sequence data in any two different data periods, determining a difference characteristic value between the two parameter time sequence data based on the difference between the numerical values and the difference condition between the slope values of the numerical values;

And taking the difference characteristic value between the parameter time sequence data as distance measurement, and carrying out K-means cluster analysis on all the parameter time sequence data in the parameter data set based on a preset K value to obtain the cluster result.

Further, the method for acquiring the frequent pattern model comprises the following steps:

for any one parameter data set, determining a numerical range based on a numerical maximum value and a numerical minimum value in the parameter data set, uniformly dividing the numerical range to obtain a preset number of interval ranges, and marking different interval ranges with different labels;

replacing each numerical value in each parameter time sequence data by using the label to obtain a label sequence corresponding to each parameter time sequence data;

Obtaining the importance index of each cluster according to the occurrence frequency of all the labels and the number of label sequences in each cluster, wherein the number of label sequences in each cluster and the occurrence frequency of the labels are positively correlated with the importance index;

in a clustering result corresponding to the parameter data set, analyzing all label sequences in each cluster based on an FP-Growth algorithm to obtain frequent pattern trees corresponding to each cluster, arranging the frequent pattern trees corresponding to all clusters in a descending order based on an importance index to obtain an arrangement sequence, wherein in the arrangement sequence, a first frequent pattern tree is used as a target tree, a next frequent pattern tree of the target tree is used as a tree to be analyzed, a part with difference between the tree to be analyzed and the target tree is added into the target tree to obtain a new target tree, the next frequent pattern tree of the tree to be analyzed is used as a new tree to be analyzed, a part with difference between the new tree to be analyzed and the new target tree is added into the new target tree, the new target tree is continuously determined until the frequent pattern tree in the arrangement sequence is stopped after traversing, and the new target tree is used as a frequent pattern model corresponding to the parameter data set.

Further, the method for acquiring the noise existence probability comprises the following steps:

For any parameter data set, acquiring all frequent item sets in the corresponding frequent pattern model, wherein the occurrence frequency of each item in the frequent item sets is the occurrence frequency of a label corresponding to the item in the parameter data set;

For any frequent item set, determining a first integral noise-containing factor of the frequent item set based on the occurrence frequency of each item in the frequent item set, wherein the first integral noise-containing factor is in negative correlation with the occurrence frequency;

Fusing and averaging the first integral noise-containing factors and the second integral noise-containing factors of all frequent item sets to obtain a first noise-containing index of the parameter data set, wherein the first integral noise-containing factors and the second integral noise-containing factors are positively correlated with the first noise-containing index;

determining a second noise-containing index of the parameter data set based on the similarity conditions among all the frequent item sets;

and obtaining the noise existence probability of the parameter data set according to the first noise-containing index and the second noise-containing index of the parameter data set, wherein the first noise-containing index and the second noise-containing index are positively correlated with the noise existence probability.

Further, the method for acquiring the second noisy indicator comprises the following steps:

in all frequent item sets, taking the frequent item sets with the same item number as a type of frequent item set;

For any one combination, determining the independence degree value of the frequent item sets in the combination based on the difference condition of items at the same position in the two frequent item sets;

and taking the average value of the independent degree values corresponding to all combinations of all types of frequent item sets as a second noisy index of the parameter data set.

Further, the method for acquiring the data quality comprises the following steps:

Taking a parameter data set with noise existence probability larger than or equal to a preset noise threshold value as a target data set, taking a parameter data set with noise existence probability smaller than the preset noise threshold value as a normal data set, and setting the data quality of the normal data set to be a fixed value not smaller than 1;

for any one target data set, determining a first quality parameter of the target data set based on the change trend change condition of time sequence data of each parameter in the target data set;

In each parameter time sequence data corresponding to the target data set, based on the deviation condition between each numerical value and the average value of all numerical values, obtaining a deviation factor of each numerical value, and determining the overall quality factor of each parameter time sequence data based on the deviation factors of all numerical values, wherein the overall quality factor is in negative correlation with the deviation factor;

And obtaining the data quality of the target data set according to the noise existence probability, the first quality parameter and the second quality parameter of the target data set, wherein the noise existence probability, the first quality parameter and the second quality parameter are negatively related to the data quality, and the value of the data quality is a normalized value.

Further, the method for acquiring the first quality parameter includes:

And acquiring the number of extreme points in each parameter time sequence data in the target data set, and determining a first quality parameter of the target data set based on the ratio of the sum value of the number of extreme points of all parameter time sequence data to the total number of values of all parameter time sequence data.

Further, the method for acquiring the high-quality data set comprises the following steps:

Taking the parameter data set with the data quality smaller than or equal to a preset quality threshold value as a noise data set in all the parameter data sets;

Denoising all parameter time sequence data in each noise data set based on a Kalman filtering method to obtain a denoising data set corresponding to each noise data set;

and taking the parameter data set with the data quality larger than the preset quality threshold value and all the denoising data sets as high-quality data sets corresponding to all the test periods.

Further, the performance test of the electric control cabinet based on all high-quality data sets corresponding to all performance parameters includes:

And establishing a simulation test model corresponding to the electric control cabinet, taking all high-quality data sets corresponding to all performance parameters as a database of the simulation test model, and performing performance test of the electric control cabinet to obtain a test result.

The invention also provides a comprehensive automatic test system for the electric control cabinet, which comprises:

A memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of any one of the methods when the computer program is executed.

The invention has the following beneficial effects:

According to the invention, when the simulation test model is utilized to detect the electric control cabinet, noise data can be accurately screened out from a large amount of data, so that targeted denoising treatment is performed, the efficiency is effectively improved, and the accuracy of a subsequent model test result is ensured. Firstly, for any performance parameter, a parameter data set of the performance parameter in a plurality of test periods needs to be acquired, and the parameter data set contains a plurality of parameter time sequence data, because noise data belongs to accidental phenomena compared with normal data, the parameter time sequence data in each parameter data set can be subjected to cluster analysis based on similarity among the data to obtain a cluster result, and at the moment, the parameter time sequence data in each cluster in each cluster result can distinguish the normal data and the noise data in the parameter data set to a certain extent. Further, based on numerical characteristics of all parameter time sequence data in the parameter data set and a frequent pattern mining algorithm, fusion analysis is carried out on each cluster in the clustering result, so that a frequent pattern model is obtained, and the frequent pattern model can more remarkably represent the difference between normal data and noise data. Further, the frequent pattern model can be analyzed, the relevance among the data and the position distribution of the frequent pattern and the frequent item set in the frequent pattern model are analyzed, so that the possibility of noise data existence is quantified, the noise existence probability of the parameter data set is obtained, and a reference is provided for the quality of the parameter data set to be evaluated later. Further, since the noise data has characteristics on data fluctuation and trend change, the change trend and the data fluctuation condition of the parameter time sequence data are comprehensively considered, and the data quality of the parameter data set can be accurately estimated by combining the noise existence probability. This helps to accurately identify and screen out the noisy dataset, and then denoise it, resulting in a high quality dataset. And finally, performing performance test of the electric control cabinet based on the high-quality data sets corresponding to all the performance parameters to obtain a test result. The invention can accurately screen out the noise data set, so that only the noise data set is denoised in a large number of data sets, the data denoising processing efficiency can be improved, and the accuracy and the reliability of the test result are ensured.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method for a fully automated test method for an electronic control cabinet according to one embodiment of the present invention;

FIG. 2 is a flow chart of a method for constructing a frequent pattern model according to an embodiment of the present invention;

FIG. 3 is an exemplary schematic diagram of a merging sub-step of a frequent pattern model creation process according to one embodiment of the present invention;

FIG. 4 is a flowchart of a method for obtaining a noise existence probability according to an embodiment of the present invention;

FIG. 5 is a flow chart of a method for obtaining data quality according to an embodiment of the present invention;

FIG. 6 is a system block diagram of a full-scale automated test system for an electronic control cabinet in accordance with one embodiment of the present invention;

FIG. 7 is a schematic diagram of a system architecture of a fully automated test system for an electronic control cabinet according to one embodiment of the present invention;

fig. 8 is a schematic diagram of a computer-readable storage medium according to an embodiment of the present invention.

Detailed Description

In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following description refers to the specific implementation, structure, characteristics and effects of a comprehensive automatic testing method and system for an electric control cabinet according to the invention in combination with the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The invention provides a comprehensive automatic test method and a system for an electric control cabinet.

Referring to fig. 1, a flowchart of a method for fully automated testing of an electronic control cabinet according to an embodiment of the present invention is shown, the method comprising the steps of:

Step S1, for any performance parameter of an electric control cabinet, acquiring a parameter data set of the performance parameter in each test period, wherein each parameter data set comprises parameter time sequence data of a plurality of data periods.

In the process of carrying out comprehensive automatic test on an electric control cabinet by utilizing a software simulation detection mode, a large amount of data is required to be acquired so as to ensure the accuracy of a simulation detection result, noise data exists in a data set due to various external influencing factors in the process of data acquisition, and denoising processing is required to be carried out on the data set so as to ensure the accuracy of simulation operation, so that the reliability of the simulation result is improved. In view of the large scale of the data volume, in the embodiment of the invention, the data in the data set is analyzed, so that the data set with noise is screened out and denoised, the denoising efficiency can be effectively improved, and the credibility of the simulation result is ensured.

The electric control cabinet is widely applied to multiple fields of new energy sources, electric control, food machinery, machine tools and the like, and because the data characteristics of the electric control cabinet can show differences under different process flows, in order to obtain more accurate test results, the invention mainly tests under a certain process flow. The process flow can be, for example, that when a CNC machine tool of a machining center works, a spindle motor and a feeding system of the CNC machine tool can generate variable loads, so that current and power parameters in an electric control cabinet fluctuate. In the injection molding process, the power requirement of the injection molding machine is periodically changed in the processes of heating, injecting, cooling and opening and closing the mold, so that the current and power parameters in the electric control cabinet are regularly fluctuated.

Aiming at any performance parameter of the electric control cabinet, a sensor arranged in the electric control cabinet is utilized to collect parameter data sets of a plurality of test periods of the electric control cabinet under the current process flow, and each parameter data set should comprise parameter time sequence data of a plurality of data periods. It should be noted that, the performance parameters include voltage, current, etc., the test period is a period from the start of the process flow to the end of the process flow, wherein the data period is set to 10 minutes, the sampling time interval is set to 0.1s, and specific values of the data period and the sampling time interval can be adjusted according to the implementation scenario, which is not limited herein.

So far, for a certain technological process, a plurality of parameter data sets corresponding to each new energy parameter of the electric control cabinet can be obtained, and each parameter data set can be subjected to subsequent analysis so as to determine whether noise data is contained in the parameter data.

And S2, clustering all the parameter time sequence data according to the similarity condition among the parameter time sequence data in each parameter data set to obtain a clustering result, and carrying out fusion analysis on the parameter time sequence data in each cluster in the clustering result according to the numerical characteristics of all the parameter time sequence data in each parameter data set and a frequent pattern mining algorithm to obtain a frequent pattern model corresponding to each parameter data set.

Because noise data is an occasional phenomenon compared with normal data, the parameter time sequence data in each parameter data set can be subjected to cluster analysis based on the similarity between the data to obtain a cluster result, and the parameter time sequence data in each cluster result can distinguish the normal data and the noise data in the parameter data set to a certain extent. Because the relevance between the normal data is strong, based on the numerical characteristics of all parameter time sequence data in the parameter data set and a frequent pattern mining algorithm, fusion analysis is carried out on each cluster in the clustering result, so that a frequent pattern model is obtained, and the frequent pattern model at the moment can more remarkably represent the difference between the normal data and the noise data, so that preparation is made for the subsequent noise data screening process.

By clustering the parameter time sequence data in each parameter data set, the similar parameter time sequence data can be classified, so that the data distribution of the parameter data sets is simplified, meanwhile, the clustering result can reveal the internal structure and mode in the parameter data, and the understanding of the relation between the parameter time sequence data is facilitated. Therefore, according to the similarity among the numerical values in each parameter time sequence data in each parameter data set, clustering analysis is carried out on all the parameter time sequence data, so that a clustering result is obtained.

Preferably, in one embodiment of the present invention, the method for obtaining a clustering result includes:

The purpose of clustering is to make samples within the same cluster as similar as possible, while samples of different clusters are as dissimilar as possible. In this embodiment of the present invention, the similarity between the parameter time series data is determined mainly based on the numerical difference case between the parameter time series data.

Therefore, in each parameter data set, for parameter time sequence data in any two different data periods, a difference characteristic value between the two parameter time sequence data is determined based on the difference between numerical values and the difference condition between slope values of the numerical values, wherein a formula model of the difference characteristic value comprises:

;

wherein, Is shown in the firstA difference characteristic value between the parameter time sequence data 1 and the parameter time sequence data 2 in the parameter data set; Is shown in the first The parameter time series data 1 is the first parameter data setSlope values of the individual values; Is shown in the first The parameter time series data 2 of the parameter data setSlope values of the individual values; Is shown in the first The parameter time series data 1 is the first parameter data setA number of values; Is shown in the first The parameter time series data 2 of the parameter data setA number of values; representing a normalization function; Representing the total number of values in the parameter timing data.

In the formula model of the difference characteristic value, for any two different parameter time series data in the parameter data set, the difference between the numerical values at the corresponding positions is calculatedThe smaller the value, the more similar the values at the same position in the two parameter time series data, and similarly, after the slope value of each value is obtained, the difference between the slope values of the values at the same position is analyzedSince the slope value can represent the change trend of the data, when the difference value is smaller, the more consistent change trend exists at the same position of the time sequence data of the two parameters. So at the same position, willAnd (3) withAnd adding, namely representing the similarity condition of the time sequence data of the two parameters at a certain position, wherein the smaller the sum value is, the higher the similarity degree is, and conversely, the larger the sum value is, the higher the difference degree is. Finally, all positions, i.e. all values, are corresponding toAnd accumulating, and normalizing the accumulated sum to obtain a difference characteristic value between the two parameter time sequence data.

And then taking the difference characteristic value between the parameter time sequence data as a distance measure, and carrying out K-means cluster analysis on all the parameter time sequence data in the parameter data set based on a preset K value to obtain a cluster result, wherein the parameter time sequence data in each cluster in the cluster result has a relatively consistent numerical characteristic.

The slope value is calculated by comparing the difference between the last value and the previous value with the time interval between the two values, and setting the slope value of the last value as the slope value of the previous value in time sequence. The preset K value is set to 5, the specific value can be adjusted according to the implementation scene, the specific value is not limited herein, and the K-means clustering algorithm is a technical means well known to those skilled in the art, and is not described herein. Here, explanation is made on the same positions in two parameter time series data, for example, values in one parameter time series data are arranged as (1, 2,3, 4) in time series, values in the other parameter time series data are arranged as (a, b, c, d) in time series, then 1 and a are two values at the same position, and similarly, 2 and b are two values at the same position, 3 and c are two values at the same position, and 4 and d are two values at the same position.

In other embodiments of the present invention, the following method may be adopted to analyze the similarity of the parameter time sequence data in each parameter data set, so as to implement clustering to obtain a clustering result.

In view of the fact that a dynamic time warping algorithm (DYNAMIC TIME WARPING, DTW) can be used to measure the similarity of two time series, in each parameter dataset, parameter timing data within any two different data periods is acquiredValue and willAnd carrying out normalization processing on the values, so as to obtain the difference characteristic value between the two parameter time sequence data. The formula model of the difference eigenvalue comprises:

;

wherein, Is shown in the firstA difference characteristic value between the parameter time sequence data 1 and the parameter time sequence data 2 in the parameter data set; Is shown in the first In the parameter data set, parameter time sequence data 1 and parameter time sequence data 2 are arrangedA value; Representing the normalization function.

In the formula model, when two parameters are time-series dataThe smaller the value, the more similar the timing data of the two parameters, and therefore, will beAnd carrying out normalization processing on the values to obtain a difference characteristic value between the two parameter time sequence data.

And then taking the difference characteristic value between the parameter time sequence data as a distance measure, and carrying out K-means cluster analysis on all the parameter time sequence data in the parameter data set based on a preset K value to obtain a cluster result, wherein the parameter time sequence data in each cluster in the cluster result has a relatively consistent characteristic.

So far, in each parameter data set, all the clustering clusters can be obtained by carrying out cluster analysis on the parameter time sequence data in the parameter data set, different data characteristics are provided in different clustering clusters, normal data and noise data can be distinguished to a certain extent, and a reference is provided for the subsequent process.

The frequent pattern mining algorithm can be used for retrieving frequent item set information and can quickly find out frequent patterns and associated information in the parameter time sequence data, so that whether noise data exist in the parameter data set can be more effectively identified. The clustering results have obvious differences among different clustering clusters, so that the main components and the branch components can be more intuitively distinguished in a final frequent pattern model by carrying out frequent pattern mining and fusion analysis on the different clustering clusters, and the branch components are more likely to represent noise data because the noise data are less than normal data.

Preferably, in one embodiment of the present invention, the method for acquiring the frequent pattern model includes:

Referring to fig. 2, a method flowchart of a method for constructing a frequent pattern model according to an embodiment of the invention is shown, the method includes the following steps:

step S201, based on the numerical characteristics of all the parameter time sequence data in the parameter data set, carrying out label conversion on the numerical value in each parameter time sequence data, and determining a label sequence corresponding to each parameter time sequence data.

If each value is used as a single class, when the value range is too large, for example, 0-100, the model is too huge when the frequent pattern model is constructed, so that for any one parameter data set, the value range is determined based on the value maximum value and the value minimum value in the parameter data set, the value range is uniformly divided, a preset number of interval ranges are obtained, and all the interval ranges are marked by different labels.

And then, replacing each numerical value in each parameter time sequence data by using a label to obtain a label sequence corresponding to each parameter time sequence data.

This procedure is exemplified here, for example, in the range of 1-100, the preset number is set to 10, and the interval ranges from 1-10,11-20, 21-30..71-80, 81-90,91-100. Each interval range is marked with a different letter, in this embodiment of the invention the letters a-J are used, i.e. the labels corresponding to interval ranges 0-10 are a, the labels corresponding to 11-20 are B, and so on. At this time, if the value in one parameter time series data sequence is (0,15,14,35,21,50,78), the corresponding tag sequence is (a, B, D, C, E, H).

It should be noted that the preset number of settings may be adjusted according to the implementation scenario, which is not limited herein.

Step S202, constructing frequent pattern trees corresponding to tag sequences in all clusters in a clustering result based on an FP-Growth algorithm, and merging all the frequent pattern trees to obtain a frequent pattern model corresponding to each parameter data set.

The FP-Growth algorithm utilizes an FP-Tree (frequent pattern Tree) data structure to retrieve frequent item set information, which enables the algorithm to more quickly discover frequent patterns and associated information in the parameter time sequence data, thereby more effectively identifying whether noise data exists in the parameter data set.

Through the processing in step S201, at this time, in the parameter data set, each parameter time sequence data corresponds to one tag sequence, and the occurrence frequency of each tag may be calculated in all tags, where the occurrence frequency may represent the importance degree of the value corresponding to the tag in all values.

Because the number of parameter time sequence data in the cluster can represent the importance degree of the cluster, in each cluster, according to the occurrence frequency of all labels and the number of label sequences in each cluster, an importance degree index of each cluster is obtained, and the number of label sequences in each cluster and the occurrence frequency of labels are positively correlated with the importance degree index. The formula model of the importance index may specifically be, for example:

;

wherein, Is shown in the firstA first parameter data setImportance index of each cluster; Is shown in the first A first parameter data setThe number of tag sequences in the cluster; Is shown in the first A first parameter data setThe total number of tags in the cluster; Is shown in the first A first parameter data setThe first cluster of clustersThe frequency of occurrence of the individual tags; Representing the normalization function.

In the formula model of the importance index, when the number of tag sequences contained in one cluster is larger, the more likely that the data in the cluster is normal data, the more likely that the data features of the process flow are represented, so that the importance degree is higher, the frequency of occurrence of each tag in all tags in the parameter data set is calculated, the greater the frequency of occurrence is, the more likely that the data corresponding to the tag is normal data, the more likely that the data features of the process flow are represented, so that the greater the frequency of occurrence of the tag in one cluster is, the more likely that the number of tag sequences is, the more likely that the importance degree of the cluster is represented. Therefore, based on the logic, a formula model of the importance index is constructed, and the importance index of each cluster is obtained.

In other embodiments of the present invention, the value obtained by normalizing the sum of the occurrence frequencies of all the tags in the parameter data set may be used as the importance index of the cluster, and the specific formula model is as follows:

;

wherein, Is shown in the firstA first parameter data setImportance index of each cluster; Is shown in the first A first parameter data setThe total number of tags in the cluster; Is shown in the first A first parameter data setThe first cluster of clustersThe frequency of occurrence of the individual tags; Representing the normalization function.

In the formula model, the occurrence frequency of each label in all labels in the parameter data set is calculated, the larger the occurrence frequency is, the more times that data corresponding to the label appear in the parameter data set is, the more likely the data corresponding to the label is normal data, the more data characteristics of the process flow can be represented, so that the higher the occurrence frequency of the label in a certain cluster is, the higher the importance degree of the cluster can be indicated.

And then, in a clustering result corresponding to the parameter data set, analyzing all tag sequences in each cluster based on an FP-Growth algorithm, so that a frequent pattern tree corresponding to each cluster can be obtained. It should be noted that, the method for constructing the frequent pattern tree is a technical means well known to those skilled in the art, and will not be described herein.

The method comprises the steps of obtaining a sequence, wherein a first frequent pattern tree is used as a target tree in the sequence, a next frequent pattern tree of the target tree is used as a tree to be analyzed, a part with difference between the tree to be analyzed and the target tree is added into the target tree to obtain a new target tree, the next frequent pattern tree of the tree to be analyzed is used as a new tree to be analyzed, a part with difference between the new tree to be analyzed and the new target tree is added into the new target tree, and the new target tree is continuously determined until the frequent pattern tree in the sequence is traversed, and the new target tree is used as a frequent pattern model corresponding to the parameter data set.

The process of merging and constructing frequent pattern models is illustrated, and if 4 frequent pattern trees exist in the permutation sequence, the frequent pattern trees are respectively marked as 1,2,3 and 4. Then frequent pattern tree 1 is used as the target tree, frequent pattern tree 2 is compared with frequent pattern tree 1, and the part which exists in frequent pattern tree 2 but does not exist in frequent pattern tree 1 is supplemented into frequent pattern tree 1 to obtain a frequent pattern treeThen the frequent pattern tree 3 is combined with the frequent pattern treeComparing, to make the frequent pattern tree 3 existThe non-existing part is supplemented to the frequent pattern treeIn (3) obtaining frequent pattern treeThen, the frequent pattern tree 4 and the frequent pattern tree are used forComparing, to make the frequent pattern tree 4 existThe non-existing part is supplemented to the frequent pattern treeIn (3) obtaining frequent pattern treeFrequent pattern tree at this timeThe final frequent pattern model is obtained. Referring to FIG. 3, an exemplary schematic diagram of a merging sub-step in the frequent pattern model building process is shown.

The frequent pattern model can be obtained, and because the frequent pattern tree is built for each cluster, each tree can well keep the data characteristics and the association patterns in the corresponding cluster, meanwhile, because noise data are often distributed into smaller clusters in the clustering process, the frequent pattern tree is combined based on the importance degree value of each cluster, the obtained frequent pattern model can better keep the normal data in the current parameter data set and also can be the feature data, so that the normal data can be distinguished from the noise data, wherein the normal data can be more obvious in the frequent pattern model due to the frequency and the association of the normal data, and the noise data can possibly represent isolated or low-frequency items.

And step S3, determining the noise existence probability of each parameter data set in the frequent pattern model based on the distribution condition of each item in all parameter time sequence data, the similarity condition among frequent item sets and the position of the frequent item set in the frequent pattern model, and obtaining the data quality of each parameter data set according to the noise existence probability of each parameter data set, the change trend change condition of the numerical value in each parameter time sequence data in the parameter data set and the numerical value fluctuation condition.

The construction of the frequent pattern model corresponding to the parameter data set can be completed through the steps, because the normal data is a high-frequency item and tends to form a trunk part of the frequent pattern model compared with the noise data, and the noise data is a low-frequency item, an atypical path is formed and has obvious distinction from the path formed by the normal data, so that the noise existence probability of the parameter data set can be determined based on the distribution condition of each item, the similarity condition among frequent item sets and the position condition of the frequent item set in the frequent pattern model. Compared with normal data, the noise data has no obvious trend distribution and frequent trend change, so that the change trend of numerical values in the parameter time sequence data can be used as a reference for evaluating the data quality, the noise data and the abnormal data are both represented as outlier data points, so that the noise data and the abnormal data have more consistent characteristics in a frequent mode model, the abnormal data can be used for simulating the possible fault condition of the electric control cabinet, otherwise, the noise data are not helpful for simulating and detecting the electric control cabinet, the noise data and the abnormal data also need to be distinguished, the outlier degree of the abnormal data is more obvious compared with the noise data, the fluctuation condition of the whole data is greatly influenced, the data fluctuation condition in the parameter time sequence data is also used as a reference for evaluating the data quality, and meanwhile, the data quality of the parameter data set is analyzed more comprehensively by combining the noise existence probability of the parameter data set.

Because noise data generally appears as items that are infrequent, dissimilar to other item sets, or are located abnormally in a tree, by comprehensively considering the frequency of occurrence of the items, the similarity of the item sets, and the location information, the characteristics and structure of the data can be more comprehensively known, so that the noise existence probability of the parameter data set can be calculated for identifying noise data.

Preferably, in one embodiment of the present invention, the method for acquiring the noise existence probability includes:

Referring to fig. 4, a method flowchart of a method for obtaining a noise existence probability according to an embodiment of the present invention is shown, where the method includes the following steps:

Step S401, determining a first noise-containing index of the parameter data set based on the distribution condition of each item in all parameter time sequence data and the position of the frequent item set in the frequent pattern model.

And for any parameter data set, acquiring all frequent item sets in the corresponding frequent pattern model.

The frequency of occurrence of each item in the frequent item set is the frequency of occurrence of the label corresponding to the item in each parameter data set, and by considering the frequency of occurrence of the item, the items with low frequency of occurrence can be more effectively identified, and the items are likely to correspond to noise data. For any one frequent item set, determining a first overall noise-containing factor of the frequent item set based on the occurrence frequency of each item in the frequent item set, namely the occurrence frequency of each label, wherein the first overall noise-containing factor is inversely related to the occurrence frequency.

The frequent pattern model reflects the association relation among the items through the tree structure, the items corresponding to the normal data tend to form the trunk part of the tree model, and the items corresponding to the noise data tend to form the branch part of the tree model, so that the intrinsic structural characteristics of the data can be further captured by utilizing the position information of the frequent item set in the frequent pattern model, particularly the number of bifurcation points. Consideration of the number of bifurcation points may help understand the complexity and diversity of the data to more accurately assess the presence of noise. And determining a second integral noise-containing factor of the frequent item set based on the number of bifurcation points of the corresponding path of the frequent item set in the frequent pattern model, wherein the second integral noise-containing factor is in negative correlation with the number of bifurcation points.

And finally, fusing and averaging the first integral noise-containing factors and the second integral noise-containing factors of all the frequent item sets to obtain a first noise-containing index of the parameter data set, wherein the first integral noise-containing factors and the second integral noise-containing factors are positively correlated with the first noise-containing index. The formula model of the first noisy indicator may specifically be, for example:

;

wherein, Represent the firstA first noisy indicator of the individual parameter dataset; Represent the first The total number of frequent item sets in the frequent pattern model corresponding to the individual parameter data sets; Represent the first The corresponding first parameter data setThe number of bifurcation points of the paths of the frequent item sets in the frequent pattern model; Represent the first The corresponding first parameter data setThe number of items of the frequent item set; Represent the first The corresponding first parameter data setFrequent item set firstThe frequency of occurrence of the corresponding entries; representing a preset first parameter.

In the formula model of the first noisy index, since the frequency of occurrence of the term corresponding to the normal data in the parameter data set is larger, for each frequent term set, if the frequency of occurrence of the term in the frequent term set is larger, the probability that the frequent term set contains noisy data is lower is indicated, so that the frequency of occurrence of each term in the frequent term set is based onDetermining a first overall noise-containing factor for the frequent item setAnd to achieve a logical relationship correction, the first overall noise-containing factor is inversely related to the frequency of occurrence. Meanwhile, because the items corresponding to the normal data are distributed in the trunk part of the frequent pattern model more, if a frequent item set has more branches in the frequency pattern model, the paths of the frequent item set are in the trunk part, namely in the core position, the higher the possibility that the data corresponding to the items in the frequent item set are normal data, the lower the possibility that the frequent item set contains noise data, therefore, the number of branches is subjected to negative correlation mapping, logic relation correction is realized, and the second integral noise factor of the frequent item set is obtained. Finally, the first integral noise-containing factor and the second integral noise-containing factor of the frequent item set are integrated, and the multiplied value is used as the integrated noise-containing factor of the frequent item setAnd carrying out averaging treatment on the comprehensive noise-containing factors of all the frequent item sets, thereby obtaining a first noise-containing index of the parameter data set.

The first parameter is presetThe function of (2) is to prevent the denominator from being 0, and the value can be 0.001, and the specific value can be adjusted according to the implementation scene, and is not limited herein.

Step S402, determining a second noisy index of the parameter data set based on the similarity condition among the frequent item sets.

Among all the frequent item sets, the frequent item set with the same item number is used as a type of frequent item set.

And for any one combination, analyzing the difference condition of the items at the same position in the two frequent item sets, and setting the difference judgment index at the position as 0 when the items at the same position are the same, otherwise setting as 1. And then determining the independence degree value of the frequent item set in the combination based on the difference judgment indexes of all the positions and the sequence number of the positions. The independence level value may be used to reflect an inherent relationship between two frequent item sets, since similar frequent item sets should have the same item in the same location. The formula model of the independence level value comprises:

;

wherein, Is shown in the firstClass frequent item set, item 1Individual degree of independence between sets of combined frequent items; Represent the first A class frequent item set, the number of items of each frequent item set; A term number index representing a set of frequent terms; Is shown in the first Class frequent item set, item 1Between frequent item sets of individual combinationsAnd judging the index of the difference of the individual items.

In the formula model of the independent degree value, since the normal data is a main body part compared with the noise data, the frequent item sets corresponding to the normal data should have higher similarity, the items which are embodied as the same position should be the same, and the positions where the different items appear should be more backward in the frequent item sets, so the difference of the items at the same position is compared first, if the difference is different, the difference judgment index at the position is set to be 1, and if the difference judgment index is the same, the difference judgment index is set to be 0, at this time, if the number of the items at the same position in the two frequent item sets is greater when the comparison is performed, the difference between the two frequent item sets is larger, the similarity degree is lower, and noise is more likely to be contained. Meanwhile, during comparison, the position of the item, namely the index of the item number is taken as a weight, if the position of the difference is more front, the description contains more noise components, so the index of the item number is subjected to negative correlation mapping to realize logical relation correction and then is taken as an adjustment weightFinally, the difference judgment index is weighted by the adjustment weightAnd comprehensively analyzing all the items to obtain an independent degree value, wherein the larger the independent degree value is, the stronger the difference between two frequent item sets in the combination is, and the higher the possibility of noise is.

It should be noted that the method for determining the adjustment weight and the difference judgment index is exemplified herein, for example, two frequent item sets of two items are { A, F } and { C, F }, respectively, where A and C are the first item, the index of the item number is 1, the adjustment weight value of the first item is 1, the difference judgment index is 1;F and F is the second item, the index of the item number is 2, and the adjustment weight value of the second item is 1The difference judgment index is 0.

So far, each combination in each type of frequent item set corresponds to an independent degree value, so that the average value of the independent degree values corresponding to all combinations is used as a second noise-containing index of the parameter data set and is recorded asAnd the larger the value, the higher the likelihood that noise is present in the parameter dataset.

Step S403, obtaining the noise existence probability of the parameter data set according to the first noise-containing index and the second noise-containing index of the parameter data set.

And taking the value obtained after normalization of the sum value of the first noisy index and the second noisy index of the parameter data set as the noise existence probability of the parameter data set. The formula model of the noise existence probability includes:

;

wherein, Represent the firstNoise existence probability of the individual parameter data sets; Represent the first A first noisy indicator of the individual parameter dataset; Represent the first A second noisy indicator of the individual parameter dataset; Representing the normalization function.

In the formula model of the noise existence probability, based on the analysis in step S401 and step S402, the larger the first noise-containing index is, the larger the possibility of existence of noise in the parameter data set is, the larger the second noise-containing index is, and the larger the possibility of existence of noise in the parameter data set is, so that the sum value of the first noise-containing index and the second noise-containing index is normalized, thereby obtaining the noise existence probability of the parameter data set.

In other embodiments of the present invention, the value obtained by normalizing the product of the first noisy index and the second noisy index may be used as the noise existence probability of the parameter data set.

So far, based on the method, the noise existence probability of the parameter data set of each test period can be obtained, and in the subsequent process, the data quality of each parameter data set can be continuously analyzed by combining the noise existence probability.

Because the abnormal data can be used for simulating the possible fault conditions of the electric control cabinet, the fault can be found more effectively in the simulation process, otherwise, the noise data is not helpful to the fully-automatic simulation test of the electric control cabinet, but can affect the final simulation result, so that the more the noise data is, the worse the data quality is. Meanwhile, as the noise data and the abnormal data represent outlier characteristics, and the outlier characteristics of the abnormal data points are more obvious, the normal data set is screened out firstly, and the data quality of the normal data set is set to be a fixed value, so that the abnormal data set and the noise data set are further distinguished in the rest data set according to the characteristics of the fluctuation condition of the numerical value and the like, the data quality of the parameter data set is comprehensively measured, and the noise data set can be screened out accurately in the follow-up process.

Preferably, in one embodiment of the present invention, the method for acquiring data quality includes:

referring to fig. 5, a method flowchart of a data quality acquisition method according to an embodiment of the present invention is shown, the method includes the following steps:

Step S501, distinguishing a normal data set from a target data set based on the noise existence probability of the parameter data set, and acquiring the data set quality of the normal data set.

The parameter data set having a noise existence probability greater than or equal to a preset noise threshold is taken as a target data set, the parameter data set having a noise existence probability less than the preset noise threshold is taken as a normal data set, and the data quality of the normal data set is set to a fixed value not less than 1. It should be noted that, in the embodiment of the present invention, the preset noise threshold is set to 0.6, and the data quality of the normal data set is set to 1, and specific values can be adjusted according to the implementation scenario, which is not limited herein.

Step S502, for any one target data set, determining a first quality parameter of the target data set based on the change trend change condition of each parameter time sequence data in the target data set.

Since noise data appears as irregularly distributed data points in the dataset, i.e., there is no apparent trend distribution, it can be stated that noise components in the target dataset may be more if the data trend of the parametric temporal data changes more frequently.

Because the extreme points in a group of data comprise the maximum point and the minimum point, are the points with the highest local or the lowest local in the data sequence and are the nodes marked by the transition of the data from one trend to another trend, for any one target data set, the number of the extreme points in each parameter time sequence data in the target data set is obtained, and the first quality parameter of the target data set is determined based on the ratio of the sum value of the number of the extreme points of all the parameter time sequence data to the total number of the values of all the parameter time sequence data and recorded asThe first quality parameter quantifies the frequency of the data change trend in the time sequence data of the target data set, and the larger the first quality parameter is, the more noise data contained in the target data set is indicated, and the worse the data quality is.

Step S503, obtaining a second quality parameter of the target data set according to the fluctuation condition of the numerical value in each parameter time sequence data in the target data set.

The target data set may contain noise data and anomaly data, which should be distinguished from noise data in order to measure the data quality of the target data set more accurately, since anomaly data is more valuable than noise data. Although both noise data and abnormal data exhibit outlier characteristics, since the abnormal data is generally more deviated from normal data, the difference between the data can be analyzed to distinguish the noise data from the abnormal data.

And in each parameter time sequence data corresponding to the target data set, taking the deviation condition of each numerical value and the average value of all numerical values as a deviation factor of each numerical value, and taking the value obtained by carrying out negative correlation mapping on the sum value of the deviation factors of all numerical values as the integral quality factor of each parameter time sequence data, wherein the smaller the deviation factor is, the larger the integral quality factor is, which indicates that the parameter time sequence data contains more noise data.

And carrying out averaging treatment on the overall quality factors of all the parameter time sequence data to obtain a second quality parameter of the target data set, wherein the larger the second quality parameter is, the more noise data contained in the target data set is, and the worse the data quality is. The formula model of the second quality parameter includes:

;

wherein, Represent the firstA second quality parameter of the respective target data set; Represent the first The number of parameter timing data in the individual target data sets; Represent the first Item number of target data setTotal number of values in the individual parameter timing data; Represent the first Item number of target data setThe first parameter in the time sequence dataA number of values; Represent the first Item number of target data setNumerical average value in parameter time sequence data of each; representing a preset second parameter.

In the formula model of the second quality parameter, for each parameter time sequence data in any one target data set, calculating the average value of all valuesThe average value characterizes the average level of the values in the time sequence data of a parameter, and then the difference between each value and the average value is calculated to obtain a deviation factorThe noise data has smaller deviation factor for the abnormal data, so the sum of the deviation factors of all values is subjected to negative correlation mapping to obtain the overall quality factor of each parameter time sequence dataThe larger the value is, the more noise data are contained in the parameter time sequence data, and finally, the whole quality factors of all the parameter time sequence data in the target data set are subjected to the averaging processing to obtain the second quality parameters of the target data set.

The second parameter is presetThe function of (2) is to prevent the denominator from being 0, and the value is 0.001, and the specific value can be adjusted according to the implementation scenario, and is not limited herein.

Step S504, the noise existence probability, the first quality parameter and the second quality parameter of the target data set are integrated to obtain the data quality of the target data set.

And carrying out negative correlation mapping on the product of the noise existence probability, the first quality parameter and the second quality parameter of the target data set and normalizing the processed value to obtain the data quality of the target data set. The formula model of the data quality may specifically be, for example:

;

wherein, Represent the firstData quality of the individual target data sets; Represent the first Noise existence probabilities of the individual target data sets; Represent the first A first quality parameter of the respective target data set; Represent the first A second quality parameter of the respective target data set; Expressed in natural constant An exponential function of the base.

In the formula model of the data quality, the higher the noise existence probability is, the higher the possibility that noise is contained in the target data set is, the worse the data quality is, the higher the first quality parameter is, the more the noise data is contained in the target data set, the worse the data quality is, the higher the second quality parameter is, the more the noise data is contained in the target data set, and the worse the data quality is, so that the product of the noise existence probability, the first quality parameter and the second quality parameter is subjected to negative correlation mapping and normalization processing, and the data quality of the target parameter data is obtained.

So far, the data quality of all parameter data sets can be obtained, and the noise data sets can be screened out and denoised based on the data quality in the subsequent process.

And step S4, screening the noise data sets in all the parameter data sets according to the data quality corresponding to the parameter data sets of all the test periods, denoising the parameter time sequence data in the noise data sets to obtain high-quality data sets corresponding to all the test periods, and performing performance test of the electric control cabinet based on all the high-quality data sets corresponding to all the performance parameters.

Because the worse the data quality, the more noise data is contained in the data, in order to ensure the accuracy of the subsequent simulation test result, the noise data set needing to be denoised should be screened and denoised, so that the noise data set and the parameter data set not needing to be denoised together form a high-quality data set.

Preferably, in one embodiment of the present invention, a method for acquiring a high quality data set includes:

and taking the parameter data set with the data quality smaller than or equal to a preset quality threshold value as a noise data set in all the parameter data sets.

And then denoising all parameter time sequence data in each noise data set based on a Kalman filtering method to obtain a denoising data set corresponding to each noise data set.

And finally, taking the parameter data set with the data quality larger than the preset quality threshold value and all the denoising data sets as high-quality data sets corresponding to all the test periods.

It should be noted that the preset quality threshold is set to 0.6, the specific value can be adjusted according to the implementation scenario, and the method is not limited herein, and the kalman filtering method is a technical means well known to those skilled in the art, and the specific process is not described herein.

All performance parameters under a certain technological process can be analyzed based on the process, so that high-quality data sets of all the performance parameters are obtained, and then the performance test of the electric control cabinet can be performed based on the high-quality data sets of all the performance parameters.

Preferably, in one embodiment of the present invention, the performance test of the electric control cabinet is performed based on all high quality data sets corresponding to all performance parameters, including:

According to the design drawing and specification of the electric control cabinet, a simulation test model corresponding to the electric control cabinet is established, all high-quality data sets corresponding to all performance parameters are used as databases of the simulation test model and input into the simulation test model, then the behavior and output of the system can be simulated and observed in a software environment, performance test of the electric control cabinet is carried out, namely various fault conditions such as element damage, power supply fluctuation, load mutation and the like are simulated, and test results are obtained.

And finally, adjusting the control strategy of the electric control cabinet based on the deviation of the test result and the expected result.

In summary, in the embodiment of the present invention, for any performance parameter, a parameter data set of the performance parameter in a plurality of test periods needs to be acquired, and the parameter data set includes a plurality of parameter time sequence data, because noise data belongs to an accidental phenomenon compared with normal data, the parameter time sequence data in each parameter data set can be subjected to cluster analysis based on similarity between data to obtain a cluster result, and at this time, the parameter time sequence data in each cluster in each cluster result can distinguish the normal data and the noise data in the parameter data set to a certain extent. Further, based on numerical characteristics of all parameter time sequence data in the parameter data set and an FP-Growth algorithm, fusion analysis is carried out on each cluster in the clustering result, so that a frequent pattern model is obtained, and at the moment, the frequent pattern model can more remarkably represent the difference between normal data and noise data. Further, the frequent pattern model can be analyzed, the relevance among the data and the position distribution of the frequent pattern and the frequent item set in the frequent pattern model are analyzed, so that the possibility of noise data existence is quantified, the noise existence probability of the parameter data set is obtained, and a reference is provided for the quality of the parameter data set to be evaluated later. Further, since the noise data has characteristics on data fluctuation and trend change, the change trend and the data fluctuation condition of the parameter time sequence data are comprehensively considered, and the data quality of the parameter data set can be accurately estimated by combining the noise existence probability. This helps to accurately identify and screen out the noisy dataset, and then denoise it, resulting in a high quality dataset. And finally, performing performance test of the electric control cabinet based on the high-quality data sets corresponding to all the performance parameters to obtain a test result. According to the embodiment of the invention, the noise data set can be accurately screened, so that only the noise data set is denoised in a large number of data sets, the data denoising processing efficiency can be improved, and meanwhile, the accuracy and the reliability of a test result are ensured.

The embodiment also provides a comprehensive automatic test system for the electric control cabinet, which comprises a processor, a memory and a computer program, wherein the memory is used for storing the corresponding computer program, the processor is used for running the corresponding computer program, and the computer program can realize any one of the steps of the comprehensive automatic test method for the electric control cabinet when running on the processor.

Referring to fig. 6, a system block diagram of a full-automatic testing system for an electronic control cabinet according to an embodiment of the present invention includes a data acquisition module 601, a frequent pattern model building module 602, a data quality analysis module 603, and a performance test module 604, wherein the data acquisition module is used for implementing step S1, step S2, and step S3, respectively.

Referring to fig. 7, a schematic diagram of a system architecture of a fully automated test system for an electronic control unit according to an embodiment of the present invention is shown, and includes a processor 700, a memory 701, a bus 702, and a communication interface 703, where the processor 700, the communication interface 703, and the memory 701 are connected by the bus 702, and the memory 701 may include a high-speed random access memory, the bus 702 may be an ISA bus, a PCI bus, or an EISA bus, and the processor 700 may be an integrated circuit chip with signal processing capability.

The embodiment of the present invention further provides a computer readable storage medium corresponding to the method provided in the foregoing embodiment, referring to fig. 8, the storage medium is shown as an optical disc, and a computer program (i.e. a program product) is stored on the storage medium, where the computer program, when executed by a processor, performs the method provided in any of the foregoing embodiments.

It should be noted that, examples of the computer readable storage medium may also include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), read Only Memory (ROM), and other optical and magnetic storage media, which are not described herein in detail.

It should be noted that the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.

Claims

1. A comprehensive automated testing method for an electric control cabinet, characterized in that the method comprises:

For any performance parameter of the electric control cabinet, a parameter data set of the performance parameter in each test period is obtained, each of the parameter data sets including parameter time series data of multiple data cycles;

According to the similarity between each parameter time series data in each parameter data set, all parameter time series data are clustered to obtain clustering results; according to the numerical characteristics of all parameter time series data in each parameter data set and the frequent pattern mining algorithm, the parameter time series data in each cluster cluster in the clustering result are fused and analyzed to obtain the frequent pattern model corresponding to each parameter data set;

In the frequent pattern model, based on the distribution of each item in all parameter time series data, the similarity between frequent item sets, and the position of frequent item sets in the frequent pattern model, the probability of noise existence in each parameter data set is determined; according to the probability of noise existence in each parameter data set, the change trend of the value in each parameter time series data in the parameter data set, and the value fluctuation, the data quality of each parameter data set is obtained;

According to the data quality corresponding to the parameter data sets of all test periods, the noise data sets are screened from all parameter data sets and the parameter time series data in the noise data sets are denoised to obtain high-quality data sets corresponding to all test periods; the performance test of the electric control cabinet is performed based on all high-quality data sets corresponding to all performance parameters;

The method for acquiring the frequent pattern model includes:

For any parameter data set, a numerical range is determined based on the maximum and minimum values in the parameter data set, the numerical range is evenly divided to obtain a preset number of interval ranges, and different interval ranges are marked with different labels;

Using the label to replace each value in each parameter time series data, to obtain a label sequence corresponding to each parameter time series data;

Obtain the frequency of occurrence of each tag in the parameter data set; in each cluster, according to the frequency of occurrence of all tags and the number of tag sequences in each cluster, obtain the importance index of each cluster, and the number of tag sequences in each cluster and the frequency of occurrence of tags are positively correlated with the importance index;

In the clustering result corresponding to the parameter data set, all label sequences in each cluster are analyzed based on the FP-Growth algorithm to obtain the frequent pattern tree corresponding to each cluster; the frequent pattern trees corresponding to all clusters are arranged in descending order based on the importance index to obtain an arrangement sequence, in which the first frequent pattern tree is used as the target tree, the frequent pattern tree after the target tree is used as the tree to be analyzed, and the part where the tree to be analyzed and the target tree have differences is added to the target tree to obtain a new target tree; the frequent pattern tree after the tree to be analyzed is used as the new tree to be analyzed, and the part where the new tree to be analyzed and the new target tree have differences is added to the new target tree, and the new target tree is continuously determined until the frequent pattern trees in the arrangement sequence are traversed and stopped, and the new target tree at this time is used as the frequent pattern model corresponding to the parameter data set;

The method for obtaining the noise existence probability includes:

For any parameter data set, obtain all frequent item sets in the corresponding frequent pattern model; the frequency of occurrence of each item in the frequent item set is the frequency of occurrence of the label corresponding to the item in the parameter data set;

For any frequent item set, based on the occurrence frequency of each item in the frequent item set, determine the first overall noise factor of the frequent item set, and the first overall noise factor is negatively correlated with the occurrence frequency; based on the number of bifurcation points of the path corresponding to the frequent item set in the frequent pattern model, determine the second overall noise factor of the frequent item set, and the second overall noise factor is negatively correlated with the number of bifurcation points;

The first overall noise factor and the second overall noise factor of all frequent item sets are merged and averaged to obtain a first noise index of the parameter data set, wherein the first overall noise factor and the second overall noise factor are both positively correlated with the first noise index;

Based on the similarities between all frequent item sets, determine the second noise index of the parameter data set;

According to a first noise index and a second noise index of the parameter data set, a noise existence probability of the parameter data set is obtained, wherein the first noise index and the second noise index are both positively correlated with the noise existence probability;

The method for obtaining the data quality includes:

The parameter data set with a noise probability greater than or equal to a preset noise threshold is taken as the target data set, and the parameter data set with a noise probability less than the preset noise threshold is taken as the normal data set, and the data quality of the normal data set is set to a fixed value not less than 1;

For any target data set, based on the change trend of each parameter time series data in the target data set, determine the first quality parameter of the target data set;

In each parameter time series data corresponding to the target data set, based on the deviation between each value and the mean of all values, a deviation factor of each value is obtained, and based on the deviation factors of all values, an overall quality factor of each parameter time series data is determined, and the overall quality factor is negatively correlated with the deviation factor; the overall quality factors of all parameter time series data are averaged to obtain a second quality parameter of the target data set;

The data quality of the target data set is obtained according to the noise existence probability, the first quality parameter and the second quality parameter of the target data set. The noise existence probability, the first quality parameter and the second quality parameter are all negatively correlated with the data quality, and the value of the data quality is a normalized value.

2. The comprehensive automated testing method for an electric control cabinet according to claim 1, wherein the method for obtaining the clustering results comprises:

In each parameter data set, for any two parameter time series data in different data periods, based on the difference between the numerical values and the difference between the slope values of the numerical values, determine the difference characteristic value between the two parameter time series data;

The difference characteristic values between the parameter time series data are used as the distance metric, and K-means clustering analysis is performed on all the parameter time series data in the parameter data set based on a preset K value to obtain the clustering result.

3. A comprehensive automated testing method for an electric control cabinet according to claim 1, characterized in that the method for obtaining the second noise index comprises:

Among all frequent item sets, the frequent item sets with the same number of items are regarded as a class of frequent item sets;

In any type of frequent item sets, any two frequent item sets are combined to obtain all non-repeating combinations; for any combination, the independence value of the frequent item sets in the combination is determined based on the differences in the items at the same position in the two frequent item sets;

The mean of the independence values corresponding to all combinations of frequent item sets of all classes is taken as the second noise index of the parameter data set.

4. The comprehensive automated testing method for an electric control cabinet according to claim 1, wherein the method for obtaining the first quality parameter comprises:

The number of extreme value points in each parameter time series data in the target data set is obtained, and the first quality parameter of the target data set is determined based on the proportion of the sum of the number of extreme value points of all parameter time series data in the total number of values of all parameter time series data.

5. The comprehensive automated testing method for an electric control cabinet according to claim 1, wherein the method for acquiring the high-quality data set comprises:

Among all parameter data sets, the parameter data sets whose data quality is less than or equal to the preset quality threshold are regarded as noise data sets;

Based on the Kalman filtering method, all parameter time series data in each noise data set are denoised to obtain a denoised data set corresponding to each noise data set;

The parameter data sets with data quality greater than the preset quality threshold and all denoised data sets are regarded as high-quality data sets corresponding to all test periods.

6. A comprehensive automated testing method for an electric control cabinet according to claim 1, characterized in that the performance test of the electric control cabinet based on all high-quality data sets corresponding to all performance parameters comprises:

A simulation test model corresponding to the electric control cabinet is established, all high-quality data sets corresponding to all performance parameters are used as the database of the simulation test model, and the performance test of the electric control cabinet is performed to obtain the test results.

7. A comprehensive automated testing system for an electric control cabinet, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method as claimed in any one of claims 1 to 6 when executing the computer program.