CN110895814A

CN110895814A - An intelligent segmentation method of aero-engine borehole image damage based on context coding network

Info

Publication number: CN110895814A
Application number: CN201911209120.1A
Authority: CN
Inventors: 管昕洁; 菅政; 万夕里
Original assignee: Nanjing Tech University
Current assignee: Nanjing Tech University
Priority date: 2019-11-30
Filing date: 2019-11-30
Publication date: 2020-03-20
Anticipated expiration: 2039-11-30
Also published as: CN110895814B

Abstract

An intelligent segmentation method for damage of aero-engine borehole images based on a context coding network, the steps include: (1) collecting samples of aero-engine borehole images, marking each sample, constructing a semantic segmentation data set of aero-engine borehole images, and dividing the data set into (2) Building a deep convolutional neural network, the deep convolutional neural network includes a feature extraction sub-network, a multi-scale context information extraction sub-network and a feature expansion sub-network; Preprocessing; (4) Use the data set to train the deep convolutional neural network, use the performance evaluation function to evaluate the network performance, and save the parameters of the convolutional neural network that reach the preset indicators and have the best performance; (5) Process (3) The image is input into the feature extraction sub-network, the multi-scale context information extraction sub-network, and the feature expansion sub-network in turn, and the feature vector with the same spatial size as the input image is obtained; (6) The feature vector obtained in (5) is used to generate a predicted label image.

Description

Intelligent segmentation method for aero-engine hole detection image damage based on context coding network

Technical Field

The invention belongs to the technical field of aeroengine hole detection, in particular to an intelligent segmentation method for aeroengine hole detection image damage based on a context coding network, which is an engineering application of a deep neural network structure and a preprocessing method for a data set in the technical field of flaw detection.

Background

The engine, as a core component in an aircraft, has a significant impact on flight safety. When the engine works, the internal temperature is high, and the pressure is high, so that various damages such as cracks, burnthrough and the like often occur to the internal structure of the engine. If the damage can not be found in time, the safety of civil aviation flight can be seriously threatened. Therefore, civil aviation companies use various detection modes to discover potential safety hazards of the engine structure in time.

Engine hole probing is one of the important detection methods. A hole detection technician extends the hole detection camera into the engine, shoots pictures, videos and the like in the engine, searches for cracks, burnthrough and other damages in the corresponding pictures and videos, and finally forms a hole detection report to provide guidance for further maintenance and repair work. However, the hole probing technique is time and labor consuming, and the hole probing of one engine takes tens of hours. And is influenced by subjective factors of hole detection personnel, and the accuracy rate is limited. With the economic development and the urbanization process of China being accelerated, the domestic and foreign airlines have been rapidly increased in recent years. The traditional hole detection technology has the defects of limited efficiency and precision and high labor cost, and can not meet the current high-rise engine hole detection requirement.

Disclosure of Invention

The invention aims to provide an intelligent segmentation method for the hole detection image damage of an aircraft engine, which has higher precision, higher speed and less occupied memory and processor resources, and the design idea of the technical scheme of the invention is as follows:

(1) collecting aeroengine hole detection image samples, marking each sample, constructing an aeroengine hole detection image semantic segmentation data set, dividing the data set into a training set, a verification set and a test set according to a certain proportion (such as 8: 1: 1);

(2) building a deep convolutional neural network, wherein the deep convolutional neural network consists of three parts, the first part is a feature extraction sub-network, the second part is a multi-scale context information extraction sub-network, and the third part is a feature expansion sub-network;

(3) preprocessing an aeroengine hole detection image to be detected;

(4) training a deep convolutional neural network by using the data set in the step (1), evaluating the network performance by using a performance evaluation function, and storing convolutional neural network parameters which reach preset indexes and have the best performance;

(5) inputting the image processed in the step (3) into a feature extraction sub-network for feature extraction to obtain a high-level feature vector capable of representing the input image;

(6) inputting the high-level feature vector obtained in the step (5) into a multi-scale context information extraction sub-network;

(7) inputting the feature vector obtained in the step (6) into a feature expansion sub-network to obtain a feature vector with the same spatial size as the input image in the step (5);

(8) and (4) generating a prediction label image by using the feature vector obtained in the step (7).

The multi-scale context information extraction sub-network consists of two parts, which are respectively: (1) the cavity convolution module has the advantages that all dimensions of input characteristic vectors and all dimensions of output characteristic vectors of the cavity convolution module are the same, five paths are formed from the input characteristic vectors to the output characteristic vectors, and the five paths are connected in parallel.

The first path is convolved with a convolution kernel with a void rate of 1 and a size of 3x 3;

the second path is convoluted by a convolution kernel with a void rate of 3 and a size of 3x3 and a convolution kernel with a void rate of 1 and a size of 1x1 in sequence;

the third path is convoluted by sequentially using a convolution kernel with a void rate of 1 and a size of 3x3, a convolution kernel with a void rate of 3 and a size of 3x3, and a convolution kernel with a void rate of 1 and a size of 1x 1;

the fourth path is sequentially convolved by a convolution kernel with a void rate of 1 and a size of 3x3, a convolution kernel with a void rate of 3 and a size of 3x3, a convolution kernel with a void rate of 5 and a size of 3x3, and a convolution kernel with a void rate of 1 and a size of 1x 1;

the fifth path is identity mapping and directly outputs the input;

all convolution operation steps in the five paths are all 1;

(2) the multi-scale pooling module comprises an input feature vector and an output feature vector, wherein the space dimensions of the input feature vector and the output feature vector of the module are the same, the number of channels of the input feature vector is 4 more than that of channels of the output feature vector, the pooling window size is 1/1, 1/2, 1/3, 1/4 and 1/7 of the space dimension of the input feature vector of the module, the pooling operation with the same step length and the pooling window size is performed, 5 pooling operations are connected in parallel, the input feature vector is directly subjected to pooling operation, the feature vectors obtained by pooling are respectively subjected to upsampling, the space dimension of the feature vectors is restored to be the same as that of the input feature vectors, and then the feature vectors and the input feature vectors are stacked according to the channel dimensions.

The feature extraction sub-network comprises a plurality of volume blocks, each volume block in the first two volume blocks comprises two convolution layers using the rectifying linear unit activation function and a maximum pooling layer, and the subsequent volume blocks comprise three convolution blocks using the rectifying linear unit activation function and a maximum pooling layer.

The feature expansion sub-network comprises a plurality of convolution blocks, each convolution block comprises an up-sampling operation and a stacking operation, the feature vectors obtained by the up-sampling operation and the output of the convolution blocks of the corresponding level in the feature extraction sub-network are stacked together according to the channel dimension, two convolution layers using a rectification linear unit activation function are arranged after the feature vectors and the output of the convolution blocks of the corresponding level in the feature extraction sub-network, the convolution kernel size is 1x1, the number of output channels is the number of damage categories plus one, and the convolution layers of the softmax activation function are matched.

The data preprocessing comprises various affine transformations, brightness, saturation and contrast adjustment, integral linear change and nonlinear transformation on an image with dark brightness, histogram equalization on an image with uneven exposure and image fusion by using a mixup method.

The deep neural network is trained by dividing a training set into a plurality of batches and inputting the batches into the deep neural network to obtain the output of the network, and then outputting and inputting the network into a graphDice-loss function based on dice coefficients for image correspondence

In the formula, p represents the prediction class probability of all pixels in all the aeroengine hole detection images in each batch, and q represents the real class of all the pixels in the label images corresponding to all the aeroengine hole detection images in each batch;

adding an l2 regularization term to the loss function, the l2 regularization term being:

the objective function after adding the l2 regularization term is:

in the formula J represents the objective function,

for the dice loss function, m represents the number of all pixels in all the aeroengine hole detection images in each batch, λ represents the L2 regularized hyper-parameter, and L represents the number of convolution layers in the deep neural network model;

calculating the gradient of each model parameter change in the deep neural network model according to the objective function based on a back propagation method, and adjusting the value of each model parameter in the deep neural network model according to the calculated gradient value by using an optimization method;

the performance evaluation function includes, but is not limited to, three performance evaluation indexes, namely, a pixel accuracy PA, an average coincidence ratio MIOU, and a frequency weighted coincidence ratio FWIOU. In the prior art, there are many types of performance evaluation functions, and the above three types are selected in the technical scheme.

In the three formulas, k represents the number of categories of pixels in the aeroengine hole detection image (the number of categories is damage category number +1), and p_iiRepresenting the total number p of pixels with the same type as the real type of the pixels in the label image corresponding to the aeroengine hole detection image in the type with the maximum pixel prediction type probability in each batch of aeroengine hole detection images_ijThe total number of pixels with j class as the class with the maximum probability of predicting the class of the pixels in each batch of aeroengine hole detection images and i class as the real class of the pixels in the label images corresponding to the aeroengine hole detection images, p_jiThe category with the maximum probability of pixel prediction category in each batch of aeroengine hole detection images is i-type and the total number of pixels with the real category of j-type of pixels in the label images corresponding to the aeroengine hole detection images.

The invention has the beneficial effects that:

the technical scheme has higher precision and speed, and occupies less memory and processor resources.

Drawings

FIG. 1 is a schematic flow diagram of an embodiment of the method.

Detailed Description

As shown in fig. 1, two examples of the present embodiment are:

example 1

The example is divided into two stages, namely a training stage and a use stage, and it is to be noted that the following damage types include damage types such as cracks and burn-through; but also includes the category of non-invasive, i.e., non-invasive.

The training phase is divided into the following steps:

acquiring a hole detection image sample of the aeroengine, wherein the acquired sample comprises images of all positions of the aeroengine, including images with one or more damages at the same time and images without damages; the acquired image can be an image with the number of channels more than or equal to one in any color mode;

the image preprocessing of the step (1.2) converts the image obtained in the step (1.1) into the same storage format, so as to facilitate the following unified processing, and then performs image cleaning to remove abnormal shot images, for example: if there are two or more images with high blur degree and not focused sufficiently, only one image is kept. Selecting an image with darker overall brightness, and redistributing image pixel values through histogram equalization to enable the number of pixels of each brightness level in each color channel to be approximately the same;

and (3) image labeling, namely labeling all the images obtained in the step (1.2) one by using any image labeling tool (such as labelme), determining the total number N of the damage types before labeling, giving a unique class label value to each damage type from 1 to N, labeling the labels of all the pixels in the non-damage area in the image as 0 when labeling one image, and labeling the labels of all the pixels in each damage type area as respective class label values. And generating a label image according to a corresponding method provided by the marking tool. The tag image and the original image storage file name should correspond.

And (1.4) dividing a data set, regarding an original image and a label image corresponding to the original image as a divided minimum unit, and dividing all the minimum units into a training set, a verification set and a test set according to a certain proportion (such as 8: 1: 1).

And (1.5) building a deep neural network, and using an arbitrary deep learning framework, such as: the deep neural network comprises three parts, namely a feature extraction sub-network, a multi-scale context information extraction sub-network and a feature expansion sub-network.

The feature extraction sub-network comprises a plurality of volume blocks, each volume block in the first two volume blocks comprises two convolution layers using the rectifying linear unit activation function and a maximum pooling layer, and the subsequent volume blocks comprise three convolution layers using the rectifying linear unit activation function and a maximum pooling layer.

The multi-scale context information extraction sub-network consists of two parts, which are respectively: (1) the cavity convolution module has the advantages that all dimensions of input characteristic vectors and all dimensions of output characteristic vectors of the cavity convolution module are the same, five paths are formed from the input characteristic vectors to the output characteristic vectors, and the five paths are connected in parallel. The first path is convolved with a convolution kernel with a void rate of 1 and a size of 3x 3; the second path is convoluted by sequentially using a convolution kernel with a void rate of 3 and a size of 3x3 and a convolution kernel with a void rate of 1 and a size of 1x 1; the third path is convoluted by sequentially using a convolution kernel with a void rate of 1 and a size of 3x3, a convolution kernel with a void rate of 3 and a size of 3x3, and a convolution kernel with a void rate of 1 and a size of 1x 1; the fourth path is sequentially convolved by a convolution kernel with a void rate of 1 and a size of 3x3, a convolution kernel with a void rate of 3 and a size of 3x3, a convolution kernel with a void rate of 5 and a size of 3x3, and a convolution kernel with a void rate of 1 and a size of 1x 1; the fifth path is identity mapping and directly outputs the input; all convolution operation steps in the five paths are all 1;

The feature extraction sub-network may include three or more convolution blocks, and one convolution block may be connected to another convolution block after another convolution block, on the premise that the length and width of the spatial scale of the feature vector output by the convolution block are both greater than or equal to two. The feature expansion sub-network comprises the same number of convolution blocks as the feature extraction sub-network. The upsampling operation in the feature extension sub-network described above may be bilinear interpolation, nearest neighbor interpolation, or transposed convolution.

The number of the volume blocks in the feature extraction sub-network and the feature expansion sub-network is used as a hyper-parameter, and is positively correlated with the number of images in the data set, the number of damage types and the difficulty degree of damage detection in the images.

And (1.6) training the deep neural network, dividing all the images in the training set divided in the step (1.4) into a plurality of batches, wherein the total number of samples of each batch is N, performing data amplification on the images of each batch and the corresponding label images, and then performing onehot coding on the label images.

And (4) sending all samples of one batch into the deep neural network built in the step (1.5) to obtain the output feature vector.

Then inputting the output characteristic vector and the onehot coded label images of the batch into a loss function together to obtain an error;

then calculating the gradient of trainable parameters of each layer in the deep neural network;

then, optimization is performed by using an optimizer with a set learning rate.

When all batches have passed through the above process, one round is completed. And (3) dividing all the images in the verification set into a plurality of batches after each round is finished, wherein the total number of samples in each batch is M, performing onehot coding on the label image in each sample, and sending all the samples in one batch into the deep neural network built in the step (1.5) to obtain the output feature vector.

And then inputting the output characteristic vector and the onehot coded label images of the batch into a loss function and a performance evaluation function together to obtain an error and a performance index, and storing the error and the performance index into an array.

And finishing all the batches in the verification set after the above process.

And calculating an array mean value of the error array and the performance index array, and storing parameters with the best performance and the model. And presetting the maximum number of training rounds, and stopping training when the number of the training rounds reaches the maximum number of the training rounds after multi-round training. And a learning rate automatic attenuation strategy is used during training.

The data augmentation includes random scrambling of the sample, various random affine transformations, and a range, for example: 1 + -0.4, random brightness, saturation, contrast adjustment, and mixup image fusion. It should be noted that the brightness, saturation, and contrast adjustment are performed on the original image separately, and other operations need to be performed on the original image and the label image at the same time, and the specific implementation needs to set the same random seed for random transformation, so as to ensure that the same random operation is performed on the original image and the corresponding label image in each sample.

The mixup image fusion method specifically operates as follows, firstly, N (N is the total number of samples in each batch) random numbers lambda (α can take other values) are generated according to a beta distribution with α being 1 and β being 1, then, all samples in one copy of all samples in the current batch are cloned, all samples in the copy of the copy are randomly shuffled again, and finally, fusion is carried out according to the following formula.

In the above formula, λ isA random number as described above, (x)_i，y_i) Is one sample in the current batch, i ═ 1,2, …, N; (x)_j，y_j) J is one sample of the clone of the current batch, 1,2, …, N;

is a new sample generated after fusion.

The testing stage is divided into the following steps:

and (2.1) loading the best-performance network and parameters stored in the step (1.6) and loading the parameters into the network.

Step (2.2) dividing all images in the test set into a plurality of batches, wherein the total number of samples in each batch is M, performing onehot coding on the label image in each sample, and sending all samples in one batch into the deep neural network built in the step (1.5) to obtain output characteristic vectors;

All batches in the test set are finished after the above process.

And calculating the mean value of the error array and the performance index array, and judging whether the performance index of the network reaches a preset standard. If the standard is met, returning to the step (1.5) if the standard is not met, adjusting the hyper-parameters, and repeating the above processes until the performance indexes on the test set meet the standard.

Example 2

The training phase is divided into the following steps:

The feature extraction sub-network comprises a plurality of convolution blocks, each convolution block in the first two convolution blocks comprises two convolution layers using a rectification linear unit activation function, a batch regularization layer is arranged behind each convolution layer, the last convolution block is a maximum pooling layer, the last convolution block comprises three convolution layers using rectification linear unit activation functions, a batch regularization layer is arranged behind each convolution layer, and the last convolution block is a maximum pooling layer.

The feature expansion sub-network comprises a plurality of convolution blocks, each convolution block comprises an up-sampling operation and a stacking operation, the feature vectors obtained by the up-sampling operation and the output of the convolution blocks of the corresponding level in the feature extraction sub-network are stacked together according to the channel dimension, two convolution layers using a rectification linear unit activation function are arranged after the feature blocks, a batch regularization layer is arranged after each convolution layer, the last convolution layer of the expansion sub-network is a convolution kernel with the size of 1x1, the number of output channels is the number of damage categories plus one, and the convolution layers are matched with a softmax activation function.

And (1.6) training the deep neural network, dividing all the images in the training set divided in the step (1.4) into a plurality of batches, wherein the total number of samples of each batch is N, performing data amplification on the images of each batch and the corresponding label images, and then performing onehot coding on the label images. And (4) sending all samples of one batch into the deep neural network built in the step (1.6) to obtain the output feature vector. And then inputting the output feature vector and the onehot coded label images of the batch into a loss function together to obtain an error, then calculating the gradient of trainable parameters of each layer in the deep neural network, and then optimizing by using an optimizer with a set learning rate. When all batches have passed through the above process, one round is completed. And (3) dividing all the images in the verification set into a plurality of batches after each round is finished, wherein the total number of samples in each batch is M, performing onehot coding on the label image in each sample, and sending all the samples in one batch into the deep neural network built in the step (1.6) to obtain the output feature vector. And then inputting the output characteristic vector and the onehot coded label images of the batch into a loss function and a performance evaluation function together to obtain an error and a performance index, and storing the error and the performance index into an array. And finishing all the batches in the verification set after the above process. And calculating an array mean value of the error array and the performance index array, and storing parameters with the best performance and the model. And presetting the maximum number of training rounds, and stopping training when the number of the training rounds reaches the maximum number of the training rounds after multi-round training. And a learning rate automatic attenuation strategy is used during training.

In the above formula, λ is one of the above random numbers, (x)_i，y_i) Is one sample in the current batch, i ═ 1,2, …, N; (x)_j，y_j) J is one sample of the clone of the current batch, 1,2, …, N;

is after fusionAnd generating a new sample.

The testing stage is divided into the following steps:

And (2.2) dividing all images in the test set into a plurality of batches, wherein the total number of samples in each batch is M, performing onehot coding on the label image in each sample, and sending all samples in one batch into the deep neural network built in the step (1.6) to obtain the output feature vector. And then inputting the output characteristic vector and the onehot coded label images of the batch into a loss function and a performance evaluation function together to obtain an error and a performance index, and storing the error and the performance index into an array. All batches in the test set are finished after the above process. And calculating the mean value of the error array and the performance index array, and judging whether the performance index of the network reaches a preset standard. If the standard is met, returning to the step (1.5) if the standard is not met, adjusting the hyper-parameters, and repeating the process until the performance indexes on the test set meet the standard.

The invention is not the best known technology.

The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims

1. An intelligent segmentation method for aviation engine hole detection image damage based on a context coding network is characterized by comprising the following steps: comprises 1) training phase and 2) testing phase;

1) the training phase comprises the following steps:

1.1) acquiring a hole detection image of the aeroengine as a sample;

1.2) image preprocessing: converting the images obtained in the step 1.1) into the same storage format; then, cleaning the image to remove the abnormal shot image;

1.3) image annotation: labeling all the images obtained in the step 1.2) one by one;

before labeling, determining the total number of the damage classes, and giving each damage class a unique class label value;

when an image is marked, firstly, the labels of all pixels in a non-damage area in the image are marked as 0, and then the labels of all pixels in each damage area are marked as respective class label values; finally, generating a label image;

1.4) data set partitioning: constructing a semantic segmentation data set of the aeroengine hole detection image, regarding the label image obtained in the step 1.3) and the original image corresponding to the label image as a divided minimum unit, and dividing all the minimum units into a training set, a verification set and a test set;

1.5) building a deep neural network:

the deep neural network comprises three parts, namely a feature extraction sub-network, a multi-scale context information extraction sub-network and a feature expansion sub-network in sequence;

1.6) training a deep neural network:

1.6.1) dividing all images in the training set into a plurality of batches;

the following operations are performed for each batch of images:

sending all samples of a batch into a deep neural network to obtain an output characteristic vector; then, inputting the output characteristic vector and the label images of the batch into a loss function together to obtain an error; then, calculating the gradient of the trainable parameters of each layer in the deep neural network; then, optimizing by using an optimizer with a set learning rate;

when all batches in the training set are subjected to the process, completing one round of training;

1.6.2) dividing all images in the verification set into a plurality of batches;

the following operations are performed for each batch of images:

sending all samples of a batch into a deep neural network to obtain an output characteristic vector; then, inputting the output characteristic vector and the label images of the batch into a loss function and a performance evaluation function together to obtain an error and a performance index, and respectively storing the error and the performance index into an error array and a performance index array;

all the batches in the verification set are finished after the process;

calculating the mean value of the error array and the performance index array respectively, and storing the convolutional neural network parameters with the best performance;

presetting the maximum number of training rounds, and stopping training when the number of training rounds reaches the maximum number of training rounds after multi-round training;

2) the testing stage comprises the following steps:

2.1) loading the network and the parameters with the best performance stored in the step 1.6), and loading the parameters into the deep neural network built in the step 1.5);

2.2) firstly, inputting the images in the test set into a feature extraction sub-network for feature extraction to obtain high-level feature vectors representing the input images;

then, inputting the high-level feature vector obtained in the previous step into a multi-scale context information extraction sub-network; secondly, inputting the feature vector obtained in the previous step into a feature expansion sub-network to obtain a feature vector with the same space size as the input sample image;

and finally, generating a predictive label image by using the feature vector obtained in the previous step.

2. The intelligent segmentation method for the damage of the aeroengine hole detection image based on the context coding network as claimed in claim 1, wherein in the step 1.5):

the feature extraction sub-network comprises a plurality of volume blocks, each volume block in the first two volume blocks comprises two convolution layers using the rectifying linear unit activation function and a maximum pooling layer, and the subsequent volume blocks comprise three convolution layers using the rectifying linear unit activation function and a maximum pooling layer;

a multi-scale context information extraction sub-network, comprising:

a. cavity convolution module

All dimensions of the input characteristic vector and the output characteristic vector of the cavity convolution module are the same;

five paths are formed from the input feature vector to the output feature vector, and the five paths are connected in parallel;

the second path is convoluted by sequentially using a convolution kernel with a void rate of 3 and a size of 3x3 and a convolution kernel with a void rate of 1 and a size of 1x 1;

the fifth path is identity mapping and directly outputs the input;

all convolution operation steps in the five paths are all 1;

b. multi-scale pooling module:

the spatial dimensions of the input characteristic vector and the output characteristic vector of the multi-scale pooling module are the same; the number of channels of the input feature vector is 4 more than that of channels of the output feature vector;

the method comprises the steps of using a pooling window to be 1/1, 1/2, 1/3, 1/4 and 1/7 of the space size of an input feature vector of the module, performing pooling operation with the step size being the same as the size of the pooling window, enabling 5 pooling windows to be connected in parallel, performing pooling operation on the input feature vector directly, performing up-sampling on each pooled feature vector, restoring the space size of the feature vector to be the same as the input feature vector, and stacking the feature vector and the input feature vector according to the channel dimension.

3. The intelligent segmentation method for the damage of the hole detection image of the aeroengine based on the context coding network as claimed in claim 2, wherein the feature extraction sub-network comprises three or more volume blocks, and the precondition that one volume block is followed by another volume block is as follows: the length and width of the space scale of the feature vector output by the previous convolution block are both greater than or equal to two;

the number of the convolution blocks in the feature expansion sub-network is the same as that of the convolution blocks in the feature extraction sub-network;

the up-sampling operation in the feature expansion sub-network is bilinear interpolation, nearest neighbor interpolation or transposition convolution;

the number of the convolution blocks in the feature extraction sub-network and the feature expansion sub-network is used as a hyper-parameter, and is positively correlated with the number of images in the data set, the number of damage categories and the difficulty degree of damage detection in the images.

4. The intelligent segmentation method for the damage of the aeroengine hole detection image based on the context coding network as claimed in claim 1, wherein in the step 1.6), a learning rate automatic attenuation strategy is used in training.

5. The intelligent segmentation method for the damage of the aeroengine hole detection image based on the context coding network as claimed in claim 1, wherein the loss function is a dice loss function based on dice coefficients

In the formula:

p represents the probability of all pixel prediction classes in all aeroengine bore images in each batch,

q represents the real category of all pixels in the label image corresponding to all the aeroengine hole detection images in each batch;

add l2 regularization term to the loss function,

the l2 regularization term for a single convolutional layer is:

the objective function after adding the l2 regularization term is:

in the formula:

j represents the value of the objective function,

is the function of the dice loss in question,

m represents the number of all pixels in all the aeroengine hole detection images in each batch, lambda represents L2 regularized hyper-parameter, and L represents the number of convolution layers in the deep neural network model;

calculating the gradient of the change of each model parameter in the deep neural network model according to the target function J based on a back propagation method, and adjusting the value of each model parameter in the deep neural network model according to the gradient value;

the performance evaluation function comprises a pixel accuracy rate PA function, an average coincidence rate MIOU function and a frequency weight coincidence rate FWIOU function;

in the formula:

k represents the number of damage categories for a pixel in the aircraft engine bore hole image,

p_iithe representation is true, namely the total number of pixels which are the same as the true type of the pixels in the label image corresponding to the aeroengine hole detection image and the type with the maximum probability of pixel prediction type in each batch of aeroengine hole detection images;

p_ijthe pixel prediction type probability of each batch of aeroengine pore-exploring images is j, and the total number of pixels of which the true type is i is the pixel of the pixel in the label image corresponding to the aeroengine pore-exploring image;

p_jithe pixel prediction type probability of each batch of aeroengine hole detection images is the total number of pixels of which the type with the maximum pixel prediction type probability is i type and the real type of the pixels in the label images corresponding to the aeroengine hole detection images is j type.