[go: up one dir, main page]

CN110895814A - An intelligent segmentation method of aero-engine borehole image damage based on context coding network - Google Patents

An intelligent segmentation method of aero-engine borehole image damage based on context coding network Download PDF

Info

Publication number
CN110895814A
CN110895814A CN201911209120.1A CN201911209120A CN110895814A CN 110895814 A CN110895814 A CN 110895814A CN 201911209120 A CN201911209120 A CN 201911209120A CN 110895814 A CN110895814 A CN 110895814A
Authority
CN
China
Prior art keywords
network
convolution
image
images
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911209120.1A
Other languages
Chinese (zh)
Other versions
CN110895814B (en
Inventor
管昕洁
菅政
万夕里
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Tech University
Original Assignee
Nanjing Tech University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Tech University filed Critical Nanjing Tech University
Priority to CN201911209120.1A priority Critical patent/CN110895814B/en
Publication of CN110895814A publication Critical patent/CN110895814A/en
Application granted granted Critical
Publication of CN110895814B publication Critical patent/CN110895814B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

一种基于上下文编码网络的航空发动机孔探图像损伤智能分割方法,步骤包括:(1)采集航空发动机孔探图像样本,并标记各个样本,构建航空发动机孔探图像语义分割数据集,将数据集划分;(2)搭建深度卷积神经网络,所述深度卷积神经网络包括特征提取子网络、多尺度上下文信息提取子网络和特征扩张子网络;(3)对待检测的航空发动机孔探图像进行预处理;(4)利用数据集训练深度卷积神经网络,用性能评估函数评估网络性能,将达到预设指标且性能最佳的卷积神经网络参数保存;(5)将经(3)处理的图像依次输入特征提取子网络、多尺度上下文信息提取子网络、特征扩张子网络,得到空间尺寸和输入图像相同的特征向量;(6)将(5)得到的特征向量生成预测标签图像。

Figure 201911209120

An intelligent segmentation method for damage of aero-engine borehole images based on a context coding network, the steps include: (1) collecting samples of aero-engine borehole images, marking each sample, constructing a semantic segmentation data set of aero-engine borehole images, and dividing the data set into (2) Building a deep convolutional neural network, the deep convolutional neural network includes a feature extraction sub-network, a multi-scale context information extraction sub-network and a feature expansion sub-network; Preprocessing; (4) Use the data set to train the deep convolutional neural network, use the performance evaluation function to evaluate the network performance, and save the parameters of the convolutional neural network that reach the preset indicators and have the best performance; (5) Process (3) The image is input into the feature extraction sub-network, the multi-scale context information extraction sub-network, and the feature expansion sub-network in turn, and the feature vector with the same spatial size as the input image is obtained; (6) The feature vector obtained in (5) is used to generate a predicted label image.

Figure 201911209120

Description

Intelligent segmentation method for aero-engine hole detection image damage based on context coding network
Technical Field
The invention belongs to the technical field of aeroengine hole detection, in particular to an intelligent segmentation method for aeroengine hole detection image damage based on a context coding network, which is an engineering application of a deep neural network structure and a preprocessing method for a data set in the technical field of flaw detection.
Background
The engine, as a core component in an aircraft, has a significant impact on flight safety. When the engine works, the internal temperature is high, and the pressure is high, so that various damages such as cracks, burnthrough and the like often occur to the internal structure of the engine. If the damage can not be found in time, the safety of civil aviation flight can be seriously threatened. Therefore, civil aviation companies use various detection modes to discover potential safety hazards of the engine structure in time.
Engine hole probing is one of the important detection methods. A hole detection technician extends the hole detection camera into the engine, shoots pictures, videos and the like in the engine, searches for cracks, burnthrough and other damages in the corresponding pictures and videos, and finally forms a hole detection report to provide guidance for further maintenance and repair work. However, the hole probing technique is time and labor consuming, and the hole probing of one engine takes tens of hours. And is influenced by subjective factors of hole detection personnel, and the accuracy rate is limited. With the economic development and the urbanization process of China being accelerated, the domestic and foreign airlines have been rapidly increased in recent years. The traditional hole detection technology has the defects of limited efficiency and precision and high labor cost, and can not meet the current high-rise engine hole detection requirement.
Disclosure of Invention
The invention aims to provide an intelligent segmentation method for the hole detection image damage of an aircraft engine, which has higher precision, higher speed and less occupied memory and processor resources, and the design idea of the technical scheme of the invention is as follows:
(1) collecting aeroengine hole detection image samples, marking each sample, constructing an aeroengine hole detection image semantic segmentation data set, dividing the data set into a training set, a verification set and a test set according to a certain proportion (such as 8: 1: 1);
(2) building a deep convolutional neural network, wherein the deep convolutional neural network consists of three parts, the first part is a feature extraction sub-network, the second part is a multi-scale context information extraction sub-network, and the third part is a feature expansion sub-network;
(3) preprocessing an aeroengine hole detection image to be detected;
(4) training a deep convolutional neural network by using the data set in the step (1), evaluating the network performance by using a performance evaluation function, and storing convolutional neural network parameters which reach preset indexes and have the best performance;
(5) inputting the image processed in the step (3) into a feature extraction sub-network for feature extraction to obtain a high-level feature vector capable of representing the input image;
(6) inputting the high-level feature vector obtained in the step (5) into a multi-scale context information extraction sub-network;
(7) inputting the feature vector obtained in the step (6) into a feature expansion sub-network to obtain a feature vector with the same spatial size as the input image in the step (5);
(8) and (4) generating a prediction label image by using the feature vector obtained in the step (7).
The multi-scale context information extraction sub-network consists of two parts, which are respectively: (1) the cavity convolution module has the advantages that all dimensions of input characteristic vectors and all dimensions of output characteristic vectors of the cavity convolution module are the same, five paths are formed from the input characteristic vectors to the output characteristic vectors, and the five paths are connected in parallel.
The first path is convolved with a convolution kernel with a void rate of 1 and a size of 3x 3;
the second path is convoluted by a convolution kernel with a void rate of 3 and a size of 3x3 and a convolution kernel with a void rate of 1 and a size of 1x1 in sequence;
the third path is convoluted by sequentially using a convolution kernel with a void rate of 1 and a size of 3x3, a convolution kernel with a void rate of 3 and a size of 3x3, and a convolution kernel with a void rate of 1 and a size of 1x 1;
the fourth path is sequentially convolved by a convolution kernel with a void rate of 1 and a size of 3x3, a convolution kernel with a void rate of 3 and a size of 3x3, a convolution kernel with a void rate of 5 and a size of 3x3, and a convolution kernel with a void rate of 1 and a size of 1x 1;
the fifth path is identity mapping and directly outputs the input;
all convolution operation steps in the five paths are all 1;
(2) the multi-scale pooling module comprises an input feature vector and an output feature vector, wherein the space dimensions of the input feature vector and the output feature vector of the module are the same, the number of channels of the input feature vector is 4 more than that of channels of the output feature vector, the pooling window size is 1/1, 1/2, 1/3, 1/4 and 1/7 of the space dimension of the input feature vector of the module, the pooling operation with the same step length and the pooling window size is performed, 5 pooling operations are connected in parallel, the input feature vector is directly subjected to pooling operation, the feature vectors obtained by pooling are respectively subjected to upsampling, the space dimension of the feature vectors is restored to be the same as that of the input feature vectors, and then the feature vectors and the input feature vectors are stacked according to the channel dimensions.
The feature extraction sub-network comprises a plurality of volume blocks, each volume block in the first two volume blocks comprises two convolution layers using the rectifying linear unit activation function and a maximum pooling layer, and the subsequent volume blocks comprise three convolution blocks using the rectifying linear unit activation function and a maximum pooling layer.
The feature expansion sub-network comprises a plurality of convolution blocks, each convolution block comprises an up-sampling operation and a stacking operation, the feature vectors obtained by the up-sampling operation and the output of the convolution blocks of the corresponding level in the feature extraction sub-network are stacked together according to the channel dimension, two convolution layers using a rectification linear unit activation function are arranged after the feature vectors and the output of the convolution blocks of the corresponding level in the feature extraction sub-network, the convolution kernel size is 1x1, the number of output channels is the number of damage categories plus one, and the convolution layers of the softmax activation function are matched.
The data preprocessing comprises various affine transformations, brightness, saturation and contrast adjustment, integral linear change and nonlinear transformation on an image with dark brightness, histogram equalization on an image with uneven exposure and image fusion by using a mixup method.
The deep neural network is trained by dividing a training set into a plurality of batches and inputting the batches into the deep neural network to obtain the output of the network, and then outputting and inputting the network into a graphDice-loss function based on dice coefficients for image correspondence
Figure BDA0002297655930000031
Figure BDA0002297655930000032
In the formula, p represents the prediction class probability of all pixels in all the aeroengine hole detection images in each batch, and q represents the real class of all the pixels in the label images corresponding to all the aeroengine hole detection images in each batch;
adding an l2 regularization term to the loss function, the l2 regularization term being:
Figure BDA0002297655930000033
the objective function after adding the l2 regularization term is:
Figure BDA0002297655930000034
in the formula J represents the objective function,
Figure BDA0002297655930000035
for the dice loss function, m represents the number of all pixels in all the aeroengine hole detection images in each batch, λ represents the L2 regularized hyper-parameter, and L represents the number of convolution layers in the deep neural network model;
calculating the gradient of each model parameter change in the deep neural network model according to the objective function based on a back propagation method, and adjusting the value of each model parameter in the deep neural network model according to the calculated gradient value by using an optimization method;
the performance evaluation function includes, but is not limited to, three performance evaluation indexes, namely, a pixel accuracy PA, an average coincidence ratio MIOU, and a frequency weighted coincidence ratio FWIOU. In the prior art, there are many types of performance evaluation functions, and the above three types are selected in the technical scheme.
Figure BDA0002297655930000041
Figure BDA0002297655930000042
Figure BDA0002297655930000043
In the three formulas, k represents the number of categories of pixels in the aeroengine hole detection image (the number of categories is damage category number +1), and piiRepresenting the total number p of pixels with the same type as the real type of the pixels in the label image corresponding to the aeroengine hole detection image in the type with the maximum pixel prediction type probability in each batch of aeroengine hole detection imagesijThe total number of pixels with j class as the class with the maximum probability of predicting the class of the pixels in each batch of aeroengine hole detection images and i class as the real class of the pixels in the label images corresponding to the aeroengine hole detection images, pjiThe category with the maximum probability of pixel prediction category in each batch of aeroengine hole detection images is i-type and the total number of pixels with the real category of j-type of pixels in the label images corresponding to the aeroengine hole detection images.
The invention has the beneficial effects that:
the technical scheme has higher precision and speed, and occupies less memory and processor resources.
Drawings
FIG. 1 is a schematic flow diagram of an embodiment of the method.
Detailed Description
As shown in fig. 1, two examples of the present embodiment are:
example 1
The example is divided into two stages, namely a training stage and a use stage, and it is to be noted that the following damage types include damage types such as cracks and burn-through; but also includes the category of non-invasive, i.e., non-invasive.
The training phase is divided into the following steps:
acquiring a hole detection image sample of the aeroengine, wherein the acquired sample comprises images of all positions of the aeroengine, including images with one or more damages at the same time and images without damages; the acquired image can be an image with the number of channels more than or equal to one in any color mode;
the image preprocessing of the step (1.2) converts the image obtained in the step (1.1) into the same storage format, so as to facilitate the following unified processing, and then performs image cleaning to remove abnormal shot images, for example: if there are two or more images with high blur degree and not focused sufficiently, only one image is kept. Selecting an image with darker overall brightness, and redistributing image pixel values through histogram equalization to enable the number of pixels of each brightness level in each color channel to be approximately the same;
and (3) image labeling, namely labeling all the images obtained in the step (1.2) one by using any image labeling tool (such as labelme), determining the total number N of the damage types before labeling, giving a unique class label value to each damage type from 1 to N, labeling the labels of all the pixels in the non-damage area in the image as 0 when labeling one image, and labeling the labels of all the pixels in each damage type area as respective class label values. And generating a label image according to a corresponding method provided by the marking tool. The tag image and the original image storage file name should correspond.
And (1.4) dividing a data set, regarding an original image and a label image corresponding to the original image as a divided minimum unit, and dividing all the minimum units into a training set, a verification set and a test set according to a certain proportion (such as 8: 1: 1).
And (1.5) building a deep neural network, and using an arbitrary deep learning framework, such as: the deep neural network comprises three parts, namely a feature extraction sub-network, a multi-scale context information extraction sub-network and a feature expansion sub-network.
The feature extraction sub-network comprises a plurality of volume blocks, each volume block in the first two volume blocks comprises two convolution layers using the rectifying linear unit activation function and a maximum pooling layer, and the subsequent volume blocks comprise three convolution layers using the rectifying linear unit activation function and a maximum pooling layer.
The multi-scale context information extraction sub-network consists of two parts, which are respectively: (1) the cavity convolution module has the advantages that all dimensions of input characteristic vectors and all dimensions of output characteristic vectors of the cavity convolution module are the same, five paths are formed from the input characteristic vectors to the output characteristic vectors, and the five paths are connected in parallel. The first path is convolved with a convolution kernel with a void rate of 1 and a size of 3x 3; the second path is convoluted by sequentially using a convolution kernel with a void rate of 3 and a size of 3x3 and a convolution kernel with a void rate of 1 and a size of 1x 1; the third path is convoluted by sequentially using a convolution kernel with a void rate of 1 and a size of 3x3, a convolution kernel with a void rate of 3 and a size of 3x3, and a convolution kernel with a void rate of 1 and a size of 1x 1; the fourth path is sequentially convolved by a convolution kernel with a void rate of 1 and a size of 3x3, a convolution kernel with a void rate of 3 and a size of 3x3, a convolution kernel with a void rate of 5 and a size of 3x3, and a convolution kernel with a void rate of 1 and a size of 1x 1; the fifth path is identity mapping and directly outputs the input; all convolution operation steps in the five paths are all 1;
(2) the multi-scale pooling module comprises an input feature vector and an output feature vector, wherein the space dimensions of the input feature vector and the output feature vector of the module are the same, the number of channels of the input feature vector is 4 more than that of channels of the output feature vector, the pooling window size is 1/1, 1/2, 1/3, 1/4 and 1/7 of the space dimension of the input feature vector of the module, the pooling operation with the same step length and the pooling window size is performed, 5 pooling operations are connected in parallel, the input feature vector is directly subjected to pooling operation, the feature vectors obtained by pooling are respectively subjected to upsampling, the space dimension of the feature vectors is restored to be the same as that of the input feature vectors, and then the feature vectors and the input feature vectors are stacked according to the channel dimensions.
The feature expansion sub-network comprises a plurality of convolution blocks, each convolution block comprises an up-sampling operation and a stacking operation, the feature vectors obtained by the up-sampling operation and the output of the convolution blocks of the corresponding level in the feature extraction sub-network are stacked together according to the channel dimension, two convolution layers using a rectification linear unit activation function are arranged after the feature vectors and the output of the convolution blocks of the corresponding level in the feature extraction sub-network, the convolution kernel size is 1x1, the number of output channels is the number of damage categories plus one, and the convolution layers of the softmax activation function are matched.
The feature extraction sub-network may include three or more convolution blocks, and one convolution block may be connected to another convolution block after another convolution block, on the premise that the length and width of the spatial scale of the feature vector output by the convolution block are both greater than or equal to two. The feature expansion sub-network comprises the same number of convolution blocks as the feature extraction sub-network. The upsampling operation in the feature extension sub-network described above may be bilinear interpolation, nearest neighbor interpolation, or transposed convolution.
The number of the volume blocks in the feature extraction sub-network and the feature expansion sub-network is used as a hyper-parameter, and is positively correlated with the number of images in the data set, the number of damage types and the difficulty degree of damage detection in the images.
And (1.6) training the deep neural network, dividing all the images in the training set divided in the step (1.4) into a plurality of batches, wherein the total number of samples of each batch is N, performing data amplification on the images of each batch and the corresponding label images, and then performing onehot coding on the label images.
And (4) sending all samples of one batch into the deep neural network built in the step (1.5) to obtain the output feature vector.
Then inputting the output characteristic vector and the onehot coded label images of the batch into a loss function together to obtain an error;
then calculating the gradient of trainable parameters of each layer in the deep neural network;
then, optimization is performed by using an optimizer with a set learning rate.
When all batches have passed through the above process, one round is completed. And (3) dividing all the images in the verification set into a plurality of batches after each round is finished, wherein the total number of samples in each batch is M, performing onehot coding on the label image in each sample, and sending all the samples in one batch into the deep neural network built in the step (1.5) to obtain the output feature vector.
And then inputting the output characteristic vector and the onehot coded label images of the batch into a loss function and a performance evaluation function together to obtain an error and a performance index, and storing the error and the performance index into an array.
And finishing all the batches in the verification set after the above process.
And calculating an array mean value of the error array and the performance index array, and storing parameters with the best performance and the model. And presetting the maximum number of training rounds, and stopping training when the number of the training rounds reaches the maximum number of the training rounds after multi-round training. And a learning rate automatic attenuation strategy is used during training.
The data augmentation includes random scrambling of the sample, various random affine transformations, and a range, for example: 1 + -0.4, random brightness, saturation, contrast adjustment, and mixup image fusion. It should be noted that the brightness, saturation, and contrast adjustment are performed on the original image separately, and other operations need to be performed on the original image and the label image at the same time, and the specific implementation needs to set the same random seed for random transformation, so as to ensure that the same random operation is performed on the original image and the corresponding label image in each sample.
The mixup image fusion method specifically operates as follows, firstly, N (N is the total number of samples in each batch) random numbers lambda (α can take other values) are generated according to a beta distribution with α being 1 and β being 1, then, all samples in one copy of all samples in the current batch are cloned, all samples in the copy of the copy are randomly shuffled again, and finally, fusion is carried out according to the following formula.
Figure BDA0002297655930000071
Figure BDA0002297655930000072
In the above formula, λ isA random number as described above, (x)i,yi) Is one sample in the current batch, i ═ 1,2, …, N; (x)j,yj) J is one sample of the clone of the current batch, 1,2, …, N;
Figure BDA0002297655930000073
is a new sample generated after fusion.
The testing stage is divided into the following steps:
and (2.1) loading the best-performance network and parameters stored in the step (1.6) and loading the parameters into the network.
Step (2.2) dividing all images in the test set into a plurality of batches, wherein the total number of samples in each batch is M, performing onehot coding on the label image in each sample, and sending all samples in one batch into the deep neural network built in the step (1.5) to obtain output characteristic vectors;
and then inputting the output characteristic vector and the onehot coded label images of the batch into a loss function and a performance evaluation function together to obtain an error and a performance index, and storing the error and the performance index into an array.
All batches in the test set are finished after the above process.
And calculating the mean value of the error array and the performance index array, and judging whether the performance index of the network reaches a preset standard. If the standard is met, returning to the step (1.5) if the standard is not met, adjusting the hyper-parameters, and repeating the above processes until the performance indexes on the test set meet the standard.
Example 2
The example is divided into two stages, namely a training stage and a use stage, and it is to be noted that the following damage types include damage types such as cracks and burn-through; but also includes the category of non-invasive, i.e., non-invasive.
The training phase is divided into the following steps:
acquiring a hole detection image sample of the aeroengine, wherein the acquired sample comprises images of all positions of the aeroengine, including images with one or more damages at the same time and images without damages; the acquired image can be an image with the number of channels more than or equal to one in any color mode;
the image preprocessing of the step (1.2) converts the image obtained in the step (1.1) into the same storage format, so as to facilitate the following unified processing, and then performs image cleaning to remove abnormal shot images, for example: if there are two or more images with high blur degree and not focused sufficiently, only one image is kept. Selecting an image with darker overall brightness, and redistributing image pixel values through histogram equalization to enable the number of pixels of each brightness level in each color channel to be approximately the same;
and (3) image labeling, namely labeling all the images obtained in the step (1.2) one by using any image labeling tool (such as labelme), determining the total number N of the damage types before labeling, giving a unique class label value to each damage type from 1 to N, labeling the labels of all the pixels in the non-damage area in the image as 0 when labeling one image, and labeling the labels of all the pixels in each damage type area as respective class label values. And generating a label image according to a corresponding method provided by the marking tool. The tag image and the original image storage file name should correspond.
And (1.4) dividing a data set, regarding an original image and a label image corresponding to the original image as a divided minimum unit, and dividing all the minimum units into a training set, a verification set and a test set according to a certain proportion (such as 8: 1: 1).
And (1.5) building a deep neural network, and using an arbitrary deep learning framework, such as: the deep neural network comprises three parts, namely a feature extraction sub-network, a multi-scale context information extraction sub-network and a feature expansion sub-network.
The feature extraction sub-network comprises a plurality of convolution blocks, each convolution block in the first two convolution blocks comprises two convolution layers using a rectification linear unit activation function, a batch regularization layer is arranged behind each convolution layer, the last convolution block is a maximum pooling layer, the last convolution block comprises three convolution layers using rectification linear unit activation functions, a batch regularization layer is arranged behind each convolution layer, and the last convolution block is a maximum pooling layer.
The multi-scale context information extraction sub-network consists of two parts, which are respectively: (1) the cavity convolution module has the advantages that all dimensions of input characteristic vectors and all dimensions of output characteristic vectors of the cavity convolution module are the same, five paths are formed from the input characteristic vectors to the output characteristic vectors, and the five paths are connected in parallel. The first path is convolved with a convolution kernel with a void rate of 1 and a size of 3x 3; the second path is convoluted by sequentially using a convolution kernel with a void rate of 3 and a size of 3x3 and a convolution kernel with a void rate of 1 and a size of 1x 1; the third path is convoluted by sequentially using a convolution kernel with a void rate of 1 and a size of 3x3, a convolution kernel with a void rate of 3 and a size of 3x3, and a convolution kernel with a void rate of 1 and a size of 1x 1; the fourth path is sequentially convolved by a convolution kernel with a void rate of 1 and a size of 3x3, a convolution kernel with a void rate of 3 and a size of 3x3, a convolution kernel with a void rate of 5 and a size of 3x3, and a convolution kernel with a void rate of 1 and a size of 1x 1; the fifth path is identity mapping and directly outputs the input; all convolution operation steps in the five paths are all 1;
(2) the multi-scale pooling module comprises an input feature vector and an output feature vector, wherein the space dimensions of the input feature vector and the output feature vector of the module are the same, the number of channels of the input feature vector is 4 more than that of channels of the output feature vector, the pooling window size is 1/1, 1/2, 1/3, 1/4 and 1/7 of the space dimension of the input feature vector of the module, the pooling operation with the same step length and the pooling window size is performed, 5 pooling operations are connected in parallel, the input feature vector is directly subjected to pooling operation, the feature vectors obtained by pooling are respectively subjected to upsampling, the space dimension of the feature vectors is restored to be the same as that of the input feature vectors, and then the feature vectors and the input feature vectors are stacked according to the channel dimensions.
The feature expansion sub-network comprises a plurality of convolution blocks, each convolution block comprises an up-sampling operation and a stacking operation, the feature vectors obtained by the up-sampling operation and the output of the convolution blocks of the corresponding level in the feature extraction sub-network are stacked together according to the channel dimension, two convolution layers using a rectification linear unit activation function are arranged after the feature blocks, a batch regularization layer is arranged after each convolution layer, the last convolution layer of the expansion sub-network is a convolution kernel with the size of 1x1, the number of output channels is the number of damage categories plus one, and the convolution layers are matched with a softmax activation function.
The feature extraction sub-network may include three or more convolution blocks, and one convolution block may be connected to another convolution block after another convolution block, on the premise that the length and width of the spatial scale of the feature vector output by the convolution block are both greater than or equal to two. The feature expansion sub-network comprises the same number of convolution blocks as the feature extraction sub-network. The upsampling operation in the feature extension sub-network described above may be bilinear interpolation, nearest neighbor interpolation, or transposed convolution.
The number of the volume blocks in the feature extraction sub-network and the feature expansion sub-network is used as a hyper-parameter, and is positively correlated with the number of images in the data set, the number of damage types and the difficulty degree of damage detection in the images.
And (1.6) training the deep neural network, dividing all the images in the training set divided in the step (1.4) into a plurality of batches, wherein the total number of samples of each batch is N, performing data amplification on the images of each batch and the corresponding label images, and then performing onehot coding on the label images. And (4) sending all samples of one batch into the deep neural network built in the step (1.6) to obtain the output feature vector. And then inputting the output feature vector and the onehot coded label images of the batch into a loss function together to obtain an error, then calculating the gradient of trainable parameters of each layer in the deep neural network, and then optimizing by using an optimizer with a set learning rate. When all batches have passed through the above process, one round is completed. And (3) dividing all the images in the verification set into a plurality of batches after each round is finished, wherein the total number of samples in each batch is M, performing onehot coding on the label image in each sample, and sending all the samples in one batch into the deep neural network built in the step (1.6) to obtain the output feature vector. And then inputting the output characteristic vector and the onehot coded label images of the batch into a loss function and a performance evaluation function together to obtain an error and a performance index, and storing the error and the performance index into an array. And finishing all the batches in the verification set after the above process. And calculating an array mean value of the error array and the performance index array, and storing parameters with the best performance and the model. And presetting the maximum number of training rounds, and stopping training when the number of the training rounds reaches the maximum number of the training rounds after multi-round training. And a learning rate automatic attenuation strategy is used during training.
The data augmentation includes random scrambling of the sample, various random affine transformations, and a range, for example: 1 + -0.4, random brightness, saturation, contrast adjustment, and mixup image fusion. It should be noted that the brightness, saturation, and contrast adjustment are performed on the original image separately, and other operations need to be performed on the original image and the label image at the same time, and the specific implementation needs to set the same random seed for random transformation, so as to ensure that the same random operation is performed on the original image and the corresponding label image in each sample.
The mixup image fusion method specifically operates as follows, firstly, N (N is the total number of samples in each batch) random numbers lambda (α can take other values) are generated according to a beta distribution with α being 1 and β being 1, then, all samples in one copy of all samples in the current batch are cloned, all samples in the copy of the copy are randomly shuffled again, and finally, fusion is carried out according to the following formula.
Figure BDA0002297655930000101
Figure BDA0002297655930000102
In the above formula, λ is one of the above random numbers, (x)i,yi) Is one sample in the current batch, i ═ 1,2, …, N; (x)j,yj) J is one sample of the clone of the current batch, 1,2, …, N;
Figure BDA0002297655930000103
is after fusionAnd generating a new sample.
The testing stage is divided into the following steps:
and (2.1) loading the best-performance network and parameters stored in the step (1.6) and loading the parameters into the network.
And (2.2) dividing all images in the test set into a plurality of batches, wherein the total number of samples in each batch is M, performing onehot coding on the label image in each sample, and sending all samples in one batch into the deep neural network built in the step (1.6) to obtain the output feature vector. And then inputting the output characteristic vector and the onehot coded label images of the batch into a loss function and a performance evaluation function together to obtain an error and a performance index, and storing the error and the performance index into an array. All batches in the test set are finished after the above process. And calculating the mean value of the error array and the performance index array, and judging whether the performance index of the network reaches a preset standard. If the standard is met, returning to the step (1.5) if the standard is not met, adjusting the hyper-parameters, and repeating the process until the performance indexes on the test set meet the standard.
The invention is not the best known technology.
The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims (5)

1. An intelligent segmentation method for aviation engine hole detection image damage based on a context coding network is characterized by comprising the following steps: comprises 1) training phase and 2) testing phase;
1) the training phase comprises the following steps:
1.1) acquiring a hole detection image of the aeroengine as a sample;
1.2) image preprocessing: converting the images obtained in the step 1.1) into the same storage format; then, cleaning the image to remove the abnormal shot image;
1.3) image annotation: labeling all the images obtained in the step 1.2) one by one;
before labeling, determining the total number of the damage classes, and giving each damage class a unique class label value;
when an image is marked, firstly, the labels of all pixels in a non-damage area in the image are marked as 0, and then the labels of all pixels in each damage area are marked as respective class label values; finally, generating a label image;
1.4) data set partitioning: constructing a semantic segmentation data set of the aeroengine hole detection image, regarding the label image obtained in the step 1.3) and the original image corresponding to the label image as a divided minimum unit, and dividing all the minimum units into a training set, a verification set and a test set;
1.5) building a deep neural network:
the deep neural network comprises three parts, namely a feature extraction sub-network, a multi-scale context information extraction sub-network and a feature expansion sub-network in sequence;
1.6) training a deep neural network:
1.6.1) dividing all images in the training set into a plurality of batches;
the following operations are performed for each batch of images:
sending all samples of a batch into a deep neural network to obtain an output characteristic vector; then, inputting the output characteristic vector and the label images of the batch into a loss function together to obtain an error; then, calculating the gradient of the trainable parameters of each layer in the deep neural network; then, optimizing by using an optimizer with a set learning rate;
when all batches in the training set are subjected to the process, completing one round of training;
1.6.2) dividing all images in the verification set into a plurality of batches;
the following operations are performed for each batch of images:
sending all samples of a batch into a deep neural network to obtain an output characteristic vector; then, inputting the output characteristic vector and the label images of the batch into a loss function and a performance evaluation function together to obtain an error and a performance index, and respectively storing the error and the performance index into an error array and a performance index array;
all the batches in the verification set are finished after the process;
calculating the mean value of the error array and the performance index array respectively, and storing the convolutional neural network parameters with the best performance;
presetting the maximum number of training rounds, and stopping training when the number of training rounds reaches the maximum number of training rounds after multi-round training;
2) the testing stage comprises the following steps:
2.1) loading the network and the parameters with the best performance stored in the step 1.6), and loading the parameters into the deep neural network built in the step 1.5);
2.2) firstly, inputting the images in the test set into a feature extraction sub-network for feature extraction to obtain high-level feature vectors representing the input images;
then, inputting the high-level feature vector obtained in the previous step into a multi-scale context information extraction sub-network; secondly, inputting the feature vector obtained in the previous step into a feature expansion sub-network to obtain a feature vector with the same space size as the input sample image;
and finally, generating a predictive label image by using the feature vector obtained in the previous step.
2. The intelligent segmentation method for the damage of the aeroengine hole detection image based on the context coding network as claimed in claim 1, wherein in the step 1.5):
the feature extraction sub-network comprises a plurality of volume blocks, each volume block in the first two volume blocks comprises two convolution layers using the rectifying linear unit activation function and a maximum pooling layer, and the subsequent volume blocks comprise three convolution layers using the rectifying linear unit activation function and a maximum pooling layer;
a multi-scale context information extraction sub-network, comprising:
a. cavity convolution module
All dimensions of the input characteristic vector and the output characteristic vector of the cavity convolution module are the same;
five paths are formed from the input feature vector to the output feature vector, and the five paths are connected in parallel;
the first path is convolved with a convolution kernel with a void rate of 1 and a size of 3x 3;
the second path is convoluted by sequentially using a convolution kernel with a void rate of 3 and a size of 3x3 and a convolution kernel with a void rate of 1 and a size of 1x 1;
the third path is convoluted by sequentially using a convolution kernel with a void rate of 1 and a size of 3x3, a convolution kernel with a void rate of 3 and a size of 3x3, and a convolution kernel with a void rate of 1 and a size of 1x 1;
the fourth path is sequentially convolved by a convolution kernel with a void rate of 1 and a size of 3x3, a convolution kernel with a void rate of 3 and a size of 3x3, a convolution kernel with a void rate of 5 and a size of 3x3, and a convolution kernel with a void rate of 1 and a size of 1x 1;
the fifth path is identity mapping and directly outputs the input;
all convolution operation steps in the five paths are all 1;
b. multi-scale pooling module:
the spatial dimensions of the input characteristic vector and the output characteristic vector of the multi-scale pooling module are the same; the number of channels of the input feature vector is 4 more than that of channels of the output feature vector;
the method comprises the steps of using a pooling window to be 1/1, 1/2, 1/3, 1/4 and 1/7 of the space size of an input feature vector of the module, performing pooling operation with the step size being the same as the size of the pooling window, enabling 5 pooling windows to be connected in parallel, performing pooling operation on the input feature vector directly, performing up-sampling on each pooled feature vector, restoring the space size of the feature vector to be the same as the input feature vector, and stacking the feature vector and the input feature vector according to the channel dimension.
The feature expansion sub-network comprises a plurality of convolution blocks, each convolution block comprises an up-sampling operation and a stacking operation, the feature vectors obtained by the up-sampling operation and the output of the convolution blocks of the corresponding level in the feature extraction sub-network are stacked together according to the channel dimension, two convolution layers using a rectification linear unit activation function are arranged after the feature vectors and the output of the convolution blocks of the corresponding level in the feature extraction sub-network, the convolution kernel size is 1x1, the number of output channels is the number of damage categories plus one, and the convolution layers of the softmax activation function are matched.
3. The intelligent segmentation method for the damage of the hole detection image of the aeroengine based on the context coding network as claimed in claim 2, wherein the feature extraction sub-network comprises three or more volume blocks, and the precondition that one volume block is followed by another volume block is as follows: the length and width of the space scale of the feature vector output by the previous convolution block are both greater than or equal to two;
the number of the convolution blocks in the feature expansion sub-network is the same as that of the convolution blocks in the feature extraction sub-network;
the up-sampling operation in the feature expansion sub-network is bilinear interpolation, nearest neighbor interpolation or transposition convolution;
the number of the convolution blocks in the feature extraction sub-network and the feature expansion sub-network is used as a hyper-parameter, and is positively correlated with the number of images in the data set, the number of damage categories and the difficulty degree of damage detection in the images.
4. The intelligent segmentation method for the damage of the aeroengine hole detection image based on the context coding network as claimed in claim 1, wherein in the step 1.6), a learning rate automatic attenuation strategy is used in training.
5. The intelligent segmentation method for the damage of the aeroengine hole detection image based on the context coding network as claimed in claim 1, wherein the loss function is a dice loss function based on dice coefficients
Figure FDA0002297655920000031
In the formula:
p represents the probability of all pixel prediction classes in all aeroengine bore images in each batch,
q represents the real category of all pixels in the label image corresponding to all the aeroengine hole detection images in each batch;
add l2 regularization term to the loss function,
the l2 regularization term for a single convolutional layer is:
Figure FDA0002297655920000041
the objective function after adding the l2 regularization term is:
Figure FDA0002297655920000042
in the formula:
j represents the value of the objective function,
Figure FDA0002297655920000043
is the function of the dice loss in question,
m represents the number of all pixels in all the aeroengine hole detection images in each batch, lambda represents L2 regularized hyper-parameter, and L represents the number of convolution layers in the deep neural network model;
calculating the gradient of the change of each model parameter in the deep neural network model according to the target function J based on a back propagation method, and adjusting the value of each model parameter in the deep neural network model according to the gradient value;
the performance evaluation function comprises a pixel accuracy rate PA function, an average coincidence rate MIOU function and a frequency weight coincidence rate FWIOU function;
Figure FDA0002297655920000044
Figure FDA0002297655920000045
Figure FDA0002297655920000046
in the formula:
k represents the number of damage categories for a pixel in the aircraft engine bore hole image,
piithe representation is true, namely the total number of pixels which are the same as the true type of the pixels in the label image corresponding to the aeroengine hole detection image and the type with the maximum probability of pixel prediction type in each batch of aeroengine hole detection images;
pijthe pixel prediction type probability of each batch of aeroengine pore-exploring images is j, and the total number of pixels of which the true type is i is the pixel of the pixel in the label image corresponding to the aeroengine pore-exploring image;
pjithe pixel prediction type probability of each batch of aeroengine hole detection images is the total number of pixels of which the type with the maximum pixel prediction type probability is i type and the real type of the pixels in the label images corresponding to the aeroengine hole detection images is j type.
CN201911209120.1A 2019-11-30 2019-11-30 Aero-engine hole-finding image damage segmentation method based on context coding network Active CN110895814B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911209120.1A CN110895814B (en) 2019-11-30 2019-11-30 Aero-engine hole-finding image damage segmentation method based on context coding network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911209120.1A CN110895814B (en) 2019-11-30 2019-11-30 Aero-engine hole-finding image damage segmentation method based on context coding network

Publications (2)

Publication Number Publication Date
CN110895814A true CN110895814A (en) 2020-03-20
CN110895814B CN110895814B (en) 2023-04-18

Family

ID=69786781

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911209120.1A Active CN110895814B (en) 2019-11-30 2019-11-30 Aero-engine hole-finding image damage segmentation method based on context coding network

Country Status (1)

Country Link
CN (1) CN110895814B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111583215A (en) * 2020-04-30 2020-08-25 平安科技(深圳)有限公司 Intelligent damage assessment method and device for damage image, electronic equipment and storage medium
CN111798469A (en) * 2020-07-13 2020-10-20 珠海函谷科技有限公司 A Semantic Segmentation Method for Small Datasets of Digital Images Based on Deep Convolutional Neural Networks
CN111899169A (en) * 2020-07-02 2020-11-06 佛山市南海区广工大数控装备协同创新研究院 Network segmentation method of face image based on semantic segmentation
CN112232349A (en) * 2020-09-23 2021-01-15 成都佳华物链云科技有限公司 Model training method, image segmentation method and device
CN112818822A (en) * 2021-01-28 2021-05-18 中国空气动力研究与发展中心超高速空气动力研究所 Automatic identification method for damaged area of aerospace composite material
CN113838014A (en) * 2021-09-15 2021-12-24 南京工业大学 Aircraft engine damage video detection method based on double spatial distortion
CN114842192A (en) * 2022-04-15 2022-08-02 南京航空航天大学 Aero-engine blade damage identification model, damage identification method and system
CN114998255A (en) * 2022-05-31 2022-09-02 南京工业大学 A lightweight deployment method based on aero-engine hole detection and crack detection
CN115035291A (en) * 2022-04-12 2022-09-09 江汉大学 Semantic segmentation-based sand and gravel image segmentation method, device, equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104067314A (en) * 2014-05-23 2014-09-24 中国科学院自动化研究所 Human-shaped image segmentation method
CN108985343A (en) * 2018-06-22 2018-12-11 深源恒际科技有限公司 Automobile damage detecting method and system based on deep neural network
US10304193B1 (en) * 2018-08-17 2019-05-28 12 Sigma Technologies Image segmentation and object detection using fully convolutional neural network
CN110020652A (en) * 2019-01-07 2019-07-16 新而锐电子科技(上海)有限公司 The dividing method of Tunnel Lining Cracks image
CN110059698A (en) * 2019-04-30 2019-07-26 福州大学 The semantic segmentation method and system based on the dense reconstruction in edge understood for streetscape
CN110135379A (en) * 2019-05-21 2019-08-16 中电健康云科技有限公司 Tongue Image Segmentation Method and Device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104067314A (en) * 2014-05-23 2014-09-24 中国科学院自动化研究所 Human-shaped image segmentation method
CN108985343A (en) * 2018-06-22 2018-12-11 深源恒际科技有限公司 Automobile damage detecting method and system based on deep neural network
US10304193B1 (en) * 2018-08-17 2019-05-28 12 Sigma Technologies Image segmentation and object detection using fully convolutional neural network
CN110020652A (en) * 2019-01-07 2019-07-16 新而锐电子科技(上海)有限公司 The dividing method of Tunnel Lining Cracks image
CN110059698A (en) * 2019-04-30 2019-07-26 福州大学 The semantic segmentation method and system based on the dense reconstruction in edge understood for streetscape
CN110135379A (en) * 2019-05-21 2019-08-16 中电健康云科技有限公司 Tongue Image Segmentation Method and Device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Z. SHEN, X. WAN, F. YE, X. GUAN AND S. LIU: "Deep Learning based Framework for Automatic Damage Detection in Aircraft Engine Borescope Inspection" *
李冰涛等: "基于成像测井的裂缝智能识别新方法" *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111583215A (en) * 2020-04-30 2020-08-25 平安科技(深圳)有限公司 Intelligent damage assessment method and device for damage image, electronic equipment and storage medium
CN111899169A (en) * 2020-07-02 2020-11-06 佛山市南海区广工大数控装备协同创新研究院 Network segmentation method of face image based on semantic segmentation
CN111899169B (en) * 2020-07-02 2024-01-26 佛山市南海区广工大数控装备协同创新研究院 A method of segmentation network for face images based on semantic segmentation
CN111798469A (en) * 2020-07-13 2020-10-20 珠海函谷科技有限公司 A Semantic Segmentation Method for Small Datasets of Digital Images Based on Deep Convolutional Neural Networks
CN112232349A (en) * 2020-09-23 2021-01-15 成都佳华物链云科技有限公司 Model training method, image segmentation method and device
CN112232349B (en) * 2020-09-23 2023-11-03 成都佳华物链云科技有限公司 Model training method, image segmentation method and device
CN112818822B (en) * 2021-01-28 2022-05-06 中国空气动力研究与发展中心超高速空气动力研究所 Automatic identification method for damaged area of aerospace composite material
CN112818822A (en) * 2021-01-28 2021-05-18 中国空气动力研究与发展中心超高速空气动力研究所 Automatic identification method for damaged area of aerospace composite material
CN113838014B (en) * 2021-09-15 2023-06-23 南京工业大学 Video detection method of aero-engine damage based on dual space warping
CN113838014A (en) * 2021-09-15 2021-12-24 南京工业大学 Aircraft engine damage video detection method based on double spatial distortion
CN115035291A (en) * 2022-04-12 2022-09-09 江汉大学 Semantic segmentation-based sand and gravel image segmentation method, device, equipment and medium
CN115035291B (en) * 2022-04-12 2024-12-03 江汉大学 Sand and gravel image segmentation method, device, equipment and medium based on semantic segmentation
CN114842192A (en) * 2022-04-15 2022-08-02 南京航空航天大学 Aero-engine blade damage identification model, damage identification method and system
CN114842192B (en) * 2022-04-15 2024-09-27 南京航空航天大学 A damage identification model, damage identification method and system for aircraft engine blades
CN114998255A (en) * 2022-05-31 2022-09-02 南京工业大学 A lightweight deployment method based on aero-engine hole detection and crack detection

Also Published As

Publication number Publication date
CN110895814B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
CN110895814B (en) Aero-engine hole-finding image damage segmentation method based on context coding network
CN113486865B (en) Power transmission line suspended foreign object target detection method based on deep learning
CN117173449A (en) Aeroengine blade defect detection method based on multi-scale DETR
CN111126134B (en) Deep learning identification method of radar radiation source based on non-fingerprint signal canceller
CN111798469A (en) A Semantic Segmentation Method for Small Datasets of Digital Images Based on Deep Convolutional Neural Networks
CN114021704B (en) An AI neural network model training method and related device
CN109886947A (en) High-voltage wire defect detection method based on region convolutional neural network
CN112132086B (en) Multi-scale martensite microstructure aging and damage grading method
CN111161224A (en) Classification and evaluation system and method of casting internal defects based on deep learning
CN114970240B (en) Method and equipment for rapidly evaluating load state of multiphase composite structural image
CN117451716A (en) A method for detecting surface defects of industrial products
CN116740037B (en) Concrete multi-label defect identification method based on deep learning
CN117726627B (en) Chip surface defect detection method and equipment
CN120070401A (en) Progressive industrial product surface defect detection method based on decoupling characterization
CN113516652A (en) Battery surface defect and adhesive detection method, device, medium and electronic equipment
CN117830888A (en) Visual recognition method for missing bolts in tower based on multi-order deformation lightweight structure
CN114596244A (en) Infrared image recognition method and system based on vision processing and multi-feature fusion
CN118628499A (en) A method for aircraft engine blade defect detection based on network architecture search
CN119625390A (en) A metal surface defect detection method based on adversarial and residual network
CN118864425A (en) Fault detection method, device, computer equipment and storage medium for converter valve
CN111382787A (en) Target detection method based on deep learning
CN118038271A (en) Hyperspectral target detection method based on attention mechanism
CN112700425B (en) A method for determining the quality of X-ray images of power equipment
CN117094225A (en) A heat conduction solution method and system based on deep learning
CN114998611A (en) Target contour detection method based on structure fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant