CN114092458A

CN114092458A - Engine dense smoke and light smoke automatic detection method based on improved NanoDet deep network

Info

Publication number: CN114092458A
Application number: CN202111428973.1A
Authority: CN
Inventors: 王静静; 张聪; 张新曼; 罗智元; 陈冕; 程昭晖; 赵红超; 贾士凡; 王书琴; 毛乙舒; 陆罩
Original assignee: Xian Jiaotong University; AECC Sichuan Gas Turbine Research Institute
Current assignee: Xian Jiaotong University; AECC Sichuan Gas Turbine Research Institute
Priority date: 2021-11-29
Filing date: 2021-11-29
Publication date: 2022-02-25
Anticipated expiration: 2041-11-29
Also published as: CN114092458B

Abstract

An automatic detection method of engine smoke and light smoke based on improved NanoDet deep network, collecting engine smoke pictures to form a data set; using improved NanoDet deep network, under the condition of ensuring the same receptive field, only C5 feature layer is used to reduce network parameters; The frame is predicted for each pixel of the feature layer; the positive and negative samples are screened by an adaptive training sample selection algorithm. The detection head is composed of a classification branch, a frame regression branch and an implicit unsupervised target prediction sub-branch to improve the detection accuracy. Whether the engine produces smoke and the type of smoke is judged by the network output. If smoke is detected, it will continue to judge the type of smoke detected according to the chromaticity difference between the smoke area and the background area. If light smoke is generated, an alarm will be immediately issued to prevent micro-smoke. If the detection result is heavy smoke, emergency measures will be automatically activated in addition to the alarm. ; if no smoke is detected, continue with the detection. The method can realize the detection of heavy smoke and light smoke from automobile and aero engines.

Description

Engine dense smoke and light smoke automatic detection method based on improved NanoDet deep network

Technical Field

The invention belongs to the technical field of automatic control, relates to automatic detection of engine smoke, and particularly relates to an automatic detection method of engine dense smoke and light smoke based on an improved NanoDet deep network.

Background

In order to meet the functions of high efficiency, small size and the like, the structure of the engine is increasingly complex, and the probability of failure occurrence is increased when the engine works under severe working conditions of high temperature, high pressure, high rotating speed and the like for a long time. On one hand, smoke is one of important image characteristics appearing at the initial stage of abnormal occurrence such as fire and explosion, so that smoke detection is accurately carried out on engine test videos in various environments in real time, and the method is an important means for reducing economic loss and guaranteeing user safety; on the other hand, with the birth and development of deep learning technology, video understanding and intelligent analysis are extensively and deeply researched, which provides an effective implementation means for smoke detection and effectively solves the smoke detection problem under complex environments of different angles, different illumination, different backgrounds and the like.

At present, smoke detection is mainly divided into three directions of smoke detection sensors, traditional artificial feature extraction and deep learning based. The sensor has the advantages of low price and easy installation, but the smoke sensor requires closed space and is oxidized due to large-area contact with air, and the sensitivity is not high; the traditional algorithm is abnormally unstable along with the change of the color, texture and shape characteristics of the smoke image along with factors such as illumination conditions.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide an automatic detection method for thick smoke and light smoke of an engine based on an improved NanoDet deep network, the intelligent analysis is carried out on a video based on the improved NanoDet deep network, the smoke generated by the engine can be effectively detected and an alarm can be given, automatic smoke detection support is provided for the test of the engine, the smoke detection and the event alarm of a test monitoring video are realized, human resources are liberated, the application range is wide, the detection result is accurate and reliable, and the robustness is higher.

In order to achieve the purpose, the invention adopts the technical scheme that:

an automatic detection method for dense smoke and light smoke of an engine based on an improved NanoDet deep network comprises the following steps:

step 1), collecting engine smoke pictures to form a data set;

step 2), labeling smoke and light smoke of a smoke picture in a data set, dividing the smoke picture into a training set and a testing set, inputting an improved NanoDet deep network for training, wherein the training set is input into a ShuffleNet v2 trunk network for convolution calculation to generate feature layers with different sizes, respectively using the downsampling rates of 8:1, 16:1 and 32:1, then inputting the C5 feature layer obtained by the downsampling rate of 32:1 into an expansion encoder, namely a projection layer and four expansion residual blocks, then performing multi-frame prediction on each pixel in the C5 feature layer processed by the expansion encoder, meanwhile, screening positive and negative samples by using an adaptive training sample selection Algorithm (ATSS), and finally calculating position regression loss and classification loss by using a detection head;

and 3), taking the trained improved NanoDet deep network as a smoke detector, and judging the dense smoke and the light smoke of the engine, wherein the method comprises the following steps:

collecting video frames, inputting the video frames into a trained improved NanoDet depth network, judging whether an engine generates smoke or not and the range of the generated smoke according to the probability of outputting the smoke category and the position of a smoke frame through characteristic enhancement, characteristic conversion and standardization processing, calculating the chromaticity of a detected smoke image, calculating the chromaticity average value of a smoke area and the chromaticity average value of a surrounding background area, calculating the difference between the two values, judging that the generated smoke is thick smoke if the two values exceed a set threshold value, otherwise, judging that the smoke is light smoke, and alarming and taking different actions according to the thick smoke and the light smoke if the probability of detecting the smoke of the engine exceeds the preset threshold value; otherwise, continuing monitoring.

Preferably, in the step 1), the acquired engine smoke image is subjected to data enhancement to amplify the data set.

The data enhancement method can be as follows:

performing Mosaic data enhancement on the original data set, and randomly splicing the four pictures to form a picture;

and performing one or more operations of rotation, brightness adjustment, scaling, random occlusion and random clipping on the original data set.

The method for enhancing the Mosaic data can be as follows: the method comprises the following steps of taking a batch of (batch size) pictures from a data set, randomly extracting four pictures each time, splicing the four pictures into output pictures from left to right and from top to bottom after cutting, wherein the size of the output pictures is consistent with the size of an original sample, and the cutting method comprises the following steps: randomly generating trimming position parameters cut _ x and cut _ y; the cut _ x and the cut _ y represent the width and the height of the cut on the upper left corner of the picture 1, the size of the output picture is determined, so the width and the height of the cut on the upper right corner of the picture 2, the width and the height of the cut on the lower left corner of the picture 3 and the width and the height of the cut on the lower right corner of the picture 4 are determined, after the four pictures are cut, the

cut sub-pictures

1, 2, 3 and 4 are spliced into new pictures according to the sequence of the upper left, the upper right, the lower left, the lower right, the boundary frame of the original picture is reserved, the Mosaic data enhancement process is repeated for a batch time, and a new batch of (batch) data is obtained.

Preferably, in the step 2), the aspect ratio of the smoke pictures in the data set is kept uniform, and the dense smoke and the light smoke are marked by adopting rectangular boxes.

Preferably, the modified NanoDet deep network consists of three parts: a ShuffleNet V2 backbone network, an expansion encoder and a detection head; the basic unit of the ShuffleNet V2 backbone network firstly separates the input channels, one part of the channels inputs 1 × 1 convolution, 3 × 3 deep convolution and 1 × 1 convolution, the other part of the channels does not operate, and finally the two parts of the channels are connected to carry out channel random mixing operation; the ShuffleNet V2 backbone network is responsible for carrying out convolution calculation on input, down-sampling to obtain feature layers with different scales, and conveying the C5 feature layer to an expansion encoder for processing, wherein the expansion encoder is composed of a projection layer and four cascaded expansion residual blocks, the projection layer is composed of convolution kernels of 1 × 1 and 3 × 3, the four expansion residual blocks have different expansion factors, the purpose is to expand the receptive field, and the problem that the receptive field is too small due to the fact that only the C5 feature layer is used is solved; the method comprises the steps of conducting multiple frame prediction on each pixel on a C5 feature layer processed by an expansion encoder, namely generating multiple predicted values of the distance between the pixel coordinate and four boundaries of a boundary frame, namely multiple candidate frames, screening positive and negative samples by using an adaptive training sample selection Algorithm (ATSS), enabling a detection head to be composed of two branches and respectively responsible for calculating classification loss and conducting frame regression, leading out a sub-branch through the frame regression branch, enabling the sub-branch to obtain the probability of detecting a target, namely smoke, in a region through unsupervised calculation of the frame position obtained through frame regression, and multiplying the probability by a classification score to obtain the final probability of classification so as to achieve the function of filtering the background.

Preferably, in the step 2), the C5 feature layer is input into 1 × 1 convolution, then input into 3 × 3 convolution, and then input into 4 residual expansion blocks.

Preferably, the detection head performs a classification loss calculation using a focus loss.

Preferably, in the step 3), whether the engine generates smoke is judged according to the probability of outputting the smoke category, if the probability is greater than a set threshold, the engine generates smoke is indicated, and whether the detected smoke is thick smoke or light smoke is judged by calculating the average chromaticity of a smoke area and a surrounding background area; if the detection is less than the set threshold value, the detection is continued.

Compared with the prior art, the invention has the following beneficial effects:

and performing Mosaic data enhancement on the smoke data sets of the automobile and the aircraft engine collected by monitoring in the network training process, so that a new training sample contains information of four original training samples, and the trained network is more robust. Meanwhile, various data enhancement methods such as rotation, brightness adjustment, scaling, random shielding, random cutting and the like are adopted for the data, so that the detection of smoke with different scales, backgrounds and illumination is facilitated.

When the engine smoke image is detected, the functions of real-time high-efficiency high-frame-rate engine smoke detection and alarm can be realized by improving the NanoDet deep network detection and outputting the result, and the detection precision and the practicability are improved to a certain extent. By adopting the improved NanoDet deep network, the problem of prediction precision reduction caused by smoke diffusion can be solved by predicting the frame pixel by pixel, so that the diffusion range of smoke can be predicted more accurately; the original road stiffness aggregation network structure is replaced by single characteristic layer input, the receptive field is enlarged by using the projection layer and the expanded residual error structure, the detection efficiency of the network is improved on the premise of ensuring the detection accuracy, and the engine smoke detection effect of the neural network is improved to a certain extent; the detection head is composed of classification branches and frame regression branches, the frame regression branches are divided into an implicit unsupervised implicit target prediction sub-branch, an objective score is generated, and the background can be better filtered. The focus loss is used as a loss function, training of samples generated by the background is inhibited, and training of foreground samples is not influenced, so that the network detection accuracy is improved, and the detection speed is improved. The smoke detection method can accurately detect the smoke of the engine, has accurate and reliable detection result, strong robustness and wide application range, and has wide application prospect in the intelligent security field of social service, public safety, intelligent analysis, industrial control safety and the like.

Furthermore, the method detects the input engine smoke gray level image by improving the NanoDet deep network, and the improved NanoDet deep network is optimized on the basis of the original NanoDet network, so that the engine smoke dangerous image can be detected better and faster, and the use safety is improved. In particular, compared with some classical target detection networks, the invention uses few parameters of the improved NanoDet deep network, has light model weight, saves training cost and is convenient to transplant to a mobile terminal. Inputting a video stream to be detected into a trained improved NanoDet deep network, performing corresponding convolution processing, feature extraction and feature enhancement, and finally outputting the detection type and position of a target, wherein if the probability of detecting smoke exceeds a preset threshold value, the smoke emission of an engine is true, the type of the detected smoke is further judged, if light smoke is detected, an alarm is given, and if dense smoke is detected, emergency measures and an alarm are started; otherwise, the detection is continued.

The deep learning algorithm makes a major breakthrough in time series data analysis, and effectively solves the detection problems caused by video frame distortion, blurring, zooming, illumination change, occlusion and the like. The deep learning network can simultaneously carry out a large amount of data learning, and the detection efficiency of the deep learning network exceeds many traditional algorithms. Among them, the convolutional neural network has been widely used in the fields of target detection, human detection, and motion recognition.

The NanoDet deep neural network is a lightweight target detection network, adopts an anchor-free target detection framework to perform frame prediction on each pixel of a characteristic layer, and is suitable for engine smoke detection. The diffuse nature of smoke results in low accuracy of anchored detection networks using anchor frame regression, and pixel-by-pixel prediction can improve the accuracy of smoke detection and the accuracy of regression of the position of the border. Meanwhile, the light weight structure of the NanoDet deep neural network is not only suitable for transplantation and deployment at a mobile end, but also is friendly to training, and the detection speed is high. The characteristics are very in line with the requirement of the smoke detection of the engine on timeliness.

According to the method, on the basis of the original NanoDet deep neural network, a path aggregation network structure (PAN) in the original NanoDet network is removed, only a single characteristic layer is used for regression and prediction, a cavity convolution layer is used for expanding the receptive field of the single characteristic layer, and the detection speed is further improved under the condition of not losing accuracy. The detection head of the improved NanoDet consists of a classification branch and a frame regression branch, and meanwhile, in order to reduce the interference of the background on smoke detection, the unsupervised target prediction sub-branch is applied to screen out the background, so that the detection precision is improved. And finally, performing classification and border regression by using the focus loss, and inhibiting the learning of the background sample. Compared with an anchor target detection network, the improved NanoDet neural network has the advantages of low training cost, easy transplantation, high detection speed and high detection precision; compared with the original NanoDet neural network, the improved NanoDet neural network is lighter in weight and higher in detection speed, and a good effect is achieved in an engine smoke detection experiment.

Drawings

Fig. 1 is a schematic block diagram of an engine smoke and light smoke detection method based on an improved NanoDet deep network.

Fig. 2 is a schematic diagram of a method for enhancing a Mosaic image.

Fig. 3 is a diagram of the original NanoDet network structure.

FIG. 4 is a diagram illustrating a single-input single-output architecture;

fig. 5 is a structure diagram of a modified NanoDet deep network.

Fig. 6 is a graph of the detection result of the engine smoke, wherein (a) is a graph of the detection result of the real automobile engine smoke, (b) is a graph of the detection result of the simulated aircraft engine smoke, and (c) and (d) are graphs of the detection result of the network aircraft engine test video smoke.

Fig. 7 is a graph of the results of a laboratory simulated aero-engine smoke reduction experiment, wherein graphs (a) - (d) represent simulated experiments for different locations and types.

Detailed Description

The embodiments of the present invention will be described in detail below with reference to the drawings and examples.

The invention provides an automatic detection method for dense smoke and light smoke of an engine based on an improved NanoDet deep network, which is used for judging whether the engine generates smoke in a complex environment or not, and is mainly characterized by comprising a training stage and a detection stage with reference to figure 1.

As shown in fig. 1, in the network training stage, a data set is formed by collecting engine smoke pictures, then dense smoke and light smoke labeling is performed, the data set is divided into a training set and a testing set, the improved nanoget deep network is trained, and the trained improved nanoget deep network can be used as a smoke detector.

Specifically, the engine can be an automobile engine and an aeroengine.

According to the invention, the acquired engine smoke picture can be subjected to data enhancement so as to amplify the data set. By way of example, the method of data enhancement may be:

and performing Mosaic data enhancement on the original data set, and randomly splicing and fusing the four pictures to form a new picture. And performing one or more operations of rotation, brightness adjustment, scaling, random occlusion and random cutting on the pictures in the original data set. These two processed data sets constitute a new data set. In the new dataset, the picture aspect ratio is adjusted to uniform pixels (e.g., 320 × 320), and rectangular box labeling is performed for smoke and non-smoke.

In order to solve the problems of more parameters for predicting a plurality of characteristic layers and high training cost, the invention adopts an improved NanoDet deep network and uses a structure of single characteristic layer input and single layer output to replace a multipath aggregation network structure (PAN) in the traditional NanoDet. Meanwhile, in order to make up for the problem that the reception field of a single characteristic layer is too small, the reception field is expanded through one projection layer and four expansion residual blocks. The method has the advantages that the NanoDet deep network is improved, frame prediction is directly carried out on each pixel point of the characteristic layer, positive and negative samples are screened by an adaptive sample selection Algorithm (ATSS), and the effect and the speed of the neural network are improved to a certain extent. The detection head is composed of classification branches and frame regression branches, and meanwhile, the background is filtered by applying implicit unsupervised target prediction sub-branches.

The training set is input into a ShuffleNet v2 backbone network to carry out convolution calculation to generate feature layers with different scales, then the C5 feature layer is input into an expansion encoder, namely a projection layer and four expansion residual blocks, then a plurality of frame predictions of a target are carried out on each pixel in the C5 feature layer processed by the expansion encoder, meanwhile, positive and negative samples are screened by an adaptive training sample selection Algorithm (ATSS), and finally, the loss of position regression and the classification loss are calculated by a detection head.

In conclusion, the method inputs the video of the engine to be detected into the improved NanoDet deep network, performs corresponding convolution processing, feature extraction and feature enhancement, outputs the detection category probability and the frame coordinate, and judges whether the engine generates smoke, dense smoke or light smoke according to the detected category and position. The method comprises the following specific steps:

1. engine smoke image enhancement

Referring to fig. 2, the specific method for enhancing the Mosaic data is as follows:

and (3) taking a batch of (batch size) pictures from the data set, randomly extracting four pictures each time, splicing the four pictures into output pictures from left to right and from top to bottom after cutting, wherein the sizes of the output pictures are consistent with the sizes of original pictures. The cutting method comprises the following steps: the clipping position parameters cut _ x and cut _ y are randomly generated. The cut _ x and the cut _ y represent the width and the height of the cut on the upper left corner of the picture 1, because the size of the output picture is determined, the width and the height of the cut on the upper right corner of the picture 2, the width and the height of the cut on the lower left corner of the picture 3 and the width and the height of the cut on the lower right corner of the picture 4 are determined, after the four pictures are cut, the cut sub-pictures 1, 2, 3 and 4 are spliced into new pictures according to the sequence of the upper left, the upper right, the lower left and the lower right, the boundary frame of the original picture is reserved, the Mosaic data enhancement process is repeated for a batch of times, and a batch of (batch) data is obtained and is marked as a data set 1.

In order to enhance the robustness of the network to the external condition changes such as background, illumination, shielding and the like, each smoke picture in the original data set is subjected to operations such as rotation, brightness adjustment, scaling, random clipping, random shielding and the like to obtain a data set 2.

The new data set is composed of a data set 1 and a data set 2, a uniform length-width ratio is kept for smoke pictures in the new data set, dense smoke and light smoke are marked by adopting rectangular boxes, and the smoke pictures are divided into a training set and a testing set.

2. Improving NanoDet deep network training

Fig. 3 is a structure diagram of an original NanoDet deep network, wherein the NanoDet is an anchorless one-stage detection network and uses shuffle v2 as a backbone network. The original NanoDet deep network carries out frame prediction on each pixel at a C5 feature layer, and a large number of candidate anchor frames do not need to be generated at the feature layer. Let F_iDenotes the ith feature layer and s denotes the cumulative convolution step. The real frame of the input image is represented as

The first four items represent the coordinates of the top left corner and the bottom right corner of the border, and the last item represents the category of the border. C is the total number of classes.

For F_iCan be mapped onto the input image at each pixel location (x, y) of (c)

For each pixel (x, y), the network will learn four parameters (l)^*,t^*,r^*,b^*) The four parameters define the bounding box corresponding to this position (x, y). The four parameters are defined as follows:

here, (x, y) represents coordinates after mapping to the original image.

The final output of the network is a C-dimensional vector p_x,yRepresenting class probabilities, and a 4-dimensional vector t_x,yAnd (l, t, r, b) is used to indicate the position of the frame predicted by the pixel.

Fig. 4 is a structure diagram using a single input and single output, and the single input and single output structure only uses a C5 feature layer, thereby reducing the parameters of the network and improving the detection speed. Since only one feature layer is used, the receptive field cannot cover targets of all scales, and in order not to lose detection accuracy, the receptive field is enlarged using a void convolution layer. Specifically, a projection layer and four cascaded expansion residual blocks are accessed behind a feature layer. The projection layer consists of a 1 × 1 convolution kernel and a 3 × 3 convolution kernel. The 1 × 1 convolution kernel is used to reduce the channel dimension and the 3 × 3 convolution kernel is used to extract semantic information. The overall structure of the expanded residual block is a residual structure, and the main channel is sequentially a convolution kernel of 1 × 1, a void convolution layer of 3 × 3 and a convolution kernel of 1 × 1. The hole convolution layer allows the receptive field to be exponentially expanded without loss of resolution. Order to

Is a discrete function of

And order

For a discrete filter, then the discrete convolution operator can be defined as:

generalizing the operator, let l be the expansion factor, then convolving the hole_lThe operator is defined as:

the four expansion residual blocks adopt different expansion factors l, so that the receptive field covers all the targets to be detected in the picture as much as possible.

Referring to fig. 5, the modified NanoDet deep network is also composed of three parts: a ShuffleNet V2 backbone network, an expansion encoder and a detection head; the basic unit of the ShuffleNet V2 backbone network firstly separates the input channels, one part of the channels inputs 1 × 1 convolution, 3 × 3 deep convolution and 1 × 1 convolution, the other part of the channels does not operate, and finally the two parts of the channels are connected to carry out channel random mixing operation; the ShuffleNet V2 backbone network is responsible for carrying out convolution calculation on input, downsampling to obtain feature layers with different scales, and conveying the C5 feature layer to an expansion encoder for processing, wherein the expansion encoder is composed of a projection layer and four cascaded expansion residual blocks, the projection layer is composed of convolution kernels of 1 × 1 and 3 × 3, the C5 feature layer is firstly input with convolution of 1 × 1, then input with convolution of 3 × 3, and then input with 4 expansion residual blocks, and the four expansion residual blocks have different expansion factors, so that the perception field is expanded, and the problem that the perception field is too small due to the fact that only the C5 feature layer is used is solved. The method comprises the steps of conducting multiple frame predictions on each pixel on a C5 feature layer processed by an expansion encoder, namely generating multiple predicted values of the distance between the pixel coordinate and four boundaries of a boundary frame, namely multiple candidate frames, screening positive and negative samples by using an adaptive training sample selection Algorithm (ATSS), enabling a detection head to be composed of two branches and respectively responsible for calculating classification loss and conducting frame regression, leading out a sub-branch through the frame regression branch, enabling the frame position obtained through the frame regression of the sub-branch to unsupervised calculate the probability that a target, namely smoke, is detected in the region, multiplying the probability and classification scores, and enabling the relation of the two scores to be independent, so that the classified final probability is obtained, and the function of filtering the background is achieved.

Improved NanoDThe et depth network employs an adaptive training sample selection Algorithm (ATSS) to screen positive and negative samples. The algorithm has no hyper-parameter and good robustness, and can improve the speed and stability of smoke detection. The algorithm traverses each real frame on the input image, generates an empty candidate set aiming at one real frame, selects k candidate frames with the shortest distance between the centers of the candidate frames in the characteristic layer and the real frame, and adds the k candidate frames into the candidate set. Next, a cross-over ratio is calculated for each candidate box and true bounding box in the candidate set (IoU), and a IoU mean m of the set is calculated for the candidate set_gAnd standard deviation v_g. Defining a threshold t_g＝m_g+v_gThe threshold definition takes into account the influence of the average level of the candidate set on the threshold, which should be high if the average level is high. Traversing the candidate set, candidate boxes that are greater than or equal to the threshold and whose center is inside the real border will be considered positive samples, others will be defined as negative samples. And performing the operation on each real frame on the input image. The algorithm only has one hyperparameter k, the optimal value of the k parameter is robust, and experiments show that the optimal value of k is around 9. Therefore, the training sample selection algorithm has few hyper-parameters, and the deep network parameter adjusting process can be simplified.

The improved NanoDet deep network uses the focus loss as a loss function, and the focus loss solves the problems that negative samples generated by a background and positive samples generated by a foreground occupy the loss function due to quantity difference and most of calculation amount of gradient reduction, and more calculation resources are used for calculating the foreground instead of calculating the background. The cross-entropy loss commonly used by deep learning networks is defined as follows:

wherein y ∈ { + -1 } represents a real category, and p ∈ [0,1 ]]Is a probability value of a network prediction class, defining p_tComprises the following steps:

the cross-entropy loss can be written as CE (p, y) ═ CE (p)_t)＝-log(p_t). The drawback of cross-entropy loss is that the loss of easy-to-learn samples, i.e. negative samples, tends to account for a large portion of the total loss, resulting in meaningless calculations when the gradient is decreasing. In order to focus the network on learning positive samples, the focus loss function determines the difficulty of sample training according to the level of probability value of classified samples, and suppresses samples that are easy to learn, and the basic definition is as follows (omitting parameter α)_t)：

FL(p_t)＝-(1-p_t)^γlog(p_t)

Where y represents the true value category, and is 0 or 1. p is the prediction probability value, p ∈ (0, 1). Gamma is a focusing parameter, adjustable. Negative examples are easy to learn and have a low probability of being misclassified, i.e. p_tA large value, the focusing factor (1-p)_t)^γThe value tends to 0, the loss value tends to 0, and the sample learning is inhibited; positive samples are harder to learn, p_tThe focusing factor approaches 1 compared to the negative samples, and the overall loss calculation is not suppressed. Adding the hyperparametric weight coefficient alpha_tThe detection accuracy of the network can be improved, and the method is defined as follows:

wherein the weight is sparse alpha for positive samples_t＝α∈[0,1]For negative sample alpha_t1- α, the complete loss of focus is defined as follows:

FL(p_t)＝-α_t(1-p_t)^γlog(p_t)

the improved NanoDet deep network uses two branches of classification and frame regression as detection heads, and the structure of the improved NanoDet deep network is shown in FIG. 5. The classification detection head consists of 2 3 multiplied by 3 convolution kernels, a batch normalization layer and a ReLU activation function; the frame regression detection head is normalized in batch by four 3 multiplied by 3 convolution kernelsLayer and ReLU activation function. Meanwhile, an implicit unsupervised target predication sub-branch is added to each predicated frame in the frame regression branch, the sub-branch structure is consistent with the centralized branch structure of the FCOS, target index calculation can be carried out on pixels in the currently detected frame, and the degree of the frame containing targets is obtained through the index calculation. The target degree index may be measured by the following index: the scale saliency is that the saliency of the pixel in the frame relative to the whole picture is calculated, and the saliency relative to the whole picture at the pixel p is available

Calculating, wherein the larger the value is, the more likely the frame contains the target; the greater the color contrast, i.e., the contrast of the color inside the border to its surrounding colors, the higher the probability that the target is, the border w and its background Surr (w, θ)_CC) The contrast ratio of (c) can be CC (w, theta)_CC)＝χ²(h(w),h(Surr(w,θ_CC) )); super-pixel cross-boundary: defining superpixels as blocks of pixels with almost the same color, if the majority contained in a frame is a target, the superpixel cross boundary is rare, if the majority is a background, the superpixel cross boundary is many, and the degree of the superpixel cross boundary frame w can be used

It is calculated that if this value is high, the higher the probability of being a target within the box. By calculating the above-described index of the frame of the regression branch, the target of the image in the frame, that is, the possibility of including the target can be obtained. Multiplying the targetability and classification score will achieve the effect of filtering the background.

3. Detecting engine smoke, and judging whether the detected smoke is dense smoke or light smoke according to the chromaticity difference

And (3) taking the trained improved NanoDet deep network as a smoke detector to judge the dense smoke and the light smoke of the engine. Firstly, judging whether the engine generates smoke or not according to the probability of outputting the smoke category, if the probability is larger than a set threshold, indicating that the engine generates smoke, and continuously judging whether the detected smoke is thick smoke or light smoke through calculating the average chroma of a smoke area and a surrounding background area; if the detection is less than the set threshold value, the detection is continued.

Y in YUV color space represents luminance, and U and V both represent chrominance. When the smoke is thickened, the gray scale of the image is increased, and the chroma is reduced. And calculating U and V values pixel by pixel of the detected smoke area.

Y_S＝0.299R′_S+0.587G′_S+0.114B′_S

U_S＝0.492(B'_S-Y_S)

V_S＝0.877(R'_S-Y_S)

Wherein S represents smoke. Averaging all pixels in the smoke region can yield an average chromaticity index:

and (3) carrying out the same operation on the pixels outside the smoke boundary frame to obtain a background average chromaticity index:

and (5) solving the difference between the two indexes.

T＝|C_B-C_S|

Comparing T with a preset threshold value T₀If T is greater than T₀If not, the cigarette is judged to be light smoke.

The above embodiments are merely exemplary embodiments of the present invention, which is not intended to limit the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the scope of the present invention.

FIG. 6 is a graph of the results of an engine smoke test, wherein (a) is a graph of a smoke test of an actual automobile engine, (b) is a graph of a smoke test of a simulated aircraft engine, and (c) is a graph of a smoke test result of a network aircraft engine test video; in fig. 7, (a), (b), (c), and (d) are graphs of the results of the simulated aero-engine smoke detection, which can prove that the invention can detect the dense smoke and the light smoke of the engine.

Claims

1. An automatic detection method for dense smoke and light smoke of an engine based on an improved NanoDet deep network is characterized by comprising the following steps:

step 1), collecting engine smoke pictures to form a data set;

step 2), labeling smoke and light smoke of a smoke picture in a data set, dividing the smoke picture into a training set and a testing set, inputting an improved NanoDet deep network for training, wherein the training set is input into a ShuffleNet V2 trunk network for convolution calculation to generate feature layers on different scales, inputting a C5 feature layer into an expansion encoder, namely a projection layer and four expansion residual blocks, then performing target multi-frame prediction on each pixel in the C5 feature layer processed by the expansion encoder, screening positive and negative samples by using an adaptive training sample selection Algorithm (ATSS), and finally calculating position regression loss and classification loss through a detection head;

2. The method for automatically detecting the engine smoke and the light smoke based on the improved NanoDet deep network as claimed in claim 1, wherein in the step 1), the collected engine smoke image is subjected to data enhancement to amplify the data set.

3. The method for automatically detecting the engine smoke and the light smoke based on the improved NanoDet deep network according to claim 2, wherein the data enhancement method comprises the following steps:

4. The method for automatically detecting the engine smoke and the light smoke based on the improved NanoDet deep network according to claim 3, wherein the method for enhancing the Mosaic data is as follows: the method comprises the following steps of taking a batch of (batch size) pictures from a data set, randomly extracting four pictures each time, splicing the four pictures into output pictures from left to right and from top to bottom after cutting, wherein the size of the output pictures is consistent with the size of an original sample, and the cutting method comprises the following steps: randomly generating trimming position parameters cut _ x and cut _ y; the cut _ x and the cut _ y represent the width and the height of the cut on the upper left corner of the picture 1, the size of the output picture is determined, so the width and the height of the cut on the upper right corner of the picture 2, the width and the height of the cut on the lower left corner of the picture 3 and the width and the height of the cut on the lower right corner of the picture 4 are determined, after the four pictures are cut, the cut sub-pictures 1, 2, 3 and 4 are spliced into new pictures according to the sequence of the upper left, the upper right, the lower left, the lower right, the boundary frame of the original picture is reserved, the Mosaic data enhancement process is repeated for a batch time, and a new batch of (batch) data is obtained.

5. The method for automatically detecting the engine dense smoke and the light smoke based on the improved NanoDet deep network as claimed in claim 1, wherein in the step 2), the smoke pictures in the data set are kept with a uniform length-width ratio, and the dense smoke and the light smoke are marked by adopting rectangular boxes.

6. The method for automatically detecting the engine smoke and the light smoke based on the improved NanoDet deep network according to claim 1, wherein the improved NanoDet deep network consists of three parts: a ShuffleNet V2 backbone network, an expansion encoder and a detection head; the basic unit of the ShuffleNet V2 backbone network firstly separates the input channels, one part of the channels inputs 1 × 1 convolution, 3 × 3 deep convolution and 1 × 1 convolution, the other part of the channels does not operate, and finally the two parts of the channels are connected to carry out channel random mixing operation; the ShuffleNet V2 backbone network is responsible for carrying out convolution calculation on input, down-sampling to obtain feature layers with different scales, and conveying the C5 feature layer to an expansion encoder for processing, wherein the expansion encoder is composed of a projection layer and four cascaded expansion residual blocks, the projection layer is composed of convolution kernels of 1 × 1 and 3 × 3, the four expansion residual blocks have different expansion factors, the purpose is to expand the receptive field, and the problem that the receptive field is too small due to the fact that only the C5 feature layer is used is solved; the method comprises the steps of conducting multiple frame prediction on each pixel on a C5 feature layer processed by an expansion encoder, namely generating multiple predicted values of the distance between the pixel coordinate and four boundaries of a boundary frame, namely multiple candidate frames, screening positive and negative samples by using an adaptive training sample selection Algorithm (ATSS), enabling a detection head to be composed of two branches and respectively responsible for calculating classification loss and conducting frame regression, leading out a sub-branch through the frame regression branch, enabling the sub-branch to obtain the probability of detecting a target, namely smoke, in a region through unsupervised calculation of the frame position obtained through frame regression, and multiplying the probability by a classification score to obtain the final probability of classification so as to achieve the function of filtering the background.

7. The method for automatically detecting the engine smoke and the light smoke based on the improved NanoDet deep network as claimed in claim 6, wherein in the step 2), the C5 feature layer is input into the convolution of 1 x 1, then input into the convolution of 3 x 3, and then input into the 4 expansion residual blocks.

8. The method for automatically detecting the engine smoke and the light smoke based on the improved NanoDet deep network as claimed in claim 6, wherein the detection head calculates the classification loss by using the focus loss.

9. The method for automatically detecting the dense smoke and the light smoke of the engine based on the improved NanoDet deep network as claimed in claim 1, wherein in the step 3), whether the engine generates the smoke or not is firstly judged according to the probability of outputting the smoke category, if the probability is greater than a set threshold value, the engine generates the smoke, and whether the detected dense smoke or the detected light smoke is judged continuously through calculating the average chromaticity of a smoke area and a surrounding background area; if the detection is less than the set threshold value, the detection is continued.