Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solutions claimed in the claims of the present application can be implemented without these technical details and with various changes and modifications based on the following embodiments.
A first embodiment of the invention relates to a method of detecting standing water. The specific flow is shown in figure 1.
The method comprises the following steps:
101: acquiring a video image frame to be detected and a water accumulation detection model;
102: carrying out water accumulation detection on the video image frame to be detected through the water accumulation detection model to obtain a detection result;
103: and if the detection result is that the accumulated water exists, acquiring the position attribute information of the accumulated water detection area of the target image.
It should be noted that the location attribute information of the ponding detection area includes: detecting frames of the water accumulation areas and coordinates and categories of the position points; the method further comprises the following steps:
acquiring a detection frame with a water accumulation area and a labeled sample set of position point coordinates;
performing data enhancement processing on the labeled sample set to obtain an expansion labeled sample set;
and training the ponding detection model according to the expansion labeling sample set.
It should be further noted that, the method further includes:
acquiring the accumulated water detection model;
and converting the single-precision accumulated water detection model into a half-precision model.
It should be further noted that, the step of performing data enhancement processing on the labeled sample set to obtain an expanded labeled sample set includes:
acquiring a water accumulation area picture set in a marked sample set according to the detection frame with the water accumulation area and the position point coordinates;
performing local data enhancement processing on the ponding region picture set to obtain an expansion ponding region picture set;
and acquiring a capacity expansion labeling sample set according to the capacity expansion waterlogged area picture set. The specific implementation of the step is as follows: the image set of the expanded ponding area after the local data enhancement is also required to be randomly attached to a road for simulating real ponding and generating the label information of the ponding area on the corresponding composite road. That is to say, the expansion ponding region picture set is combined with the random road to form an expansion labeling sample set.
It should be further noted that the detection result includes: the positions and the scores of the detection frames of the ponding detection model and the positions and the scores of the masks of the ponding detection model; the method further comprises the following steps:
presetting a ponding detection threshold;
judging the mask score of the detection frame score output by the ponding detection model and the ponding detection threshold;
and if the detection frame score and the mask score exceed the ponding detection threshold, determining that the detection frame is an effective detection frame and an effective mask, and simultaneously outputting a detection frame coordinate and a mask coordinate, namely a ponding area. In this step, the detection result includes not only the detection frame score and position (the detection frame is the detection frame for accumulated water), but also the coordinate and score of the accumulated water area, and if the score of the detection frame and the score of the accumulated water area are greater than the set accumulated water detection threshold, the detection frame and the accumulated water area are output.
A second embodiment of the invention relates to a water accumulation detecting device. As shown in fig. 2. The device includes:
an information obtaining unit 201, configured to obtain a video image frame to be detected and a water accumulation detection model;
the detection unit 202 is configured to perform water accumulation detection on the video image frame to be detected through the water accumulation detection model to obtain a detection result;
and the information output unit 203 is used for acquiring the position attribute information of the ponding detection area of the target image if the detection result is that ponding exists.
It should be noted that the location attribute information of the ponding detection area includes: detecting frames of the water accumulation areas and coordinates and categories of the position points; the device also includes:
the sample information acquisition unit is used for acquiring a detection frame with a ponding area and a labeled sample set of position point coordinates;
the data enhancement unit is used for carrying out data enhancement processing on the labeled sample set to obtain an expansion labeled sample set;
and the model training unit is used for training the accumulated water detection model according to the expansion labeling sample set.
It should be further noted that the apparatus further includes:
and the model optimization unit is used for converting the single-precision accumulated water detection model into a half-precision model.
It should be further noted that the data enhancement unit is further configured to obtain a water accumulation region picture set in the labeled sample set according to the detection frame with the water accumulation region and the position point coordinates; performing local data enhancement processing on the ponding region picture set to obtain an expansion ponding region picture set; and acquiring a capacity expansion labeling sample set according to the capacity expansion waterlogged area picture set.
It should be further noted that the detection result includes: the positions and the scores of the detection frames of the ponding detection model and the positions and the scores of the masks of the ponding detection model; the device also includes:
the device comprises a presetting unit, a detection unit and a control unit, wherein the presetting unit is used for presetting a ponding detection threshold;
the judging unit is used for judging the mask score of the detection frame score output by the ponding detection model and the ponding detection threshold;
and the output unit is used for determining the detection frame as an effective detection frame and an effective mask if the detection frame score and the mask score exceed the ponding detection threshold, and simultaneously outputting a detection frame coordinate and a mask coordinate, namely a ponding area.
It should be understood that this embodiment is an example of the apparatus corresponding to the first embodiment, and may be implemented in cooperation with the first embodiment. The related technical details mentioned in the first embodiment are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the first embodiment.
It should be noted that each module referred to in this embodiment is a logical module, and in practical applications, one logical unit may be one physical unit, may be a part of one physical unit, and may be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present invention, elements that are not so closely related to solving the technical problems proposed by the present invention are not introduced in the present embodiment, but this does not indicate that other elements are not present in the present embodiment.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
For convenience of description, the above devices are described separately in terms of functional division into various units/modules. Of course, the functionality of the units/modules may be implemented in one or more software and/or hardware implementations of the invention.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
Based on the embodiment, the ponding detection principle is combined, the convolutional neural network is utilized to train the ponding detection model, the detection frame of the ponding detection area and the coordinates of the position point are output, the training data of the model can be effectively increased by further adopting a virtual data enhancement method, the detection rate is further effectively improved, the TensorRT can be adopted to convert the single-precision model into the semi-precision model, the scale of the ponding detection model is optimized, and the detection efficiency is improved. In the implementation process of the technical scheme, a camera is used for collecting a video (namely a video image frame to be detected) of a target area, the video is transmitted back to a terminal processor through a network cable, the terminal processor detects a video single frame (namely the video image frame to be detected) by using a water accumulation detection model, if the confidence degrees of a detection frame and a mask exceed a set value (namely a water accumulation detection threshold value), the position of the detection frame is a road water accumulation detection frame, and the coordinate of the mask is an area where road water accumulation is located. The method can be used in any scene where a camera is present, including but not limited to highways, street roads, commercial districts, parks, attractions, etc., with expandability and portability! By means of repeated training and a TensorRT optimization model method, false detection rate of ponding detection is effectively reduced, detection efficiency is improved, and the method has high reliability and stability. The technical scheme of the invention is concretely realized as follows:
firstly, training a ponding detection model; before the ponding detection model needs to be trained, a training sample set needs to be collected; the training sample set is a marked sample set with a detection frame of a ponding area and position point coordinates; the training method of the ponding detection model comprises the following steps: acquiring a detection frame with a water accumulation area and a labeled sample set of position point coordinates; performing data enhancement processing on the labeled sample set to obtain an expansion labeled sample set; and training the ponding detection model according to the expansion labeling sample set.
The data enhancement processing adopts a virtual enhancement data method to increase training data, and adopts a convolutional neural network to train a ponding detection model. Then, the ponding detection model can be optimized by using TensorRT.
Based on the trained ponding detection model, the ponding detection method disclosed by the invention is specifically realized by the following steps:
step 1: the method comprises the steps that a terminal processor obtains a video image frame to be detected and a water accumulation detection model;
step 2: and (2) transmitting the video collected by the camera back to a terminal processor (namely the video image frame to be detected obtained in the step (1)) through a network cable, and carrying out ponding detection on the water part of the road area in the video single-frame image (namely the video image frame to be detected) by the terminal processor, and outputting a detection frame of a ponding area and the coordinates and categories of the position points.
And step 3: and automatically storing the detection result, the time and the place and other information and transmitting the detection result to a terminal display screen.
It should be noted that, in step 2, the target detection update architecture centret is used as a detection network, and the process of training the ponding detection model is as follows:
a. collecting 6000 pictures with surface water in different scenes, randomly selecting 4000 pictures as a training set, and 2000 pictures as a verification set;
b. marking the region with accumulated water in the picture in a way of simultaneously marking a detection frame and a coordinate point of the accumulated water region;
c. in order to adapt to different scenes, training data are added by a virtual data enhancement method besides data enhancement of the training pictures; the method comprises the following specific steps:
copying the ponding area in the training set by using a polygonal frame, obtaining a clean ponding area by using a background removing algorithm, performing local data enhancement such as stretching, compression, brightness, contrast, color adjustment and the like on the ponding area, randomly pasting the adjusted ponding area on other pictures with road surfaces for simulating a ponding target, and adding a label for simulating the ponding target in an original label file. The original picture was randomly cropped to 500x500 centered on the water, with the picture that contains no targets (i.e., no water) as a negative sample. The number of original training sets was expanded to 30000 by the above procedure.
The ponding detection model is an improved CenterNet model; the improved CenterNet model is additionally provided with a mask prediction module on the basis of the original CenterNet model, wherein the mask is the position point coordinates and the category of the water accumulation area; as shown in fig. 3, an improved centret model proposed for the technical solution of the present invention includes: the device comprises a feature extraction network module, a detection frame prediction module, a mask prediction module and a coupling output module; the feature extraction network module is a typical hourglass network, the detection frame prediction module is a detection head part of an original CenterNet, the mask prediction module is a mask prediction head part added in the method, and the coupling output module is added in the technical scheme of the invention in order to output the position point coordinates of the detection frame and the water accumulation area at the same time. And the video image frame to be detected obtains a feature map after passing through the feature extraction network module, and then the feature map is used as the input of the detection frame prediction module and the mask prediction module to carry out detection frame prediction and mask prediction respectively. In the original CenterNet, the feature map only outputs the coordinate type and the confidence coefficient of a detection frame of the water accumulation region after passing through a detection frame prediction module, and does not output the specific position of the water accumulation region; the mask prediction module of the invention can convert the characteristic image into a mask image and output a corresponding water accumulation area as a position point coordinate, and the mask image is consistent with the original image in size, and the content is the confidence coefficient of the water accumulation area in the original image; the higher the confidence, the higher the possibility that the water-accumulating area exists in the area of the original image. The coupling output module is used for carrying out logic judgment on output results of the detection frame prediction module and the mask prediction module, and outputting the corresponding position of the detection frame and the corresponding position coordinate in the mask image if the output confidence coefficient of the detection frame prediction module and the output confidence coefficient of the mask prediction module are both greater than a set threshold value.
And (3) training the water logging detection model by using a pyrorch training frame, adopting an SGD learning method, wherein the basic learning rate is 0.001, the weight attenuation is 0.0005, the momentum is 0.9, training 100 rounds, and reducing the learning rate by ten times in a cycle of 40 rounds.
And applying the ponding detection model to videos of different scenes, picking out a picture of an undetected ponding area and a picture of an erroneously detected ponding area, and adding the two images as positive and negative samples into a training set for retraining.
And converting the obtained single-precision detection model into a half-precision detection model through TensorRT, optimizing the scale of the model, reducing the occupancy rate of the memory and improving the detection efficiency of the accumulated water.
The camera transmits the video back to the terminal processor through the network cable, and the terminal processor analyzes the video by using the ponding detection model. Meanwhile, a ponding detection threshold value needs to be preset, and the detection frame score, the mask score and the threshold value output by the model are judged. And if the detection score and the mask score exceed the threshold value, judging that the detection frame and the mask are an effective detection frame and an effective mask, and simultaneously outputting the coordinates of the detection frame and the coordinates of the mask (namely the specific position coordinates of the water accumulation area).
The detection picture is automatically stored, the coordinates of the detection frame, the detection time and the detection address are displayed on the display screen in a text and picture mode, and therefore the detection picture is convenient for workers to browse and check.
The technical scheme of the invention can be used in any scene with cameras, including but not limited to highways, street roads, commercial districts, parks, scenic spots and the like. The video is returned only through the network cable, and the road water accumulation area can be obtained by utilizing the terminal processor to analyze the video and judge the threshold value, so that the method has stronger expansibility and portability. The false detection rate can be effectively reduced through repeated training of the wrong samples, so that the method has high reliability and stability. Meanwhile, the information is automatically stored comprehensively, and the staff can conveniently browse and check the information.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.