YOLOv 8-based intelligent safety monitoring system
Technical Field
The invention relates to the technical field of safety monitoring, in particular to an intelligent safety monitoring system based on YOLOv.
Background
With the continuous development of the information society, the safety guarantee problem of some places is increasingly outstanding. The traditional safety precaution measures such as manual patrol, closed-circuit monitoring and the like have the problems of slow response to emergency, dependence on manpower, incapability of real-time analysis and judgment and the like. Especially in the face of hidden extraneous personnel in the complex environment, the recognition accuracy of traditional technique is low, and misinformation and missing report risk are higher. The object detection technology can classify objects in videos according to different categories, but because of the variety of imaging devices sold in the market at present, extremely high omission rate can occur when the imaging devices used in places are monitored only by means of object detection based on videos, and the safety of place information cannot be guaranteed. Meanwhile, the current target detection technology can only detect all personnel types and cannot identify personnel identities, so that whether personnel in the place have relevant rights cannot be judged. Traditional rights management is only suitable for the fixed area of access control, security personnel are mainly adopted for supervision on the periphery of a large-scale place, a large amount of manpower resources are required to be consumed, and a great security monitoring loophole exists.
Disclosure of Invention
The invention aims to provide an intelligent safety monitoring system based on YOLOv, which solves the problems in the background technology by comprehensively utilizing deep learning, target detection, sensors and radio frequency technology.
The technical scheme is that the intelligent safety monitoring system based on YOLOv comprises:
the RFID work card identification module is used for identifying an RFID work card worn by a person entering the monitoring area and judging the identity and access authority of the person according to the work card information;
The image pickup device detection module is used for detecting whether the image pickup device in use exists in the monitoring area and marking the abnormal area;
the feature precision module is integrated in the YOLOv target detection model and is used for enhancing the detection precision of the model on the camera equipment target;
the central processing unit is used for receiving detection results of the RFID work card identification module and the camera equipment detection module, and triggering corresponding response mechanisms according to the detection results, including activity track recording, alarming and response processing;
Further, the RFID tag identification module comprises an RFID tag, a reader, a camera and a buzzer, wherein the reader is used for reading a unique electronic product code EPC in the tag and matching with identity information in an identity identification database to determine the identity and access authority of a person, the camera equipment detection module comprises a FLIR Lepton thermal imaging infrared sensor and a HackRF One software defined radio frequency module which are used for detecting a high-temperature target area and confirming whether the high-temperature target area emits a wireless signal or not, and the characteristic precision module processes a characteristic diagram through a depth separable space attention block DS-SAM and a double-pool extrusion excitation attention block DP-SE.
Further, the RFID tag adopts a passive tag, the working frequency band is ultrahigh frequency 860-960MHz, the identification distance is set to 10 meters, when the reader detects an unauthorized person, the reader triggers a corresponding buzzer to give a warning and records the moving track of the person, and when the reader detects the authorized person, the reader starts a corresponding camera to detect the equipment target.
Further, the camera equipment detection module screens a high-temperature target area by setting a temperature threshold value, and detects and confirms whether the target area emits a wireless signal by combining a radio frequency signal, if the wireless signal is detected, the area is marked as an abnormal area, and the position information of the area is recorded.
Furthermore, the feature precision module performs space dimension weighting on the feature images through a depth separable space attention block DS-SAM and enhances interaction among channels through channel shuffling, and the feature precision module performs global space information fusion on the feature images through a double-pooling extrusion excitation attention block DP-SE to generate channel weights and multiplies the channel weights with the original feature images bit by bit to obtain a final feature image.
Furthermore, the feature precision module performs space dimension weighting on the feature images through a depth separable space attention block DS-SAM and enhances interaction among channels through channel shuffling, and the feature precision module performs global space information fusion on the feature images through a double-pooling extrusion excitation attention block DP-SE to generate channel weights and multiplies the channel weights with the original feature images bit by bit to obtain a final feature image.
Further, the feature precision module comprises setting and inputting an ith batch feature map, wherein C is the number of input channels, H is the height, and W is the width, and obtaining depth feature maps after DS-SAM and DP-SEAnd pooling feature mapsThen willAnd (3) withAdding elements by elements to obtain final distinguishing characteristics, wherein the formula is as follows:
。
further, the DS-SAM comprises a spatial attention transform and a channel shuffling, wherein the spatial attention transform sequentially passes an input ith batch of feature images through two depth-separable convolution layers to weight importance of the feature images in spatial dimension, wherein a first depth-separable convolution layer reduces the number of channels to The second depth separable convolution layer scales the channel liters back to the original channel numbersThe space importance representation of the feature map is obtained after the operation of the batch normalization and activation function, and the formula is as follows:
;
Wherein, ,Representing the characteristic map of the input characteristic map after the separable convolution via the 1 st and 2 nd depths respectively,Representing zeroing out negative values in all output results, sigmoid functionOperation limits the output between [0,1 ];
Meanwhile, the DS-SAM replaces the standard convolution with the depth separable convolution, and the feature map is transformed by the space attention Enters a channel shuffling part to divide the number of channels into 4 groups, and rearranges the channels of the 4 groups to obtain a characteristic diagramFinally, the original characteristic diagramAnd channel shuffled feature mapAnd multiplying the two images bit by bit to obtain a final characteristic diagram.
Further, the DP-SE is specifically processed by mapping the ith batch characteristic mapIs compressed to global space information of (1) by adaptive average pooling and adaptive maximum pooling respectivelyAndFusing the double-pooling results and generating channel weights through the full-connection layerFinally multiplying the original feature map and the channel weight feature map bit by bit to obtain a final feature mapThe formula is as follows:
;
Wherein, Indicating full connectivity layer operation.
Further, the central processing unit triggers response mechanisms of different levels according to the detection result, wherein the response mechanisms comprise primary responses, secondary responses and tertiary responses, the primary responses are used for processing personnel which do not wear the tablet and are detected around the tablet, the secondary responses are used for processing unauthorized personnel, and the tertiary responses are used for processing the conditions that authorized personnel are detected around the tablet.
Compared with the prior art, the intelligent security monitoring method and system have the advantages that camera equipment is initially detected through the module integrating the FLIR Lepton thermal imaging infrared sensor and the HackRF One software defined radio frequency module, personnel identity in an area is confirmed through the RFID work card identification module, equipment target detection is carried out through the YOLOv target detection model added with the characteristic precision module, relevant alarm and response are carried out, the intelligent security monitoring method and system are designed for identifying the camera equipment and the personnel identity which are in use in a real-time scene, and further abnormal conditions of a scene are cooperatively monitored across a plurality of scenes.
Drawings
FIG. 1 is a schematic flow diagram of a system of the present invention;
fig. 2 is a flowchart of an image pickup apparatus detection module of the present invention;
FIG. 3 is a schematic diagram of a feature precision module of the present invention;
fig. 4 is a schematic diagram of an RFID patch identification module of the present invention.
Detailed Description
The technical scheme of the invention is further described below with reference to the accompanying drawings.
As shown in fig. 1, an embodiment of the present invention provides an intelligent security monitoring system based on YOLOv, including:
As shown in FIG. 4, the RFID tag identification module is used for identifying an RFID tag worn by a person entering the monitoring area and judging the identity and access authority of the person according to the tag information, and specifically comprises an RFID tag, a reader, a camera and a buzzer. The RFID tag is embedded in the RFID tag, a unique Electronic Product Code (EPC) corresponding to a person to whom the tag belongs is contained, a passive tag is used, the working frequency band is ultrahigh frequency 860-960MHz, and the identification distance is set to be 10 meters. The reader comprises an antenna and a decoding module, and is used for reading tag data of the tablet and transmitting the decoded corresponding ID information to the system.
When personnel enter the monitoring range of the reader, if the worker card information is detected, the worker card information directly enters a subsequent link to carry out equipment target detection, and if the worker card information is detected, the RFID worker card recognition function is started. When the work card is close to the reader antenna, the RFID tag receives a radio frequency signal sent by the antenna, the passive tag acquires energy from the signal, the tag transmits the stored unique EPC to the decoding module in a wireless mode for decoding, the decoded information is transmitted to the system and is matched with the identity information in the identity recognition database, and the identity and the access authority of the person are determined according to the matching result, so that the identity verification is completed. If the worker card wearer is identified as an authorized person, a camera corresponding to the reader is started, equipment target detection is carried out on the area, and the security and protection condition of the area is further recorded.
Meanwhile, each reader is provided with corresponding numbers and position information during deployment, and when abnormal conditions are monitored, the positions and time information of the readers can be used for drawing the moving track of the personnel, so that the personnel can conveniently check and trace.
The image pickup device detection module is used for detecting whether the image pickup device is in use in the monitoring area and marking the abnormal area, the image pickup device detection module comprises FLIR Lepton thermal imaging infrared sensors and HackRF One software defined radio frequency modules and is used for detecting the high-temperature target area and confirming whether the high-temperature target area emits wireless signals or not, the image pickup device detection module screens the high-temperature target area by setting a temperature threshold and combines the radio frequency signals to detect and confirm whether the target area emits the wireless signals or not, and if the wireless signals are detected, the area is marked as the abnormal area and the position information of the area is recorded.
Specifically, as shown in fig. 2, FLIR Lepton thermal imaging infrared sensors detect real-time scene areas, thermal image data are obtained through SPI interfaces, openCV is used for processing thermal images, high-temperature target areas are screened out according to set temperature thresholds and marked, hackRF One software defined radio frequency modules perform spectrum analysis on radio frequency signals of the infrared sensor marked areas through Python interfaces, fast Fourier Transform (FFT) is used for detecting strong signals of a 2.4GHz frequency band, and whether the target areas emit wireless signals is confirmed. If a wireless signal is detected, the area is marked as an abnormal area, and the position information is recorded. The feature precision module is integrated in a YOLOv target detection model and used for enhancing the detection precision of the model to the camera equipment target, and processes a feature map through a depth separable space attention block DS-SAM and a double-pooling extrusion excitation attention block DP-SE;
The feature precision module performs space dimension weighting on the feature images through a depth separable space attention block DS-SAM and enhances interaction among channels through channel shuffling, and the feature precision module performs global space information fusion on the feature images through a double-pooling extrusion excitation attention block DP-SE to generate channel weights and multiplies the channel weights with the original feature images bit by bit to obtain a final feature image.
The specific process of the feature precision module is as follows, the ith batch feature map is set and inputWherein C is the number of input channels, H is the height, W is the width, and depth feature maps are obtained after DS-SAM and DP-SE are performedAnd pooling feature mapsThen willAnd (3) withAdding elements by element to obtain final distinguishing characteristicsThe formula is as follows:
。
The DS-SAM comprises a space attention transformation and a channel shuffling, wherein the space attention transformation sequentially passes the input ith batch of feature images through two depth separable convolution layers to weight the importance of the feature images in the space dimension, and the first depth separable convolution layer reduces the number of channels to The second depth separable convolution layer scales the channel liters back to the original channel numbersThe space importance representation of the feature map is obtained after the operation of the batch normalization and activation function, and the formula is as follows:
;
Wherein, ,Representing the characteristic map of the input characteristic map after the separable convolution via the 1 st and 2 nd depths respectively,Representing zeroing out negative values in all output results, sigmoid functionOperation limits the output between [0,1 ];
Meanwhile, the DS-SAM replaces the standard convolution with the depth separable convolution, and the feature map is transformed by the space attention Enters a channel shuffling part to divide the number of channels into 4 groups, and rearranges the channels of the 4 groups to obtain a characteristic diagramFinally, the original characteristic diagramAnd channel shuffled feature mapAnd multiplying the two images bit by bit to obtain a final characteristic diagram.
DP-SE is performed by mapping the ith lot characteristic patternIs compressed to global space information of (1) by adaptive average pooling and adaptive maximum pooling respectivelyAndFusing the double-pooling results and generating channel weights through the full-connection layerFinally multiplying the original feature map and the channel weight feature map bit by bit to obtain a final feature mapThe formula is as follows:
;
Wherein, Indicating full connectivity layer operation.
The system comprises a central processing unit, a first-level response unit, a second-level response unit and a third-level response unit, wherein the central processing unit is used for receiving detection results of an RFID (radio frequency identification) tablet identification module and a camera equipment detection module and triggering corresponding response mechanisms according to the detection results, the response mechanisms comprise activity track recording, alarming and response processing, the central processing unit triggers response mechanisms with different levels according to the detection results, the response mechanisms comprise the first-level response unit is used for processing personnel which do not wear the tablet and are detected around the camera equipment, the second-level response unit is used for processing unauthorized personnel, and the third-level response unit is used for processing the situation that authorized personnel are detected around the camera equipment. The method comprises the following steps of immediately triggering a first-level response if equipment is detected around a person wearing an unauthorized tablet, and immediately triggering a second-level response if equipment is detected around a person wearing an unauthorized tablet. If a device is detected around the person wearing the authorized tablet, a tertiary response is triggered immediately. And locking the identities of the personnel triggering the second-level response and the third-level response, dynamically recording the position information of the work card among a plurality of readers, analyzing the time of entering and leaving a specific area by utilizing the time stamp, and generating an activity track. When the secondary response triggers, relevant information is sent to the corresponding upper-level leader of the person to check the action track of the person, if the person has three or more abnormal conditions, the middle-level leader is informed to check by means of mobile phones, short messages and the like, and when the tertiary response triggers, the person needs to actively go to the relevant security department to check shooting information. Temporarily storing the abnormal video frames of the responses of each level of the system, and after the responses disappear, carrying out frame recombination to generate an abnormal video, and storing the abnormal video frames into a system database according to time and camera position information, so that the abnormal video frames are convenient to review and call in the later period.