CN119810988A

CN119810988A - An intelligent security monitoring system based on YOLOv8

Info

Publication number: CN119810988A
Application number: CN202510288489.5A
Authority: CN
Inventors: 庄建军; 王楠; 庄宇辰
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing University of Information Science and Technology
Priority date: 2025-03-12
Filing date: 2025-03-12
Publication date: 2025-04-11

Abstract

The present invention discloses an intelligent safety monitoring system based on YOLOv8, comprising: an RFID badge recognition module, used for recognizing the RFID badge worn by a person entering a monitoring area, and judging the identity and access rights of the person according to the badge information; a camera detection module, used for detecting whether there is a camera being used in the monitoring area, and marking abnormal areas; a feature precision module, integrated into a YOLOv8 target detection model, used for enhancing the detection accuracy of the model for camera targets; a central processing unit, used for receiving the detection results of the RFID badge recognition module and the camera detection module, and triggering a corresponding response mechanism according to the detection results, including activity track recording, alarming and response processing; the present invention effectively improves the accuracy and efficiency of staff supervision work.

Description

YOLOv 8-based intelligent safety monitoring system

Technical Field

The invention relates to the technical field of safety monitoring, in particular to an intelligent safety monitoring system based on YOLOv.

Background

With the continuous development of the information society, the safety guarantee problem of some places is increasingly outstanding. The traditional safety precaution measures such as manual patrol, closed-circuit monitoring and the like have the problems of slow response to emergency, dependence on manpower, incapability of real-time analysis and judgment and the like. Especially in the face of hidden extraneous personnel in the complex environment, the recognition accuracy of traditional technique is low, and misinformation and missing report risk are higher. The object detection technology can classify objects in videos according to different categories, but because of the variety of imaging devices sold in the market at present, extremely high omission rate can occur when the imaging devices used in places are monitored only by means of object detection based on videos, and the safety of place information cannot be guaranteed. Meanwhile, the current target detection technology can only detect all personnel types and cannot identify personnel identities, so that whether personnel in the place have relevant rights cannot be judged. Traditional rights management is only suitable for the fixed area of access control, security personnel are mainly adopted for supervision on the periphery of a large-scale place, a large amount of manpower resources are required to be consumed, and a great security monitoring loophole exists.

Disclosure of Invention

The invention aims to provide an intelligent safety monitoring system based on YOLOv, which solves the problems in the background technology by comprehensively utilizing deep learning, target detection, sensors and radio frequency technology.

The technical scheme is that the intelligent safety monitoring system based on YOLOv comprises:

the RFID work card identification module is used for identifying an RFID work card worn by a person entering the monitoring area and judging the identity and access authority of the person according to the work card information;

The image pickup device detection module is used for detecting whether the image pickup device in use exists in the monitoring area and marking the abnormal area;

the feature precision module is integrated in the YOLOv target detection model and is used for enhancing the detection precision of the model on the camera equipment target;

the central processing unit is used for receiving detection results of the RFID work card identification module and the camera equipment detection module, and triggering corresponding response mechanisms according to the detection results, including activity track recording, alarming and response processing;

Further, the RFID tag identification module comprises an RFID tag, a reader, a camera and a buzzer, wherein the reader is used for reading a unique electronic product code EPC in the tag and matching with identity information in an identity identification database to determine the identity and access authority of a person, the camera equipment detection module comprises a FLIR Lepton thermal imaging infrared sensor and a HackRF One software defined radio frequency module which are used for detecting a high-temperature target area and confirming whether the high-temperature target area emits a wireless signal or not, and the characteristic precision module processes a characteristic diagram through a depth separable space attention block DS-SAM and a double-pool extrusion excitation attention block DP-SE.

Further, the RFID tag adopts a passive tag, the working frequency band is ultrahigh frequency 860-960MHz, the identification distance is set to 10 meters, when the reader detects an unauthorized person, the reader triggers a corresponding buzzer to give a warning and records the moving track of the person, and when the reader detects the authorized person, the reader starts a corresponding camera to detect the equipment target.

Further, the camera equipment detection module screens a high-temperature target area by setting a temperature threshold value, and detects and confirms whether the target area emits a wireless signal by combining a radio frequency signal, if the wireless signal is detected, the area is marked as an abnormal area, and the position information of the area is recorded.

Furthermore, the feature precision module performs space dimension weighting on the feature images through a depth separable space attention block DS-SAM and enhances interaction among channels through channel shuffling, and the feature precision module performs global space information fusion on the feature images through a double-pooling extrusion excitation attention block DP-SE to generate channel weights and multiplies the channel weights with the original feature images bit by bit to obtain a final feature image.

Further, the feature precision module comprises setting and inputting an ith batch feature map, wherein C is the number of input channels, H is the height, and W is the width, and obtaining depth feature maps after DS-SAM and DP-SEAnd pooling feature mapsThen willAnd (3) withAdding elements by elements to obtain final distinguishing characteristics, wherein the formula is as follows:

。

further, the DS-SAM comprises a spatial attention transform and a channel shuffling, wherein the spatial attention transform sequentially passes an input ith batch of feature images through two depth-separable convolution layers to weight importance of the feature images in spatial dimension, wherein a first depth-separable convolution layer reduces the number of channels to The second depth separable convolution layer scales the channel liters back to the original channel numbersThe space importance representation of the feature map is obtained after the operation of the batch normalization and activation function, and the formula is as follows:

;

Wherein, ,Representing the characteristic map of the input characteristic map after the separable convolution via the 1 st and 2 nd depths respectively,Representing zeroing out negative values in all output results, sigmoid functionOperation limits the output between [0,1 ];

Meanwhile, the DS-SAM replaces the standard convolution with the depth separable convolution, and the feature map is transformed by the space attention Enters a channel shuffling part to divide the number of channels into 4 groups, and rearranges the channels of the 4 groups to obtain a characteristic diagramFinally, the original characteristic diagramAnd channel shuffled feature mapAnd multiplying the two images bit by bit to obtain a final characteristic diagram.

Further, the DP-SE is specifically processed by mapping the ith batch characteristic mapIs compressed to global space information of (1) by adaptive average pooling and adaptive maximum pooling respectivelyAndFusing the double-pooling results and generating channel weights through the full-connection layerFinally multiplying the original feature map and the channel weight feature map bit by bit to obtain a final feature mapThe formula is as follows:

;

Wherein, Indicating full connectivity layer operation.

Further, the central processing unit triggers response mechanisms of different levels according to the detection result, wherein the response mechanisms comprise primary responses, secondary responses and tertiary responses, the primary responses are used for processing personnel which do not wear the tablet and are detected around the tablet, the secondary responses are used for processing unauthorized personnel, and the tertiary responses are used for processing the conditions that authorized personnel are detected around the tablet.

Compared with the prior art, the intelligent security monitoring method and system have the advantages that camera equipment is initially detected through the module integrating the FLIR Lepton thermal imaging infrared sensor and the HackRF One software defined radio frequency module, personnel identity in an area is confirmed through the RFID work card identification module, equipment target detection is carried out through the YOLOv target detection model added with the characteristic precision module, relevant alarm and response are carried out, the intelligent security monitoring method and system are designed for identifying the camera equipment and the personnel identity which are in use in a real-time scene, and further abnormal conditions of a scene are cooperatively monitored across a plurality of scenes.

Drawings

FIG. 1 is a schematic flow diagram of a system of the present invention;

fig. 2 is a flowchart of an image pickup apparatus detection module of the present invention;

FIG. 3 is a schematic diagram of a feature precision module of the present invention;

fig. 4 is a schematic diagram of an RFID patch identification module of the present invention.

Detailed Description

The technical scheme of the invention is further described below with reference to the accompanying drawings.

As shown in fig. 1, an embodiment of the present invention provides an intelligent security monitoring system based on YOLOv, including:

As shown in FIG. 4, the RFID tag identification module is used for identifying an RFID tag worn by a person entering the monitoring area and judging the identity and access authority of the person according to the tag information, and specifically comprises an RFID tag, a reader, a camera and a buzzer. The RFID tag is embedded in the RFID tag, a unique Electronic Product Code (EPC) corresponding to a person to whom the tag belongs is contained, a passive tag is used, the working frequency band is ultrahigh frequency 860-960MHz, and the identification distance is set to be 10 meters. The reader comprises an antenna and a decoding module, and is used for reading tag data of the tablet and transmitting the decoded corresponding ID information to the system.

When personnel enter the monitoring range of the reader, if the worker card information is detected, the worker card information directly enters a subsequent link to carry out equipment target detection, and if the worker card information is detected, the RFID worker card recognition function is started. When the work card is close to the reader antenna, the RFID tag receives a radio frequency signal sent by the antenna, the passive tag acquires energy from the signal, the tag transmits the stored unique EPC to the decoding module in a wireless mode for decoding, the decoded information is transmitted to the system and is matched with the identity information in the identity recognition database, and the identity and the access authority of the person are determined according to the matching result, so that the identity verification is completed. If the worker card wearer is identified as an authorized person, a camera corresponding to the reader is started, equipment target detection is carried out on the area, and the security and protection condition of the area is further recorded.

Meanwhile, each reader is provided with corresponding numbers and position information during deployment, and when abnormal conditions are monitored, the positions and time information of the readers can be used for drawing the moving track of the personnel, so that the personnel can conveniently check and trace.

The image pickup device detection module is used for detecting whether the image pickup device is in use in the monitoring area and marking the abnormal area, the image pickup device detection module comprises FLIR Lepton thermal imaging infrared sensors and HackRF One software defined radio frequency modules and is used for detecting the high-temperature target area and confirming whether the high-temperature target area emits wireless signals or not, the image pickup device detection module screens the high-temperature target area by setting a temperature threshold and combines the radio frequency signals to detect and confirm whether the target area emits the wireless signals or not, and if the wireless signals are detected, the area is marked as the abnormal area and the position information of the area is recorded.

Specifically, as shown in fig. 2, FLIR Lepton thermal imaging infrared sensors detect real-time scene areas, thermal image data are obtained through SPI interfaces, openCV is used for processing thermal images, high-temperature target areas are screened out according to set temperature thresholds and marked, hackRF One software defined radio frequency modules perform spectrum analysis on radio frequency signals of the infrared sensor marked areas through Python interfaces, fast Fourier Transform (FFT) is used for detecting strong signals of a 2.4GHz frequency band, and whether the target areas emit wireless signals is confirmed. If a wireless signal is detected, the area is marked as an abnormal area, and the position information is recorded. The feature precision module is integrated in a YOLOv target detection model and used for enhancing the detection precision of the model to the camera equipment target, and processes a feature map through a depth separable space attention block DS-SAM and a double-pooling extrusion excitation attention block DP-SE;

The feature precision module performs space dimension weighting on the feature images through a depth separable space attention block DS-SAM and enhances interaction among channels through channel shuffling, and the feature precision module performs global space information fusion on the feature images through a double-pooling extrusion excitation attention block DP-SE to generate channel weights and multiplies the channel weights with the original feature images bit by bit to obtain a final feature image.

The specific process of the feature precision module is as follows, the ith batch feature map is set and inputWherein C is the number of input channels, H is the height, W is the width, and depth feature maps are obtained after DS-SAM and DP-SE are performedAnd pooling feature mapsThen willAnd (3) withAdding elements by element to obtain final distinguishing characteristicsThe formula is as follows:

。

The DS-SAM comprises a space attention transformation and a channel shuffling, wherein the space attention transformation sequentially passes the input ith batch of feature images through two depth separable convolution layers to weight the importance of the feature images in the space dimension, and the first depth separable convolution layer reduces the number of channels to The second depth separable convolution layer scales the channel liters back to the original channel numbersThe space importance representation of the feature map is obtained after the operation of the batch normalization and activation function, and the formula is as follows:

;

DP-SE is performed by mapping the ith lot characteristic patternIs compressed to global space information of (1) by adaptive average pooling and adaptive maximum pooling respectivelyAndFusing the double-pooling results and generating channel weights through the full-connection layerFinally multiplying the original feature map and the channel weight feature map bit by bit to obtain a final feature mapThe formula is as follows:

;

Wherein, Indicating full connectivity layer operation.

The system comprises a central processing unit, a first-level response unit, a second-level response unit and a third-level response unit, wherein the central processing unit is used for receiving detection results of an RFID (radio frequency identification) tablet identification module and a camera equipment detection module and triggering corresponding response mechanisms according to the detection results, the response mechanisms comprise activity track recording, alarming and response processing, the central processing unit triggers response mechanisms with different levels according to the detection results, the response mechanisms comprise the first-level response unit is used for processing personnel which do not wear the tablet and are detected around the camera equipment, the second-level response unit is used for processing unauthorized personnel, and the third-level response unit is used for processing the situation that authorized personnel are detected around the camera equipment. The method comprises the following steps of immediately triggering a first-level response if equipment is detected around a person wearing an unauthorized tablet, and immediately triggering a second-level response if equipment is detected around a person wearing an unauthorized tablet. If a device is detected around the person wearing the authorized tablet, a tertiary response is triggered immediately. And locking the identities of the personnel triggering the second-level response and the third-level response, dynamically recording the position information of the work card among a plurality of readers, analyzing the time of entering and leaving a specific area by utilizing the time stamp, and generating an activity track. When the secondary response triggers, relevant information is sent to the corresponding upper-level leader of the person to check the action track of the person, if the person has three or more abnormal conditions, the middle-level leader is informed to check by means of mobile phones, short messages and the like, and when the tertiary response triggers, the person needs to actively go to the relevant security department to check shooting information. Temporarily storing the abnormal video frames of the responses of each level of the system, and after the responses disappear, carrying out frame recombination to generate an abnormal video, and storing the abnormal video frames into a system database according to time and camera position information, so that the abnormal video frames are convenient to review and call in the later period.

Claims

1. An intelligent safety monitoring system based on YOLOv8, characterized in that it includes: an RFID badge recognition module, which is used to identify the RFID badge worn by the person entering the monitoring area, and judge the identity and access rights of the person according to the badge information; a camera detection module, which is used to detect whether there is a camera in use in the monitoring area and mark the abnormal area; a feature precision module, which is integrated into the YOLOv8 target detection model, and is used to enhance the detection accuracy of the model for the camera target; a central processing unit, which is used to receive the detection results of the RFID badge recognition module and the camera detection module, and trigger the corresponding response mechanism according to the detection results, including activity trajectory recording, alarm and response processing.

2. According to claim 1, an intelligent safety monitoring system based on YOLOv8 is characterized in that the RFID work badge recognition module includes an RFID work badge, a reader, a camera and a buzzer, the reader is used to read the unique electronic product code EPC in the work badge and match it with the identity information in the identity recognition database to determine the identity and access rights of the personnel; the camera equipment detection module includes a FLIR Lepton thermal imaging infrared sensor and a HackRF One software-defined radio frequency module, which are used to detect high-temperature target areas and confirm whether they emit wireless signals; the feature precision module processes feature maps through a deeply separable spatial attention block DS-SAM and a dual-pooling squeeze excitation attention block DP-SE.

3. According to the YOLOv8-based intelligent safety monitoring system described in claim 1, it is characterized in that the RFID work badge adopts a passive tag, the working frequency band is ultra-high frequency 860-960MHz, and the recognition distance is set to 10 meters; when the reader detects an unauthorized person, it triggers the corresponding buzzer to issue a warning and records the activity trajectory of the person; when the reader detects an authorized person, it starts the corresponding camera to perform equipment target detection.

4. According to claim 1, an intelligent safety monitoring system based on YOLOv8 is characterized in that the camera device detection module screens the high-temperature target area by setting a temperature threshold, and confirms whether the target area transmits a wireless signal in combination with radio frequency signal detection; if a wireless signal is detected, the area is marked as an abnormal area and its location information is recorded.

5. According to claim 1, an intelligent safety monitoring system based on YOLOv8 is characterized in that the feature precision module performs spatial dimension weighting on the feature map through the deep separable spatial attention block DS-SAM, and enhances the interaction between channels through channel shuffling; the feature precision module performs global spatial information fusion on the feature map through the double pooling squeeze excitation attention block DP-SE, generates channel weights, and multiplies them bit by bit with the original feature map to obtain the final feature map.

6. According to claim 5, an intelligent safety monitoring system based on YOLOv8 is characterized in that the specific process of the feature precision module is as follows: set the input i -th batch feature map, where C is the number of input channels, H is the height, and W is the width. After DS-SAM and DP-SE, the deep feature maps are obtained respectively. And pooling feature map , then and Add each element together to get the final discriminant feature. The formula is as follows:

.

7. According to claim 6, a YOLOv8-based intelligent security monitoring system is characterized in that DS-SAM includes two parts: spatial attention transformation and channel shuffling; wherein the spatial attention transformation successively passes the input i- th batch feature map through two depth-separable convolutional layers to weight the importance of the spatial dimension of the feature map, wherein the first depth-separable convolutional layer reduces the number of channels to , the second depth-wise separable convolutional layer increases the number of channels back to the original number of channels , after batch normalization and activation function operation, the feature map space importance representation is obtained, the formula is as follows:

;

in, , They represent the feature maps of the input feature map after the first and second depth-separable convolutions, respectively. Indicates that all negative values in the output results are set to zero, the sigmoid function The operation limits the output to the range [0,1].

At the same time, DS-SAM replaces the standard convolution with a depth-separable convolution; the feature map after spatial attention transformation Enter the channel shuffling part, divide the number of channels into 4 groups, and rearrange the channels of the 4 groups to obtain the feature map , and finally the original feature map And the feature map after channel shuffle Multiply bit by bit to get the final feature map.

8. According to claim 6, a YOLOv8-based intelligent safety monitoring system is characterized in that the specific process of DP-SE is as follows: the i- th batch feature map The global spatial information of is compressed into and , fuse the double pooling results and generate channel weights through the fully connected layer Finally, the original feature map and the channel weight feature map are multiplied bit by bit to obtain the final feature map , the formula is as follows:

;

in, Represents a fully connected layer operation.

9. According to claim 1, an intelligent safety monitoring system based on YOLOv8 is characterized in that the central processing unit triggers different levels of response mechanisms according to the detection results, including primary response, secondary response and tertiary response; the primary response is used to deal with personnel who do not wear work badges and have camera equipment detected around them, the secondary response is used to deal with unauthorized personnel, and the tertiary response is used to deal with situations where authorized personnel are detected around them.