Safety management camera module and method based on visual model
Technical Field
The invention belongs to the field of safety monitoring, and particularly relates to a safety management camera module and method based on a visual model.
Background
With the development of society and the continuous improvement of people's safety consciousness, safety management cameras are widely used in numerous places such as residential communities, business office areas, industrial parks, and the like. The traditional safety management camera can only realize a simple video image acquisition function, has limited capability of analyzing and judging the acquired image, generally relies on preset fixed rules to identify basic conditions such as a moving target, a specific-shape object and the like, is difficult to accurately cope with complex and changeable actual monitoring scenes, and is easy to cause false alarm, missing alarm and the like.
When facing massive monitoring video data, the traditional camera lacks effective data integration and depth mining capability, valuable safety related information cannot be timely extracted to assist management personnel in making efficient decisions, and the intelligent level and the practical effect of safety management are greatly limited. Therefore, a novel camera technology with stronger image analysis capability and capable of deeply mining data value and accurately performing security management judgment is urgently needed.
Therefore, the invention provides a safety management camera module and a safety management camera method based on a visual model.
Disclosure of Invention
In order to overcome the deficiencies of the prior art, at least one technical problem presented in the background art is solved.
The technical scheme adopted by the invention for solving the technical problems is that the safety management camera module based on the visual model is characterized by comprising the following components:
The image acquisition module consists of a high-definition optical lens and an image sensor component and is responsible for acquiring real-time video image information in a monitoring area;
The bean bag visual large model embedding module is used for receiving the video image data from the image acquisition module, and the visual large model can accurately identify and classify objects, figures and behavior actions in the image;
Qwen2-VL, recognizing complex scene object relation, handwriting and multi-language image text, having excellent visual reasoning capability, solving problems by means of graph analysis, understanding long videos, supporting real-time conversations for multi-application, supporting multi-language convenience for global users, processing images with arbitrary resolution, integrating multi-dimensional information by means of innovative architecture, and expanding application capability of question-beating and answering;
The data processing and analyzing module cooperates with the big bean bag visual model embedding module to further sort, judge and integrate the output result of the big bean bag visual model embedding module, and the big bean bag visual model embedding module is used for comparing real-time and historical monitoring data through an algorithm, mining safety trend and abnormality, generating an evaluation report and assisting a safety management decision;
The communication module has the supporting capability for various communication protocols such as Wi-Fi, ethernet and 4G/5G, can transmit video image data collected by a camera, analysis results of the bean bag visual large model and various information of security risk assessment reports generated by the data processing and analysis module to a remote monitoring management platform or a mobile terminal of security personnel in real time, ensures the timely transmission of security information, and is convenient for the management personnel to remotely control the condition of a monitoring area and respond quickly;
The power management module provides stable and reliable power supply for the whole camera system, can be connected with a mains supply, is internally provided with a standby power supply such as a lithium battery, and has the functions of power monitoring, intelligent switching and energy-saving control;
The storage module is provided with a large-capacity storage medium, such as a solid state disk, and is mainly used for storing original video image data acquired by a camera, and key information obtained after processing of each link and the content of an analysis result;
The alarm module is cooperated with the data processing and analyzing module to monitor the safety condition of the monitoring area, triggers various alarm modes, deterres potential dangerous personnel by sending out sound and light signals, and pushes information, voice prompts and the like to the mobile terminal of the manager.
Preferably, the image acquisition module has functions of adjusting focal length, aperture and the like, can adapt to the acquisition requirements of clear images under different distances and illumination environment conditions, and acquired image data is transmitted to a subsequent module in a digital signal form in real time for processing.
Preferably, the bean bag visual large model embedding module is internally provided with a pre-trained bean bag visual large model, the model is formed by performing depth training based on massive image data of different scenes, various objects and characters, has strong image feature extraction, semantic understanding and pattern recognition capability, and can accurately recognize and classify the objects, characters and behavior actions in the image after receiving the video image data from the image acquisition module, for example, accurately distinguish normal pedestrians, workers or suspicious people, recognize specific dangerous objects and abnormal scene arrangement, such as blocked fire control channels and the like, and analyze and judge the behavior track and the action gesture of the characters, and judge whether abnormal behaviors such as climbing and fighting exist or not.
Preferably, qwen-VL comprises:
the method has strong recognition capability, can accurately recognize a plurality of objects and relations thereof in a complex scene, and can recognize handwritten texts and multilingual image texts including most European languages, japanese languages, korean languages and Arabic languages;
the visual reasoning capability is excellent, the complex mathematical problem can be solved through chart analysis, information can be extracted from the real world image and the chart, and the actual problem can be solved by better following the instruction;
The long video understanding and real-time dialogue can understand the video content for more than 20 minutes, can continuously provide information and support in the real-time dialogue, and can be applied to question-answering, dialogue, content creation and the like of the video;
The multi-language support not only supports common English and Chinese, but also supports understanding of image texts in multiple languages, thereby being convenient for global users to use;
the architecture is innovative, a serial structure of vit plus qwen is adopted, the native dynamic resolution and multi-mode rotation position embedding technology is supported, the image input with any resolution can be processed, and multi-dimensional position information, question taking and answering, AI picture generation, telephone call and message sending can be simultaneously captured and integrated.
Preferably, the data processing and analyzing module works in cooperation with the large bean bag visual model embedding module, on one hand, further data arrangement and logic judgment are carried out on the recognition and analysis results output by the large model, and related information acquired at different moments and different angles is associated and integrated, on the other hand, the built-in algorithm is used for carrying out comparison analysis on the real-time monitoring data and the historical monitoring data, and potential safety trend and abnormal change conditions, such as abnormal aggregation frequency change of recent personnel in a certain area, are mined, and a safety risk assessment report is generated according to the safety risk assessment report, so that detailed data support is provided for subsequent safety management decisions.
Preferably, the power management module is responsible for providing stable and reliable power supply for the whole camera system, can be connected with a mains supply and is internally provided with a standby power supply, such as a lithium battery, has the functions of power monitoring, intelligent switching and energy-saving control, can automatically switch to the standby power supply to continuously maintain the normal work of the camera when the mains supply is powered off, and can dynamically adjust the power supply of each module according to actual monitoring requirements
Preferably, the storage module is provided with a large-capacity storage medium, such as a solid state disk, and is used for storing collected original video image data, processed key information and analysis result content, supporting data classification storage according to various modes of time and event types, facilitating subsequent query, playback and data tracing operation, and the stored data can be backed up periodically to prevent loss.
The use method of the safety management camera based on the visual model is characterized by comprising the following steps of:
S1, installing and initializing, namely installing a camera at a proper position needing safety monitoring, such as a gateway of a building, a corridor passageway and a key channel area of a park, ensuring that a lens field of an image acquisition module covers a target monitoring area, connecting a mains supply, starting the camera, performing self-checking by a power management module at the moment and normally supplying power to each module, starting an initializing program by a camera system, automatically connecting a communication module to a preset network, for example, connecting the communication module to a local area network or accessing the Internet through a 4G/5G network, completing formatting preparation work by a storage module, waiting for data storage, and loading pre-training model parameters by a bean bag vision large model embedding module to enter a ready state;
S2, image acquisition and analysis, wherein the image acquisition module continuously acquires video images of a monitoring area according to a set frame rate, such as 25 frames per second, and transmits real-time image data to the bean bag visual large model embedding module, after receiving the images, the visual large model embedding module performs operations of feature extraction, object identification and behavior analysis on each frame of images, such as identifying personnel identity in a picture, judging whether walking directions and behavior actions of personnel are in compliance or not in a mode of comparing with a pre-stored authorized personnel image database, and simultaneously identifying various objects and states thereof in a scene, such as fire facilities and vehicles, and outputting corresponding analysis results to the data processing and analysis module;
S3, after the data processing and analyzing module collects analysis results of the vision large model, on one hand, integrating cameras at different angles at the same moment, collecting related data, and constructing a complete monitoring scene view, on the other hand, combining historical monitoring data, judging the safety state of a current monitoring area through a built-in risk assessment algorithm, such as calculating the probability of abnormal behaviors of current personnel and the level of potential safety hazards in the environment, and generating a safety risk assessment report;
S4, information transmission and remote monitoring, wherein the communication module sends collected original video image data, analysis results of a visual large model and generated security risk assessment reports to a remote monitoring management platform and a mobile terminal of security personnel in real time according to a set period, such as every 1 minute or aiming at an urgent high risk event;
S5, alarm triggering and response are carried out, when the data processing and analyzing module judges that a safety event reaching a preset alarm threshold occurs in the monitoring area, for example, an unauthorized person is detected to try to break into a limit area and a fire smoke situation occurs, the alarm module is immediately started;
S6, data storage and management are carried out, the storage module continuously stores the original video image data acquired by the image acquisition module, classified archiving is carried out according to time sequence and event labels, such as each alarm event and daily inspection period are taken as labels, subsequent query and playback are facilitated, meanwhile, the processed analysis result and safety risk assessment report key information are stored, data statistics and trend analysis are facilitated for management personnel, the data are taken as the basis of safety management decision, and the stored data are backed up to external storage equipment or cloud end periodically, so that safety and integrity of the data are ensured.
The beneficial effects of the invention are as follows:
1. according to the safety management camera module and the safety management camera method based on the visual model, by means of the strong image understanding capability of the big bean bag visual model, various elements in a monitored scene can be identified with high precision, misjudgment conditions caused by factors such as environmental interference and object similarity are greatly reduced, safety related key information can be accurately captured under different conditions such as daytime, night, complex indoor and outdoor scenes and the like, and the effectiveness of safety monitoring is improved.
2. According to the safety management camera module and the safety management camera method based on the visual model, through the comprehensive application of the data processing and analyzing module to the output result and the historical data of the visual large model, the current safety condition can be known, potential safety hazard trends can be dug, the generated safety risk assessment report can assist management staff to make a coping strategy in advance, passive response is changed into active prevention, and the prospective and scientificity of overall safety management are improved.
3. According to the safety management camera module and the safety management camera method based on the visual model, the communication module ensures that monitoring data and analysis results can be transmitted to the remote terminal in real time, a manager can master the situation and give instructions at any time and any place without monitoring the scene, convenience and timeliness of safety management work are greatly improved, and the safety management camera module and the safety management camera method based on the visual model are particularly suitable for large-area and multi-area centralized safety management scenes.
4. According to the safety management camera module and the safety management camera method based on the visual model, the power management module ensures that the camera stably works in various power supply environments, monitoring blank caused by power failure is avoided, and the storage module has convenient data storage and query functions, is beneficial to follow-up event disc copying, evidence searching and other works, and enhances the practicability and reliability of the whole safety management system.
5. According to the safety management camera module and the safety management camera method based on the visual model, the alarm module can rapidly start the corresponding alarm mode according to accurate safety event judgment, and notify related personnel at the first time, so that safety risks are controlled to be in the minimum range, further expansion of safety accidents is avoided, and safety of personnel and property in a monitoring area is guaranteed.
Drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a block diagram of a camera in the present invention;
fig. 2 is a flowchart of a method of using a camera according to the present invention.
Detailed Description
The invention is further described in connection with the following detailed description in order to make the technical means, the creation characteristics, the achievement of the purpose and the effect of the invention easy to understand.
As shown in fig. 1 and fig. 2, a security management camera module based on a visual model according to an embodiment of the present invention is characterized in that the security management camera module includes:
The image acquisition module consists of a high-definition optical lens and an image sensor component and is responsible for acquiring real-time video image information in a monitoring area;
The bean bag visual large model embedding module is used for receiving the video image data from the image acquisition module, and the visual large model can accurately identify and classify objects, figures and behavior actions in the image;
Qwen2-VL, recognizing complex scene object relation, handwriting and multi-language image text, having excellent visual reasoning capability, solving problems by means of graph analysis, understanding long videos, supporting real-time conversations for multi-application, supporting multi-language convenience for global users, processing images with arbitrary resolution, integrating multi-dimensional information by means of innovative architecture, and expanding application capability of question-beating and answering;
The data processing and analyzing module cooperates with the big bean bag visual model embedding module to further sort, judge and integrate the output result of the big bean bag visual model embedding module, and the big bean bag visual model embedding module is used for comparing real-time and historical monitoring data through an algorithm, mining safety trend and abnormality, generating an evaluation report and assisting a safety management decision;
The communication module has the supporting capability for various communication protocols such as Wi-Fi, ethernet and 4G/5G, can transmit video image data collected by a camera, analysis results of the bean bag visual large model and various information of security risk assessment reports generated by the data processing and analysis module to a remote monitoring management platform or a mobile terminal of security personnel in real time, ensures the timely transmission of security information, and is convenient for the management personnel to remotely control the condition of a monitoring area and respond quickly;
The power management module provides stable and reliable power supply for the whole camera system, can be connected with a mains supply, is internally provided with a standby power supply such as a lithium battery, and has the functions of power monitoring, intelligent switching and energy-saving control;
The storage module is provided with a large-capacity storage medium, such as a solid state disk, and is mainly used for storing original video image data acquired by a camera, and key information obtained after processing of each link and the content of an analysis result;
The alarm module is cooperated with the data processing and analyzing module to monitor the safety condition of the monitoring area, triggers various alarm modes, deterres potential dangerous personnel by sending out sound and light signals, and pushes information, voice prompts and the like to the mobile terminal of the manager.
The image acquisition module has the functions of adjusting focal length, aperture and the like, can adapt to the acquisition requirements of clear images under different distances and illumination environment conditions, and the acquired image data is transmitted to the subsequent module in a digital signal form in real time for processing.
The large bean bag visual model embedding module is internally provided with a large pre-trained large bean bag visual model, the model is formed by performing depth training based on massive image data of different scenes, various objects and characters, has strong image feature extraction, semantic understanding and mode recognition capability, and can accurately recognize and classify the objects, characters and behavior actions in the images after receiving the video image data from the image acquisition module, for example, accurately distinguish normal pedestrians, workers or suspicious people, recognize specific dangerous objects and abnormal scene arrangement, such as blocked fire control channels and the like, and analyze and judge the behavior track and the behavior gesture of the characters, and judge whether abnormal behaviors exist, such as climbing and fighting.
Qwen2-VL comprises:
the method has strong recognition capability, can accurately recognize a plurality of objects and relations thereof in a complex scene, and can recognize handwritten texts and multilingual image texts including most European languages, japanese languages, korean languages and Arabic languages;
the visual reasoning capability is excellent, the complex mathematical problem can be solved through chart analysis, information can be extracted from the real world image and the chart, and the actual problem can be solved by better following the instruction;
The long video understanding and real-time dialogue can understand the video content for more than 20 minutes, can continuously provide information and support in the real-time dialogue, and can be applied to question-answering, dialogue, content creation and the like of the video;
The multi-language support not only supports common English and Chinese, but also supports understanding of image texts in multiple languages, thereby being convenient for global users to use;
the architecture is innovative, a serial structure of vit plus qwen is adopted, the native dynamic resolution and multi-mode rotation position embedding technology is supported, the image input with any resolution can be processed, and multi-dimensional position information, question taking and answering, AI picture generation, telephone call and message sending can be simultaneously captured and integrated.
The data processing and analyzing module works in cooperation with the large bean bag visual model embedding module, performs further data arrangement and logic judgment on recognition and analysis results output by the large model, performs association and integration on related information acquired at different moments and different angles, performs comparison analysis on real-time monitoring data and historical monitoring data through a built-in algorithm, and digs potential safety trend and abnormal change conditions, such as abnormal aggregation frequency change of recent personnel in a certain area, so as to generate a safety risk assessment report according to the safety trend and abnormal aggregation frequency change, and provides detailed data support for subsequent safety management decisions.
The power management module is responsible for providing stable and reliable power supply for the whole camera system, can be connected to a mains supply and is internally provided with a standby power supply, such as a lithium battery, has the functions of power monitoring, intelligent switching and energy-saving control, can automatically switch to the standby power supply to continuously maintain the normal work of the camera when the mains supply is powered off, and can dynamically adjust the power supply of each module according to the actual monitoring requirements
The storage module is provided with a large-capacity storage medium, such as a solid state disk, and is used for storing collected original video image data, processed key information and analysis result content, supporting data classification storage according to various modes of time and event types, facilitating subsequent query, playback and data tracing operation, and the stored data can be backed up periodically to prevent loss.
The modules of the whole camera system work cooperatively, the image acquisition module continuously acquires images, after the deep analysis processing of the bean bag visual large model embedding module and the data processing and analyzing module, the communication module is used for transmitting information outwards, the storage module is used for data storage, the power supply management module is used for guaranteeing power supply, and the alarm module is used for giving an alarm in time when necessary, so that a complete intelligent safety management monitoring system is formed.
The use method of the safety management camera based on the visual model is characterized by comprising the following steps of:
S1, installing and initializing, namely installing a camera at a proper position needing safety monitoring, such as a gateway of a building, a corridor passageway and a key channel area of a park, ensuring that a lens field of an image acquisition module covers a target monitoring area, connecting a mains supply, starting the camera, performing self-checking by a power management module at the moment and normally supplying power to each module, starting an initializing program by a camera system, automatically connecting a communication module to a preset network, for example, connecting the communication module to a local area network or accessing the Internet through a 4G/5G network, completing formatting preparation work by a storage module, waiting for data storage, and loading pre-training model parameters by a bean bag vision large model embedding module to enter a ready state;
S2, image acquisition and analysis, wherein the image acquisition module continuously acquires video images of a monitoring area according to a set frame rate, such as 25 frames per second, and transmits real-time image data to the bean bag visual large model embedding module, after receiving the images, the visual large model embedding module performs operations of feature extraction, object identification and behavior analysis on each frame of images, such as identifying personnel identity in a picture, judging whether walking directions and behavior actions of personnel are in compliance or not in a mode of comparing with a pre-stored authorized personnel image database, and simultaneously identifying various objects and states thereof in a scene, such as fire facilities and vehicles, and outputting corresponding analysis results to the data processing and analysis module;
S3, after the data processing and analyzing module collects analysis results of the vision large model, on one hand, integrating cameras at different angles at the same moment, collecting related data, and constructing a complete monitoring scene view, on the other hand, combining historical monitoring data, judging the safety state of a current monitoring area through a built-in risk assessment algorithm, such as calculating the probability of abnormal behaviors of current personnel and the level of potential safety hazards in the environment, and generating a safety risk assessment report;
S4, information transmission and remote monitoring, wherein the communication module sends collected original video image data, analysis results of a visual large model and generated security risk assessment reports to a remote monitoring management platform and a mobile terminal of security personnel in real time according to a set period, such as every 1 minute or aiming at an urgent high risk event;
S5, alarm triggering and response are carried out, when the data processing and analyzing module judges that a safety event reaching a preset alarm threshold occurs in the monitoring area, for example, an unauthorized person is detected to try to break into a limit area and a fire smoke situation occurs, the alarm module is immediately started;
S6, data storage and management are carried out, the storage module continuously stores the original video image data acquired by the image acquisition module, classified archiving is carried out according to time sequence and event labels, such as each alarm event and daily inspection period are taken as labels, subsequent query and playback are facilitated, meanwhile, the processed analysis result and safety risk assessment report key information are stored, data statistics and trend analysis are facilitated for management personnel, the data are taken as the basis of safety management decision, and the stored data are backed up to external storage equipment or cloud end periodically, so that safety and integrity of the data are ensured.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.