[go: up one dir, main page]

CN120075566A - Safety management camera module and method based on visual model - Google Patents

Safety management camera module and method based on visual model Download PDF

Info

Publication number
CN120075566A
CN120075566A CN202510148783.6A CN202510148783A CN120075566A CN 120075566 A CN120075566 A CN 120075566A CN 202510148783 A CN202510148783 A CN 202510148783A CN 120075566 A CN120075566 A CN 120075566A
Authority
CN
China
Prior art keywords
module
visual
data
security
monitoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202510148783.6A
Other languages
Chinese (zh)
Inventor
王波
张腾予
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Baisnake Technology Co ltd
Original Assignee
Chengdu Baisnake Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Baisnake Technology Co ltd filed Critical Chengdu Baisnake Technology Co ltd
Priority to CN202510148783.6A priority Critical patent/CN120075566A/en
Publication of CN120075566A publication Critical patent/CN120075566A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/50Constructional details
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/72Data preparation, e.g. statistical preprocessing of image or video features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B25/00Alarm systems in which the location of the alarm condition is signalled to a central station, e.g. fire or police telegraphic systems
    • G08B25/01Alarm systems in which the location of the alarm condition is signalled to a central station, e.g. fire or police telegraphic systems characterised by the transmission medium
    • G08B25/08Alarm systems in which the location of the alarm condition is signalled to a central station, e.g. fire or police telegraphic systems characterised by the transmission medium using communication transmission lines
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B7/00Signalling systems according to more than one of groups G08B3/00 - G08B6/00; Personal calling systems according to more than one of groups G08B3/00 - G08B6/00
    • G08B7/06Signalling systems according to more than one of groups G08B3/00 - G08B6/00; Personal calling systems according to more than one of groups G08B3/00 - G08B6/00 using electric transmission, e.g. involving audible and visible signalling through the use of sound and light sources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/50Constructional details
    • H04N23/54Mounting of pick-up tubes, electronic image sensors, deviation or focusing coils
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/50Constructional details
    • H04N23/55Optical parts specially adapted for electronic image sensors; Mounting thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/65Control of camera operation in relation to power supply
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/69Control of means for changing angle of the field of view, e.g. optical zoom objectives or electronic zooming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Business, Economics & Management (AREA)
  • Emergency Management (AREA)
  • Artificial Intelligence (AREA)
  • Alarm Systems (AREA)

Abstract

本发明属于安全监控领域,具体的说是一种基于视觉模型的安全管理摄像头模块及方法,该安全管理摄像头模块包括:图像采集模块,豆包视觉大模型嵌入模块,Qwen2‑VL,数据处理与分析模块,通信模块,本发明通过图像采集模块不断采集图像,经过豆包视觉大模型嵌入模块和数据处理与分析模块的深度分析处理后,通过通信模块向外传输信息,存储模块进行数据保存,电源管理模块保障供电,报警模块在必要时及时发出警报,从而形成一套完整的智能安全管理监控体系,通过结合先进的视觉大模型,实现更精准、智能且高效的安全监控与管理功能,有效降低误报、漏报率,助力安全管理人员更好地履行职责、保障监控区域安全。

The present invention belongs to the field of security monitoring, and specifically, is a security management camera module and method based on a visual model. The security management camera module comprises: an image acquisition module, a Doubao visual large model embedding module, Qwen2‑VL, a data processing and analysis module, and a communication module. The present invention continuously acquires images through the image acquisition module, and after in-depth analysis and processing by the Doubao visual large model embedding module and the data processing and analysis module, transmits information to the outside through the communication module, the storage module saves data, the power management module ensures power supply, and the alarm module promptly issues an alarm when necessary, thereby forming a complete set of intelligent security management and monitoring system. By combining with advanced visual large models, more accurate, intelligent and efficient security monitoring and management functions are achieved, and the false alarm and missed alarm rates are effectively reduced, helping security management personnel to better perform their duties and ensure the safety of the monitored area.

Description

Safety management camera module and method based on visual model
Technical Field
The invention belongs to the field of safety monitoring, and particularly relates to a safety management camera module and method based on a visual model.
Background
With the development of society and the continuous improvement of people's safety consciousness, safety management cameras are widely used in numerous places such as residential communities, business office areas, industrial parks, and the like. The traditional safety management camera can only realize a simple video image acquisition function, has limited capability of analyzing and judging the acquired image, generally relies on preset fixed rules to identify basic conditions such as a moving target, a specific-shape object and the like, is difficult to accurately cope with complex and changeable actual monitoring scenes, and is easy to cause false alarm, missing alarm and the like.
When facing massive monitoring video data, the traditional camera lacks effective data integration and depth mining capability, valuable safety related information cannot be timely extracted to assist management personnel in making efficient decisions, and the intelligent level and the practical effect of safety management are greatly limited. Therefore, a novel camera technology with stronger image analysis capability and capable of deeply mining data value and accurately performing security management judgment is urgently needed.
Therefore, the invention provides a safety management camera module and a safety management camera method based on a visual model.
Disclosure of Invention
In order to overcome the deficiencies of the prior art, at least one technical problem presented in the background art is solved.
The technical scheme adopted by the invention for solving the technical problems is that the safety management camera module based on the visual model is characterized by comprising the following components:
The image acquisition module consists of a high-definition optical lens and an image sensor component and is responsible for acquiring real-time video image information in a monitoring area;
The bean bag visual large model embedding module is used for receiving the video image data from the image acquisition module, and the visual large model can accurately identify and classify objects, figures and behavior actions in the image;
Qwen2-VL, recognizing complex scene object relation, handwriting and multi-language image text, having excellent visual reasoning capability, solving problems by means of graph analysis, understanding long videos, supporting real-time conversations for multi-application, supporting multi-language convenience for global users, processing images with arbitrary resolution, integrating multi-dimensional information by means of innovative architecture, and expanding application capability of question-beating and answering;
The data processing and analyzing module cooperates with the big bean bag visual model embedding module to further sort, judge and integrate the output result of the big bean bag visual model embedding module, and the big bean bag visual model embedding module is used for comparing real-time and historical monitoring data through an algorithm, mining safety trend and abnormality, generating an evaluation report and assisting a safety management decision;
The communication module has the supporting capability for various communication protocols such as Wi-Fi, ethernet and 4G/5G, can transmit video image data collected by a camera, analysis results of the bean bag visual large model and various information of security risk assessment reports generated by the data processing and analysis module to a remote monitoring management platform or a mobile terminal of security personnel in real time, ensures the timely transmission of security information, and is convenient for the management personnel to remotely control the condition of a monitoring area and respond quickly;
The power management module provides stable and reliable power supply for the whole camera system, can be connected with a mains supply, is internally provided with a standby power supply such as a lithium battery, and has the functions of power monitoring, intelligent switching and energy-saving control;
The storage module is provided with a large-capacity storage medium, such as a solid state disk, and is mainly used for storing original video image data acquired by a camera, and key information obtained after processing of each link and the content of an analysis result;
The alarm module is cooperated with the data processing and analyzing module to monitor the safety condition of the monitoring area, triggers various alarm modes, deterres potential dangerous personnel by sending out sound and light signals, and pushes information, voice prompts and the like to the mobile terminal of the manager.
Preferably, the image acquisition module has functions of adjusting focal length, aperture and the like, can adapt to the acquisition requirements of clear images under different distances and illumination environment conditions, and acquired image data is transmitted to a subsequent module in a digital signal form in real time for processing.
Preferably, the bean bag visual large model embedding module is internally provided with a pre-trained bean bag visual large model, the model is formed by performing depth training based on massive image data of different scenes, various objects and characters, has strong image feature extraction, semantic understanding and pattern recognition capability, and can accurately recognize and classify the objects, characters and behavior actions in the image after receiving the video image data from the image acquisition module, for example, accurately distinguish normal pedestrians, workers or suspicious people, recognize specific dangerous objects and abnormal scene arrangement, such as blocked fire control channels and the like, and analyze and judge the behavior track and the action gesture of the characters, and judge whether abnormal behaviors such as climbing and fighting exist or not.
Preferably, qwen-VL comprises:
the method has strong recognition capability, can accurately recognize a plurality of objects and relations thereof in a complex scene, and can recognize handwritten texts and multilingual image texts including most European languages, japanese languages, korean languages and Arabic languages;
the visual reasoning capability is excellent, the complex mathematical problem can be solved through chart analysis, information can be extracted from the real world image and the chart, and the actual problem can be solved by better following the instruction;
The long video understanding and real-time dialogue can understand the video content for more than 20 minutes, can continuously provide information and support in the real-time dialogue, and can be applied to question-answering, dialogue, content creation and the like of the video;
The multi-language support not only supports common English and Chinese, but also supports understanding of image texts in multiple languages, thereby being convenient for global users to use;
the architecture is innovative, a serial structure of vit plus qwen is adopted, the native dynamic resolution and multi-mode rotation position embedding technology is supported, the image input with any resolution can be processed, and multi-dimensional position information, question taking and answering, AI picture generation, telephone call and message sending can be simultaneously captured and integrated.
Preferably, the data processing and analyzing module works in cooperation with the large bean bag visual model embedding module, on one hand, further data arrangement and logic judgment are carried out on the recognition and analysis results output by the large model, and related information acquired at different moments and different angles is associated and integrated, on the other hand, the built-in algorithm is used for carrying out comparison analysis on the real-time monitoring data and the historical monitoring data, and potential safety trend and abnormal change conditions, such as abnormal aggregation frequency change of recent personnel in a certain area, are mined, and a safety risk assessment report is generated according to the safety risk assessment report, so that detailed data support is provided for subsequent safety management decisions.
Preferably, the power management module is responsible for providing stable and reliable power supply for the whole camera system, can be connected with a mains supply and is internally provided with a standby power supply, such as a lithium battery, has the functions of power monitoring, intelligent switching and energy-saving control, can automatically switch to the standby power supply to continuously maintain the normal work of the camera when the mains supply is powered off, and can dynamically adjust the power supply of each module according to actual monitoring requirements
Preferably, the storage module is provided with a large-capacity storage medium, such as a solid state disk, and is used for storing collected original video image data, processed key information and analysis result content, supporting data classification storage according to various modes of time and event types, facilitating subsequent query, playback and data tracing operation, and the stored data can be backed up periodically to prevent loss.
The use method of the safety management camera based on the visual model is characterized by comprising the following steps of:
S1, installing and initializing, namely installing a camera at a proper position needing safety monitoring, such as a gateway of a building, a corridor passageway and a key channel area of a park, ensuring that a lens field of an image acquisition module covers a target monitoring area, connecting a mains supply, starting the camera, performing self-checking by a power management module at the moment and normally supplying power to each module, starting an initializing program by a camera system, automatically connecting a communication module to a preset network, for example, connecting the communication module to a local area network or accessing the Internet through a 4G/5G network, completing formatting preparation work by a storage module, waiting for data storage, and loading pre-training model parameters by a bean bag vision large model embedding module to enter a ready state;
S2, image acquisition and analysis, wherein the image acquisition module continuously acquires video images of a monitoring area according to a set frame rate, such as 25 frames per second, and transmits real-time image data to the bean bag visual large model embedding module, after receiving the images, the visual large model embedding module performs operations of feature extraction, object identification and behavior analysis on each frame of images, such as identifying personnel identity in a picture, judging whether walking directions and behavior actions of personnel are in compliance or not in a mode of comparing with a pre-stored authorized personnel image database, and simultaneously identifying various objects and states thereof in a scene, such as fire facilities and vehicles, and outputting corresponding analysis results to the data processing and analysis module;
S3, after the data processing and analyzing module collects analysis results of the vision large model, on one hand, integrating cameras at different angles at the same moment, collecting related data, and constructing a complete monitoring scene view, on the other hand, combining historical monitoring data, judging the safety state of a current monitoring area through a built-in risk assessment algorithm, such as calculating the probability of abnormal behaviors of current personnel and the level of potential safety hazards in the environment, and generating a safety risk assessment report;
S4, information transmission and remote monitoring, wherein the communication module sends collected original video image data, analysis results of a visual large model and generated security risk assessment reports to a remote monitoring management platform and a mobile terminal of security personnel in real time according to a set period, such as every 1 minute or aiming at an urgent high risk event;
S5, alarm triggering and response are carried out, when the data processing and analyzing module judges that a safety event reaching a preset alarm threshold occurs in the monitoring area, for example, an unauthorized person is detected to try to break into a limit area and a fire smoke situation occurs, the alarm module is immediately started;
S6, data storage and management are carried out, the storage module continuously stores the original video image data acquired by the image acquisition module, classified archiving is carried out according to time sequence and event labels, such as each alarm event and daily inspection period are taken as labels, subsequent query and playback are facilitated, meanwhile, the processed analysis result and safety risk assessment report key information are stored, data statistics and trend analysis are facilitated for management personnel, the data are taken as the basis of safety management decision, and the stored data are backed up to external storage equipment or cloud end periodically, so that safety and integrity of the data are ensured.
The beneficial effects of the invention are as follows:
1. according to the safety management camera module and the safety management camera method based on the visual model, by means of the strong image understanding capability of the big bean bag visual model, various elements in a monitored scene can be identified with high precision, misjudgment conditions caused by factors such as environmental interference and object similarity are greatly reduced, safety related key information can be accurately captured under different conditions such as daytime, night, complex indoor and outdoor scenes and the like, and the effectiveness of safety monitoring is improved.
2. According to the safety management camera module and the safety management camera method based on the visual model, through the comprehensive application of the data processing and analyzing module to the output result and the historical data of the visual large model, the current safety condition can be known, potential safety hazard trends can be dug, the generated safety risk assessment report can assist management staff to make a coping strategy in advance, passive response is changed into active prevention, and the prospective and scientificity of overall safety management are improved.
3. According to the safety management camera module and the safety management camera method based on the visual model, the communication module ensures that monitoring data and analysis results can be transmitted to the remote terminal in real time, a manager can master the situation and give instructions at any time and any place without monitoring the scene, convenience and timeliness of safety management work are greatly improved, and the safety management camera module and the safety management camera method based on the visual model are particularly suitable for large-area and multi-area centralized safety management scenes.
4. According to the safety management camera module and the safety management camera method based on the visual model, the power management module ensures that the camera stably works in various power supply environments, monitoring blank caused by power failure is avoided, and the storage module has convenient data storage and query functions, is beneficial to follow-up event disc copying, evidence searching and other works, and enhances the practicability and reliability of the whole safety management system.
5. According to the safety management camera module and the safety management camera method based on the visual model, the alarm module can rapidly start the corresponding alarm mode according to accurate safety event judgment, and notify related personnel at the first time, so that safety risks are controlled to be in the minimum range, further expansion of safety accidents is avoided, and safety of personnel and property in a monitoring area is guaranteed.
Drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a block diagram of a camera in the present invention;
fig. 2 is a flowchart of a method of using a camera according to the present invention.
Detailed Description
The invention is further described in connection with the following detailed description in order to make the technical means, the creation characteristics, the achievement of the purpose and the effect of the invention easy to understand.
As shown in fig. 1 and fig. 2, a security management camera module based on a visual model according to an embodiment of the present invention is characterized in that the security management camera module includes:
The image acquisition module consists of a high-definition optical lens and an image sensor component and is responsible for acquiring real-time video image information in a monitoring area;
The bean bag visual large model embedding module is used for receiving the video image data from the image acquisition module, and the visual large model can accurately identify and classify objects, figures and behavior actions in the image;
Qwen2-VL, recognizing complex scene object relation, handwriting and multi-language image text, having excellent visual reasoning capability, solving problems by means of graph analysis, understanding long videos, supporting real-time conversations for multi-application, supporting multi-language convenience for global users, processing images with arbitrary resolution, integrating multi-dimensional information by means of innovative architecture, and expanding application capability of question-beating and answering;
The data processing and analyzing module cooperates with the big bean bag visual model embedding module to further sort, judge and integrate the output result of the big bean bag visual model embedding module, and the big bean bag visual model embedding module is used for comparing real-time and historical monitoring data through an algorithm, mining safety trend and abnormality, generating an evaluation report and assisting a safety management decision;
The communication module has the supporting capability for various communication protocols such as Wi-Fi, ethernet and 4G/5G, can transmit video image data collected by a camera, analysis results of the bean bag visual large model and various information of security risk assessment reports generated by the data processing and analysis module to a remote monitoring management platform or a mobile terminal of security personnel in real time, ensures the timely transmission of security information, and is convenient for the management personnel to remotely control the condition of a monitoring area and respond quickly;
The power management module provides stable and reliable power supply for the whole camera system, can be connected with a mains supply, is internally provided with a standby power supply such as a lithium battery, and has the functions of power monitoring, intelligent switching and energy-saving control;
The storage module is provided with a large-capacity storage medium, such as a solid state disk, and is mainly used for storing original video image data acquired by a camera, and key information obtained after processing of each link and the content of an analysis result;
The alarm module is cooperated with the data processing and analyzing module to monitor the safety condition of the monitoring area, triggers various alarm modes, deterres potential dangerous personnel by sending out sound and light signals, and pushes information, voice prompts and the like to the mobile terminal of the manager.
The image acquisition module has the functions of adjusting focal length, aperture and the like, can adapt to the acquisition requirements of clear images under different distances and illumination environment conditions, and the acquired image data is transmitted to the subsequent module in a digital signal form in real time for processing.
The large bean bag visual model embedding module is internally provided with a large pre-trained large bean bag visual model, the model is formed by performing depth training based on massive image data of different scenes, various objects and characters, has strong image feature extraction, semantic understanding and mode recognition capability, and can accurately recognize and classify the objects, characters and behavior actions in the images after receiving the video image data from the image acquisition module, for example, accurately distinguish normal pedestrians, workers or suspicious people, recognize specific dangerous objects and abnormal scene arrangement, such as blocked fire control channels and the like, and analyze and judge the behavior track and the behavior gesture of the characters, and judge whether abnormal behaviors exist, such as climbing and fighting.
Qwen2-VL comprises:
the method has strong recognition capability, can accurately recognize a plurality of objects and relations thereof in a complex scene, and can recognize handwritten texts and multilingual image texts including most European languages, japanese languages, korean languages and Arabic languages;
the visual reasoning capability is excellent, the complex mathematical problem can be solved through chart analysis, information can be extracted from the real world image and the chart, and the actual problem can be solved by better following the instruction;
The long video understanding and real-time dialogue can understand the video content for more than 20 minutes, can continuously provide information and support in the real-time dialogue, and can be applied to question-answering, dialogue, content creation and the like of the video;
The multi-language support not only supports common English and Chinese, but also supports understanding of image texts in multiple languages, thereby being convenient for global users to use;
the architecture is innovative, a serial structure of vit plus qwen is adopted, the native dynamic resolution and multi-mode rotation position embedding technology is supported, the image input with any resolution can be processed, and multi-dimensional position information, question taking and answering, AI picture generation, telephone call and message sending can be simultaneously captured and integrated.
The data processing and analyzing module works in cooperation with the large bean bag visual model embedding module, performs further data arrangement and logic judgment on recognition and analysis results output by the large model, performs association and integration on related information acquired at different moments and different angles, performs comparison analysis on real-time monitoring data and historical monitoring data through a built-in algorithm, and digs potential safety trend and abnormal change conditions, such as abnormal aggregation frequency change of recent personnel in a certain area, so as to generate a safety risk assessment report according to the safety trend and abnormal aggregation frequency change, and provides detailed data support for subsequent safety management decisions.
The power management module is responsible for providing stable and reliable power supply for the whole camera system, can be connected to a mains supply and is internally provided with a standby power supply, such as a lithium battery, has the functions of power monitoring, intelligent switching and energy-saving control, can automatically switch to the standby power supply to continuously maintain the normal work of the camera when the mains supply is powered off, and can dynamically adjust the power supply of each module according to the actual monitoring requirements
The storage module is provided with a large-capacity storage medium, such as a solid state disk, and is used for storing collected original video image data, processed key information and analysis result content, supporting data classification storage according to various modes of time and event types, facilitating subsequent query, playback and data tracing operation, and the stored data can be backed up periodically to prevent loss.
The modules of the whole camera system work cooperatively, the image acquisition module continuously acquires images, after the deep analysis processing of the bean bag visual large model embedding module and the data processing and analyzing module, the communication module is used for transmitting information outwards, the storage module is used for data storage, the power supply management module is used for guaranteeing power supply, and the alarm module is used for giving an alarm in time when necessary, so that a complete intelligent safety management monitoring system is formed.
The use method of the safety management camera based on the visual model is characterized by comprising the following steps of:
S1, installing and initializing, namely installing a camera at a proper position needing safety monitoring, such as a gateway of a building, a corridor passageway and a key channel area of a park, ensuring that a lens field of an image acquisition module covers a target monitoring area, connecting a mains supply, starting the camera, performing self-checking by a power management module at the moment and normally supplying power to each module, starting an initializing program by a camera system, automatically connecting a communication module to a preset network, for example, connecting the communication module to a local area network or accessing the Internet through a 4G/5G network, completing formatting preparation work by a storage module, waiting for data storage, and loading pre-training model parameters by a bean bag vision large model embedding module to enter a ready state;
S2, image acquisition and analysis, wherein the image acquisition module continuously acquires video images of a monitoring area according to a set frame rate, such as 25 frames per second, and transmits real-time image data to the bean bag visual large model embedding module, after receiving the images, the visual large model embedding module performs operations of feature extraction, object identification and behavior analysis on each frame of images, such as identifying personnel identity in a picture, judging whether walking directions and behavior actions of personnel are in compliance or not in a mode of comparing with a pre-stored authorized personnel image database, and simultaneously identifying various objects and states thereof in a scene, such as fire facilities and vehicles, and outputting corresponding analysis results to the data processing and analysis module;
S3, after the data processing and analyzing module collects analysis results of the vision large model, on one hand, integrating cameras at different angles at the same moment, collecting related data, and constructing a complete monitoring scene view, on the other hand, combining historical monitoring data, judging the safety state of a current monitoring area through a built-in risk assessment algorithm, such as calculating the probability of abnormal behaviors of current personnel and the level of potential safety hazards in the environment, and generating a safety risk assessment report;
S4, information transmission and remote monitoring, wherein the communication module sends collected original video image data, analysis results of a visual large model and generated security risk assessment reports to a remote monitoring management platform and a mobile terminal of security personnel in real time according to a set period, such as every 1 minute or aiming at an urgent high risk event;
S5, alarm triggering and response are carried out, when the data processing and analyzing module judges that a safety event reaching a preset alarm threshold occurs in the monitoring area, for example, an unauthorized person is detected to try to break into a limit area and a fire smoke situation occurs, the alarm module is immediately started;
S6, data storage and management are carried out, the storage module continuously stores the original video image data acquired by the image acquisition module, classified archiving is carried out according to time sequence and event labels, such as each alarm event and daily inspection period are taken as labels, subsequent query and playback are facilitated, meanwhile, the processed analysis result and safety risk assessment report key information are stored, data statistics and trend analysis are facilitated for management personnel, the data are taken as the basis of safety management decision, and the stored data are backed up to external storage equipment or cloud end periodically, so that safety and integrity of the data are ensured.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (9)

1.一种基于视觉模型的安全管理摄像头模块,其特征在于:该安全管理摄像头模块包括:1. A security management camera module based on a visual model, characterized in that: the security management camera module comprises: 图像采集模块,由高清光学镜头、图像传感器部件组成,负责采集监控区域内的实时视频图像信息;The image acquisition module, which consists of high-definition optical lenses and image sensor components, is responsible for collecting real-time video image information within the monitoring area; 豆包视觉大模型嵌入模块,接收来自图像采集模块的视频图像数据后,视觉大模型能够对图像中的物体、人物、行为动作进行精确识别和分类;The Doubao visual model embedding module receives the video image data from the image acquisition module. The visual model can accurately identify and classify the objects, people, and behaviors in the image. Qwen2-VL,识别复杂场景对象关系、手写及多语言图像文本,有出色视觉推理能力可借图表分析解决问题,可理解长视频并支持实时对话用于多应用,支持多语言方便全球用户,且凭借创新架构能处理任意分辨率图像、整合多维度信息,还有拍题答疑拓展应用能力;Qwen2-VL, which can recognize complex scene object relationships, handwriting, and multi-language image text, has excellent visual reasoning ability and can solve problems through chart analysis. It can understand long videos and support real-time conversations for multiple applications. It supports multiple languages to facilitate global users. With its innovative architecture, it can process images of any resolution, integrate multi-dimensional information, and expand its application capabilities by taking pictures to answer questions. 数据处理与分析模块,协同豆包视觉大模型嵌入模块,既对其输出结果进一步整理、判断并整合关联多时空信息,又通过算法对比实时和历史监控数据,挖掘安全趋势及异常,生成评估报告,助力安全管理决策;The data processing and analysis module, in collaboration with the Doubao visual large model embedding module, not only further organizes, judges and integrates the output results of the module, but also compares the real-time and historical monitoring data through algorithms, mines security trends and anomalies, generates evaluation reports, and assists in security management decision-making; 通信模块,具备对多种通信协议,如Wi-Fi、以太网、4G/5G的支持能力,可实时将摄像头采集的视频图像数据、豆包视觉大模型的分析结果以及数据处理与分析模块生成的安全风险评估报告各类信息,传输至远程的监控管理平台或安保人员的移动终端,保障安全信息及时传递,便于管理人员远程把控监控区域状况并迅速响应;The communication module supports multiple communication protocols, such as Wi-Fi, Ethernet, and 4G/5G. It can transmit the video image data collected by the camera, the analysis results of the Doubao visual model, and the security risk assessment report generated by the data processing and analysis module to the remote monitoring management platform or the mobile terminal of the security personnel in real time, ensuring the timely transmission of security information and facilitating the management personnel to remotely control the status of the monitoring area and respond quickly; 电源管理模块,为整个摄像头系统提供稳定可靠的电力供应,既能接入市电电源,又内置如锂电池之类的备用电源,拥有电源监测、智能切换以及节能控制功能;The power management module provides a stable and reliable power supply for the entire camera system. It can be connected to the mains power supply and has a built-in backup power supply such as a lithium battery. It has power monitoring, intelligent switching and energy-saving control functions. 存储模块,配备大容量存储介质,如固态硬盘,主要用于存储摄像头采集的原始视频图像数据,还有经过各环节处理后得到的关键信息及分析结果的内容;The storage module is equipped with a large-capacity storage medium, such as a solid-state hard disk, which is mainly used to store the original video image data collected by the camera, as well as the key information and analysis results obtained after processing in various links; 报警模块,与数据处理与分析模块协同,监测监控区域安全状况;触发多种报警方式,通过发出声光信号威慑潜在危险人员,同时向管理人员移动终端推送消息、语音提示等。The alarm module works with the data processing and analysis module to monitor the safety status of the surveillance area; triggers multiple alarm modes, deters potentially dangerous persons by emitting sound and light signals, and pushes messages and voice prompts to the manager's mobile terminal. 2.根据权利要求1所述的一种基于视觉模型的安全管理摄像头模块,其特征在于:图像采集模块具备可调节焦距、光圈等功能,能够适应不同距离、光照环境条件下清晰图像的采集需求,采集到的图像数据以数字信号形式实时传输给后续模块进行处理。2. According to the security management camera module based on the visual model described in claim 1, it is characterized in that the image acquisition module has functions such as adjustable focal length and aperture, which can adapt to the needs of collecting clear images under different distances and lighting conditions, and the collected image data is transmitted in real time in the form of digital signals to subsequent modules for processing. 3.根据权利要求2所述的一种基于视觉模型的安全管理摄像头模块,其特征在于:豆包视觉大模型嵌入模块内置经过预训练的豆包视觉大模型,该模型基于海量的不同场景、各类物体、人物行为图像数据进行深度训练而成,具有强大的图像特征提取、语义理解以及模式识别能力;接收来自图像采集模块的视频图像数据后,视觉大模型能够对图像中的物体、人物、行为动作进行精确识别和分类,例如准确分辨出是正常行人、工作人员还是可疑人员,识别出特定的危险物品、异常的场景布置,如消防通道被堵塞等情况,同时还能对人物的行为轨迹、动作姿态进行分析判断,判断是否存在异常行为,如攀爬、打斗。3. According to claim 2, a security management camera module based on a visual model is characterized in that: the Doubao visual big model embedding module has a pre-trained Doubao visual big model built in, and the model is deeply trained based on a large amount of different scenes, various objects, and character behavior image data, and has powerful image feature extraction, semantic understanding and pattern recognition capabilities; after receiving the video image data from the image acquisition module, the visual big model can accurately identify and classify objects, people, and behavioral actions in the image, for example, accurately distinguish whether they are normal pedestrians, staff or suspicious persons, identify specific dangerous items, abnormal scene arrangements, such as blocked fire passages, etc., and at the same time, it can also analyze and judge the character's behavior trajectory and action posture to determine whether there is abnormal behavior, such as climbing or fighting. 4.根据权利要求3所述的一种基于视觉模型的安全管理摄像头模块,其特征在于:Qwen2-VL包括:4. A security management camera module based on a visual model according to claim 3, characterized in that: Qwen2-VL comprises: 强大的识别能力,能精准识别复杂场景中多个对象及其关系,还可识别手写文本和包括大部分欧洲语言、日语、韩语、阿拉伯语在内的多语言图像文本;Powerful recognition capabilities, capable of accurately identifying multiple objects and their relationships in complex scenes, as well as handwritten text and multilingual image text including most European languages, Japanese, Korean, and Arabic; 出色的视觉推理能力,可通过图表分析解决复杂数学问题,能从真实世界图像和图表中提取信息,并更好地遵循指令解决实际问题;Excellent visual reasoning skills, able to solve complex math problems through graphical analysis, able to extract information from real-world images and diagrams, and better follow instructions to solve real-world problems; 长视频理解与实时对话,能够理解20分钟以上的视频内容,并可在实时对话中持续提供信息和支持,可应用于视频的问答、对话和内容创作等;Long video comprehension and real-time conversation: It can understand video content of more than 20 minutes and continuously provide information and support in real-time conversation. It can be applied to video Q&A, conversation and content creation, etc. 多语言支持,除常见的英语和中文外,还支持理解多种语言的图像文本,方便全球用户使用;Multi-language support: In addition to the common English and Chinese, it also supports understanding image text in multiple languages, making it convenient for users around the world to use; 架构创新,采用vit加qwen2的串联结构,支持原生动态分辨率和多模态旋转位置嵌入技术,可处理任意分辨率图像输入,能同时捕捉和整合多维度位置信息、拍题答疑、AI生图、打电话、发消息。The architecture is innovative, adopting the series structure of VIT and QWEN2, supporting native dynamic resolution and multi-modal rotation position embedding technology, and can process image input of any resolution. It can simultaneously capture and integrate multi-dimensional location information, take photos to answer questions, generate AI images, make phone calls, and send messages. 5.根据权利要求4所述的一种基于视觉模型的安全管理摄像头模块,其特征在于:数据处理与分析模块,配合豆包视觉大模型嵌入模块工作,一方面对大模型输出的识别、分析结果进行进一步的数据整理和逻辑判断,将不同时刻、不同角度采集到的相关信息进行关联整合;另一方面,通过内置的算法对实时监控数据与历史监控数据进行对比分析,挖掘潜在的安全趋势和异常变化情况,比如某区域近期人员异常聚集频率变化,以此为依据生成安全风险评估报告,为后续的安全管理决策提供详实的数据支撑。5. According to claim 4, a security management camera module based on a visual model is characterized in that: the data processing and analysis module cooperates with the Doubao visual large model embedding module to, on the one hand, further organize the data and make logical judgments on the recognition and analysis results output by the large model, and associate and integrate the relevant information collected at different times and angles; on the other hand, the real-time monitoring data is compared and analyzed with the historical monitoring data through the built-in algorithm to explore potential security trends and abnormal changes, such as the recent changes in the frequency of abnormal gatherings of people in a certain area, and generate a security risk assessment report based on this, providing detailed data support for subsequent security management decisions. 6.根据权利要求5所述的一种基于视觉模型的安全管理摄像头模块,其特征在于:电源管理模块,负责为整个摄像头系统提供稳定可靠的电力供应,可接入市电电源并内置备用电源,如锂电池,具备电源监测、智能切换以及节能控制功能,当市电断电时能自动切换至备用电源继续维持摄像头正常工作,同时可根据实际监控需求动态调整各模块的供电功率。6. According to claim 5, a security management camera module based on a visual model is characterized in that: a power management module is responsible for providing a stable and reliable power supply for the entire camera system, can be connected to the mains power supply and has a built-in backup power supply, such as a lithium battery, and has power monitoring, intelligent switching and energy-saving control functions. When the mains power is cut off, it can automatically switch to the backup power supply to continue to maintain the normal operation of the camera, and at the same time can dynamically adjust the power supply of each module according to actual monitoring needs. 7.根据权利要求6所述的一种基于视觉模型的安全管理摄像头模块,其特征在于:存储模块,配备大容量的存储介质,如固态硬盘,用于存储采集到的原始视频图像数据以及经过处理后的关键信息、分析结果内容,支持按照时间、事件类型多种方式进行数据分类存储,方便后续的查询、回放以及数据追溯操作,并且存储的数据可定期进行备份以防止丢失。7. According to claim 6, a security management camera module based on a visual model is characterized in that: the storage module is equipped with a large-capacity storage medium, such as a solid-state hard disk, which is used to store the collected original video image data and the processed key information and analysis results, and supports data classification and storage in multiple ways according to time and event type, which is convenient for subsequent query, playback and data tracing operations, and the stored data can be backed up regularly to prevent loss. 8.根据权利要求7所述的一种基于视觉模型的安全管理摄像头模块,其特征在于:报警模块,与数据处理与分析模块相连,当根据豆包视觉大模型的分析判断以及综合数据处理结果确定监控区域内出现达到预设危险等级的安全事件时,如发现非法闯入、火灾隐患,该模块能够触发多种报警方式,包括发出声光报警信号以威慑潜在危险人员、向相关管理人员的移动终端发送报警通知,如推送消息、语音提示,确保安全事件能够得到及时处理。8. According to claim 7, a security management camera module based on a visual model is characterized in that: an alarm module is connected to the data processing and analysis module. When a security incident reaching a preset danger level is determined in the monitoring area based on the analysis and judgment of the Doubao visual large model and the comprehensive data processing results, such as illegal intrusion or fire hazard, the module can trigger a variety of alarm methods, including sending sound and light alarm signals to deter potentially dangerous personnel and sending alarm notifications to the mobile terminals of relevant managers, such as push messages and voice prompts, to ensure that security incidents can be handled in a timely manner. 9.一种基于视觉模型的安全管理摄像头的使用方法,采用上述权利要求1-8中任一项所述的一种基于视觉模型的安全管理摄像头模块,其特征在于,该使用方法包括以下步骤:9. A method for using a visual model-based security management camera, using a visual model-based security management camera module according to any one of claims 1 to 8, characterized in that the method comprises the following steps: S1、安装与初始化,将摄像头安装在需要进行安全监控的合适位置,例如建筑物的出入口、走廊过道、园区的关键通道区域,确保图像采集模块的镜头视野覆盖目标监控区域;连接好市电电源,并开启摄像头,此时电源管理模块进行自检并为各模块正常供电,摄像头系统开始启动初始化程序,通信模块自动连接至预设的网络,如连接至本地局域网或者通过4G/5G网络接入互联网,存储模块完成格式化准备工作并等待数据存储,豆包视觉大模型嵌入模块加载预训练模型参数进入就绪状态;S1. Installation and initialization: Install the camera at a suitable location where security monitoring is required, such as the entrance and exit of a building, corridors, and key channel areas of a park, to ensure that the lens field of view of the image acquisition module covers the target monitoring area; connect the AC power supply and turn on the camera. At this time, the power management module performs a self-check and supplies power to each module normally. The camera system starts the initialization program, and the communication module automatically connects to the preset network, such as connecting to the local area network or accessing the Internet through a 4G/5G network. The storage module completes formatting preparations and waits for data storage. The Doubao visual large model embedding module loads the pre-trained model parameters and enters the ready state; S2、图像采集与分析,图像采集模块按照设定的帧率,如每秒25帧持续采集监控区域的视频图像,将实时图像数据传输给豆包视觉大模型嵌入模块;视觉大模型嵌入模块接收到图像后,对每帧图像进行特征提取、物体识别、行为分析的操作,例如识别画面中的人员身份,通过与预存的授权人员图像数据库对比的方式、判断人员的行走方向和行为动作是否合规,同时还会识别场景中的各类物体及其状态,如消防设施、车辆,输出对应的分析结果至数据处理与分析模块;S2, image acquisition and analysis. The image acquisition module continuously acquires video images of the monitored area at a set frame rate, such as 25 frames per second, and transmits real-time image data to the Doubao visual large model embedding module. After receiving the image, the visual large model embedding module performs feature extraction, object recognition, and behavior analysis on each frame of the image. For example, it identifies the identity of the person in the picture, and determines whether the walking direction and behavior of the person are compliant by comparing with the pre-stored authorized personnel image database. At the same time, it also identifies various objects in the scene and their states, such as fire-fighting facilities and vehicles, and outputs the corresponding analysis results to the data processing and analysis module. S3、数据处理与综合判断,数据处理与分析模块收集视觉大模型的分析结果后,一方面整合同一时刻不同角度摄像头,采集到的相关数据,构建完整的监控场景视图;另一方面结合历史监控数据,通过内置的风险评估算法,判断当前监控区域的安全状态,比如计算出当前人员异常行为的概率、环境中出现安全隐患的等级,生成安全风险评估报告;例如,若发现某区域短时间内出现大量陌生人员聚集且行为异常,结合历史该区域正常人流数据对比,判定该情况为较高风险事件,标记为重点关注情况;S3. Data processing and comprehensive judgment. After the data processing and analysis module collects the analysis results of the large visual model, it integrates the relevant data collected by cameras at different angles at the same time to build a complete monitoring scene view. On the other hand, it combines historical monitoring data and uses the built-in risk assessment algorithm to judge the safety status of the current monitoring area. For example, it calculates the probability of abnormal behavior of the current personnel and the level of safety hazards in the environment, and generates a safety risk assessment report. For example, if a large number of unfamiliar people gather in a certain area in a short period of time and behave abnormally, combined with the historical data of normal traffic in the area, it is judged as a high-risk event and marked as a key concern. S4、信息传输与远程监控,通信模块将采集的原始视频图像数据、视觉大模型的分析结果以及生成的安全风险评估报告按照设定的周期,如每隔1分钟或者针对紧急的高风险事件实时发送至远程的监控管理平台以及安保人员的移动终端上;安保人员通过手机端APP或者监控管理平台的界面,实时查看各个摄像头的监控画面、分析结果以及风险评估情况,远程对监控区域进行监管,若发现可疑情况还可远程操控摄像头进行变焦、转向操作以获取更清晰准确的图像信息;S4, information transmission and remote monitoring. The communication module sends the collected original video image data, the analysis results of the visual large model and the generated security risk assessment report to the remote monitoring management platform and the mobile terminal of the security personnel in real time according to the set cycle, such as every 1 minute or for urgent high-risk events; the security personnel can view the monitoring screen, analysis results and risk assessment of each camera in real time through the mobile phone APP or the interface of the monitoring management platform, remotely supervise the monitoring area, and remotely control the camera to zoom and turn in order to obtain clearer and more accurate image information if suspicious situations are found; S5、报警触发与响应,当数据处理与分析模块判定监控区域内出现达到预设报警阈值的安全事件时,比如检测到有未经授权人员试图闯入限制区域、发生火灾烟雾情况,报警模块立即启动;声光报警器发出强烈的声光信号威慑潜在危险人员,同时通过通信模块向安保人员、相关负责人的移动终端发送包含事件详细信息,如事件发生地点、事件类型的报警通知,相关人员收到通知后可迅速赶赴现场进行处理,或远程指挥调度采取相应的应对措施;S5, alarm triggering and response, when the data processing and analysis module determines that a security event that reaches the preset alarm threshold occurs in the monitoring area, such as detecting an unauthorized person trying to break into a restricted area or a fire and smoke situation, the alarm module is immediately activated; the sound and light alarm sends out a strong sound and light signal to deter potentially dangerous persons, and at the same time, an alarm notification containing detailed information of the event, such as the location of the event and the type of event, is sent to the mobile terminal of the security personnel and the relevant person in charge through the communication module. After receiving the notification, the relevant personnel can quickly rush to the scene to deal with it, or remotely command and dispatch to take corresponding response measures; S6、数据存储与管理,存储模块持续存储图像采集模块采集到的原始视频图像数据,按照时间顺序以及事件标签,如以每次报警事件、日常巡查时段为标签进行分类归档,方便后续查询回放;同时也存储经过处理后的分析结果、安全风险评估报告关键信息,便于管理人员进行数据统计、趋势分析以及作为安全管理决策的依据,存储的数据定期备份至外部存储设备或者云端,确保数据的安全性和完整性。S6. Data storage and management. The storage module continuously stores the original video image data collected by the image acquisition module, and classifies and archives them according to chronological order and event labels, such as each alarm event and daily patrol period, to facilitate subsequent query and playback; it also stores processed analysis results and key information of security risk assessment reports to facilitate management personnel to conduct data statistics, trend analysis, and as a basis for security management decisions. The stored data is regularly backed up to external storage devices or the cloud to ensure data security and integrity.
CN202510148783.6A 2025-02-11 2025-02-11 Safety management camera module and method based on visual model Pending CN120075566A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510148783.6A CN120075566A (en) 2025-02-11 2025-02-11 Safety management camera module and method based on visual model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510148783.6A CN120075566A (en) 2025-02-11 2025-02-11 Safety management camera module and method based on visual model

Publications (1)

Publication Number Publication Date
CN120075566A true CN120075566A (en) 2025-05-30

Family

ID=95799349

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510148783.6A Pending CN120075566A (en) 2025-02-11 2025-02-11 Safety management camera module and method based on visual model

Country Status (1)

Country Link
CN (1) CN120075566A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120957010A (en) * 2025-10-17 2025-11-14 北京中科辉丰科技有限公司 Environmental Risk AI Visual Early Warning Terminal
CN121259511A (en) * 2025-12-08 2026-01-02 北京淘车科技有限公司 A multi-dimensional method and system for quality inspection and review of vehicle images.

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120957010A (en) * 2025-10-17 2025-11-14 北京中科辉丰科技有限公司 Environmental Risk AI Visual Early Warning Terminal
CN120957010B (en) * 2025-10-17 2026-02-10 北京中科辉丰科技有限公司 Environmental Risk AI Visual Early Warning Terminal
CN121259511A (en) * 2025-12-08 2026-01-02 北京淘车科技有限公司 A multi-dimensional method and system for quality inspection and review of vehicle images.

Similar Documents

Publication Publication Date Title
CN103268680B (en) A kind of family intelligent monitoring burglary-resisting system
CN100504942C (en) Intelligent video monitoring equipment module and system and monitoring method thereof
CN103108159B (en) Electric power intelligent video analyzing and monitoring system and method
CN120075566A (en) Safety management camera module and method based on visual model
CN104079874B (en) A kind of security protection integral system and method based on technology of Internet of things
WO2021253961A1 (en) Intelligent visual perception system
CN116129490A (en) Monitoring device and monitoring method for complex environment behavior recognition
CN112112629A (en) A safety business management system and method during drilling operation
KR101036947B1 (en) Automated security system for crime and accident prevention using computer image analysis technology
CN106059868A (en) Home intelligent video monitoring protection system
CN106408833A (en) Perimeter intrusion detection method and system
CN115660297A (en) A construction site safety automatic AI early warning system and method
CN112785809B (en) Fire re-ignition prediction method and system based on AI image recognition
CN106454253A (en) Method and system for detecting area wandering
KR20200052418A (en) Automated Violence Detecting System based on Deep Learning
CN111626636A (en) Industrial safety emergency management platform based on big data analysis technology
KR20200017594A (en) Method for Recognizing and Tracking Large-scale Object using Deep learning and Multi-Agent
CN112836689A (en) Dangerous area personnel management and control system and method based on image recognition
Dawwd Embedded real-time video surveillance system based on multi-sensor and visual tracking
CN113537790A (en) Super-fusion management platform
CN120166195A (en) An unattended system based on AI model and video analysis
CN113392706A (en) Device and method for detecting smoking and using mobile phone behaviors
KR20230097854A (en) Method and system for recognizing dangerous behavior of workers in power plant
Nagamani et al. Anti-theft monitoring for a smart home
KR102891317B1 (en) Water facility Intelligent video safety management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination