CN112200067B

CN112200067B - Intelligent video event detection method, system, electronic equipment and storage medium

Info

Publication number: CN112200067B
Application number: CN202011072563.3A
Authority: CN
Inventors: 何颂颂; 陶剑文; 但雨芳; 季谋
Original assignee: Ningbo Polytechnic
Current assignee: Ningbo Huantong Information Technology Co ltd
Priority date: 2020-10-09
Filing date: 2020-10-09
Publication date: 2024-02-02
Anticipated expiration: 2040-10-09
Also published as: CN112200067A

Abstract

The invention provides an intelligent video event detection method, system, electronic device and storage medium. The method obtains a video source from a database, tags the video source, and determines the corresponding preset video template according to the tag of the video source; determines Based on the key frames of the video source, the video source is split into several video clips with the same number of frames based on the key frames. Based on the preset detection model, several video clips are scored and sorted according to the preset video template, and the video clip with the highest score is determined. is the target video clip; when the highest score is greater than or equal to the preset threshold, the frames in the target video clip with the same timestamp corresponding to the marked frame in the preset video template are determined as event evidence graphs, and the corresponding event type is determined at the same time . It can intelligently and automatically detect events in the video and output detection results. At the same time, the detection results contain key image evidence of the event, which improves the efficiency of event detection and makes event detection more accurate and credible.

Description

Intelligent video event detection method, system, electronic device and storage medium

技术领域Technical field

本发明属于计算机技术领域，尤其涉及一种智能视频事件检测方法、系统、电子设备及可存储介质。The invention belongs to the field of computer technology, and in particular relates to an intelligent video event detection method, system, electronic equipment and storage medium.

背景技术Background technique

随着社会经济水平的提高，对视频的内容进行智能检测进而具体应用到具体场景以实现具体场景的事件判断已成热门的技术，比如，交通事故的视频检测、防盗治安的视频监控、车流的统计等。利用智能视频事件检测技术可义在很大程度上提高检测的效率。With the improvement of social and economic level, it has become a popular technology to intelligently detect the content of videos and then apply it to specific scenarios to achieve event judgment in specific scenarios, such as video detection of traffic accidents, video monitoring of anti-theft security, and traffic flow monitoring. Statistics etc. The use of intelligent video event detection technology can greatly improve detection efficiency.

目前，视频事件的识别检测主要靠人工筛选浏览，或者只能识别部分视频事件，还需要人工筛选部分，无法智能化的快速设别视频内容进而判断具体场景的事件信息，检测效率较低。At present, the recognition and detection of video events mainly relies on manual screening and browsing, or only part of the video events can be recognized, and some parts need to be manually screened. It is impossible to intelligently and quickly identify the video content to determine the event information of specific scenes, and the detection efficiency is low.

发明内容Contents of the invention

本发明实施例的第一目的在于提供一种智能视频事件检测方法，旨在解决目前视频事件检测效率低下的问题。The first purpose of embodiments of the present invention is to provide an intelligent video event detection method, aiming to solve the current problem of low efficiency in video event detection.

本发明实施例是这样实现的，一种智能视频事件检测方法，包括：The embodiment of the present invention is implemented as follows: an intelligent video event detection method includes:

从数据库获取视频源，对所述视频源进行打标签，根据所述视频源的标签确定对应的预设视频模板；Obtain video sources from the database, tag the video sources, and determine the corresponding preset video template according to the tags of the video sources;

确定所述视频源的关键帧，根据关键帧将所述视频源拆分为若干个具有相同数量帧的视频片段，基于预设检测模型根据所述预设视频模板对所述若干视频片段进行打分排序，将打分最高的视频片段确定为目标视频片段；Determine the key frames of the video source, split the video source into several video clips with the same number of frames according to the key frames, and score the several video clips according to the preset video template based on a preset detection model Sort and determine the video clip with the highest score as the target video clip;

当所述打分的最高分大于等于预设阈值时，将所述目标视频片段中与所述预设视频模板中的标记帧对应的相同时间戳的帧确定为事件证据图，同时确定对应的事件类型。When the highest score is greater than or equal to the preset threshold, frames with the same timestamp in the target video clip corresponding to the marked frame in the preset video template are determined as event evidence graphs, and the corresponding events are determined at the same time. type.

在一个实施例中，所述从数据库获取视频源，对所述视频源进行打标签，根据所述视频源的标签确定对应的预设视频模板包括：从系统数据库中获取待检测视频，根据所述待检测视频的基础信息对所述视频源进行打标签，根据所述视频源的标签从预设视频模板库中选取对应的预设视频模板；所述待检测视频为数据中所有待检测视频中拍摄时间最早的视频，所述基础信息包括视频拍摄位置信息、视频拍摄时间信息、对应的视频拍摄装置的设备信息。In one embodiment, obtaining the video source from the database, labeling the video source, and determining the corresponding preset video template according to the label of the video source includes: obtaining the video to be detected from the system database, and according to the Tag the video source with the basic information of the video to be detected, and select the corresponding preset video template from the preset video template library according to the label of the video source; the video to be detected is all the videos to be detected in the data The basic information includes video shooting location information, video shooting time information, and equipment information of the corresponding video shooting device.

在一个实施例中，所述预设视频模板包括具有标签的若干和所述目标视频片段具有相同帧的视频，且所述预设视频模板的帧经过灰度处理，所述预设视频模板的标签包括位置信息。In one embodiment, the preset video template includes several videos with tags that have the same frames as the target video clip, and the frames of the preset video template are processed in grayscale, and the frames of the preset video template are Tags include location information.

在一个实施例中，所述根据所述视频源的标签从预设视频模板库中选取对应的预设视频模板包括；根据所述视频源的位置信息遍历所述预设视频模板库，将位置信息和所述视频源的位置信息相同的预设视频模板确定为对应的预设视频模板。In one embodiment, selecting a corresponding preset video template from a preset video template library according to the tag of the video source includes: traversing the preset video template library according to the location information of the video source, and adding the location to the preset video template library. The preset video template whose information is the same as the position information of the video source is determined as the corresponding preset video template.

在一个实施例中，所述基于预设检测模型根据所述预设视频模板对所述若干视频片段进行打分排序包括：对所述视频片段的每一帧进行灰度化，基于图像相似度算法模型，对所述视频片段的每一帧和预设视频模板的同一时间戳的帧进行相似度计算得到每一帧的相似度，对所述视频片段的所有帧进行加权计算后得出所述视频片段和预设视频模板的相似度，根据所述相似度对所述若干视频片段进行打分排序，相似度越高打分越高，得分越高排序越前。In one embodiment, the scoring and sorting of the several video clips according to the preset video template based on the preset detection model includes: performing grayscale on each frame of the video clip, based on the image similarity algorithm model, calculate the similarity between each frame of the video clip and the frame with the same timestamp of the preset video template to obtain the similarity of each frame, and perform a weighted calculation on all the frames of the video clip to obtain the Based on the similarity between the video clips and the preset video template, the video clips are scored and sorted according to the similarity. The higher the similarity, the higher the score. The higher the score, the higher the ranking.

本发明实施例的另一目的在于提供一种智能视频事件检测系统，包括：Another object of embodiments of the present invention is to provide an intelligent video event detection system, including:

视频获取单元，用于从数据库获取视频源，对所述视频源进行打标签，根据所述视频源的标签确定对应的预设视频模板；A video acquisition unit, configured to acquire a video source from a database, tag the video source, and determine a corresponding preset video template according to the tag of the video source;

目标视频确定单元，用于确定所述视频源的关键帧，根据关键帧将所述视频源拆分为若干个具有相同数量帧的视频片段，基于预设检测模型根据所述预设视频模板对所述若干视频片段进行打分排序，将打分最高的视频片段确定为目标视频片段；The target video determination unit is used to determine the key frames of the video source, split the video source into several video segments with the same number of frames according to the key frames, and detect the target video based on the preset video template based on the preset detection model. The several video clips are ranked and sorted, and the video clip with the highest score is determined as the target video clip;

检测结果确定单元，用于当所述打分的最高分大于等于预设阈值时，将所述目标视频片段中与所述预设视频模板中的标记帧对应的相同时间戳的帧确定为事件证据图，同时确定对应的事件类型。A detection result determination unit configured to determine frames with the same timestamp in the target video segment corresponding to the marked frame in the preset video template as event evidence when the highest score is greater than or equal to the preset threshold. Figure, while determining the corresponding event type.

本发明实施例的又一目的在于提供一种电子设备，包括存储器和处理器，所述存储器中存储有计算机程序，所述计算机程序被所述处理器执行时，使得所述处理器执行所述智能视频事件检测方法的步骤。Another object of embodiments of the present invention is to provide an electronic device, including a memory and a processor. A computer program is stored in the memory. When the computer program is executed by the processor, the processor is caused to execute the Steps of intelligent video event detection method.

本发明实施例的再一目的在于一种计算机可读存储介质，所述计算机可读存储介质上存储有计算机程序，所述计算机程序被处理器执行时，使得所述处理器执行所述智能视频事件检测方法的步骤。Another object of the embodiments of the present invention is a computer-readable storage medium. A computer program is stored on the computer-readable storage medium. When the computer program is executed by a processor, it causes the processor to execute the intelligent video Steps of event detection method.

本发明实施例提供的一种智能视频事件检测方法，通过从数据库获取视频源，对所述视频源进行打标签，根据所述视频源的标签确定对应的预设视频模板；确定所述视频源的关键帧，根据关键帧将所述视频源拆分为若干个具有相同数量帧的视频片段，基于预设检测模型根据所述预设视频模板对所述若干视频片段进行打分排序，将打分最高的视频片段确定为目标视频片段；当所述打分的最高分大于等于预设阈值时，将所述目标视频片段中与所述预设视频模板中的标记帧对应的相同时间戳的帧确定为事件证据图，同时确定对应的事件类型。可以对视频中的事件智能自动检测并输出检测结果，同时检测结果中具有事件的关键图像证据，一方面提高了事件检测的效率，另一方面使得事件的检测更加准确和可信。An intelligent video event detection method provided by an embodiment of the present invention obtains a video source from a database, tags the video source, and determines the corresponding preset video template according to the tag of the video source; determines the video source key frames, split the video source into several video clips with the same number of frames according to the key frames, score and sort the several video clips according to the preset video template based on a preset detection model, and select the one with the highest score The video segment is determined as the target video segment; when the highest score is greater than or equal to the preset threshold, the frame in the target video segment with the same timestamp corresponding to the marked frame in the preset video template is determined as Event evidence map, while determining the corresponding event type. It can intelligently and automatically detect events in the video and output the detection results. At the same time, the detection results contain key image evidence of the event. On the one hand, it improves the efficiency of event detection, and on the other hand, it makes the event detection more accurate and credible.

附图说明Description of drawings

图1为本发明一个实施例提供的一种智能视频事件检测方法的实现流程；Figure 1 is an implementation process of an intelligent video event detection method provided by an embodiment of the present invention;

图2是本发明实施例提供的一种智能视频事件检测系统的主要模块示意图；Figure 2 is a schematic diagram of the main modules of an intelligent video event detection system provided by an embodiment of the present invention;

图3为本发明实施例提供的可以应用于其中的示例性系统架构图；Figure 3 is an exemplary system architecture diagram that can be applied in the embodiment of the present invention;

图4为适于用来实现本发明实施例的终端设备或服务器的计算机系统的结构示意图。FIG. 4 is a schematic structural diagram of a computer system suitable for implementing a terminal device or server according to an embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。In order to make the purpose, technical solutions and advantages of the present invention more clear, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention and are not intended to limit the present invention.

在本发明实施例中使用的术语是仅仅出于描述特定实施例的目的，而非旨在限制本发明。在本发明实施例和所附权利要求书中所使用的单数形式的“一种”和“该”也旨在包括多数形式，除非上下文清楚地表示其他含义。还应当理解，本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。The terminology used in the embodiments of the present invention is only for the purpose of describing specific embodiments and is not intended to limit the present invention. As used in this embodiment and the appended claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.

应当理解，尽管在本发明实施例中可能采用术语第一、第二等来描述各种信息，但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。It should be understood that although the terms first, second, etc. may be used to describe various information in the embodiments of the present invention, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from each other.

需要指出的是，在不冲突的情况下，本发明中的实施例以及实施例中的特征可以互相组合。It should be noted that, without conflict, the embodiments and features of the embodiments of the present invention can be combined with each other.

为了进一步阐述本发明为实现预定发明目的所采取的技术手段及功效，以下结合附图及较佳实施例，对依据本发明的具体实施方式、结构、特征及其功效，详细说明如下。In order to further elaborate on the technical means and effects adopted by the present invention to achieve the intended inventive purpose, the specific implementation manner, structure, features and effects of the present invention are described in detail below with reference to the drawings and preferred embodiments.

图1示出了本发明实施例提供的一种智能视频事件检测方法的实现流程，为了便于说明，仅示出与本发明实施例相关的部分，详述如下：Figure 1 shows the implementation process of an intelligent video event detection method provided by an embodiment of the present invention. For ease of explanation, only the parts related to the embodiment of the present invention are shown. The details are as follows:

一种智能视频事件检测方法，包括：An intelligent video event detection method, including:

S101：从数据库获取视频源，对所述视频源进行打标签，根据所述视频源的标签确定对应的预设视频模板；S101: Obtain the video source from the database, label the video source, and determine the corresponding preset video template according to the label of the video source;

S102：确定所述视频源的关键帧，根据关键帧将所述视频源拆分为若干个具有相同数量帧的视频片段，基于预设检测模型根据所述预设视频模板对所述若干视频片段进行打分排序，将打分最高的视频片段确定为目标视频片段；S102: Determine the key frames of the video source, split the video source into several video clips with the same number of frames based on the key frames, and detect the several video clips based on the preset video template based on a preset detection model. Perform scoring sorting and determine the video clip with the highest score as the target video clip;

S103：当所述打分的最高分大于等于预设阈值时，将所述目标视频片段中与所述预设视频模板中的标记帧对应的相同时间戳的帧确定为事件证据图，同时确定对应的事件类型。S103: When the highest score is greater than or equal to the preset threshold, determine the frame with the same time stamp in the target video clip corresponding to the marked frame in the preset video template as an event evidence graph, and determine the corresponding event type.

在步骤S101中，从数据库获取视频源，对所述视频源进行打标签，根据所述视频源的标签确定对应的预设视频模板，由此可以获取到视频拍摄设备拍摄的视频进行检测，将获取到的视频源进行打标签，可以使得根据标签在预设视频模板库中选取和标签对应的预设视频模板。In step S101, the video source is obtained from the database, the video source is tagged, and the corresponding preset video template is determined according to the label of the video source, so that the video shot by the video shooting device can be obtained for detection, and the The obtained video source is tagged, so that a preset video template corresponding to the tag can be selected from the preset video template library according to the tag.

在一个实施例中，所述从数据库获取视频源，对所述视频源进行打标签，根据所述视频源的标签确定对应的预设视频模板包括：从系统数据库中获取待检测视频，根据所述待检测视频的基础信息对所述视频源进行打标签，根据所述视频源的标签从预设视频模板库中选取对应的预设视频模板；所述待检测视频为数据中所有待检测视频中拍摄时间最早的视频，所述基础信息包括视频拍摄位置信息、视频拍摄时间信息、对应的视频拍摄装置的设备信息。由此，可以通过标签确定视频源的基础信息，比如视频源的来源(设备信息，哪个设备拍摄的)，地理位置信息，拍摄时间信息等，进而可以通过标签在预设视频模板库中选择和标签匹配的预设视频模板，也可以通过标签确定事件类型所对应的具体地理位置，也可以当需要对事件进行追溯时，可以根据标签反映的视频源的来源进行追溯。In one embodiment, obtaining the video source from the database, labeling the video source, and determining the corresponding preset video template according to the label of the video source includes: obtaining the video to be detected from the system database, and according to the Tag the video source with the basic information of the video to be detected, and select the corresponding preset video template from the preset video template library according to the label of the video source; the video to be detected is all the videos to be detected in the data The basic information includes video shooting location information, video shooting time information, and equipment information of the corresponding video shooting device. Therefore, the basic information of the video source can be determined through tags, such as the source of the video source (device information, which device shot it), geographical location information, shooting time information, etc., and then the tags can be used to select and select from the preset video template library. The preset video template matched by the tag can also determine the specific geographical location corresponding to the event type through the tag. When the event needs to be traced, the source of the video source reflected by the tag can be traced.

具体的，比如获取到一个视频源，该视频源是A设备在街道X的路口B拍摄，拍摄时间为2020年3月2号上午9点，则将该视频源可以打标签为“X街道路口B,3月2号上午9点，A”，在预设视频模板库中匹配标签为X街道路口B的视频模板做为对应的预设模板。Specifically, for example, if a video source is obtained, which was shot by device A at intersection B of street B, March 2nd, 9 am, A” matches the video template labeled X street intersection B in the default video template library as the corresponding default template.

由此，通过从数据库获取视频源，对所述视频源进行打标签，根据所述视频源的标签确定对应的预设视频模板；确定所述视频源的关键帧，根据关键帧将所述视频源拆分为若干个具有相同数量帧的视频片段，基于预设检测模型根据所述预设视频模板对所述若干视频片段进行打分排序，将打分最高的视频片段确定为目标视频片段；当所述打分的最高分大于等于预设阈值时，将所述目标视频片段中与所述预设视频模板中的标记帧对应的相同时间戳的帧确定为事件证据图，同时确定对应的事件类型。使得该智能视频事件检测方法可以对视频中的事件智能自动检测并输出检测结果，同时检测结果中具有事件的关键图像证据，一方面提高了事件检测的效率，另一方面使得事件的检测更加准确和可信。Thus, the video source is obtained from the database, the video source is tagged, and the corresponding preset video template is determined according to the label of the video source; the key frame of the video source is determined, and the video is classified according to the key frame. The source is split into several video clips with the same number of frames, the several video clips are scored and sorted based on the preset video template based on a preset detection model, and the video clip with the highest score is determined as the target video clip; when the When the highest score is greater than or equal to the preset threshold, frames with the same timestamp in the target video clip corresponding to the marked frame in the preset video template are determined as event evidence graphs, and the corresponding event type is determined at the same time. This intelligent video event detection method can intelligently and automatically detect events in the video and output detection results. At the same time, the detection results contain key image evidence of the event. On the one hand, it improves the efficiency of event detection, and on the other hand, it makes the event detection more accurate. and credible.

在步骤S102中：确定所述视频源的关键帧，根据关键帧将所述视频源拆分为若干个具有相同数量帧的视频片段，基于预设检测模型根据所述预设视频模板对所述若干视频片段进行打分排序，将打分最高的视频片段确定为目标视频片段，由此可以确定该视频源是否是满足事件类型的视频，并未后续的处理进行准备。In step S102: determine the key frames of the video source, split the video source into several video segments with the same number of frames according to the key frames, and detect the video source based on the preset detection model according to the preset video template. Several video clips are sorted by scoring, and the video clip with the highest score is determined as the target video clip. From this, it can be determined whether the video source is a video that meets the event type, and subsequent processing is not prepared.

在步骤S103中：当所述打分的最高分大于等于预设阈值时，将所述目标视频片段中与所述预设视频模板中的标记帧对应的相同时间戳的帧确定为事件证据图，同时确定对应的事件类型。In step S103: when the highest score is greater than or equal to the preset threshold, determine the frame in the target video clip with the same timestamp corresponding to the marked frame in the preset video template as an event evidence graph, Also determine the corresponding event type.

在这里，预设阈值可以根据具体场景进行设定，比如在车辆违章信息监控方面，预设阈值可以设置为95％，在事故检测中，可以设为80％。Here, the preset threshold can be set according to specific scenarios. For example, in vehicle violation information monitoring, the preset threshold can be set to 95%, and in accident detection, it can be set to 80%.

在一个实施例中，可以将确定的时间证据图和对应的事件类型生成界面输出至客户端进行显示，界面信息可以包括证据图、事件时间和事件类型，比如，当视频源为在2020年3月2号上午9点某车辆在街道X的路口B的压线行驶，在界面信息包括该车辆压线的证据图，压线时间，压线违章。In one embodiment, the determined time evidence graph and corresponding event type generation interface can be output to the client for display. The interface information can include the evidence graph, event time and event type. For example, when the video source is in March 2020 At 9 a.m. on the 2nd of the month, a vehicle was driving on the crossing line at intersection B of Street

图2示出了本发明实施例提供的一种智能视频事件检测系统的主要模块示意图，为了便于说明，仅示出与本发明实施例相关的部分，详述如下：Figure 2 shows a schematic diagram of the main modules of an intelligent video event detection system provided by an embodiment of the present invention. For ease of explanation, only the parts related to the embodiment of the present invention are shown. The details are as follows:

一种智能视频事件检测系统200，包括：An intelligent video event detection system 200, including:

视频获取单元201，用于从数据库获取视频源，对所述视频源进行打标签，根据所述视频源的标签确定对应的预设视频模板；The video acquisition unit 201 is used to acquire video sources from the database, tag the video sources, and determine the corresponding preset video template according to the tags of the video sources;

目标视频确定单元202，用于确定所述视频源的关键帧，根据关键帧将所述视频源拆分为若干个具有相同数量帧的视频片段，基于预设检测模型根据所述预设视频模板对所述若干视频片段进行打分排序，将打分最高的视频片段确定为目标视频片段；The target video determination unit 202 is used to determine key frames of the video source, split the video source into several video segments with the same number of frames according to the key frames, and based on the preset video template based on a preset detection model Score and sort the several video clips, and determine the video clip with the highest score as the target video clip;

检测结果确定单元203，用于当所述打分的最高分大于等于预设阈值时，将所述目标视频片段中与所述预设视频模板中的标记帧对应的相同时间戳的帧确定为事件证据图，同时确定对应的事件类型。The detection result determination unit 203 is configured to determine, as an event, frames in the target video clip with the same timestamp corresponding to the marked frame in the preset video template when the highest score is greater than or equal to the preset threshold. Evidence map, while determining the corresponding event type.

由此，本发明实施例提供的智能视频事件检测系统200，包括：视频获取单元201，用于从数据库获取视频源，对所述视频源进行打标签，根据所述视频源的标签确定对应的预设视频模板；目标视频确定单元202，用于确定所述视频源的关键帧，根据关键帧将所述视频源拆分为若干个具有相同数量帧的视频片段，基于预设检测模型根据所述预设视频模板对所述若干视频片段进行打分排序，将打分最高的视频片段确定为目标视频片段；检测结果确定单元203，用于当所述打分的最高分大于等于预设阈值时，将所述目标视频片段中与所述预设视频模板中的标记帧对应的相同时间戳的帧确定为事件证据图，同时确定对应的事件类型。可以对视频中的事件智能自动检测并输出检测结果，同时检测结果中具有事件的关键图像证据，一方面提高了事件检测的效率，另一方面使得事件的检测更加准确和可信。Therefore, the intelligent video event detection system 200 provided by the embodiment of the present invention includes: a video acquisition unit 201, used to obtain a video source from a database, label the video source, and determine the corresponding video source according to the label of the video source. Preset video template; target video determination unit 202, used to determine key frames of the video source, split the video source into several video segments with the same number of frames based on the key frames, and based on the preset detection model The preset video template is used to score and sort the several video clips, and the video clip with the highest score is determined as the target video clip; the detection result determination unit 203 is used to determine when the highest score is greater than or equal to the preset threshold. Frames in the target video clip with the same timestamp corresponding to the marked frame in the preset video template are determined as event evidence graphs, and the corresponding event type is determined at the same time. It can intelligently and automatically detect events in the video and output the detection results. At the same time, the detection results contain key image evidence of the event. On the one hand, it improves the efficiency of event detection, and on the other hand, it makes the event detection more accurate and credible.

图3示出了可以应用本发明实施例的检测方法或检测装置的示例性系统架构500。FIG. 3 shows an exemplary system architecture 500 to which the detection method or detection device according to the embodiment of the present invention can be applied.

如图3所示，系统架构500可以包括终端设备501、502、503，网络504和服务器505。网络504用以在终端设备501、502、503和服务器505之间提供通信链路的介质。网络504可以包括各种连接类型，例如有线、无线通信链路或者光纤电缆等等。As shown in Figure 3, the system architecture 500 may include terminal devices 501, 502, 503, a network 504 and a server 505. Network 504 is used to provide a medium for communication links between terminal devices 501, 502, 503 and server 505. Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

用户可以使用终端设备501、502、503通过网络504与服务器505交互，以接收或发送消息等。终端设备501、502、503上可以安装有各种通讯客户端应用，例如购物类应用、网页浏览器应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等。Users can use terminal devices 501, 502, 503 to interact with the server 505 through the network 504 to receive or send messages, etc. Various communication client applications can be installed on the terminal devices 501, 502, and 503, such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, social platform software, etc.

终端设备501、502、503可以是具有显示屏并且支持网页浏览的各种电子设备，包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。The terminal devices 501, 502, and 503 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, and so on.

服务器505可以是提供各种服务的服务器，例如对用户利用终端设备501、502、503所发送的往来消息提供支持的后台管理服务器。后台管理服务器可以在接收到终端设备请求后进行分析等处理，并将处理结果反馈给终端设备。The server 505 may be a server that provides various services, such as a background management server that provides support for messages sent by users using the terminal devices 501, 502, and 503. The background management server can perform analysis and other processing after receiving the request from the terminal device, and feed the processing results back to the terminal device.

需要说明的是，本发明实施例所提供的智能视频事件检测方法可以由服务器505执行，也可以由终端设备501、502、503执行，相应地，智能视频事件检测系统可以由服务器505执行，也可以由终端设备501、502、503执行。It should be noted that the intelligent video event detection method provided by the embodiment of the present invention can be executed by the server 505, or can be executed by the terminal devices 501, 502, 503. Correspondingly, the intelligent video event detection system can be executed by the server 505, or Can be executed by terminal devices 501, 502, 503.

应该理解，图3中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要，可以具有任意数目的终端设备、网络和服务器。It should be understood that the number of terminal devices, networks and servers in Figure 3 is only illustrative. Depending on implementation needs, there can be any number of end devices, networks, and servers.

下面参考图4，其示出了适于用来实现本发明实施例的电子设备的计算机系统600的结构示意图。图4示出的计算机系统仅仅是一个示例，不应对本发明实施例的功能和使用范围带来任何限制。Referring now to FIG. 4 , a schematic structural diagram of a computer system 600 suitable for implementing an electronic device according to an embodiment of the present invention is shown. The computer system shown in FIG. 4 is only an example and should not impose any limitations on the functions and usage scope of the embodiments of the present invention.

如图4所示，计算机系统600包括中央处理单元(CPU)601，其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中，还存储有系统600操作所需的各种程序和数据。CPU 601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in FIG. 4, computer system 600 includes a central processing unit (CPU) 601 that can operate according to a program stored in a read-only memory (ROM) 602 or loaded from a storage portion 608 into a random access memory (RAM) 603. And perform various appropriate actions and processing. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

以下部件连接至I/O接口605：包括键盘、鼠标等的输入部分606；包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分607；包括硬盘等的存储部分608；以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611，诸如磁盘、光盘、磁光盘、半导体存储器等等，根据需要安装在驱动器610上，以便于从其上读出的计算机程序根据需要被安装入存储部分608。The following components are connected to the I/O interface 605: an input section 606 including a keyboard, a mouse, etc.; an output section 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., speakers, etc.; and a storage section 608 including a hard disk, etc. ; and a communication section 609 including a network interface card such as a LAN card, a modem, etc. The communication section 609 performs communication processing via a network such as the Internet. Driver 610 is also connected to I/O interface 605 as needed. Removable media 611, such as magnetic disks, optical disks, magneto-optical disks, semiconductor memories, etc., are installed on the drive 610 as needed, so that a computer program read therefrom is installed into the storage portion 608 as needed.

特别地，根据本发明公开的实施例，上文参考流程图描述的过程可以被实现为计算机软件程序。例如，本发明公开的实施例包括一种计算机程序产品，其包括承载在计算机可读介质上的计算机程序，该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中，该计算机程序可以通过通信部分609从网络上被下载和安装，和/或从可拆卸介质611被安装。在该计算机程序被中央处理单元(CPU)601执行时，执行本发明的系统中限定的上述功能。In particular, according to embodiments disclosed in the present invention, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, the disclosed embodiments of the present invention include a computer program product including a computer program carried on a computer-readable medium, the computer program including program code for executing the method shown in the flowchart. In such embodiments, the computer program may be downloaded and installed from the network via communication portion 609, and/or installed from removable media 611. When the computer program is executed by the central processing unit (CPU) 601, the above-mentioned functions defined in the system of the present invention are performed.

需要说明的是，本发明所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于：具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本发明中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本发明中，计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号，其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式，包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质，该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输，包括但不限于：无线、电线、光缆、RF等等，或者上述的任意合适的组合。It should be noted that the computer-readable medium shown in the present invention may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium may be, for example, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or any combination thereof. More specific examples of computer readable storage media may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard drive, random access memory (RAM), read only memory (ROM), removable Programmed read-only memory (EPROM or flash memory), fiber optics, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In the present invention, a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in conjunction with an instruction execution system, apparatus, or device. In the present invention, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, in which computer-readable program code is carried. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device . Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to: wireless, wire, optical cable, RF, etc., or any suitable combination of the foregoing.

附图中的流程图和框图，图示了按照本发明各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分，上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个接连地表示的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图或流程图中的每个方框、以及框图或流程图中的方框的组合，可以用执行规定的功能或操作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved. It will also be noted that each block in the block diagram or flowchart illustration, and combinations of blocks in the block diagram or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or operations, or may be implemented by special purpose hardware-based systems that perform the specified functions or operations. Achieved by a combination of specialized hardware and computer instructions.

描述于本发明实施例中所涉及到的单元可以通过软件的方式实现，也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中，例如，可以描述为：一种处理器包括确定单元、提取单元、训练单元和筛选单元。其中，这些单元的名称在某种情况下并不构成对该单元本身的限定，例如，确定单元还可以被描述为“确定候选用户集的单元”。The units involved in the embodiments of the present invention can be implemented in software or hardware. The described unit may also be provided in a processor. For example, it may be described as follows: a processor includes a determination unit, an extraction unit, a training unit and a screening unit. The names of these units do not constitute a limitation on the unit itself under certain circumstances. For example, the determining unit may also be described as "the unit that determines the candidate user set."

以上所述实施例仅表达了本发明的几种实施方式，其描述较为具体和详细，但并不能因此而理解为对本发明专利范围的限制。应当指出的是，对于本领域的普通技术人员来说，在不脱离本发明构思的前提下，还可以做出若干变形和改进，这些都属于本发明的保护范围。因此，本发明专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation modes of the present invention, and their descriptions are relatively specific and detailed, but they should not be construed as limiting the patent scope of the present invention. It should be noted that, for those of ordinary skill in the art, several modifications and improvements can be made without departing from the concept of the present invention, and these all belong to the protection scope of the present invention. Therefore, the scope of protection of the patent of the present invention should be determined by the appended claims.

以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.

Claims

1. An intelligent video event detection method, characterized by including:

Obtain video sources from the database, tag the video sources, and determine the corresponding preset video template according to the tags of the video sources;

Determine the key frames of the video source, split the video source into several video clips with the same number of frames according to the key frames, and score the several video clips according to the preset video template based on a preset detection model Sort and determine the video clip with the highest score as the target video clip;

When the highest score is greater than or equal to the preset threshold, frames with the same timestamp in the target video clip corresponding to the marked frame in the preset video template are determined as event evidence graphs, and the corresponding events are determined at the same time. type;

Obtaining the video source from the database, labeling the video source, and determining the corresponding preset video template according to the label of the video source include: obtaining the video to be detected from the system database, and based on the video to be detected The video source is tagged with the information, and the corresponding preset video template is selected from the preset video template library according to the tag of the video source; the video to be detected is the video with the earliest shooting time among all the videos to be detected in the data. , the basic information includes video shooting location information, video shooting time information, and equipment information of the corresponding video shooting device;

The preset video template includes several videos with labels that have the same frame as the target video clip, and the frames of the preset video template are processed in grayscale, and the labels of the preset video template include position information;

Selecting the corresponding preset video template from the preset video template library according to the tag of the video source includes: traversing the preset video template library according to the location information of the video source, and combining the location information and the video source The preset video template with the same location information is determined as the corresponding preset video template;

The scoring and sorting of the several video clips according to the preset video template based on the preset detection model includes: grayscale each frame of the video clip, based on the image similarity algorithm model, The similarity of each frame of the clip and the frame with the same timestamp of the preset video template is calculated to obtain the similarity of each frame. After weighted calculation of all frames of the video clip, the video clip and the preset video are obtained. Based on the similarity of the template, the several video clips are scored and sorted according to the similarity. The higher the similarity, the higher the score. The higher the score, the higher the ranking.

2. An intelligent video event detection system, characterized by including:

A video acquisition unit, configured to acquire a video source from a database, tag the video source, and determine a corresponding preset video template according to the tag of the video source;

The target video determination unit is used to determine the key frames of the video source, split the video source into several video segments with the same number of frames according to the key frames, and detect the target video based on the preset video template based on the preset detection model. The plurality of video clips are ranked and sorted, and the video clip with the highest score is determined as the target video clip;

A detection result determination unit configured to determine frames with the same timestamp in the target video segment corresponding to the marked frame in the preset video template as event evidence when the highest score is greater than or equal to the preset threshold. Figure, while determining the corresponding event type;

3. An electronic device, characterized in that it includes a memory and a processor, and a computer program is stored in the memory. When the computer program is executed by the processor, it causes the processor to execute the method of claim 1. Steps of intelligent video event detection method.

4. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, it causes the processor to execute the intelligent method of claim 1. Steps of video event detection method.