CN118803385A - Video slicing method and device, electronic device and storage medium - Google Patents
Video slicing method and device, electronic device and storage medium Download PDFInfo
- Publication number
- CN118803385A CN118803385A CN202411060573.3A CN202411060573A CN118803385A CN 118803385 A CN118803385 A CN 118803385A CN 202411060573 A CN202411060573 A CN 202411060573A CN 118803385 A CN118803385 A CN 118803385A
- Authority
- CN
- China
- Prior art keywords
- video
- frame
- target
- slice
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 81
- 230000015654 memory Effects 0.000 claims description 22
- 238000004590 computer program Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 abstract description 82
- 238000013473 artificial intelligence Methods 0.000 abstract description 33
- 230000011218 segmentation Effects 0.000 abstract description 33
- 238000005516 engineering process Methods 0.000 abstract description 18
- 238000004364 calculation method Methods 0.000 description 34
- 238000005520 cutting process Methods 0.000 description 24
- 238000004422 calculation algorithm Methods 0.000 description 22
- 230000008569 process Effects 0.000 description 17
- 238000004891 communication Methods 0.000 description 12
- 238000013528 artificial neural network Methods 0.000 description 10
- 238000000605 extraction Methods 0.000 description 10
- 230000006835 compression Effects 0.000 description 7
- 238000007906 compression Methods 0.000 description 7
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 5
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 5
- 230000001052 transient effect Effects 0.000 description 5
- 239000000284 extract Substances 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 2
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 2
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 2
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- NVGOPFQZYCNLDU-UHFFFAOYSA-N norflurazon Chemical compound O=C1C(Cl)=C(NC)C=NN1C1=CC=CC(C(F)(F)F)=C1 NVGOPFQZYCNLDU-UHFFFAOYSA-N 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000007723 transport mechanism Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
- H04N21/8458—Structuring of content, e.g. decomposing content into time segments involving uncompressed content
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/432—Content retrieval operation from a local storage medium, e.g. hard-disk
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/433—Content storage operation, e.g. storage operation in response to a pause request, caching operations
- H04N21/4334—Recording operations
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
本申请实施例提供了一种视频切片方法和装置、电子设备及存储介质,属于人工智能技术领域。该方法包括:获取待切分的视频文件;根据预设搜索时长和多个目标I帧对视频文件进行切分,得到各个视频切片,其中,各个视频切片的首帧为目标I帧;对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。基于此,本申请实施例既可以将视频切分成期望的长度,保证视频切分处理的实时性,又可以有效保存视频中的信息,且不需要对视频进行解码再编码,从而节省大量的计算量,加快处理速度,尤其在人工智能视频处理领域,由于各个视频切片中I帧保留齐全,切分后的视频帧内容可以完整复原,不会丢失信息,节省处理时间及资源消耗。
The embodiments of the present application provide a video slicing method and device, an electronic device and a storage medium, which belong to the field of artificial intelligence technology. The method includes: obtaining a video file to be segmented; segmenting the video file according to a preset search time and multiple target I frames to obtain individual video slices, wherein the first frame of each video slice is a target I frame; saving the video data of each video slice to obtain each slice save file containing the corresponding target I frame. Based on this, the embodiments of the present application can not only segment the video into the desired length to ensure the real-time performance of the video segmentation processing, but also effectively save the information in the video, and there is no need to decode and re-encode the video, thereby saving a lot of computing power and speeding up the processing speed, especially in the field of artificial intelligence video processing, because the I frames in each video slice are fully preserved, the content of the segmented video frame can be completely restored without losing information, saving processing time and resource consumption.
Description
技术领域Technical Field
本申请涉及人工智能技术领域,尤其涉及一种视频切片方法和装置、电子设备及存储介质。The present application relates to the field of artificial intelligence technology, and in particular to a video slicing method and device, an electronic device, and a storage medium.
背景技术Background Art
在人工智能视频处理算法中,通常将一段视频输入到神经网络中,神经网络对视频进行拆帧,基于帧内的信息及帧间的区别,提取特征进行回归计算,得到神经网络的分析结果。通常情况下,送入神经网络的视频时长越长,神经网络的处理时间越长。为了保证神经网络结果的时效性,通常需要限制视频的时长。目前的视频处理算法中,常用的视频截取及切片方法有两种:In artificial intelligence video processing algorithms, a video is usually input into a neural network, which deframes the video and extracts features based on the information within the frame and the difference between frames for regression calculation to obtain the analysis results of the neural network. Generally speaking, the longer the video fed into the neural network, the longer the neural network processing time. In order to ensure the timeliness of the neural network results, it is usually necessary to limit the length of the video. In the current video processing algorithms, there are two commonly used methods for video interception and slicing:
(1)指定开始时间和结束时间,使用重新编码进行视频提取;(1) Specify the start time and end time and use re-encoding to extract the video;
切片过程中,严格从开始时间到结束时间,需要对视频重新进行H264编码,并根据需要插入I帧信息,以实现精准剪裁。During the slicing process, the video needs to be re-encoded in H264 strictly from the start time to the end time, and I frame information needs to be inserted as needed to achieve precise cropping.
这种方法可以精准切割,但是过程中编码需要耗费大量的时间和资源。This method can achieve precise cutting, but the encoding process requires a lot of time and resources.
(2)使用复制方法,无需解码实现快速剪切;(2) Use the copy method to achieve fast cutting without decoding;
在视频时间搜索过程中,搜索操作会在I帧之间跳转,不会准确停止在I帧处。由于没有进行重新编码,在达到第一个I帧之前,视频播放会出现问题。During video time seeking, the seek operation would jump between I-frames and would not stop exactly at an I-frame. Since no re-encoding was done, video playback would be problematic before the first I-frame was reached.
发明内容Summary of the invention
本申请实施例的主要目的在于提出一种视频切片方法和装置、电子设备及存储介质,能够既可以将视频切割成期望的长度,保证算法的实时性,又可以有效保存视频中的信息,且不需要对视频进行解码再编码,从而节省大量的计算量,加快处理速度。The main purpose of the embodiments of the present application is to propose a video slicing method and device, an electronic device and a storage medium, which can not only cut the video into the desired length to ensure the real-time performance of the algorithm, but also effectively save the information in the video without the need to decode and re-encode the video, thereby saving a large amount of computing power and speeding up the processing speed.
为实现上述目的,本申请实施例的第一方面提出了一种视频切片方法,所述方法包括:To achieve the above object, a first aspect of an embodiment of the present application provides a video slicing method, the method comprising:
获取待切分的视频文件;Get the video file to be segmented;
根据预设搜索时长和多个目标I帧对所述视频文件进行切分,得到各个视频切片,其中,各个所述视频切片的首帧为所述目标I帧;The video file is segmented according to a preset search duration and a plurality of target I frames to obtain video slices, wherein the first frame of each of the video slices is the target I frame;
对各个视频切片的视频数据进行保存,得到包含对应所述目标I帧的各个切片保存文件。The video data of each video slice is saved to obtain a save file of each slice including the corresponding target I frame.
在一些实施例,所述根据预设搜索时长和多个目标I帧对所述视频文件进行切分,得到各个视频切片,包括:In some embodiments, the video file is segmented according to a preset search duration and a plurality of target I frames to obtain video slices, including:
从视频文件的第一帧开始对所述视频文件的各个视频帧进行搜索,读取各个所述视频帧距离所述第一帧的间隔时长,所述第一帧为I帧;Searching each video frame of the video file starting from the first frame of the video file, reading the interval time between each video frame and the first frame, where the first frame is an I frame;
遍历所述视频文件的各个视频帧,在预设搜索时长内确定各个目标I帧;Traversing each video frame of the video file, and determining each target I frame within a preset search duration;
根据所述预设搜索时长、所述间隔时长和目标I帧对所述视频文件进行切分,得到各个视频切片。The video file is segmented according to the preset search duration, the interval duration and the target I frame to obtain various video slices.
在一些实施例,所述根据所述预设时长、所述间隔时长和目标I帧对所述视频文件进行切分,得到各个视频切片,包括:In some embodiments, the segmenting of the video file according to the preset duration, the interval duration and the target I frame to obtain the video slices includes:
根据所述预设时长和所述间隔时长确定距离上一所述目标I帧最近的下一所述目标I帧;Determine the next target I frame closest to the previous target I frame according to the preset duration and the interval duration;
将上一所述目标I帧至下一所述目标I帧之前的视频帧写入至对应的切片保存文件;Writing the video frames from the last target I frame to the next target I frame to the corresponding slice saving file;
遍历所述视频文件的各个视频帧进行切分,直至所述视频文件的最后一帧,得到各个视频切片。Each video frame of the video file is traversed and segmented until the last frame of the video file to obtain each video slice.
在一些实施例,所述根据所述预设时长和所述间隔时长确定距离上一所述目标I帧最近的下一所述目标I帧,包括:In some embodiments, determining the next target I frame closest to the previous target I frame according to the preset duration and the interval duration includes:
确定距离上一所述目标I帧不同所述间隔时长内的视频帧;Determine a video frame within the interval time from the previous target I frame;
在所述预设时长内从所述视频帧中搜索出下一所述目标I帧。The next target I frame is searched from the video frames within the preset time length.
在一些实施例,所述在所述预设时长内从所述视频帧中搜索出下一所述目标I帧,包括:In some embodiments, searching for the next target I frame from the video frames within the preset duration includes:
在所述预设时长内对所述视频帧的标识位进行识别,得到识别结果;Identify the identification position of the video frame within the preset time length to obtain an identification result;
在所述识别结果为AV_PKT_FLAG_KEY的情况下,确定所述视频帧为下一所述目标I帧。When the identification result is AV_PKT_FLAG_KEY, the video frame is determined to be the next target I frame.
在一些实施例,所述对各个视频切片的视频数据进行保存,得到包含对应所述目标I帧的各个切片保存文件,包括:In some embodiments, the video data of each video slice is saved to obtain a saved file of each slice including the corresponding target I frame, including:
新建多个切片保存文件;Create multiple slice save files;
拷贝已切分的各个所述视频切片;Copying each of the segmented video slices;
将各个视频切片的视频数据写入至对应的切片保存文件,得到包含对应所述目标I帧的各个切片保存文件。The video data of each video slice is written into the corresponding slice saving file to obtain each slice saving file containing the corresponding target I frame.
在一些实施例,所述视频文件包括多个GOP图像组,每个所述GOP图像组包括一个I帧、至少一个P帧和至少一个B帧。In some embodiments, the video file includes a plurality of GOP groups of pictures, each of the GOP groups of pictures includes an I frame, at least one P frame and at least one B frame.
为实现上述目的,本申请实施例的第二方面提出了一种视频切片装置,所述装置包括:To achieve the above-mentioned purpose, a second aspect of an embodiment of the present application provides a video slicing device, the device comprising:
获取模块,用于获取待切分的视频文件;An acquisition module is used to acquire the video file to be segmented;
切片模块,用于根据预设搜索时长和目标I帧对所述视频文件进行切分,得到各个视频切片,其中,各个所述视频切片的首帧为所述目标I帧;A slicing module, used to slice the video file according to a preset search duration and a target I frame to obtain various video slices, wherein the first frame of each of the video slices is the target I frame;
保存模块,用于对各个视频切片的视频数据进行保存,得到包含对应所述目标I帧的各个切片保存文件。The saving module is used to save the video data of each video slice to obtain a saving file of each slice containing the corresponding target I frame.
为实现上述目的,本申请实施例的第三方面提出了一种电子设备,所述电子设备包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现上述第一方面所述的方法。To achieve the above objectives, a third aspect of an embodiment of the present application proposes an electronic device, which includes a memory and a processor, wherein the memory stores a computer program, and when the processor executes the computer program, the method described in the first aspect is implemented.
为实现上述目的,本申请实施例的第四方面提出了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现上述第一方面所述的方法。To achieve the above-mentioned purpose, the fourth aspect of an embodiment of the present application proposes a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the method described in the first aspect is implemented.
本申请提出的视频切片方法和装置、电子设备及存储介质,获取待切分的视频文件;根据预设搜索时长和多个目标I帧对视频文件进行切分,得到各个视频切片,其中,各个所述视频切片的首帧为所述目标I帧;对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。用户可以根据期望的视频切分长度对预设搜索时长进行设置,在预设搜索时长内对待切分的视频文件进行视频帧搜索,并基于搜索到的目标I帧对视频文件进行切分,得到的是首帧为目标I帧的各个视频切片,因此,得到的各个视频切片包含有目标I帧信息,基于I帧可以完整地对后续视频帧数据进行解码,提取到视频帧的信息,不需要对视频数据进行解码再编码,节省大量的计算量,从而加快处理速度。基于此,本发明实施例既可以将视频切分成期望的长度,保证视频切分处理的实时性,又可以有效保存视频中的信息,且不需要对视频进行解码再编码,从而节省大量的计算量,加快处理速度,尤其在人工智能视频处理领域,由于各个视频切片中I帧保留齐全,切割后的视频帧内容可以完整复原,不会丢失信息,因此非常适合人工智能视频算法处理。整个视频切分过程不涉及到任何视频帧内的信息提取和运算,仅仅是切分后快速地拷贝并保存。视频切分后不需要进行视频重新编码,节省处理时间及资源消耗,从而加快视频切片处理速度。The video slicing method and device, electronic device and storage medium proposed in the present application obtain the video file to be segmented; segment the video file according to the preset search time and multiple target I frames to obtain each video slice, wherein the first frame of each video slice is the target I frame; save the video data of each video slice to obtain each slice save file containing the corresponding target I frame. The user can set the preset search time according to the desired video segmentation length, perform a video frame search for the video file to be segmented within the preset search time, and segment the video file based on the searched target I frame to obtain each video slice whose first frame is the target I frame. Therefore, each video slice obtained contains the target I frame information, and the subsequent video frame data can be completely decoded based on the I frame to extract the video frame information, without the need to decode and re-encode the video data, saving a lot of calculation, thereby speeding up the processing speed. Based on this, the embodiments of the present invention can not only cut the video into the desired length to ensure the real-time performance of the video cutting process, but also effectively save the information in the video, and there is no need to decode and re-encode the video, thereby saving a lot of calculations and speeding up the processing speed, especially in the field of artificial intelligence video processing. Since the I frames in each video slice are fully preserved, the content of the cut video frame can be completely restored without losing information, so it is very suitable for artificial intelligence video algorithm processing. The entire video cutting process does not involve any information extraction and calculation within the video frame, but only quickly copying and saving after cutting. There is no need to re-encode the video after cutting, which saves processing time and resource consumption, thereby speeding up the video slicing processing speed.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本申请实施例提供的视频切片方法的流程图;FIG1 is a flow chart of a video slicing method provided in an embodiment of the present application;
图2是图1中的步骤S102的流程图;FIG2 is a flow chart of step S102 in FIG1 ;
图3是图2中的步骤S203的流程图;FIG3 is a flow chart of step S203 in FIG2 ;
图4是图3中的步骤S301的流程图;FIG4 is a flow chart of step S301 in FIG3 ;
图5是图4中的步骤S402的流程图;FIG5 is a flow chart of step S402 in FIG4 ;
图6是图1中的步骤S103的流程图;FIG6 is a flow chart of step S103 in FIG1 ;
图7是本申请实施例提供的视频切片装置的结构示意图;FIG7 is a schematic diagram of the structure of a video slicing device provided in an embodiment of the present application;
图8是本申请实施例提供的电子设备的硬件结构示意图。FIG8 is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of the present application.
具体实施方式DETAILED DESCRIPTION
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application more clearly understood, the present application is further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application and are not used to limit the present application.
需要说明的是,虽然在装置示意图中进行了功能模块划分,在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于装置中的模块划分,或流程图中的顺序执行所示出或描述的步骤。说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It should be noted that, although the functional modules are divided in the device schematic diagram and the logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in a different order than the module division in the device or the order in the flowchart. The terms "first", "second", etc. in the specification, claims and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as those commonly understood by those skilled in the art to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of this application and are not intended to limit this application.
首先,对本申请中涉及的若干名词进行解析:First, some nouns involved in this application are analyzed:
人工智能(art i f i c i a l i nte l l i gence,A I):是研究、开发用于模拟、延伸和扩展人的智能的理论、方法、技术及应用系统的一门新的技术科学;人工智能是计算机科学的一个分支,人工智能企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器,该领域的研究包括机器人、语言识别、图像识别、自然语言处理和专家系统等。人工智能可以对人的意识、思维的信息过程的模拟。人工智能还是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。Artificial intelligence (AI) is a new technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence. AI is a branch of computer science that attempts to understand the essence of intelligence and produce a new type of intelligent machine that can respond in a similar way to human intelligence. Research in this field includes robots, language recognition, image recognition, natural language processing and expert systems. AI can simulate the information process of human consciousness and thinking. AI is also a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
H264编码:是一种新的视频压缩编码标准,该标准采用了多项提高图像质量和增加压缩比的技术措施,可用于SDTV、HDTV和DVD等。H264编码更加节省码流,H264不仅比MPEG-4节约了50%的码率,而且还具有较强的抗误码特性,可适应丢包率高、干扰严重的无线信道中的视频传输,从而获得平稳的图像质量。H264标准使运动图像压缩技术上升到了一个更高的阶段,在较低带宽上提供高质量的图像传输是H.264的应用亮点。H264 encoding: It is a new video compression encoding standard. This standard adopts a number of technical measures to improve image quality and increase compression ratio. It can be used for SDTV, HDTV and DVD, etc. H264 encoding saves more bit stream. H264 not only saves 50% of the bit rate compared with MPEG-4, but also has strong anti-error characteristics. It can adapt to video transmission in wireless channels with high packet loss rate and severe interference, so as to obtain smooth image quality. The H264 standard has brought motion image compression technology to a higher stage. Providing high-quality image transmission at a lower bandwidth is the application highlight of H.264.
GOP(Group of Pi ctures):策略影响编码质量,所谓GOP,意思是画面组,一个GOP就是一组连续的画面。GOP是序列中的一个图片集,用来辅助随机存取。GOP的第一个图像必须为I帧,这样就能保证GOP不需要参考其他图像,可以独立解码。MPEG编码将画面(即帧)分为I、P、B三种,I是内部编码帧,P是前向预测帧,B是双向内插帧。简单地讲,I帧是关键帧,可以理解为一个完整的画面,而P帧和B帧记录的是相对于I帧的变化,P帧表示跟前一帧的差别,B帧表示前后帧差别。没有I帧,P帧和B帧就无法解码,这就是MPEG格式难以精确剪辑的原因,也是我们之所以要微调头和尾的原因。GOP (Group of Pictures): Strategy affects encoding quality. The so-called GOP means group of pictures. A GOP is a group of continuous pictures. GOP is a set of pictures in a sequence, which is used to assist random access. The first picture of GOP must be an I frame, so that GOP can be decoded independently without referring to other pictures. MPEG encoding divides pictures (i.e. frames) into three types: I, P, and B. I is an internal coding frame, P is a forward prediction frame, and B is a bidirectional interpolation frame. Simply put, I frame is a key frame, which can be understood as a complete picture, while P frame and B frame record the changes relative to I frame. P frame indicates the difference with the previous frame, and B frame indicates the difference between the previous and next frames. Without I frame, P frame and B frame cannot be decoded. This is why it is difficult to edit accurately in MPEG format, and why we need to fine-tune the head and tail.
I帧:也称为关键帧或帧内编码帧(I ntra-coded Frame),是一个完整的图像帧,它独立于其他帧存在。I帧不依赖于其他帧的信息即可独立解码,类似于静态图像,可以视为视频序列中的一个参考点。由于I帧包含了完整的图像信息,其压缩率相对较低,但在解码时最为简单,因为它不涉及对其他帧的依赖。I frame: Also known as key frame or intra-coded frame, it is a complete image frame that exists independently of other frames. I frame can be decoded independently without relying on the information of other frames, similar to static images, and can be regarded as a reference point in the video sequence. Since I frame contains complete image information, its compression rate is relatively low, but it is the simplest to decode because it does not involve dependence on other frames.
P帧:即前向预测编码帧(Pred ict ive Frame),依赖于前面的I帧或P帧来生成。P帧存储的是与前一帧相比图像的变化量,因此它的压缩效果通常比I帧更好。在解码P帧时,需要先解码它所依赖的I帧或P帧,然后根据这些信息来重建当前帧的画面。P帧的引入有效减少了时间维度上的冗余,提高了视频的压缩效率。P frame: a forward predictive coding frame (Pred ict ive Frame), which is generated based on the previous I frame or P frame. P frame stores the amount of change in the image compared to the previous frame, so its compression effect is usually better than I frame. When decoding a P frame, you need to first decode the I frame or P frame it depends on, and then reconstruct the current frame based on this information. The introduction of P frame effectively reduces redundancy in the time dimension and improves the compression efficiency of the video.
B帧:或称为双向预测内插编码帧(Bid i rect iona l I nterpo l ated Predict ion Frame),需要参考前后的I帧或P帧来生成。B帧利用前后帧的信息来预测当前帧的内容,从而实现更高的压缩比。由于B帧的解码需要前后帧的信息,它不能独立解码,必须在解码序列中结合I帧和P帧来完成。B frame: also known as Bidirectional Interpolated Prediction Frame, which needs to refer to the previous and next I frames or P frames to generate. B frame uses the information of the previous and next frames to predict the content of the current frame, thereby achieving a higher compression ratio. Since the decoding of B frame requires the information of the previous and next frames, it cannot be decoded independently and must be completed in combination with I frame and P frame in the decoding sequence.
在人工智能视频处理算法中,通常将一段视频输入到神经网络中,神经网络对视频进行拆帧,基于帧内的信息及帧间的区别,提取特征进行回归计算,得到神经网络的分析结果。通常情况下,送入神经网络的视频时长越长,神经网络的处理时间越长。为了保证神经网络结果的时效性,通常需要限制视频的时长。In artificial intelligence video processing algorithms, a video is usually input into a neural network, which deframes the video and extracts features based on the information within the frame and the difference between frames, and performs regression calculations to obtain the analysis results of the neural network. Generally speaking, the longer the video fed into the neural network, the longer the neural network processing time. In order to ensure the timeliness of the neural network results, it is usually necessary to limit the length of the video.
目前的视频处理算法中,常用的视频截取及切片方法有两种:In the current video processing algorithms, there are two commonly used video capture and slicing methods:
(1)指定开始时间和结束时间,使用重新编码进行视频提取;(1) Specify the start time and end time and use re-encoding to extract the video;
切片过程中,严格从开始时间到结束时间,需要对视频重新进行H264编码,并根据需要插入I帧信息,以实现精准剪裁。During the slicing process, the video needs to be re-encoded in H264 strictly from the start time to the end time, and I frame information needs to be inserted as needed to achieve precise cropping.
这种方法可以精准切割,但是过程中编码需要耗费大量的时间和资源。This method can achieve precise cutting, but the encoding process requires a lot of time and resources.
(2)使用复制方法,无需解码实现快速剪切;(2) Use the copy method to achieve fast cutting without decoding;
在视频时间搜索过程中,搜索操作会在I帧之间跳转,不会准确停止在I帧处。由于没有进行重新编码,在达到第一个I帧之前,视频播放会出现问题。During video time seeking, the seek operation would jump between I-frames and would not stop exactly at an I-frame. Since no re-encoding was done, video playback would be problematic before the first I-frame was reached.
可以理解的是,从上述传统的切片方法可以发现,第(1)种方法由于没有对I帧进行提取,在做切片之前,通常对待提取内容进行解码,并进行H264编码,完成切片文件的生成,然而编码需要耗费大量的时间和资源;第(2)种方法在视频时间搜索过程中不会准确停止在I帧处,由于没有进行重新编码,在达到第一个I帧之前,视频播放会出现问题。It is understandable that from the above-mentioned traditional slicing methods, it can be found that method (1) does not extract the I frame. Before slicing, the extracted content is usually decoded and H264 encoded to complete the generation of the slice file. However, encoding requires a lot of time and resources; method (2) will not stop accurately at the I frame during the video time search process. Since no re-encoding is performed, video playback will have problems before reaching the first I frame.
基于此,本申请实施例提供了一种视频切片方法和装置、电子设备及存储介质,获取待切分的视频文件;根据预设搜索时长和多个目标I帧对视频文件进行切分,得到各个视频切片,其中,各个所述视频切片的首帧为所述目标I帧;对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。用户可以根据期望的视频切分长度对预设搜索时长进行设置,在预设搜索时长内对待切分的视频文件进行视频帧搜索,并基于搜索到的目标I帧对视频文件进行切分,得到的是首帧为目标I帧的各个视频切片,因此,得到的各个视频切片包含有目标I帧信息,基于I帧可以完整地对后续视频帧数据进行解码,提取到视频帧的信息,不需要对视频数据进行解码再编码,节省大量的计算量,从而加快处理速度。基于此,本发明实施例既可以将视频切分成期望的长度,保证视频切分处理的实时性,又可以有效保存视频中的信息,且不需要对视频进行解码再编码,从而节省大量的计算量,加快处理速度,尤其在人工智能视频处理领域,由于各个视频切片中I帧保留齐全,切割后的视频帧内容可以完整复原,不会丢失信息,因此非常适合人工智能视频算法处理。整个视频切分过程不涉及到任何视频帧内的信息提取和运算,仅仅是切分后快速地拷贝并保存。视频切分后不需要进行视频重新编码,节省处理时间及资源消耗,从而加快视频切片处理速度。Based on this, the embodiments of the present application provide a video slicing method and device, an electronic device and a storage medium, which obtain a video file to be segmented; segment the video file according to a preset search time and multiple target I frames to obtain individual video slices, wherein the first frame of each of the video slices is the target I frame; save the video data of each video slice to obtain each slice save file containing the corresponding target I frame. The user can set the preset search time according to the desired video segmentation length, perform a video frame search for the video file to be segmented within the preset search time, and segment the video file based on the searched target I frame to obtain each video slice whose first frame is the target I frame. Therefore, each obtained video slice contains the target I frame information, and the subsequent video frame data can be completely decoded based on the I frame to extract the video frame information, without the need to decode and re-encode the video data, saving a lot of calculation, thereby speeding up the processing speed. Based on this, the embodiments of the present invention can not only cut the video into the desired length to ensure the real-time performance of the video cutting process, but also effectively save the information in the video, and there is no need to decode and re-encode the video, thereby saving a lot of calculations and speeding up the processing speed, especially in the field of artificial intelligence video processing. Since the I frames in each video slice are fully preserved, the content of the cut video frame can be completely restored without losing information, so it is very suitable for artificial intelligence video algorithm processing. The entire video cutting process does not involve any information extraction and calculation within the video frame, but only quickly copying and saving after cutting. There is no need to re-encode the video after cutting, which saves processing time and resource consumption, thereby speeding up the video slicing processing speed.
本申请实施例提供的视频切片方法和装置、电子设备及存储介质,具体通过如下实施例进行说明,首先描述本申请实施例中的视频切片方法。The video slicing method and device, electronic device and storage medium provided in the embodiments of the present application are specifically illustrated through the following embodiments. First, the video slicing method in the embodiments of the present application is described.
本申请实施例可以基于人工智能技术对相关的数据进行获取和处理。其中,人工智能(Art i f i c i a l I nte l l i gence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。The embodiments of the present application can acquire and process relevant data based on artificial intelligence technology. Artificial intelligence (AI) is the theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、机器人技术、生物识别技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。AI basic technologies generally include sensors, dedicated AI chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, mechatronics, etc. AI software technologies mainly include computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
本申请实施例提供的视频切片方法,涉及人工智能技术领域。本申请实施例提供的视频切片方法可应用于终端中,也可应用于服务器端中,还可以是运行于终端或服务器端中的软件。在一些实施例中,终端可以是智能手机、平板电脑、笔记本电脑、台式计算机等;服务器端可以配置成独立的物理服务器,也可以配置成多个物理服务器构成的服务器集群或者分布式系统,还可以配置成提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN以及大数据和人工智能平台等基础云计算服务的云服务器;软件可以是实现视频切片方法的应用等,但并不局限于以上形式。The video slicing method provided in the embodiment of the present application relates to the field of artificial intelligence technology. The video slicing method provided in the embodiment of the present application can be applied to a terminal, can be applied to a server side, or can be software running in a terminal or a server side. In some embodiments, the terminal can be a smart phone, a tablet computer, a laptop computer, a desktop computer, etc.; the server side can be configured as an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms; the software can be an application that implements the video slicing method, etc., but is not limited to the above forms.
本申请可用于众多通用或专用的计算机系统环境或配置中。例如:个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器系统、基于微处理器的系统、置顶盒、可编程的消费电子设备、网络PC、小型计算机、大型计算机、包括以上任何系统或设备的分布式计算环境等等。本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。The present application can be used in many general or special computer system environments or configurations. For example: personal computers, server computers, handheld or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments including any of the above systems or devices, etc. The present application can be described in the general context of computer-executable instructions executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. The present application can also be practiced in distributed computing environments, in which tasks are performed by remote processing devices connected through a communication network. In a distributed computing environment, program modules can be located in local and remote computer storage media including storage devices.
需要说明的是,在本申请的各个具体实施方式中,当涉及到需要根据用户信息、用户行为数据,用户历史数据以及用户位置信息等与用户身份或特性相关的数据进行相关处理时,都会先获得用户的许可或者同意,而且,对这些数据的收集、使用和处理等,都会遵守相关法律法规和标准。此外,当本申请实施例需要获取用户的敏感个人信息时,会通过弹窗或者跳转到确认页面等方式获得用户的单独许可或者单独同意,在明确获得用户的单独许可或者单独同意之后,再获取用于使本申请实施例能够正常运行的必要的用户相关数据。It should be noted that in each specific implementation of the present application, when it comes to the need to perform relevant processing based on data related to user identity or characteristics such as user information, user behavior data, user historical data, and user location information, the user's permission or consent will be obtained first, and the collection, use, and processing of these data will comply with relevant laws, regulations, and standards. In addition, when the embodiment of the present application needs to obtain the user's sensitive personal information, the user's separate permission or consent will be obtained through a pop-up window or by jumping to a confirmation page. After clearly obtaining the user's separate permission or consent, the necessary user-related data for the normal operation of the embodiment of the present application will be obtained.
图1是本申请实施例提供的视频切片方法的一个可选的流程图,图1中的方法可以包括但不限于包括步骤S101至步骤S103。FIG1 is an optional flowchart of a video slicing method provided in an embodiment of the present application. The method in FIG1 may include but is not limited to steps S101 to S103.
步骤S101,获取待切分的视频文件;Step S101, obtaining a video file to be segmented;
步骤S102,根据预设搜索时长和多个目标I帧对视频文件进行切分,得到各个视频切片,其中,各个视频切片的首帧为目标I帧;Step S102, dividing the video file according to a preset search duration and a plurality of target I frames to obtain video slices, wherein the first frame of each video slice is the target I frame;
步骤S103,对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。Step S103, saving the video data of each video slice to obtain a save file of each slice including the corresponding target I frame.
在一些实施例的步骤S101中,获取待切分的视频文件。待上传的视频文件可以为用户拍摄完成或者创作完成的视频文件,视频文件的编码格式包括但不限于H264编码。以H264码流的视频文件为例,视频文件包含多个GOP图像组,每个GOP图像组里面包含多个视频编码帧。GOP图像组的划分是两个临近I帧之间的图像。每个GOP图像组包含一个I帧、至少一个P帧和至少一个B帧。P帧或B帧的解码,需要依赖前面或者后面的视频帧图像,但是I帧可以单独一帧解码。因此,如果一个GOP图像组中缺少I帧,则会导致P帧、B帧解码失败。In step S101 of some embodiments, a video file to be segmented is obtained. The video file to be uploaded may be a video file that has been shot or created by a user, and the encoding format of the video file includes but is not limited to H264 encoding. Taking a video file of an H264 code stream as an example, the video file contains multiple GOP image groups, and each GOP image group contains multiple video encoding frames. The division of the GOP image group is the image between two adjacent I frames. Each GOP image group contains an I frame, at least one P frame and at least one B frame. The decoding of P frames or B frames needs to rely on the previous or subsequent video frame images, but I frames can be decoded as a single frame. Therefore, if an I frame is missing in a GOP image group, the decoding of P frames and B frames will fail.
在一些实施例的步骤S102中,根据预设搜索时长和多个目标I帧对视频文件进行切分,得到各个视频切片,其中,各个视频切片的首帧为目标I帧。用户可以根据期限的视频切分长度对预设搜索时长进行设置,在预设搜索时长内对待切分的视频文件进行视频帧搜索,并基于搜索到的目标I帧对视频文件进行切分,得到的是首帧为目标I帧的各个视频切片,因此,得到的各个视频切片包含有目标I帧信息,基于I帧可以完整地对后续视频帧数据进行解码,提取到视频帧的信息,不需要对视频数据进行解码再编码,节省大量的计算量,从而加快处理速度。In step S102 of some embodiments, the video file is segmented according to a preset search duration and a plurality of target I frames to obtain various video slices, wherein the first frame of each video slice is the target I frame. The user can set the preset search duration according to the video segmentation length of the time limit, perform a video frame search on the video file to be segmented within the preset search duration, and segment the video file based on the searched target I frame to obtain various video slices whose first frame is the target I frame. Therefore, each obtained video slice contains the target I frame information, and the subsequent video frame data can be completely decoded based on the I frame to extract the video frame information, without the need to decode and re-encode the video data, saving a large amount of calculation, thereby speeding up the processing speed.
在一些实施例的步骤S103中,对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。示例性地,在人工智能视频处理领域,由于各个视频切片中I帧保留齐全,切割后的视频帧内容可以完整复原,不会丢失信息,因此非常适合人工智能视频算法处理。整个视频切分过程不涉及到任何视频帧内的信息提取和运算,仅仅是切分后快速地拷贝并保存。视频切分后不需要进行视频重新编码,节省处理时间及资源消耗,从而加快视频切片处理速度。In step S103 of some embodiments, the video data of each video slice is saved to obtain a save file of each slice containing the corresponding target I frame. For example, in the field of artificial intelligence video processing, since the I frames in each video slice are fully preserved, the content of the cut video frame can be completely restored without losing information, so it is very suitable for artificial intelligence video algorithm processing. The entire video segmentation process does not involve any information extraction and calculation within the video frame, but only quickly copying and saving after segmentation. After video segmentation, there is no need to re-encode the video, which saves processing time and resource consumption, thereby speeding up the processing of video slices.
需要说明的是,本发明实施例的视频切片方法具有快捷、计算量小的独特优势,可以应用到所有对视频长度有要求的算法场景。It should be noted that the video slicing method of the embodiment of the present invention has the unique advantages of being fast and having a small amount of calculation, and can be applied to all algorithm scenarios that have requirements on the video length.
本申请实施例所示意的步骤S101至步骤S103,获取待切分的视频文件;根据预设搜索时长和多个目标I帧对视频文件进行切分,得到各个视频切片,其中,各个所述视频切片的首帧为所述目标I帧;对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。用户可以根据期望的视频切分长度对预设搜索时长进行设置,在预设搜索时长内对待切分的视频文件进行视频帧搜索,并基于搜索到的目标I帧对视频文件进行切分,得到的是首帧为目标I帧的各个视频切片,因此,得到的各个视频切片包含有目标I帧信息,基于I帧可以完整地对后续视频帧数据进行解码,提取到视频帧的信息,不需要对视频数据进行解码再编码,节省大量的计算量,从而加快处理速度。基于此,本发明实施例既可以将视频切分成期望的长度,保证视频切分处理的实时性,又可以有效保存视频中的信息,且不需要对视频进行解码再编码,从而节省大量的计算量,加快处理速度,尤其在人工智能视频处理领域,由于各个视频切片中I帧保留齐全,切割后的视频帧内容可以完整复原,不会丢失信息,因此非常适合人工智能视频算法处理。整个视频切分过程不涉及到任何视频帧内的信息提取和运算,仅仅是切分后快速地拷贝并保存。视频切分后不需要进行视频重新编码,节省处理时间及资源消耗,从而加快视频切片处理速度。Steps S101 to S103 shown in the embodiment of the present application are to obtain the video file to be segmented; segment the video file according to the preset search time and multiple target I frames to obtain individual video slices, wherein the first frame of each of the video slices is the target I frame; save the video data of each video slice to obtain individual slice save files containing the corresponding target I frame. The user can set the preset search time according to the desired video segmentation length, perform a video frame search on the video file to be segmented within the preset search time, and segment the video file based on the searched target I frame to obtain individual video slices whose first frame is the target I frame. Therefore, each obtained video slice contains the target I frame information, and the subsequent video frame data can be completely decoded based on the I frame to extract the video frame information. There is no need to decode and re-encode the video data, saving a lot of calculations, thereby speeding up the processing speed. Based on this, the embodiments of the present invention can not only cut the video into the desired length to ensure the real-time performance of the video cutting process, but also effectively save the information in the video, and there is no need to decode and re-encode the video, thereby saving a lot of calculations and speeding up the processing speed, especially in the field of artificial intelligence video processing. Since the I frames in each video slice are fully preserved, the content of the cut video frame can be completely restored without losing information, so it is very suitable for artificial intelligence video algorithm processing. The entire video cutting process does not involve any information extraction and calculation within the video frame, but only quickly copying and saving after cutting. There is no need to re-encode the video after cutting, which saves processing time and resource consumption, thereby speeding up the video slicing processing speed.
请参阅图2,在一些实施例中,步骤S102可以包括但不限于包括步骤S201至步骤S203:Please refer to FIG. 2 . In some embodiments, step S102 may include but is not limited to steps S201 to S203:
步骤S201,从视频文件的第一帧开始对视频文件的各个视频帧进行搜索,读取各个视频帧距离第一帧的间隔时长,第一帧为I帧;Step S201, searching each video frame of the video file starting from the first frame of the video file, reading the interval time between each video frame and the first frame, the first frame being an I frame;
步骤S202,遍历视频文件的各个视频帧,在预设搜索时长内确定各个目标I帧;Step S202, traversing each video frame of the video file, and determining each target I frame within a preset search duration;
步骤S203,根据预设搜索时长、间隔时长和目标I帧对视频文件进行切分,得到各个视频切片。Step S203, dividing the video file according to the preset search duration, interval duration and target I frame to obtain various video slices.
在一些实施例中,以H264码流的视频文件为例,视频文件包含多个GOP图像组,每个GOP图像组里面包含多个视频编码帧。GOP图像组的划分是两个临近I帧之间的图像。每个GOP图像组包含一个I帧、至少一个P帧和至少一个B帧。P帧或B帧的解码,需要依赖前面或者后面的视频帧图像,但是I帧可以单独一帧解码。因此,如果一个GOP图像组中缺少I帧,则会导致P帧、B帧解码失败。假设算法单次输入视频2.5s以内,可以保证视频算法的处理时间满足业务要求,则用户可以将预设搜索时长设置为2.5s。对视频文件进行切分,以视频文件长度为50s,单个视频片段时间设定为2s,从视频文件的第一帧开始对视频文件的各个视频帧进行搜索,视频文件的第一帧为I帧,从该I帧开始,可以向后搜索,每一帧读取距离第一帧的间隔时长,例如,当搜索到视频帧距离第一个I帧为2s时,向后搜索到最近的I帧,且搜索时长在2.5s内,将搜索到最近的I帧之前的数据作为一个分段,得到一个视频切片,以便于后续写入并保存到对应的切片保存文件。接着,从上述搜索到最近的I帧开始,依次重复上述步骤,遍历视频文件的各个视频帧,在预设搜索时长内搜索到各个目标I帧,并基于此对视频文件继续进行切分,直到视频文件的最后一帧,从而得到各个视频切片。用户可以根据期望的视频切分长度对预设搜索时长进行设置,在预设搜索时长内对待切分的视频文件进行视频帧搜索,并基于搜索到的目标I帧对视频文件进行切分,得到的是首帧为目标I帧的各个视频切片,因此,得到的各个视频切片包含有目标I帧信息,基于I帧可以完整地对后续视频帧数据进行解码,提取到视频帧的信息,不需要对视频数据进行解码再编码,节省大量的计算量,从而加快处理速度。In some embodiments, taking the video file of H264 code stream as an example, the video file contains multiple GOP image groups, and each GOP image group contains multiple video encoding frames. The division of the GOP image group is the image between two adjacent I frames. Each GOP image group contains an I frame, at least one P frame and at least one B frame. The decoding of P frames or B frames needs to rely on the previous or subsequent video frame images, but I frames can be decoded as a single frame. Therefore, if an I frame is missing in a GOP image group, the decoding of P frames and B frames will fail. Assuming that the algorithm inputs a single video within 2.5s, it can ensure that the processing time of the video algorithm meets the business requirements, then the user can set the preset search time to 2.5s. The video file is segmented, with the length of the video file being 50s and the time of a single video segment being set to 2s. The video frames of the video file are searched starting from the first frame of the video file. The first frame of the video file is an I frame. Starting from the I frame, it can be searched backwards, and the interval time of each frame from the first frame is read. For example, when the video frame is searched to be 2s away from the first I frame, the nearest I frame is searched backwards, and the search time is within 2.5s. The data before the nearest I frame is searched is used as a segment to obtain a video slice, so as to facilitate subsequent writing and saving to the corresponding slice saving file. Then, starting from the above search to the nearest I frame, the above steps are repeated in sequence, and the video frames of the video file are traversed. Each target I frame is searched within the preset search time, and the video file is continued to be segmented based on this, until the last frame of the video file, thereby obtaining each video slice. The user can set the preset search time according to the desired video segmentation length, perform video frame search for the video file to be segmented within the preset search time, and segment the video file based on the searched target I frame, and obtain individual video slices whose first frame is the target I frame. Therefore, each obtained video slice contains the target I frame information. Based on the I frame, the subsequent video frame data can be completely decoded and the video frame information can be extracted. There is no need to decode and re-encode the video data, which saves a lot of calculations and speeds up the processing.
请参阅图3,在一些实施例中,在步骤S203可以包括但不限于包括步骤S301至步骤S303:Please refer to FIG. 3 . In some embodiments, step S203 may include but is not limited to steps S301 to S303:
步骤S301,根据预设时长和间隔时长确定距离上一目标I帧最近的下一目标I帧;Step S301, determining the next target I frame closest to the previous target I frame according to the preset duration and the interval duration;
步骤S302,将上一目标I帧至下一目标I帧之前的视频帧写入至对应的切片保存文件;Step S302, writing the video frames from the previous target I frame to the next target I frame into the corresponding slice saving file;
步骤S303,遍历视频文件的各个视频帧进行切分,直至视频文件的最后一帧,得到各个视频切片。Step S303, traverse each video frame of the video file to segment until the last frame of the video file, and obtain each video slice.
在一些实施例中,根据预设时长和间隔时长确定距离上一目标I帧最近的下一目标I帧。例如,假设算法单次输入视频2.5s以内,可以保证视频算法的处理时间满足业务要求,则用户可以将预设搜索时长设置为2.5s。对视频文件进行切分,以视频文件长度为50s,单个视频片段时间设定为2s,从视频文件的第一帧开始对视频文件的各个视频帧进行搜索,视频文件的第一帧为I帧,从该I帧开始,可以向后搜索,每一帧读取距离第一帧的间隔时长,例如,当搜索到视频帧距离第一个I帧为2s时,向后搜索到最近的下一I帧,且搜索时长在2.5s内,将搜索到最近的下一I帧之前的数据作为一个分段,得到第一个视频切片,将得到的第一个视频切片写入并保存到对应的切片保存文件。接着,从上述搜索到最近的I帧开始作为上一I帧,继续重复上述步骤,遍历视频文件的各个视频帧,在预设搜索时长内搜索到下一目标I帧,并基于此对视频文件继续进行切分,直到视频文件的最后一帧,从而得到各个视频切片。基于此,本发明实施例既可以将视频切分成期望的长度,保证视频切分处理的实时性,又可以有效保存视频中的信息,且不需要对视频进行解码再编码,从而节省大量的计算量,加快处理速度。In some embodiments, the next target I frame closest to the previous target I frame is determined based on a preset duration and an interval duration. For example, assuming that the algorithm inputs a video within 2.5 seconds at a time, it can be ensured that the processing time of the video algorithm meets the business requirements, and the user can set the preset search duration to 2.5 seconds. The video file is segmented, with the video file length being 50 seconds and the single video segment time being set to 2 seconds. Starting from the first frame of the video file, each video frame of the video file is searched. The first frame of the video file is an I frame. Starting from the I frame, it can be searched backwards, and each frame reads the interval duration from the first frame. For example, when the searched video frame is 2 seconds away from the first I frame, the nearest next I frame is searched backwards, and the search duration is within 2.5 seconds. The data before the nearest next I frame is searched as a segment to obtain the first video slice, and the obtained first video slice is written and saved to the corresponding slice save file. Next, starting from the most recent I frame searched above as the previous I frame, the above steps are repeated to traverse each video frame of the video file, and the next target I frame is searched within the preset search time, and the video file is segmented based on this until the last frame of the video file, thereby obtaining each video slice. Based on this, the embodiment of the present invention can not only segment the video into the desired length, ensuring the real-time performance of the video segmentation process, but also effectively save the information in the video, and does not need to decode and re-encode the video, thereby saving a lot of calculation and speeding up the processing speed.
请参阅图4,在一些实施例中,步骤S301可以包括但不限于包括步骤S401至步骤S402:Please refer to FIG. 4 . In some embodiments, step S301 may include but is not limited to steps S401 to S402:
步骤S401,确定距离上一目标I帧不同间隔时长内的视频帧;Step S401, determining video frames within different time intervals from the previous target I frame;
步骤S402,在预设时长内从视频帧中搜索出下一目标I帧。Step S402, searching for the next target I frame from the video frames within a preset duration.
在一些实施例中,在视频切分过程中,每个视频切片的首帧为目标I帧,例如,当搜索到视频帧距离上一个I帧为2s时,向后搜索到最近的I帧,且搜索时长在2.5s内,将搜索到最近的下一I帧之前的数据作为一个分段,得到一个视频切片,将得到的该视频切片写入并保存到对应的切片保存文件。基于此,得到的各个视频切片均包含有I帧信息,基于I帧可以完整地对后续视频帧数据进行解码,提取到视频帧的信息,不需要对视频数据进行解码再编码,节省大量的计算量,从而加快处理速度。In some embodiments, during the video segmentation process, the first frame of each video slice is the target I frame. For example, when the searched video frame is 2s away from the previous I frame, the nearest I frame is searched backward, and the search duration is within 2.5s. The data before the nearest next I frame is searched as a segment to obtain a video slice, and the obtained video slice is written and saved to the corresponding slice save file. Based on this, each obtained video slice contains I frame information, and the subsequent video frame data can be completely decoded based on the I frame to extract the video frame information. There is no need to decode and re-encode the video data, saving a lot of calculations, thereby speeding up the processing speed.
请参阅图5,在一些实施例中,步骤S402可以包括但不限于包括步骤S501至步骤S502:Please refer to FIG. 5 . In some embodiments, step S402 may include but is not limited to steps S501 to S502:
步骤S501,在预设时长内对视频帧的标识位进行识别,得到识别结果;Step S501, identifying the identification position of the video frame within a preset time length to obtain an identification result;
步骤S502,在识别结果为AV_PKT_FLAG_KEY的情况下,确定视频帧为下一目标I帧。Step S502: When the identification result is AV_PKT_FLAG_KEY, the video frame is determined to be the next target I frame.
在一些实施例中,I帧存在于视频文件中,例如,在H264编码的视频文件中,存在I帧、B帧和P帧。提取I帧需要对H264编码的每个视频帧信息进行识别,通过识别视频帧信息来确定视频帧类型,具体地,可以通过f l ags对应的标识位来实现,当确定标识位为AV_PKT_FLAG_KEY时,则表明该视频帧为I帧。In some embodiments, an I frame exists in a video file. For example, in an H264-encoded video file, there are I frames, B frames, and P frames. Extracting an I frame requires identifying each H264-encoded video frame information, and determining the video frame type by identifying the video frame information. Specifically, this can be achieved through the flag corresponding to the flags. When the flag is determined to be AV_PKT_FLAG_KEY, it indicates that the video frame is an I frame.
请参阅图6,在一些实施例中,步骤S103可以包括但不限于包括步骤S601至步骤S603:Please refer to FIG. 6 . In some embodiments, step S103 may include but is not limited to steps S601 to S603:
步骤S601,新建多个切片保存文件;Step S601, creating multiple slice saving files;
步骤S602,拷贝已切分的各个视频切片;Step S602, copying each segmented video slice;
步骤S603,将各个视频切片的视频数据写入至对应的切片保存文件,得到包含对应目标I帧的各个切片保存文件。Step S603, writing the video data of each video slice into the corresponding slice saving file, and obtaining each slice saving file containing the corresponding target I frame.
在一些实施例中,对于每个已切分的视频切片,都需要新建一个切片保存文件,拷贝已切分的各个视频切片,再将各个视频切片的视频数据写入至对应的切片保存文件,得到包含对应目标I帧的各个切片保存文件。可知,整个视频切分过程不涉及到任何视频帧内的信息提取和运算,仅仅是切分后快速地拷贝并保存。由于得到的是包含对应目标I帧的各个切片保存文件,因此,视频切分后不需要进行视频重新编码,节省处理时间及资源消耗,从而加快视频切片处理速度。In some embodiments, for each segmented video slice, a new slice save file needs to be created, each segmented video slice is copied, and then the video data of each video slice is written to the corresponding slice save file to obtain each slice save file containing the corresponding target I frame. It can be seen that the entire video segmentation process does not involve any information extraction and calculation within the video frame, but only quickly copying and saving after segmentation. Since each slice save file containing the corresponding target I frame is obtained, there is no need to re-encode the video after segmentation, saving processing time and resource consumption, thereby speeding up the video segmentation processing speed.
以下结合具体实施例进一步说明本申请的视频切片方法。The video slicing method of the present application is further explained below in conjunction with specific embodiments.
假设算法单次输入视频2.5s以内,可以保证算法的处理时间满足业务要求。Assuming that the algorithm inputs a video within 2.5 seconds at a time, the algorithm's processing time can be guaranteed to meet business requirements.
本方法对视频进行切分,视频长度50s,单个视频片段时间设定为2s。This method segments the video into 50 seconds long segments, with the duration of a single video segment set to 2 seconds.
步骤如下:Here are the steps:
(1)从视频开始进行搜索,第一帧为I帧,将该帧写入文件。(1) Search from the beginning of the video, the first frame is the I frame, and write the frame into the file.
(2)从开始I帧开始,向后搜索,每一帧读取距离第一帧的时间,并把帧数据都写入文件。(2) Start from the first I frame and search backwards, read the time from the first frame to each frame, and write all the frame data into the file.
(3)当搜索到视频帧距离第一个I帧为2s时,向后搜索到最近的I帧,将I帧之前的数据写入文件。(3) When the video frame is found to be 2 seconds away from the first I frame, search backward to the nearest I frame and write the data before the I frame into the file.
(4)新建文件,从(3)中最后搜索的I帧开始写文件,依次重复(2)、(3)步骤内容。(4) Create a new file and start writing the file from the last I frame searched in (3), and repeat steps (2) and (3) in sequence.
(5)重复(4)中操作,直到视频最后。(5) Repeat the steps in (4) until the end of the video.
由于传统的视频切片方法没有对I帧进行提取,在做视频切片之前,通常对待提取内容进行解码,并进行H264编码,完成切片文件的生成,但是过程中编码需要耗费大量的时间和资源。相比于传统的视频切片方法,本发明的视频切片方法,因为从I帧进行数据读取及文件写入,因此,新的切片文件中增加包含I帧信息,可以完整的对后续视频数据进行解码。由于不需要对视频数据进行解码再编码,节省了大量的计算量和资源消耗,从而加快了处理速度。Since the traditional video slicing method does not extract the I frame, before doing video slicing, the extracted content is usually decoded and H264 encoded to complete the generation of the slice file, but the encoding process requires a lot of time and resources. Compared with the traditional video slicing method, the video slicing method of the present invention reads data from the I frame and writes files. Therefore, the new slice file contains I frame information, and the subsequent video data can be completely decoded. Since there is no need to decode and re-encode the video data, a large amount of calculation and resource consumption are saved, thereby speeding up the processing speed.
需要说明的是,本发明实施例的视频切片方法具有快捷、计算量小的独特优势,可以应用到所有对视频长度有要求的算法场景。It should be noted that the video slicing method of the embodiment of the present invention has the unique advantages of being fast and having a small amount of calculation, and can be applied to all algorithm scenarios that have requirements on the video length.
基于此,本申请实施例通过获取待切分的视频文件;根据预设搜索时长和多个目标I帧对视频文件进行切分,得到各个视频切片,其中,各个所述视频切片的首帧为所述目标I帧;对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。用户可以根据期望的视频切分长度对预设搜索时长进行设置,在预设搜索时长内对待切分的视频文件进行视频帧搜索,并基于搜索到的目标I帧对视频文件进行切分,得到的是首帧为目标I帧的各个视频切片,因此,得到的各个视频切片包含有目标I帧信息,基于I帧可以完整地对后续视频帧数据进行解码,提取到视频帧的信息,不需要对视频数据进行解码再编码,节省大量的计算量,从而加快处理速度。基于此,本发明实施例既可以将视频切分成期望的长度,保证视频切分处理的实时性,又可以有效保存视频中的信息,且不需要对视频进行解码再编码,从而节省大量的计算量,加快处理速度,尤其在人工智能视频处理领域,由于各个视频切片中I帧保留齐全,切割后的视频帧内容可以完整复原,不会丢失信息,因此非常适合人工智能视频算法处理。整个视频切分过程不涉及到任何视频帧内的信息提取和运算,仅仅是切分后快速地拷贝并保存。视频切分后不需要进行视频重新编码,节省处理时间及资源消耗,从而加快视频切片处理速度。Based on this, the embodiment of the present application obtains the video file to be segmented; segments the video file according to the preset search time and multiple target I frames to obtain various video slices, wherein the first frame of each of the video slices is the target I frame; and saves the video data of each video slice to obtain various slice save files containing the corresponding target I frame. The user can set the preset search time according to the desired video segmentation length, perform a video frame search on the video file to be segmented within the preset search time, and segment the video file based on the searched target I frame to obtain various video slices whose first frame is the target I frame. Therefore, each obtained video slice contains the target I frame information, and the subsequent video frame data can be completely decoded based on the I frame to extract the video frame information. There is no need to decode and re-encode the video data, saving a lot of calculations, thereby speeding up the processing speed. Based on this, the embodiments of the present invention can not only cut the video into the desired length to ensure the real-time performance of the video cutting process, but also effectively save the information in the video, and there is no need to decode and re-encode the video, thereby saving a lot of calculations and speeding up the processing speed, especially in the field of artificial intelligence video processing. Since the I frames in each video slice are fully preserved, the content of the cut video frame can be completely restored without losing information, so it is very suitable for artificial intelligence video algorithm processing. The entire video cutting process does not involve any information extraction and calculation within the video frame, but only quickly copying and saving after cutting. There is no need to re-encode the video after cutting, which saves processing time and resource consumption, thereby speeding up the video slicing processing speed.
请参阅图7,本申请实施例还提供一种视频切片装置,可以实现上述视频切片方法,该装置包括:Referring to FIG. 7 , an embodiment of the present application further provides a video slicing device, which can implement the above-mentioned video slicing method, and the device includes:
获取模块710,用于获取待切分的视频文件;An acquisition module 710 is used to acquire a video file to be segmented;
切片模块720,用于根据预设搜索时长和目标I帧对视频文件进行切分,得到各个视频切片,其中,各个视频切片的首帧为目标I帧;The slicing module 720 is used to slice the video file according to the preset search time and the target I frame to obtain various video slices, wherein the first frame of each video slice is the target I frame;
保存模块730,用于对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。The saving module 730 is used to save the video data of each video slice to obtain a saving file of each slice including a corresponding target I frame.
在本申请的一些实施例中,获取模块710获取待切分的视频文件;切片模块720根据预设搜索时长和目标I帧对视频文件进行切分,得到各个视频切片,其中,各个视频切片的首帧为目标I帧;保存模块730对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。In some embodiments of the present application, the acquisition module 710 acquires the video file to be segmented; the slicing module 720 segments the video file according to the preset search time and the target I frame to obtain individual video slices, wherein the first frame of each video slice is the target I frame; the saving module 730 saves the video data of each video slice to obtain individual slice save files containing the corresponding target I frame.
在本申请的一些实施例中,获取待切分的视频文件。待上传的视频文件可以为用户拍摄完成或者创作完成的视频文件,视频文件的编码格式包括但不限于H264编码。以H264码流的视频文件为例,视频文件包含多个GOP图像组,每个GOP图像组里面包含多个视频编码帧。GOP图像组的划分是两个临近I帧之间的图像。每个GOP图像组包含一个I帧、至少一个P帧和至少一个B帧。P帧或B帧的解码,需要依赖前面或者后面的视频帧图像,但是I帧可以单独一帧解码。因此,如果一个GOP图像组中缺少I帧,则会导致P帧、B帧解码失败。In some embodiments of the present application, a video file to be segmented is obtained. The video file to be uploaded may be a video file that has been shot or created by a user, and the encoding format of the video file includes but is not limited to H264 encoding. Taking a video file of an H264 code stream as an example, the video file contains multiple GOP image groups, and each GOP image group contains multiple video encoding frames. The division of the GOP image group is the image between two adjacent I frames. Each GOP image group contains an I frame, at least one P frame and at least one B frame. The decoding of P frames or B frames needs to rely on the previous or subsequent video frame images, but I frames can be decoded as a single frame. Therefore, if an I frame is missing in a GOP image group, the decoding of P frames and B frames will fail.
在本申请的一些实施例中,根据预设搜索时长和多个目标I帧对视频文件进行切分,得到各个视频切片,其中,各个视频切片的首帧为目标I帧。用户可以根据期限的视频切分长度对预设搜索时长进行设置,在预设搜索时长内对待切分的视频文件进行视频帧搜索,并基于搜索到的目标I帧对视频文件进行切分,得到的是首帧为目标I帧的各个视频切片,因此,得到的各个视频切片包含有目标I帧信息,基于I帧可以完整地对后续视频帧数据进行解码,提取到视频帧的信息,不需要对视频数据进行解码再编码,节省大量的计算量,从而加快处理速度。In some embodiments of the present application, the video file is segmented according to a preset search duration and multiple target I frames to obtain various video slices, wherein the first frame of each video slice is the target I frame. The user can set the preset search duration according to the video segmentation length of the period, perform a video frame search on the video file to be segmented within the preset search duration, and segment the video file based on the searched target I frame to obtain various video slices whose first frame is the target I frame. Therefore, each obtained video slice contains the target I frame information, and the subsequent video frame data can be completely decoded based on the I frame to extract the video frame information, without the need to decode and re-encode the video data, saving a lot of calculation, thereby speeding up the processing speed.
在本申请的一些实施例中,对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。示例性地,在人工智能视频处理领域,由于各个视频切片中I帧保留齐全,切割后的视频帧内容可以完整复原,不会丢失信息,因此非常适合人工智能视频算法处理。整个视频切分过程不涉及到任何视频帧内的信息提取和运算,仅仅是切分后快速地拷贝并保存。视频切分后不需要进行视频重新编码,节省处理时间及资源消耗,从而加快视频切片处理速度。In some embodiments of the present application, the video data of each video slice is saved to obtain a save file of each slice containing the corresponding target I frame. For example, in the field of artificial intelligence video processing, since the I frames in each video slice are fully preserved, the content of the cut video frame can be completely restored without losing information, so it is very suitable for artificial intelligence video algorithm processing. The entire video segmentation process does not involve any information extraction and calculation within the video frame, but only quickly copying and saving after segmentation. After video segmentation, there is no need to re-encode the video, which saves processing time and resource consumption, thereby speeding up the processing speed of video slices.
基于此,本申请实施例的视频切片装置,获取模块710获取待切分的视频文件;切片模块720根据预设搜索时长和目标I帧对视频文件进行切分,得到各个视频切片,其中,各个视频切片的首帧为目标I帧;保存模块730对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。本申请通过获取待切分的视频文件;根据预设搜索时长和多个目标I帧对视频文件进行切分,得到各个视频切片,其中,各个所述视频切片的首帧为所述目标I帧;对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。用户可以根据期望的视频切分长度对预设搜索时长进行设置,在预设搜索时长内对待切分的视频文件进行视频帧搜索,并基于搜索到的目标I帧对视频文件进行切分,得到的是首帧为目标I帧的各个视频切片,因此,得到的各个视频切片包含有目标I帧信息,基于I帧可以完整地对后续视频帧数据进行解码,提取到视频帧的信息,不需要对视频数据进行解码再编码,节省大量的计算量,从而加快处理速度。基于此,本发明实施例既可以将视频切分成期望的长度,保证视频切分处理的实时性,又可以有效保存视频中的信息,且不需要对视频进行解码再编码,从而节省大量的计算量,加快处理速度,尤其在人工智能视频处理领域,由于各个视频切片中I帧保留齐全,切割后的视频帧内容可以完整复原,不会丢失信息,因此非常适合人工智能视频算法处理。整个视频切分过程不涉及到任何视频帧内的信息提取和运算,仅仅是切分后快速地拷贝并保存。视频切分后不需要进行视频重新编码,节省处理时间及资源消耗,从而加快视频切片处理速度。Based on this, in the video slicing device of the embodiment of the present application, the acquisition module 710 acquires the video file to be segmented; the slicing module 720 segments the video file according to the preset search time and the target I frame to obtain each video slice, wherein the first frame of each video slice is the target I frame; the saving module 730 saves the video data of each video slice to obtain each slice saving file containing the corresponding target I frame. The present application obtains the video file to be segmented; segments the video file according to the preset search time and multiple target I frames to obtain each video slice, wherein the first frame of each video slice is the target I frame; saves the video data of each video slice to obtain each slice saving file containing the corresponding target I frame. The user can set the preset search time according to the desired video segmentation length, perform video frame search on the video file to be segmented within the preset search time, and segment the video file based on the searched target I frame, and obtain each video slice whose first frame is the target I frame. Therefore, each obtained video slice contains the target I frame information, and the subsequent video frame data can be completely decoded based on the I frame to extract the information of the video frame. There is no need to decode and re-encode the video data, saving a lot of calculation, thereby speeding up the processing speed. Based on this, the embodiment of the present invention can not only segment the video into the desired length, ensure the real-time performance of the video segmentation processing, but also effectively save the information in the video, and there is no need to decode and re-encode the video, thereby saving a lot of calculation, speeding up the processing speed, especially in the field of artificial intelligence video processing, because the I frames in each video slice are fully preserved, the content of the cut video frame can be completely restored, and no information will be lost, so it is very suitable for artificial intelligence video algorithm processing. The entire video segmentation process does not involve any information extraction and calculation in the video frame, but only quickly copying and saving after segmentation. There is no need to re-encode the video after segmentation, which saves processing time and resource consumption, thereby speeding up the video segmentation processing.
该视频切片装置的具体实施方式与上述视频切片方法的具体实施例基本相同,在此不再赘述。The specific implementation of the video slicing device is basically the same as the specific implementation of the above-mentioned video slicing method, and will not be repeated here.
本申请实施例还提供了一种电子设备,电子设备包括存储器和处理器,存储器存储有计算机程序,处理器执行计算机程序时实现上述视频切片方法。该电子设备可以为包括平板电脑、车载电脑等任意智能终端。The embodiment of the present application also provides an electronic device, the electronic device includes a memory and a processor, the memory stores a computer program, and the processor implements the above-mentioned video slicing method when executing the computer program. The electronic device can be any intelligent terminal including a tablet computer, a car computer, etc.
请参阅图8,图8示意了另一实施例的电子设备的硬件结构,电子设备包括:Please refer to FIG8 , which illustrates a hardware structure of an electronic device according to another embodiment. The electronic device includes:
处理器801,可以采用通用的CPU(Centra l Process i ngUn it,中央处理器)、微处理器、应用专用集成电路(App l i cat i onSpec i f i c I ntegratedCi rcu it,ASI C)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现本申请实施例所提供的技术方案。The processor 801 can be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute relevant programs to implement the technical solutions provided in the embodiments of the present application.
存储器802,可以采用只读存储器(ReadOn l yMemory,ROM)、静态存储设备、动态存储设备或者随机存取存储器(RandomAccessMemory,RAM)等形式实现。存储器802可以存储操作系统和其他应用程序,在通过软件或者固件来实现本说明书实施例所提供的技术方案时,相关的程序代码保存在存储器802中,并由处理器801来调用执行本申请实施例的视频切片方法,即通过获取待切分的视频文件;根据预设搜索时长和多个目标I帧对视频文件进行切分,得到各个视频切片,其中,各个所述视频切片的首帧为所述目标I帧;对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。用户可以根据期望的视频切分长度对预设搜索时长进行设置,在预设搜索时长内对待切分的视频文件进行视频帧搜索,并基于搜索到的目标I帧对视频文件进行切分,得到的是首帧为目标I帧的各个视频切片,因此,得到的各个视频切片包含有目标I帧信息,基于I帧可以完整地对后续视频帧数据进行解码,提取到视频帧的信息,不需要对视频数据进行解码再编码,节省大量的计算量,从而加快处理速度。基于此,本发明实施例既可以将视频切分成期望的长度,保证视频切分处理的实时性,又可以有效保存视频中的信息,且不需要对视频进行解码再编码,从而节省大量的计算量,加快处理速度,尤其在人工智能视频处理领域,由于各个视频切片中I帧保留齐全,切割后的视频帧内容可以完整复原,不会丢失信息,因此非常适合人工智能视频算法处理。整个视频切分过程不涉及到任何视频帧内的信息提取和运算,仅仅是切分后快速地拷贝并保存。视频切分后不需要进行视频重新编码,节省处理时间及资源消耗,从而加快视频切片处理速度。The memory 802 can be implemented in the form of a read-only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM). The memory 802 can store an operating system and other applications. When the technical solution provided in the embodiment of this specification is implemented by software or firmware, the relevant program code is stored in the memory 802, and the processor 801 is used to call and execute the video slicing method of the embodiment of the present application, that is, by obtaining a video file to be segmented; segmenting the video file according to a preset search duration and multiple target I frames to obtain each video slice, wherein the first frame of each video slice is the target I frame; and saving the video data of each video slice to obtain each slice save file containing the corresponding target I frame. The user can set the preset search time according to the desired video segmentation length, perform video frame search on the video file to be segmented within the preset search time, and segment the video file based on the searched target I frame, and obtain each video slice whose first frame is the target I frame. Therefore, each obtained video slice contains the target I frame information, and the subsequent video frame data can be completely decoded based on the I frame to extract the information of the video frame. There is no need to decode and re-encode the video data, saving a lot of calculation, thereby speeding up the processing speed. Based on this, the embodiment of the present invention can not only segment the video into the desired length, ensure the real-time performance of the video segmentation processing, but also effectively save the information in the video, and there is no need to decode and re-encode the video, thereby saving a lot of calculation, speeding up the processing speed, especially in the field of artificial intelligence video processing, because the I frames in each video slice are fully preserved, the content of the cut video frame can be completely restored, and no information will be lost, so it is very suitable for artificial intelligence video algorithm processing. The entire video segmentation process does not involve any information extraction and calculation in the video frame, but only quickly copying and saving after segmentation. There is no need to re-encode the video after segmentation, which saves processing time and resource consumption, thereby speeding up the video segmentation processing.
输入/输出接口803,用于实现信息输入及输出。The input/output interface 803 is used to implement information input and output.
通信接口804,用于实现本设备与其他设备的通信交互,可以通过有线方式(例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WI F I、蓝牙等)实现通信。The communication interface 804 is used to realize the communication interaction between this device and other devices. Communication can be realized through wired methods (such as USB, network cable, etc.) or wireless methods (such as mobile network, WIFI, Bluetooth, etc.).
总线,在设备的各个组件(例如处理器801、存储器802、输入/输出接口803和通信接口804)之间传输信息。The bus transmits information between various components of the device (eg, the processor 801 , the memory 802 , the input/output interface 803 , and the communication interface 804 ).
其中处理器801、存储器802、输入/输出接口803和通信接口804通过总线实现彼此之间在设备内部的通信连接。The processor 801 , the memory 802 , the input/output interface 803 and the communication interface 804 are connected to each other in communication within the device via a bus.
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序被处理器执行时实现上述视频切片方法。An embodiment of the present application also provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the above-mentioned video slicing method is implemented.
存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器可选包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至该处理器。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory, as a non-transient computer-readable storage medium, can be used to store non-transient software programs and non-transient computer executable programs. In addition, the memory may include a high-speed random access memory, and may also include a non-transient memory, such as at least one disk storage device, a flash memory device, or other non-transient solid-state storage device. In some embodiments, the memory may optionally include a memory remotely disposed relative to the processor, and these remote memories may be connected to the processor via a network. Examples of the above-mentioned network include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
本申请实施例提供的视频切片方法、视频切片装置、电子设备及存储介质,通过获取待切分的视频文件;根据预设搜索时长和多个目标I帧对视频文件进行切分,得到各个视频切片,其中,各个所述视频切片的首帧为所述目标I帧;对各个视频切片的视频数据进行保存,得到包含对应目标I帧的各个切片保存文件。用户可以根据期望的视频切分长度对预设搜索时长进行设置,在预设搜索时长内对待切分的视频文件进行视频帧搜索,并基于搜索到的目标I帧对视频文件进行切分,得到的是首帧为目标I帧的各个视频切片,因此,得到的各个视频切片包含有目标I帧信息,基于I帧可以完整地对后续视频帧数据进行解码,提取到视频帧的信息,不需要对视频数据进行解码再编码,节省大量的计算量,从而加快处理速度。基于此,本发明实施例既可以将视频切分成期望的长度,保证视频切分处理的实时性,又可以有效保存视频中的信息,且不需要对视频进行解码再编码,从而节省大量的计算量,加快处理速度,尤其在人工智能视频处理领域,由于各个视频切片中I帧保留齐全,切割后的视频帧内容可以完整复原,不会丢失信息,因此非常适合人工智能视频算法处理。整个视频切分过程不涉及到任何视频帧内的信息提取和运算,仅仅是切分后快速地拷贝并保存。视频切分后不需要进行视频重新编码,节省处理时间及资源消耗,从而加快视频切片处理速度。The video slicing method, video slicing device, electronic device and storage medium provided in the embodiment of the present application obtain the video file to be segmented; segment the video file according to the preset search time and multiple target I frames to obtain each video slice, wherein the first frame of each video slice is the target I frame; save the video data of each video slice to obtain each slice save file containing the corresponding target I frame. The user can set the preset search time according to the desired video segmentation length, perform a video frame search for the video file to be segmented within the preset search time, and segment the video file based on the searched target I frame to obtain each video slice whose first frame is the target I frame. Therefore, each obtained video slice contains the target I frame information, and the subsequent video frame data can be completely decoded based on the I frame to extract the video frame information, without the need to decode and re-encode the video data, saving a lot of calculation, thereby speeding up the processing speed. Based on this, the embodiments of the present invention can not only cut the video into the desired length to ensure the real-time performance of the video cutting process, but also effectively save the information in the video, and there is no need to decode and re-encode the video, thereby saving a lot of calculations and speeding up the processing speed, especially in the field of artificial intelligence video processing. Since the I frames in each video slice are fully preserved, the content of the cut video frame can be completely restored without losing information, so it is very suitable for artificial intelligence video algorithm processing. The entire video cutting process does not involve any information extraction and calculation within the video frame, but only quickly copying and saving after cutting. There is no need to re-encode the video after cutting, which saves processing time and resource consumption, thereby speeding up the video slicing processing speed.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统可以被实施为软件、固件、硬件及其适当的组合。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读程序、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读程序、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。It will be appreciated by those skilled in the art that all or some of the steps and systems in the disclosed method above may be implemented as software, firmware, hardware and appropriate combinations thereof. Some physical components or all physical components may be implemented as software executed by a processor, such as a central processing unit, a digital signal processor or a microprocessor, or may be implemented as hardware, or may be implemented as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on a computer-readable medium, which may include a computer storage medium (or a non-transitory medium) and a communication medium (or a temporary medium). As known to those skilled in the art, the term computer storage medium includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storing information (such as a computer-readable program, a data structure, a program module or other data). Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cassette, magnetic tape, disk storage or other magnetic storage device, or any other medium that may be used to store desired information and may be accessed by a computer. Furthermore, it is well known to those skilled in the art that communication media generally contain computer-readable programs, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media.
本申请实施例描述的实施例是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域技术人员可知,随着技术的演变和新应用场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。The embodiments described in the embodiments of the present application are intended to more clearly illustrate the technical solutions of the embodiments of the present application and do not constitute a limitation on the technical solutions provided in the embodiments of the present application. Those skilled in the art will appreciate that with the evolution of technology and the emergence of new application scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
本领域技术人员可以理解的是,图中示出的技术方案并不构成对本申请实施例的限定,可以包括比图示更多或更少的步骤,或者组合某些步骤,或者不同的步骤。Those skilled in the art will appreciate that the technical solutions shown in the figures do not constitute a limitation on the embodiments of the present application, and may include more or fewer steps than shown in the figures, or a combination of certain steps, or different steps.
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The device embodiments described above are merely illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place or distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、设备中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。Those skilled in the art will appreciate that all or some of the steps in the methods disclosed above, and the functional modules/units in the systems and devices may be implemented as software, firmware, hardware, or a suitable combination thereof.
本申请的说明书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", "fourth", etc. (if any) in the specification of the present application and the above-mentioned drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. It should be understood that the data used in this way can be interchangeable where appropriate, so that the embodiments of the present application described herein can be implemented in an order other than those illustrated or described herein. In addition, the terms "including" and "having" and any of their variations are intended to cover non-exclusive inclusions, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those steps or units clearly listed, but may include other steps or units that are not clearly listed or inherent to these processes, methods, products or devices.
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。It should be understood that in the present application, "at least one (item)" means one or more, and "plurality" means two or more. "And/or" is used to describe the association relationship of associated objects, indicating that three relationships may exist. For example, "A and/or B" can mean: only A exists, only B exists, and A and B exist at the same time, where A and B can be singular or plural. The character "/" generally indicates that the objects associated before and after are in an "or" relationship. "At least one of the following" or similar expressions refers to any combination of these items, including any combination of single or plural items. For example, at least one of a, b or c can mean: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", where a, b, c can be single or multiple.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,上述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in the present application, it should be understood that the disclosed devices and methods can be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of the above units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware or in the form of software functional units.
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括多指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例的方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-On l y Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等各种可以存储程序的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including multiple instructions to enable a computer device (which can be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods of various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-On l y Memory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM), disk or optical disk and other media that can store programs.
以上参照附图说明了本申请实施例的优选实施例,并非因此局限本申请实施例的权利范围。本领域技术人员不脱离本申请实施例的范围和实质内所作的任何修改、等同替换和改进,均应在本申请实施例的权利范围之内。The preferred embodiments of the present invention are described above with reference to the accompanying drawings, but the scope of the rights of the present invention is not limited thereto. Any modification, equivalent substitution and improvement made by a person skilled in the art without departing from the scope and essence of the present invention should be within the scope of the rights of the present invention.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411060573.3A CN118803385A (en) | 2024-08-02 | 2024-08-02 | Video slicing method and device, electronic device and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411060573.3A CN118803385A (en) | 2024-08-02 | 2024-08-02 | Video slicing method and device, electronic device and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN118803385A true CN118803385A (en) | 2024-10-18 |
Family
ID=93031597
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202411060573.3A Pending CN118803385A (en) | 2024-08-02 | 2024-08-02 | Video slicing method and device, electronic device and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118803385A (en) |
-
2024
- 2024-08-02 CN CN202411060573.3A patent/CN118803385A/en active Pending
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN109862391B (en) | Video classification method, medium, device and computing equipment | |
| US9418296B1 (en) | Detecting segments of a video program | |
| CN111670580B (en) | Progressive compressed domain computer vision and deep learning system | |
| EP3099074B1 (en) | Systems, devices and methods for video coding | |
| US9071814B1 (en) | Scene detection based on video encoding parameters | |
| US11086843B2 (en) | Embedding codebooks for resource optimization | |
| CN113704506B (en) | A media content deduplication method and related device | |
| US11600302B2 (en) | System and methods for autonomous synchronous rendering of media objects obtained from a plurality of media sources | |
| CN112291589A (en) | Video file structure detection method and device | |
| CN107370726B (en) | Virtual slicing method and system for distributed media file transcoding system | |
| US12143651B2 (en) | Method for on-demand video editing at transcode-time in a video streaming system | |
| WO2019127926A1 (en) | Calculation method and calculation device for sparse neural network, electronic device, computer readable storage medium, and computer program product | |
| WO2017162015A1 (en) | Data processing method and apparatus, and storage medium | |
| CN112383824A (en) | Video advertisement filtering method, device and storage medium | |
| US11743474B2 (en) | Shot-change detection using container level information | |
| CN112714336B (en) | Video segmentation method and device, electronic equipment and computer readable storage medium | |
| CN118803385A (en) | Video slicing method and device, electronic device and storage medium | |
| CN112019878B (en) | Video decoding and editing method, device, equipment and storage medium | |
| CN116132719A (en) | Video processing method, device, electronic equipment and readable storage medium | |
| CN106791909B (en) | Video data processing method and device and server | |
| WO2024138062A2 (en) | Co-optimization of hardware-based encoding and software-based encoding | |
| CN116916060A (en) | Video processing method and related equipment | |
| JP2006020330A (en) | Process and apparatus for compressing video documents | |
| CN114979643A (en) | Video coding method and device, electronic equipment and storage medium | |
| CN112218118A (en) | Audio and video clipping method and device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |