WO2019165723A1

WO2019165723A1 - Method and system for processing audio/video, and device and storage medium

Info

Publication number: WO2019165723A1
Application number: PCT/CN2018/091115
Authority: WO
Inventors: 袁晖
Original assignee: Shenzhen Ikmak Tech Co Ltd
Current assignee: Shenzhen Ikmak Tech Co Ltd
Priority date: 2018-02-28
Filing date: 2018-06-13
Publication date: 2019-09-06
Anticipated expiration: 2020-08-28
Also published as: CN108388649B; CN108388649A

Abstract

Disclosed are a method and system for processing an audio/video, and a device and a storage medium. The method comprises the steps of: generating a positioning information tag, an environmental meteorological information tag, and a time information tag of a location where an audio/video is recorded; matching to obtain a historical audio/video having information tags identical to the information tags above from a database; and processing and generating a target audio/video according to the recorded audio/video and the historical audio/video. By means of the method and system for processing the audio/video, and the device and the storage medium, the positioning information tag, the environmental meteorological information tag, and the time information tag of the location where the audio/video is recorded are added, so that the historical audio/video having identical information tags can be used as a reference of the recorded audio/video when the audio/video is processed, the audio/video can be edited and processed more accurately, the operation difficulty and the operation time of a user are reduced, and the authenticity of the target audio/video after processing is improved.

Description

Method, system, device and storage medium for processing audio and video

技术领域Technical field

本申请涉及到无线通讯领域，特别是涉及到一种处理音视频的方法、系统、设备及存储介质。The present application relates to the field of wireless communications, and in particular, to a method, system, device, and storage medium for processing audio and video.

背景技术Background technique

随着摄像技术的不断发展，越来越多的用户可以通过摄像机自行拍摄音视频，为了保证音视频的质量，在音视频拍摄完成后，通常还需要对拍摄的音视频进行编辑，例如对音视频中的部分片段进行提取、删除以及合并等。With the continuous development of camera technology, more and more users can shoot audio and video by themselves. In order to ensure the quality of audio and video, after the audio and video shooting is completed, it is usually necessary to edit the recorded audio and video, such as the sound. Some of the clips in the video are extracted, deleted, merged, and so on.

在现有技术中，当用户需要对音视频进行编辑时，用户通常需要通过个人计算机下载以及安装专业的音视频编辑软件，例如Adobe After Effects或Adobe Premiere等。In the prior art, when a user needs to edit audio and video, the user usually needs to download and install professional audio and video editing software through a personal computer, such as Adobe After. Effects or Adobe Premiere, etc.

现有的音视频编辑软件一般是根据内嵌的算法对音视频编辑部分及附近内容进行智能计算得出编辑后的音视频内容，该种方法对软件内嵌的算法依赖性强，音视频编辑后的真实性偏低，主要取决于算法的精确性以及使用者的个人能力水平，对使用者和编辑机器的要求高，极大的影响使用者的体验。The existing audio and video editing software generally performs intelligent calculation on the audio and video editing part and nearby content according to the embedded algorithm to obtain the edited audio and video content. This method has strong dependence on the software embedded algorithm, and the audio and video editing is strong. The low authenticity depends mainly on the accuracy of the algorithm and the level of personal ability of the user. The requirements on the user and the editing machine are high, which greatly affects the user experience.

发明内容Summary of the invention

本申请的主要目的为提供一种处理音视频的方法、系统、设备及存储介质，提高修改后音视频中场景的真实性。The main purpose of the present application is to provide a method, system, device, and storage medium for processing audio and video, which improves the authenticity of a scene in a modified audio and video.

为实现上述目的本申请提出一种处理音视频的方法，包括步骤：To achieve the above object, the present application provides a method for processing audio and video, including the steps of:

生成录制音视频所在位置的定位信息标签、环境气象信息标签和时间信息标签；Generating a positioning information label, an environmental weather information label, and a time information label at a position where the recorded audio and video are located;

从数据库中匹配出与上述定位信息标签、环境气象信息标签和时间信息标签相同或近似的历史音视频；Matching historical audio and video that is the same as or similar to the positioning information label, the environmental weather information label, and the time information label from the database;

根据上述录制音视频和历史音视频处理生成目标音视频。The target audio and video is generated according to the above-mentioned recorded audio and video and historical audio and video processing.

为实现上述目的本申请提出一种处理音视频的系统，包括：To achieve the above object, the present application proposes a system for processing audio and video, comprising:

生成模块，设置为生成录制音视频所在位置的定位信息标签、环境气象信息标签和时间信息标签；The generating module is configured to generate a positioning information label, an environmental weather information label, and a time information label at a position where the recorded audio and video are located;

匹配模块，设置为从数据库中匹配出上述信息标签完全相同的历史音视频；The matching module is configured to match the historical audio and video of the same information label from the database;

处理模块，设置为根据上述录制音视频和历史音视频处理生成目标音视频。The processing module is configured to generate a target audio and video according to the above-mentioned recorded audio and video and historical audio and video processing.

为实现上述目的本申请提出一种计算机设备，包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序，上述处理器执行上述程序时实现如下步骤：To achieve the above object, the present application provides a computer device including a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor performs the following steps when executing the program:

从数据库中匹配出与所述定位信息标签、环境气象信息标签和时间信息标签相同或近似的历史音视频；Matching historical audio and video that is the same as or similar to the positioning information label, the environmental weather information label, and the time information label from the database;

根据所述录制音视频和历史音视频处理生成目标音视频。The target audio and video is generated according to the recorded audio and video and historical audio and video processing.

为实现上述目的本申请提出一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如下步骤：To achieve the above object, the present application provides a computer readable storage medium having stored thereon a computer program that, when executed by a processor, implements the following steps:

本申请的处理音视频的方法、系统、设备及存储介质的有益效果为：通过为录制音视频附加所在位置的定位信息标签、环境气象信息标签和时间信息标签，使在处理音视频时能通过信息标签相同的历史音视频做为该录制音视频的参考，从而能够更精确的将音视频进行编辑和处理，降低使用者的操作难度以及操作时间，并提高处理后目标音视频的真实性。The beneficial effects of the method, system, device and storage medium for processing audio and video of the present application are as follows: by adding a positioning information label, an environmental weather information label and a time information label of a location for recording audio and video, so that the audio and video can be processed when the audio and video are processed. The historical audio and video with the same information label is used as a reference for the recorded audio and video, so that the audio and video can be edited and processed more accurately, the operation difficulty and operation time of the user are reduced, and the authenticity of the target audio and video after processing is improved.

附图说明DRAWINGS

图1 为本申请一实施例的处理音视频的方法的流程示意图；1 is a schematic flow chart of a method for processing audio and video according to an embodiment of the present application;

图2 为本申请一实施例的处理音视频的方法的流程示意图；2 is a schematic flowchart of a method for processing audio and video according to an embodiment of the present application;

图3 为本申请一实施例的处理音视频的方法的流程示意图；FIG. 3 is a schematic flowchart diagram of a method for processing audio and video according to an embodiment of the present application; FIG.

图4 为本申请一实施例的处理音视频的方法的流程示意图；FIG. 4 is a schematic flowchart diagram of a method for processing audio and video according to an embodiment of the present application; FIG.

图5 为本申请一实施例的处理音视频的方法的流程示意图；FIG. 5 is a schematic flowchart diagram of a method for processing audio and video according to an embodiment of the present application;

图6 为本申请一实施例的处理音视频的方法的流程示意图；6 is a schematic flowchart of a method for processing audio and video according to an embodiment of the present application;

图7 为本申请一实施例的处理音视频的方法的流程示意图；FIG. 7 is a schematic flowchart diagram of a method for processing audio and video according to an embodiment of the present application;

图8 为本申请一实施例的处理音视频的系统的模块结构示意图；FIG. 8 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application; FIG.

图9 为本申请一实施例的处理音视频的系统的模块结构示意图；FIG. 9 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application; FIG.

图10 为本申请一实施例的处理音视频的系统的模块结构示意图；FIG. 10 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application; FIG.

图11 为本申请一实施例的处理音视频的系统的模块结构示意图；FIG. 11 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application; FIG.

图12 为本申请一实施例的处理音视频的系统的模块结构示意图；FIG. 12 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application; FIG.

图13 为本申请一实施例的处理音视频的系统的模块结构示意图；FIG. 13 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application; FIG.

图14 为本申请一实施例的处理音视频的系统的模块结构示意图；FIG. 14 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application; FIG.

图15 为本申请一实施例的一种计算机设备的结构示意图。FIG. 15 is a schematic structural diagram of a computer device according to an embodiment of the present application.

本申请目的实现、功能特点及优点将结合实施例，参照附图做进一步说明。The object of the present application, the features and advantages of the present application will be further described with reference to the accompanying drawings.

具体实施方式Detailed ways

下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本申请的一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。The technical solutions in the embodiments of the present application are clearly and completely described in the following with reference to the drawings in the embodiments of the present application. It is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope are the scope of the present application.

另外，在本申请中涉及“第一”、“第二”等的描述仅用于描述目的，而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。另外，各个实施例之间的技术方案可以相互结合，但是必须是以本领域普通技术人员能够实现为基础，当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在，也不在本申请要求的保护范围之内。In addition, the descriptions of "first", "second", and the like in this application are used for the purpose of description only, and are not to be construed as indicating or implying their relative importance or implicitly indicating the number of technical features indicated. Thus, features defining "first" or "second" may include at least one of the features, either explicitly or implicitly. In addition, the technical solutions between the various embodiments may be combined with each other, but must be based on the realization of those skilled in the art, and when the combination of the technical solutions is contradictory or impossible to implement, it should be considered that the combination of the technical solutions does not exist. Nor is it within the scope of protection required by this application.

参照图1，在本申请实施例中，本申请提供一种处理音视频的方法，包括步骤：Referring to FIG. 1, in the embodiment of the present application, the application provides a method for processing audio and video, including the steps of:

S101、生成录制音视频所在位置的定位信息标签、环境气象信息标签和时间信息标签；S101. Generate a positioning information label, an environmental weather information label, and a time information label at a location where the recorded audio and video are located;

S102、从数据库中匹配出与上述定位信息标签、环境气象信息标签和时间信息标签相同或近似的历史音视频；S102. Matching, from the database, a historical audio and video that is the same as or similar to the positioning information label, the environmental weather information label, and the time information label;

S103、根据上述录制音视频和历史音视频处理生成目标音视频。S103. Generate target audio and video according to the foregoing recording audio and video and historical audio and video processing.

如上述步骤S101，生成录制音视频所在位置的定位信息标签、环境气象信息标签和时间信息标签，在上述录制音视频录制完成后，对该录制设备进行所在位置的定位信息获取，并将获取到的定位信息整合于上述录制音视频的数据中，形成上述录制音视频的定位信息标签，上述定位信息标签一般以经纬度作为表现数值，在确认上述定位信息标签后，根据上述定位信息标签的定位信息查询该地区的环境气象参数和实时时间，该参数一般包括——但不限于——温度、气压、湿度、风向、风速、光照强度、紫外线强度和天气，将上述环境气象参数和实时时间整合于上述录制音视频的数据中，得出上述环境气象信息标签和时间信息标签，In the above step S101, the positioning information label, the environmental meteorological information label, and the time information label of the location where the recorded audio and video are located are generated. After the recording of the recorded audio and video is completed, the positioning information of the location of the recording device is obtained, and the information is obtained. The positioning information is integrated into the data of the recorded audio and video to form a positioning information label of the recorded audio and video. The positioning information label generally uses latitude and longitude as the performance value. After confirming the positioning information label, according to the positioning information of the positioning information label. Query the environmental meteorological parameters and real-time time of the area, which generally includes – but is not limited to – temperature, pressure, humidity, wind direction, wind speed, light intensity, UV intensity and weather, and integrates the above environmental meteorological parameters and real-time time In the data of the above recorded audio and video, the environmental weather information label and the time information label are obtained.

如上述步骤S102，从数据库中匹配出与上述定位信息标签、环境气象信息标签和时间信息标签相同或近似的历史音视频，在数据库中对上述定位信息标签、环境气象信息标签和时间信息标签进行匹配搜索以得出与上述录制音视频标记有相同上述定位信息标签、环境气象信息标签和时间信息标签与的历史音视频，In the above step S102, the historical audio and video that is the same as or similar to the positioning information label, the environmental weather information label, and the time information label is matched from the database, and the positioning information label, the environmental weather information label, and the time information label are performed in the database. Matching the search to obtain a historical audio and video with the same positioning information label, environmental weather information label, and time information label as the above recorded audio and video mark,

如上述步骤S103，根据上述录制音视频和历史音视频处理生成目标音视频，通过上述步骤S102获取到足够的历史音视频后对上述录制音视频进行编辑处理以获得处理后的上述目标音视频，其中，上述目标音视频一般包括——但不限于——3D音视频或特定场景音视频等，上述处理一般包括——但不限于——视频的修剪和删除或增加视频中指定图像，其中在通过上述步骤S102获取到足够的历史音视频后，还可以对上述录制音视频进行真实性的检测，该检测过程仅需要通过对比上述历史音视频与上述录制音视频的差异比例，即可得出视频是否被进行修改，该检测过程比现今的通过检查视频数据是否有修改痕迹更简便、耗时更短。The target audio and video is generated according to the above-mentioned recorded audio and video and historical audio and video processing, and the recorded audio and video is edited after the sufficient historical audio and video is obtained in the above step S102 to obtain the processed target audio and video. Wherein, the above target audio and video generally includes, but is not limited to, 3D audio and video or specific scene audio and video, etc., and the above processing generally includes, but is not limited to, trimming and deleting of video or adding a specified image in the video, wherein After the sufficient historical audio and video is obtained through the foregoing step S102, the recorded audio and video may be authentically detected. The detection process only needs to compare the difference between the historical audio and video and the recorded audio and video. Whether the video is modified or not, this detection process is easier and less time consuming than checking whether the video data has a modified trace.

参照图2，在本实施例中，在上述的处理音视频的方法中，上述处理生成目标音视频的步骤，包括步骤：Referring to FIG. 2, in the embodiment, in the method for processing audio and video, the foregoing step of generating a target audio and video includes the following steps:

S131、选取上述录制音视频中的目标区域；S131. Select a target area in the recorded audio and video.

S132、获取上述历史音视频中对应目标区域的图像数据；S132. Acquire image data of a corresponding target area in the historical audio and video.

S133、将上述图像数据替换上述目标区域的图像数据生成上述目标音视频。S133. The image data is replaced by the image data of the target area to generate the target audio and video.

如上述步骤S131，选取上述录制音视频中的目标区域，在选取目标区域是一般从上述录制音视频中指定一帧画面作为选取基础，其中，上述目标区域一般为非固定区域，当使用者选取上述目标区域后，对上述目标区域中的目标图像进行选择，在选择上述目标图像后，对上述录制音视频中的其余帧数画面进行目标图像的识别以获取与上述目标图像相同或相近的图像并将该图像所在位置重新定义为上述目标区域，从而获得目标区域的运动轨迹，In the above step S131, the target area in the recorded audio and video is selected, and the selected target area is generally selected from the recorded audio and video as a selection basis, wherein the target area is generally a non-fixed area, and is selected by the user. After the target area is selected, the target image in the target area is selected, and after the target image is selected, the target image is identified on the remaining number of frames in the recorded audio and video to obtain an image that is the same as or similar to the target image. And re-defining the position of the image as the above target area, thereby obtaining the motion track of the target area.

如上述步骤S132，获取上述历史音视频中对应目标区域的图像数据，在上述步骤S131执行后，将上述目标区域的运动轨迹设置在上述历史音视频中，并获取上述目标区域的运动轨迹说覆盖的图像数据，Obtaining the image data of the corresponding target area in the historical audio and video as described above, after the step S131 is performed, setting the motion track of the target area in the historical audio and video, and acquiring the motion track coverage of the target area. Image data,

如上述步骤S133，将上述图像数据替换上述目标区域的图像数据生成上述目标音视频。In step S133 described above, the image data is replaced with the image data of the target area to generate the target audio and video.

参照图3，在本实施例中，在上述的处理音视频的方法中，上述处理生成目标音视频的步骤，包括步骤：Referring to FIG. 3, in the foregoing method, in the method for processing audio and video, the foregoing step of generating a target audio and video includes the following steps:

S134、选取上述录制音视频和上述历史音视频中的目标图像；S134. Select the recorded audio and video and the target image in the historical audio and video;

S135、将上述目标图像进行整合得到上述目标图像的三维数据；S135. Integrate the target image to obtain three-dimensional data of the target image.

S136、根据上述目标图像的三维数据生成上述目标音视频。S136. Generate the target audio and video according to the three-dimensional data of the target image.

如上述步骤S134，选取上述录制音视频和上述历史音视频中的目标图像，使用者从上述录制音视频和历史音视频中分别选取对应的目标图像，每个从上述录制音视频或上述历史音视频中选取的目标图像一般为——但不限于——指定物件或场景在该音视频视角下的图像，其中，上述图像一般包括——但不限于——该指定物件或场景的正视图、后视图、左视图、右视图、俯视图和仰视图中的多张，还可以根据实际上述历史音视频的数量以及录制视角的不同或该指定物件或场景的复杂程度增加其他视角的图像或减少对应视角的图像；In step S134, the target audio image and the target image in the historical audio and video are selected, and the user selects a corresponding target image from the recorded audio and video and the historical audio and video, respectively, each of the recorded audio and video or the historical sound. The target image selected in the video is generally - but not limited to - specifying an image of the object or scene at the perspective of the audio and video, wherein the image generally includes - but is not limited to - a front view of the specified object or scene, Multiple sheets in the rear view, left view, right view, top view, and bottom view. It is also possible to increase the image of other views or reduce the corresponding image according to the actual number of the above-mentioned historical audio and video and the difference of the recorded viewing angle or the complexity of the specified object or scene. An image of a perspective;

如上述步骤S135，将上述目标图像进行整合得到上述目标图像的三维数据，将通过上述步骤S134获得的上述目标图像进行对应的三维合并，整合形成上述目标图像的三维数据，以获得指定物件或场景的三维图像，In step S135, the target image is integrated to obtain the three-dimensional data of the target image, and the target image obtained by the above step S134 is subjected to corresponding three-dimensional merging, and the three-dimensional data of the target image is integrated to obtain a specified object or scene. Three-dimensional image,

如上述步骤S136，根据上述目标图像的三维数据生成上述目标音视频，根据上述目标图像的三维数据，即指定物件或场景的三维图像配合指定的音频或视频图像生成上述目标音视频。The step S136 is configured to generate the target audio and video according to the three-dimensional data of the target image, and generate the target audio and video according to the three-dimensional data of the target image, that is, the three-dimensional image of the specified object or the scene, with the specified audio or video image.

参照图4，在本实施例中，在上述的处理音视频的方法中，上述生成录制音视频的对应定位信息标签、环境气象信息标签和时间信息标签的步骤，包括步骤：Referring to FIG. 4, in the foregoing method, in the method for processing audio and video, the step of generating a corresponding positioning information label, an environmental weather information label, and a time information label for recording audio and video includes the following steps:

S111、获取摄录设备所在位置的定位信息；S111. Obtain positioning information of a location where the video recording device is located.

S112、根据上述定位信息获取上述定位信息对应位置的实时环境气象信息和当前时间信息；S112. Acquire real-time environmental weather information and current time information of the location corresponding to the positioning information according to the foregoing positioning information.

S113、根据上述定位信息、实时环境气象信息和当前时间信息生成上述定位信息标签、环境气象信息标签和时间信息标签并附加在上述录制音视频中。S113. The positioning information label, the environmental weather information label, and the time information label are generated according to the positioning information, the real-time environmental weather information, and the current time information, and are added to the recorded audio and video.

如上述步骤S111，获取摄录设备所在位置的定位信息，在获取录像设备所在位置的定位信息时获取的定位信息一般为——但不限于——经纬度，还包括海拔高度，但当海拔高度对音视频的内容影响较小甚至无时，使用者可手动取消对海拔高度的获取，仅保存经纬度，If the location information of the location of the video recording device is obtained, the location information obtained when acquiring the location information of the location of the video recording device is generally - but not limited to - latitude and longitude, and includes altitude, but when the altitude is The content of audio and video has little or no effect. Users can manually cancel the acquisition of altitude and save only the latitude and longitude.

如上述步骤S112，根据上述定位信息获取上述定位信息对应位置的实时环境气象信息和当前时间信息，在通过上述步骤S111获得上述定位信息后通过连接该地区的当地气象信息共享平台或气象网站获取对应位置的环境气象信息，在通过该地区的当地气象信息共享平台或气象网站获取对应位置的环境气象信息的同时，进行实时时间的同步，以获取对应位置的上述当前时间信息，上述当前时间信息一般为该位置对应的时区实时时间，获取上述当前时间信息还可以通过获取上述录制音视频的录制起始时间或录制结束时间，再根据录制设备中的计时模块将上述录制音视频的各帧时间进行换算，Obtaining the real-time environmental meteorological information and the current time information of the location corresponding to the location information according to the foregoing location information, and obtaining the corresponding location information by using the local weather information sharing platform or the weather website connected to the region after obtaining the location information in step S111. The environmental meteorological information of the location is obtained by synchronizing the real-time time while obtaining the environmental meteorological information of the corresponding location through the local meteorological information sharing platform or the meteorological website in the area, to obtain the current time information of the corresponding location, and the current time information is generally For the time zone real time corresponding to the location, obtaining the current time information may also obtain the recording start time or the recording end time of the recorded audio and video, and then perform the frame time of the recorded audio and video according to the timing module in the recording device. Conversion,

如上述步骤S113，根据上述定位信息、实时环境气象信息和当前时间信息生成上述定位信息标签、环境气象信息标签和时间信息标签并附加在上述录制音视频中。In the above step S113, the positioning information label, the environmental weather information label and the time information label are generated according to the positioning information, the real-time environmental weather information and the current time information, and are added to the recorded audio and video.

参照图5，在本实施例中，在上述的处理音视频的方法中，在上述根据上述录制音视频和历史音视频处理生成目标音视频的步骤之后，一般还包括步骤：Referring to FIG. 5, in the foregoing method, in the foregoing method for processing audio and video, after the step of generating the target audio and video according to the foregoing recording audio and video and historical audio and video processing, the method further includes the following steps:

S401、将上述录制音视频存储至上述数据库对应的音视频列表中。S401. Store the recorded audio and video into the audio and video list corresponding to the database.

如上述步骤S401，将上述录制音视频存储至上述数据库对应的音视频列表中，在上述步骤S103执行后，将上述录制音视频存储至对应的上述音视频列表以达到更新上述音视频列表的目的，已达到对位置上景物变化的最优化同步和更新。In the above step S401, the recorded audio and video is stored in the audio and video list corresponding to the database, and after the step S103 is performed, the recorded audio and video is stored in the corresponding audio and video list to achieve the purpose of updating the audio and video list. , optimized synchronization and update of changes in location.

参照图5，在本实施例中，在上述的处理音视频的方法中，在上述生成录制音视频所在位置的定位信息标签、环境气象信息标签和时间信息标签的步骤之前，还包括步骤：Referring to FIG. 5, in the foregoing method, in the foregoing method for processing audio and video, before the step of generating the positioning information label, the environmental weather information label, and the time information label at the position where the recorded audio and video is located, the method further includes the following steps:

S501、生成对应不同定位信息标签、环境气象信息标签和时间信息标签的内容组合的音视频列表；S501. Generate an audio and video list corresponding to a combination of different positioning information labels, environmental weather information labels, and time information labels.

S601、获取与上述音视频列表对应的历史音视频，并将上述历史音视频存储至对应的上述音视频列表中；S601. Acquire a historical audio and video corresponding to the audio and video list, and store the historical audio and video into the corresponding audio and video list.

S701、将上述音视频列表及内容归集形成上述数据库。S701. Collect the audio and video list and the content to form the database.

如上述步骤S501，生成对应不同定位信息标签、环境气象信息标签和时间信息标签的内容组合的音视频列表，根据不同的上述定位信息标签、环境气象信息标签和时间信息标签的内容组合生成不同的上述音视频列表，且不同的上述音视频列表中的上述定位信息标签、环境气象信息标签和时间信息标签的上述内容组合互不相同，Step S501, generating an audio and video list corresponding to the content combination of the different positioning information label, the environmental weather information label, and the time information label, and generating different according to different content combinations of the positioning information label, the environmental weather information label, and the time information label. The audio and video list, and the content combinations of the positioning information label, the environmental weather information label, and the time information label in the different audio and video lists are different from each other.

如上述步骤S601，获取与上述音视频列表对应的历史音视频，并将上述历史音视频存储至对应的上述音视频列表中，在上述步骤S501执行后，根据生成的上述音视频列表将对应的历史音视频存储至上述音视频列表，其中，获取上述历史音视频的方法一般包括——但不限于——进行预录制或从云共享数据库进行获取，Obtaining the historical audio and video corresponding to the audio and video list, and storing the historical audio and video into the corresponding audio and video list, and performing the above-mentioned audio and video list according to the generated audio and video list after the step S501 is performed. The historical audio and video is stored in the above audio and video list, wherein the method for obtaining the historical audio and video generally includes, but is not limited to, performing pre-recording or acquiring from a cloud sharing database.

如上述步骤S701，将上述音视频列表及内容归集形成上述数据库，当上述步骤S601执行后，将上述生成的音视频列表以及储存于列表的上述历史音视频整合成上述数据库。In step S701, the audio and video list and the content are collected to form the database. After the step S601 is performed, the generated audio and video list and the historical audio and video stored in the list are integrated into the database.

参照图6，在本实施例中，在上述的处理音视频的方法中，上述获取与上述音视频列表对应的历史音视频的步骤，包括步骤：Referring to FIG. 6, in the foregoing method, in the method for processing audio and video, the step of acquiring the historical audio and video corresponding to the audio and video list includes the following steps:

S610、通过将云共享数据库中现有的音视频进行定位信息计算、环境气象信息查询和录制对应时间判定生成上述历史音视频。S610. The historical audio and video is generated by performing positioning information calculation, environment meteorological information query, and recording corresponding time determination in the existing audio and video in the cloud sharing database.

如上述步骤S610，通过将云共享数据库中现有的音视频进行定位信息计算、环境气象信息查询和录制对应时间判定生成上述历史音视频。As described above, in step S610, the historical audio and video is generated by performing positioning information calculation, environmental weather information query, and recording corresponding time determination in the existing audio and video in the cloud sharing database.

参照图6，在一具体本实施例中，在上述的处理音视频的方法中，上述获取与上述音视频列表对应的历史音视频的步骤还可以通过一下步骤对上述步骤S610进行替换，该步骤包括：Referring to FIG. 6 , in a specific embodiment, in the foregoing method for processing audio and video, the step of acquiring the historical audio and video corresponding to the audio and video list may further replace the foregoing step S610 by using the following steps. include:

S620、进行音视频的预录制并对其添加对应的定位信息标签、环境气象信息标签和时间信息标签形成上述历史音视频。S620: Perform pre-recording of audio and video and add a corresponding positioning information label, an environmental weather information label, and a time information label to form the historical audio and video.

如上述步骤S620，进行音视频的预录制并对其添加对应的定位信息标签、环境气象信息标签和时间信息标签形成上述历史音视频，通过预录制大量的音视频并将对应的定位信息标签、环境气象信息标签和时间信息标签附加在该音视频上后形成上述历史音视频。As in the above step S620, the pre-recording of the audio and video is performed, and the corresponding positioning information label, the environmental weather information label and the time information label are added to form the historical audio and video, and a plurality of audio and video are pre-recorded and the corresponding positioning information label is The environmental weather information tag and the time information tag are attached to the audio and video to form the above-mentioned historical audio and video.

参照图7，在本实施例中，在上述的处理音视频的方法中，上述定位信息计算包括步骤：Referring to FIG. 7, in the embodiment, in the foregoing method for processing audio and video, the foregoing positioning information calculation includes the following steps:

S611、抽取上述音视频中指定帧数的画面；S611. Extract a picture of a specified number of frames in the audio and video.

S612、将上述画面中的标志性图像进行图像识别和搜索获取上述标志性图像的所属地理位置；S612, performing image recognition and searching on the iconic image in the above-mentioned screen to obtain a geographic location of the iconic image;

S613、根据上述画面中两个或两个以上指定图像的尺寸比例以及对应实物之间的尺寸比例计算出上述音视频的录制距离；S613, calculating a recording distance of the audio and video according to a size ratio of two or more specified images in the above picture and a size ratio between the corresponding objects;

S614、根据上述标志性图像的内容、地理位置和录制距离计算得出上述定位信息。S614. Calculate the positioning information according to the content, the geographical location, and the recording distance of the iconic image.

如上述步骤S611，抽取上述音视频中指定帧数的画面，其中上述指定帧数的画面一般为包括有突出识别特征或标志性的换面，如地标建筑、物品、人物等，In the above step S611, a picture of the specified number of frames in the audio and video is extracted, wherein the picture of the specified number of frames is generally a face change including a prominent recognition feature or a landmark, such as a landmark building, an item, a character, and the like.

如上述步骤S612，将上述画面中的标志性图像进行图像识别和搜索获取上述标志性图像的所属地理位置，其中，该标志性图像一般又使用者进行选取，当使用者放弃选取时选择画面中图像颜色差异性较大的区域或历史选择次数较多的区域定义为标志性的图像，上述标志性的图像包括至少一个，一般优选包括——但不限于——两个，In the above step S612, the iconic image in the above-mentioned screen is image-recognized and searched to obtain the geographic location of the iconic image, wherein the iconic image is generally selected by the user, and the user selects the image when the user gives up the selection. An area where the image color difference is large or a region with a large number of historical selections is defined as an iconic image, and the above-mentioned iconic image includes at least one, and generally preferably includes, but is not limited to, two.

如上述步骤S613，根据上述画面中两个或两个以上指定图像的尺寸比例以及对应实物之间的尺寸比例计算出上述音视频的录制距离，获取上述画面中的多个指定图像的尺寸比例或两个图像之间的距离与实际场景的比值，从而得出上述录制设备与指定图像中的实际场景的录制距离，In step S613, the recording distance of the audio and video is calculated according to the size ratio of the two or more specified images in the screen and the size ratio between the corresponding objects, and the size ratio of the plurality of specified images in the screen is obtained or The ratio of the distance between the two images to the actual scene, thereby obtaining the recording distance of the above-mentioned recording device and the actual scene in the specified image,

如上述步骤S614，根据上述标志性图像的内容、地理位置和录制距离计算得出上述定位信息。According to the above step S614, the positioning information is calculated according to the content, the geographical location and the recording distance of the iconic image.

参照图8，在本申请实施例中，本申请提供一种处理音视频的系统，包括：Referring to FIG. 8, in the embodiment of the present application, the application provides a system for processing audio and video, including:

生成模块101，设置为生成录制音视频所在位置的定位信息标签、环境气象信息标签和时间信息标签；The generating module 101 is configured to generate a positioning information label, an environmental weather information label, and a time information label of the location where the recorded audio and video are located;

匹配模块102，设置为从数据库中匹配出与上述定位信息标签、环境气象信息标签和时间信息标签相同或近似的历史音视频；The matching module 102 is configured to match, from the database, historical audio and video that is the same as or similar to the positioning information label, the environmental weather information label, and the time information label;

处理模块103，设置为根据上述录制音视频和历史音视频处理生成目标音视频。The processing module 103 is configured to generate a target audio and video according to the recorded audio and video and historical audio and video processing.

上述生成模块101，一般用于生成录制音视频所在位置的定位信息标签、环境气象信息标签和时间信息标签，在上述录制音视频录制完成后，对该录制设备进行所在位置的定位信息获取，并将获取到的定位信息整合于上述录制音视频的数据中，形成上述录制音视频的定位信息标签，上述定位信息标签一般以经纬度作为表现数值，在确认上述定位信息标签后，根据上述定位信息标签的定位信息查询该地区的环境气象参数和实时时间，该参数一般包括——但不限于——温度、气压、湿度、风向、风速、光照强度、紫外线强度和天气，将上述环境气象参数和实时时间整合于上述录制音视频的数据中，得出上述环境气象信息标签和时间信息标签，The generating module 101 is generally configured to generate a positioning information label, an environmental weather information label, and a time information label at a position where the recorded audio and video is located, and obtain the positioning information of the location of the recording device after the recording of the recorded audio and video is completed, and The obtained positioning information is integrated into the data of the recorded audio and video to form a positioning information label of the recorded audio and video. The positioning information label generally has a latitude and longitude as a performance value, and after confirming the positioning information label, according to the positioning information label. The location information queries the environmental meteorological parameters and real-time time of the area, which generally includes, but is not limited to, temperature, air pressure, humidity, wind direction, wind speed, light intensity, ultraviolet intensity and weather, and the above-mentioned environmental meteorological parameters and real-time The time is integrated into the data of the recorded audio and video, and the environmental weather information label and the time information label are obtained.

上述匹配模块102，一般用于从从数据库中匹配出与上述定位信息标签、环境气象信息标签和时间信息标签相同或近似的历史音视频，在数据库中对上述定位信息标签、环境气象信息标签和时间信息标签进行匹配搜索以得出与上述录制音视频标记有相同上述定位信息标签、环境气象信息标签和时间信息标签与的历史音视频，The matching module 102 is generally configured to match historical audio and video that is the same as or similar to the positioning information label, the environmental weather information label, and the time information label from the database, and the positioning information label, the environmental meteorological information label, and the database in the database. The time information tag performs a matching search to obtain a historical audio and video with the same positioning information tag, environmental weather information tag, and time information tag as the above recorded audio and video tag.

上述处理模块103，一般用于根据上述录制音视频和历史音视频处理生成目标音视频，通过上述上述匹配模块102获取到足够的历史音视频后对上述录制音视频进行编辑处理以获得处理后的上述目标音视频，其中，上述目标音视频一般包括——但不限于——3D音视频或特定场景音视频等，上述处理一般包括——但不限于——视频的修剪和删除或增加视频中指定图像，其中在通过上述匹配模块102获取到足够的历史音视频后，还可以对上述录制音视频进行真实性的检测，该检测过程仅需要通过对比上述历史音视频与上述录制音视频的差异比例，即可得出视频是否被进行修改，该检测过程比现今的通过检查视频数据是否有修改痕迹更简便、耗时更短。The processing module 103 is generally configured to generate a target audio and video according to the recorded audio and video and the historical audio and video processing, and after the foregoing matching module 102 obtains sufficient historical audio and video, edit the recorded audio and video to obtain the processed audio and video. The above-mentioned target audio and video, wherein the above-mentioned target audio and video generally includes, but is not limited to, 3D audio and video or specific scene audio and video, etc., and the above processing generally includes, but is not limited to, trimming and deleting of video or adding video. An image is specified, wherein after the sufficient historical audio and video is obtained by the matching module 102, the recorded audio and video may be authentically detected, and the detection process only needs to compare the difference between the historical audio and video and the recorded audio and video. The ratio can be used to determine whether the video has been modified. This detection process is easier and less time-consuming than checking whether the video data has a modified trace.

参照图9，在本实施例中，在上述的处理音视频的系统中，上述处理模块103包括：Referring to FIG. 9, in the embodiment, in the above system for processing audio and video, the processing module 103 includes:

第一选取模块131，设置为选取上述录制音视频中的目标区域；The first selecting module 131 is configured to select a target area in the recorded audio and video;

图像数据模块132，设置为获取上述历史音视频中对应目标区域的图像数据；The image data module 132 is configured to acquire image data of a corresponding target area in the historical audio and video;

替换模块133，设置为将上述图像数据替换上述目标区域的图像数据生成上述目标音视频。The replacement module 133 is configured to generate the target audio and video by replacing the image data with the image data of the target area.

上述第一选取模块131，一般用于选取上述录制音视频中的目标区域，在选取目标区域是一般从上述录制音视频中指定一帧画面作为选取基础，其中，上述目标区域一般为非固定区域，当使用者选取上述目标区域后，对上述目标区域中的目标图像进行选择，在选择上述目标图像后，对上述录制音视频中的其余帧数画面进行目标图像的识别以获取与上述目标图像相同或相近的图像并将该图像所在位置重新定义为上述目标区域，从而获得目标区域的运动轨迹，The first selection module 131 is generally configured to select a target area in the recorded audio and video. In the selection of the target area, a frame frame is generally selected from the recorded audio and video as a selection basis, where the target area is generally a non-fixed area. After the user selects the target area, selecting a target image in the target area, and after selecting the target image, performing target image recognition on the remaining number of frames in the recorded audio and video to obtain the target image. The same or similar image and the position of the image is redefined as the above target area, thereby obtaining the motion track of the target area,

上述图像数据模块132，一般用于获取上述历史音视频中对应目标区域的图像数据，在上述第一选取模块131执行后，将上述目标区域的运动轨迹设置在上述历史音视频中，并获取上述目标区域的运动轨迹说覆盖的图像数据，The image data module 132 is configured to acquire image data of a corresponding target area in the historical audio and video, and after the first selection module 131 is executed, set a motion track of the target area in the historical audio and video, and obtain the foregoing The trajectory of the target area says the image data covered,

上述替换模块133，一般用于将上述图像数据替换上述目标区域的图像数据生成上述目标音视频。The replacement module 133 is generally configured to generate the target audio and video by replacing the image data with the image data of the target area.

参照图10，在本实施例中，在上述的处理音视频的系统中，上述处理模块103包括：Referring to FIG. 10, in the embodiment, in the above system for processing audio and video, the processing module 103 includes:

第二选取模块134，设置为选取上述录制音视频和上述历史音视频中的目标图像；The second selecting module 134 is configured to select the target audio image and the target image in the historical audio and video;

整合模块135，设置为将上述目标图像进行整合得到上述目标图像的三维数据；The integration module 135 is configured to integrate the target image to obtain three-dimensional data of the target image.

音视频生成模块136，设置为根据上述目标图像的三维数据生成上述目标音视频。The audio and video generation module 136 is configured to generate the target audio and video according to the three-dimensional data of the target image.

上述第二选取模块134，一般用于选取上述录制音视频和上述历史音视频中的目标图像，使用者从上述录制音视频和历史音视频中分别选取对应的目标图像，每个从上述录制音视频或上述历史音视频中选取的目标图像一般为——但不限于——指定物件或场景在该音视频视角下的图像，其中，上述图像一般包括——但不限于——该指定物件或场景的正视图、后视图、左视图、右视图、俯视图和仰视图中的多张，还可以根据实际上述历史音视频的数量以及录制视角的不同或该指定物件或场景的复杂程度增加其他视角的图像或减少对应视角的图像；The second selection module 134 is generally configured to select the target audio image and the target image in the historical audio and video, and the user selects a corresponding target image from the recorded audio and video and the historical audio and video, respectively. The target image selected in the video or the above-mentioned historical audio and video is generally - but not limited to - an image specifying the object or scene in the audio and video perspective, wherein the image generally includes - but is not limited to - the specified object or Multiple views in the front view, rear view, left view, right view, top view, and bottom view of the scene, and other views can be added according to the actual number of historical audio and video and the difference of the recorded viewing angle or the complexity of the specified object or scene. Image or reduce the image of the corresponding perspective;

上述整合模块135，一般用于将上述目标图像进行整合得到上述目标图像的三维数据，将通过上述第二选取模块134获得的上述目标图像进行对应的三维合并，整合形成上述目标图像的三维数据，以获得指定物件或场景的三维图像，The integration module 135 is generally configured to integrate the target image to obtain three-dimensional data of the target image, and perform corresponding three-dimensional merging of the target image obtained by the second selection module 134 to integrate three-dimensional data of the target image. To get a three-dimensional image of the specified object or scene,

上述音视频生成模块136，一般用于根据上述目标图像的三维数据生成上述目标音视频，根据上述目标图像的三维数据，即指定物件或场景的三维图像配合指定的音频或视频图像生成上述目标音视频。The audio and video generating module 136 is configured to generate the target audio and video according to the three-dimensional data of the target image, and generate the target sound according to the three-dimensional data of the target image, that is, the three-dimensional image of the specified object or the scene and the specified audio or video image. video.

参照图11，在本实施例中，在上述的处理音视频的系统中，上述生成模块101包括：Referring to FIG. 11, in the embodiment, in the above system for processing audio and video, the generating module 101 includes:

第一获取子模块111，设置为获取摄录设备所在位置的定位信息；The first obtaining sub-module 111 is configured to obtain positioning information of a location where the recording device is located;

第二获取子模块112，设置为根据上述定位信息获取上述定位信息对应位置的实时环境气象信息和当前时间信息；The second obtaining sub-module 112 is configured to acquire real-time environmental meteorological information and current time information of the location corresponding to the positioning information according to the foregoing positioning information;

附加子模块113，设置为根据上述定位信息、实时环境气象信息和当前时间信息生成上述定位信息标签、环境气象信息标签和时间信息标签并附加在上述录制音视频中。The additional sub-module 113 is configured to generate the positioning information label, the environmental weather information label and the time information label according to the positioning information, the real-time environmental weather information and the current time information, and add the sound information to the recorded audio and video.

上述第一获取子模块111，一般用于获取摄录设备所在位置的定位信息，在获取录像设备所在位置的定位信息时获取的定位信息一般为——但不限于——经纬度，还包括海拔高度，但当海拔高度对音视频的内容影响较小甚至无时，使用者可手动取消对海拔高度的获取，仅保存经纬度，The first obtaining sub-module 111 is generally configured to obtain positioning information of a location where the recording device is located, and the positioning information obtained when acquiring the positioning information of the location where the recording device is located is generally, but not limited to, latitude and longitude, and also includes an altitude. However, when the altitude has little or no influence on the content of the audio and video, the user can manually cancel the acquisition of the altitude and save only the latitude and longitude.

上述第二获取子模块112，一般用于根据上述定位信息获取上述定位信息对应位置的实时环境气象信息和当前时间信息，在通过上述第一获取子模块111获得上述定位信息后通过连接该地区的当地气象信息共享平台或气象网站获取对应位置的环境气象信息，在通过该地区的当地气象信息共享平台或气象网站获取对应位置的环境气象信息的同时，进行实时时间的同步，以获取对应位置的上述当前时间信息，上述当前时间信息一般为该位置对应的时区实时时间，获取上述当前时间信息还可以通过获取上述录制音视频的录制起始时间或录制结束时间，再根据录制设备中的计时模块将上述录制音视频的各帧时间进行换算，The second obtaining sub-module 112 is configured to obtain the real-time environmental weather information and the current time information of the location corresponding to the positioning information according to the positioning information, and obtain the positioning information by using the first acquiring sub-module 111 to connect the area. The local meteorological information sharing platform or the meteorological website obtains the environmental meteorological information of the corresponding location, and obtains the environmental meteorological information of the corresponding location through the local meteorological information sharing platform or the weather website of the area, and simultaneously synchronizes the real-time time to obtain the corresponding position. The current time information, the current time information is generally the time zone real time corresponding to the location, and the obtaining the current time information may also obtain the recording start time or the recording end time of the recorded audio and video, and then according to the timing module in the recording device. Convert each frame time of the above recorded audio and video,

上述附加子模块113，一般用于根据上述定位信息、实时环境气象信息和当前时间信息生成上述定位信息标签、环境气象信息标签和时间信息标签并附加在上述录制音视频中。The additional sub-module 113 is generally configured to generate the positioning information label, the environmental weather information label, and the time information label according to the positioning information, the real-time environmental weather information, and the current time information, and add the sound information to the recorded audio and video.

参照图12，在本实施例中，在上述的处理音视频的系统中，一般还包括：Referring to FIG. 12, in the embodiment, in the above system for processing audio and video, it is generally further included:

存储模块401，设置为将上述录制音视频存储至上述数据库对应的音视频列表中。The storage module 401 is configured to store the recorded audio and video into the audio and video list corresponding to the database.

上述存储模块401，一般用于将上述录制音视频存储至上述数据库对应的音视频列表中，在上述处理模块103执行后，将上述录制音视频存储至对应的上述音视频列表以达到更新上述音视频列表的目的，已达到对位置上景物变化的最优化同步和更新。The storage module 401 is generally configured to store the recorded audio and video in the audio and video list corresponding to the database, and after the processing module 103 executes, store the recorded audio and video to the corresponding audio and video list to update the sound. The purpose of the video list has been to achieve optimal synchronization and update of changes in location.

参照图12，在本实施例中，在上述的处理音视频的系统中，还包括：Referring to FIG. 12, in the foregoing embodiment, in the system for processing audio and video, the method further includes:

列表生成模块501，设置为生成对应不同定位信息标签、环境气象信息标签和时间信息标签的内容组合的音视频列表；The list generating module 501 is configured to generate an audio and video list corresponding to a combination of content of different positioning information labels, environmental weather information labels, and time information labels;

列表存储模块601，设置为获取与上述音视频列表对应的历史音视频，并将上述历史音视频存储至对应的上述音视频列表中；The list storage module 601 is configured to acquire a historical audio and video corresponding to the audio and video list, and store the historical audio and video into the corresponding audio and video list;

列表归集模块701，设置为将上述音视频列表及内容归集形成上述数据库。The list collection module 701 is configured to group the audio and video lists and contents to form the database.

上述列表生成模块501，一般用于生成对应不同定位信息标签、环境气象信息标签和时间信息标签的内容组合的音视频列表，根据不同的上述定位信息标签、环境气象信息标签和时间信息标签的内容组合生成不同的上述音视频列表，且不同的上述音视频列表中的上述定位信息标签、环境气象信息标签和时间信息标签的上述内容组合互不相同，The above-mentioned list generating module 501 is generally configured to generate an audio and video list corresponding to a combination of different positioning information tags, environmental weather information tags, and time information tags, according to different content of the positioning information tags, environmental weather information tags, and time information tags. Combining different audio and video lists, and the content combinations of the positioning information label, the environmental weather information label, and the time information label in the different audio and video lists are different from each other.

上述列表存储模块601，一般用于获取与上述音视频列表对应的历史音视频，并将上述历史音视频存储至对应的上述音视频列表中，在上述列表生成模块501执行后，根据生成的上述音视频列表将对应的历史音视频存储至上述音视频列表，其中，获取上述历史音视频的方法一般包括——但不限于——进行预录制或从云共享数据库进行获取，The above-mentioned list storage module 601 is generally configured to acquire a historical audio and video corresponding to the audio and video list, and store the historical audio and video in the corresponding audio and video list, after the list generating module 501 executes, according to the generated The audio and video list stores the corresponding historical audio and video to the audio and video list, wherein the method for obtaining the historical audio and video generally includes, but is not limited to, performing pre-recording or acquiring from a cloud sharing database.

上述列表归集模块701，一般用于将上述音视频列表及内容归集形成上述数据库，当上述列表存储模块601执行后，将上述生成的音视频列表以及储存于列表的上述历史音视频整合成上述数据库。The list aggregation module 701 is generally configured to collect the audio and video list and the content to form the database. After the list storage module 601 is executed, the generated audio and video list and the historical audio and video stored in the list are integrated into The above database.

参照图13，在本实施例中，在上述的处理音视频的系统中，上述列表存储模块601包括：Referring to FIG. 13, in the foregoing embodiment, in the system for processing audio and video, the list storage module 601 includes:

历史生成子模块610，设置为通过将云共享数据库中现有的音视频进行定位信息计算、环境气象信息查询和录制对应时间判定生成上述历史音视频。The history generation sub-module 610 is configured to generate the historical audio and video by performing positioning information calculation, environment meteorological information query, and recording corresponding time determination in the existing audio and video in the cloud sharing database.

上述历史生成子模块610，一般用于通过将云共享数据库中现有的音视频进行定位信息计算、环境气象信息查询和录制对应时间判定生成上述历史音视频。The history generation sub-module 610 is generally configured to generate the historical audio and video by performing positioning information calculation, environmental meteorological information query, and recording corresponding time determination in the existing audio and video in the cloud sharing database.

参照图13，在一具体本实施例中，在上述的处理音视频的系统中，上述历史生成子模块610可以通过以下模块进行替换，包括：Referring to FIG. 13, in a specific embodiment, in the foregoing system for processing audio and video, the history generation sub-module 610 can be replaced by the following modules, including:

预录制子模块620，设置为进行音视频的预录制并对其添加对应的定位信息标签、环境气象信息标签和时间信息标签形成上述历史音视频。The pre-recording sub-module 620 is configured to perform pre-recording of audio and video and add corresponding positioning information tags, environmental weather information tags and time information tags to form the historical audio and video.

上述预录制子模块620，一般用于进行音视频的预录制并对其添加对应的定位信息标签、环境气象信息标签和时间信息标签形成上述历史音视频，通过预录制大量的音视频并将对应的定位信息标签、环境气象信息标签和时间信息标签附加在该音视频上后形成上述历史音视频。The pre-recording sub-module 620 is generally configured to perform pre-recording of audio and video, and add a corresponding positioning information label, an environmental weather information label, and a time information label to form the historical audio and video, and pre-record a large number of audio and video and corresponding The positioning information label, the environmental weather information label, and the time information label are attached to the audio and video to form the historical audio and video.

参照图14，在本实施例中，在上述的处理音视频的系统中，上述定位信息计算包括步骤：Referring to FIG. 14, in the embodiment, in the above system for processing audio and video, the positioning information calculation includes the following steps:

抽取子模块611，设置为抽取上述音视频中指定帧数的画面；The extraction submodule 611 is configured to extract a picture of the specified number of frames in the audio and video;

识别子模块612，设置为将上述画面中的标志性图像进行图像识别和搜索获取上述标志性图像的所属地理位置；The identification sub-module 612 is configured to perform image recognition and search for the iconic image in the above-mentioned screen to obtain the geographic location of the iconic image;

第一计算子模块613，设置为根据上述画面中两个或两个以上指定图像的尺寸比例以及对应实物之间的尺寸比例计算出上述音视频的录制距离；The first calculation sub-module 613 is configured to calculate the recording distance of the audio and video according to the size ratio of two or more specified images in the above-mentioned picture and the size ratio between the corresponding objects;

第二计算子模块614，设置为根据上述标志性图像的内容、地理位置和录制距离计算得出上述定位信息。The second calculating sub-module 614 is configured to calculate the positioning information according to the content, the geographical location, and the recording distance of the icon image.

上述抽取子模块611，一般用于抽取上述音视频中指定帧数的画面，其中上述指定帧数的画面一般为包括有突出识别特征或标志性的换面，如地标建筑、物品、人物等，The extraction sub-module 611 is generally configured to extract a picture of the specified number of frames in the audio and video, wherein the picture of the specified number of frames is generally a face-changing feature including a prominent recognition feature or a landmark, such as a landmark building, an item, a character, and the like.

上述识别子模块612，一般用于将上述画面中的标志性图像进行图像识别和搜索获取上述标志性图像的所属地理位置，其中，该标志性图像一般又使用者进行选取，当使用者放弃选取时选择画面中图像颜色差异性较大的区域或历史选择次数较多的区域定义为标志性的图像，上述标志性的图像包括至少一个，一般优选包括——但不限于——两个，The identification sub-module 612 is generally configured to perform image recognition and search for the landmark image in the image to obtain the geographic location of the icon image, wherein the icon image is generally selected by the user, and the user abandons the selection. The area in which the image color difference is large in the selection screen or the area in which the number of historical selections is large is defined as an iconic image, and the above-mentioned iconic image includes at least one, and generally includes, but is not limited to, two.

上述第一计算子模块613，一般用于根据上述画面中两个或两个以上指定图像的尺寸比例以及对应实物之间的尺寸比例计算出上述音视频的录制距离，获取上述画面中的多个指定图像的尺寸比例或两个图像之间的距离与实际场景的比值，从而得出上述录制设备与指定图像中的实际场景的录制距离，The first calculating sub-module 613 is generally configured to calculate a recording distance of the audio and video according to a size ratio of two or more specified images in the above-mentioned picture and a size ratio between the corresponding objects, and acquire multiples in the foregoing picture. Specifying the size ratio of the image or the ratio of the distance between the two images to the actual scene, thereby obtaining the recording distance of the above-mentioned recording device and the actual scene in the specified image,

上述第二计算子模块614，一般用于根据上述标志性图像的内容、地理位置和录制距离计算得出上述定位信息。The second calculating sub-module 614 is generally configured to calculate the positioning information according to the content, the geographical location, and the recording distance of the iconic image.

参照图11，在本申请实施例中，本申请还提供一种计算机设备，上述计算机设备12以通用计算设备的形式表现，计算机设备12的组件可以包括但不限于：一个或者多个处理器或者处理单元16，系统存储器28，连接不同系统组件（包括系统存储器28和处理单元16）的总线18。Referring to FIG. 11 , in the embodiment of the present application, the present application further provides a computer device. The computer device 12 is represented in the form of a general-purpose computing device. The components of the computer device 12 may include, but are not limited to: one or more processors or Processing unit 16, system memory 28, connects bus 18 of various system components, including system memory 28 and processing unit 16.

总线18表示几类总线18结构中的一种或多种，包括存储器总线18或者存储器控制器，外围总线18，图形加速端口，处理器或者使用多种总线18结构中的任意总线18结构的局域总线18。举例来说，这些体系结构包括但不限于工业标准体系结构（ISA）总线18，微通道体系结构（MAC）总线18，增强型ISA总线18、音视频电子标准协会（VESA）局域总线18以及外围组件互连（PCI）总线18。Bus 18 represents one or more of several types of bus 18 architectures, including memory bus 18 or memory controller, peripheral bus 18, graphics acceleration port, processor or office using any of the various bus 18 architectures. Domain bus 18. For example, these architectures include, but are not limited to, an Industry Standard Architecture (ISA) bus 18, a Micro Channel Architecture (MAC) bus 18, an Enhanced ISA Bus 18, a Video and Video Electronics Standards Association (VESA) local bus 18, and Peripheral Component Interconnect (PCI) bus 18.

计算机设备12典型地包括多种计算机系统可读介质。这些介质可以是任何能够被计算机设备12访问的可用介质，包括易失性和非易失性介质，可移动的和不可移动的介质。Computer device 12 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by computer device 12, including both volatile and nonvolatile media, removable and non-removable media.

系统存储器28可以包括易失性存储器形式的计算机系统可读介质，例如随机存取存储器（RAM）30和/或高速缓存存储器32。计算机设备12可以进一步包括其他移动/不可移动的、易失性/非易失性计算机体统存储介质。仅作为举例，存储系统34可以用于读写不可移动的、非易失性磁介质（通常称为“硬盘驱动器”）。尽管图11中未示出，可以提供用于对可移动非易失性磁盘（如“软盘”）读写的磁盘驱动器，以及对可移动非易失性光盘（例如CD~ROM，DVD~ROM或者其他光介质）读写的光盘驱动器。在这些情况下，每个驱动器可以通过一个或者多个数据介质接口与总线18相连。存储器可以包括至少一个程序产品，该程序产品具有一组（例如至少一个）程序模块42，这些程序模块42被配置以执行本申请各实施例的功能。System memory 28 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 30 and/or cache memory 32. Computer device 12 may further include other mobile/non-removable, volatile/non-volatile computer system storage media. By way of example only, storage system 34 may be used to read and write non-removable, non-volatile magnetic media (commonly referred to as "hard disk drives"). Although not shown in FIG. 11, a disk drive for reading and writing to a removable non-volatile disk such as a "floppy disk", and a removable non-volatile disk (for example, CD~ROM, DVD~ROM) may be provided. Or other optical media) read and write optical drive. In these cases, each drive can be coupled to bus 18 via one or more data medium interfaces. The memory can include at least one program product having a set (e.g., at least one) of program modules 42 that are configured to perform the functions of the various embodiments of the present application.

具有一组（至少一个）程序模块42的程序/实用工具40，可以存储在例如存储器中，这样的程序模块42包括——但不限于——操作系统、一个或者多个应用程序、其他程序模块42以及程序数据，这些示例中的每一个或某种组合中可能包括网络环境的实现。程序模块42通常执行本申请所描述的实施例中的功能和/或方法。A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in a memory, such program modules 42 including, but not limited to, an operating system, one or more applications, other program modules 42 and program data, each of these examples or some combination may include an implementation of a network environment. Program module 42 typically performs the functions and/or methods of the embodiments described herein.

计算机设备12也可以与一个或多个外部设备14（例如键盘、指向设备、显示器24、摄像头等）通信，还可与一个或者多个使得用户能与该计算机设备12交互的设备通信，和/或与使得该计算机设备12能与一个或多个其它计算设备进行通信的任何设备（例如网卡，调制解调器等等）通信。这种通信可以通过输入/输出（I/O）接口22进行。并且，计算机设备12还可以通过网络适配器20与一个或者多个网络（例如局域网（LAN）），广域网（WAN）和/或公共网络（例如因特网）通信。如图所示，网络适配器20通过总线18与计算机设备12的其他模块通信。应当明白，尽管图11中未示出，可以结合计算机设备12使用其他硬件和/或软件模块，包括但不限于：微代码、设备驱动器、冗余处理单元16、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统34等。Computer device 12 may also be in communication with one or more external devices 14 (eg, a keyboard, pointing device, display 24, camera, etc.), and may also be in communication with one or more devices that enable a user to interact with the computer device 12, and/ Or communicating with any device (eg, a network card, modem, etc.) that enables the computer device 12 to communicate with one or more other computing devices. This communication can take place via an input/output (I/O) interface 22. Also, computer device 12 can communicate with one or more networks (e.g., a local area network (LAN)), a wide area network (WAN), and/or a public network (e.g., the Internet) via network adapter 20. As shown, network adapter 20 communicates with other modules of computer device 12 via bus 18. It should be understood that although not shown in FIG. 11, other hardware and/or software modules may be utilized in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing unit 16, external disk drive array, RAID system, A tape drive and a data backup storage system 34 and the like.

处理单元16通过运行存储在系统存储器28中的程序，从而执行各种功能应用以及数据处理，例如实现本申请实施例所提供的处理音视频的方法。The processing unit 16 performs various function applications and data processing by running a program stored in the system memory 28, for example, a method for processing audio and video provided by the embodiments of the present application.

也即，上述处理单元16执行上述程序时实现：生成录制音视频所在位置的定位信息标签、环境气象信息标签和时间信息标签；从数据库中匹配出与上述定位信息标签、环境气象信息标签和时间信息标签相同或近似的历史音视频；根据上述录制音视频和历史音视频处理生成目标音视频。That is, when the processing unit 16 executes the foregoing procedure, the method provides: generating a positioning information label, an environmental weather information label, and a time information label at a position where the recorded audio and video is located; and matching the positioning information label, the environmental weather information label, and the time from the database. The historical audio and video with the same or similar information labels; the target audio and video is generated according to the above-mentioned recorded audio and video and historical audio and video processing.

在本申请实施例中，本申请还提供一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如本申请所有实施例提供的处理音视频的方法。In the embodiment of the present application, the present application further provides a computer readable storage medium, where a computer program is stored, and when the program is executed by the processor, the method for processing audio and video provided by all embodiments of the present application is implemented.

其中，在处理器上所实现的方法可以参照本发明处理音视频的方法各个实施例，此处不作赘述。The method implemented on the processor can refer to various embodiments of the method for processing audio and video according to the present invention, and details are not described herein.

本发明的处理音视频的方法、系统、设备及存储介质的有益效果为：通过为录制音视频附加所在位置的定位信息标签、环境气象信息标签和时间信息标签，使在处理音视频时能通过信息标签相同的历史音视频做为该录制音视频的参考，从而能够更精确的将音视频进行编辑和处理，降低使用者的操作难度以及操作时间，并提高处理后目标音视频的真实性。The method, system, device and storage medium for processing audio and video of the present invention have the beneficial effects of: by adding a positioning information label, an environmental weather information label and a time information label of a location for recording audio and video, so that the audio and video can be processed when the audio and video are processed. The historical audio and video with the same information label is used as a reference for the recorded audio and video, so that the audio and video can be edited and processed more accurately, the operation difficulty and operation time of the user are reduced, and the authenticity of the target audio and video after processing is improved.

以上所述仅为本申请的优选实施例，并非因此限制本申请的专利范围，凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换，或直接或间接运用在其他相关的技术领域，均同理包括在本申请的专利保护范围内。The above description is only the preferred embodiment of the present application, and is not intended to limit the scope of the patent application, and the equivalent structure or equivalent process transformations made by the specification and the drawings of the present application, or directly or indirectly applied to other related The technical field is equally included in the scope of patent protection of the present application.

Claims

A method of processing audio and video, wherein the method comprises the steps of:

Generating a positioning information label, an environmental weather information label, and a time information label at a position where the recorded audio and video are located;

Matching historical audio and video that is the same as or similar to the positioning information label, the environmental weather information label, and the time information label from the database;

The target audio and video is generated according to the recorded audio and video and historical audio and video processing.

The method of processing audio and video according to claim 1, wherein said processing comprises the step of generating a target audio and video, comprising the steps of:

Selecting a target area in the recorded audio and video;

Obtaining image data of a corresponding target area in the historical audio and video;

The image data is replaced with the image data of the target area to generate the target audio and video.

Selecting the recorded audio and video and the target image in the historical audio and video;

Integrating the target image to obtain three-dimensional data of the target image;

Generating the target audio and video according to the three-dimensional data of the target image.

The method of processing audio and video according to claim 1, wherein the step of generating a corresponding positioning information label, an environmental weather information label, and a time information label for recording audio and video comprises the steps of:

Obtaining location information of the location of the recording device;

Acquiring real-time environmental meteorological information and current time information of the location corresponding to the positioning information according to the positioning information;

The positioning information label, the environmental weather information label, and the time information label are generated according to the positioning information, the real-time environmental weather information, and the current time information, and are added to the recorded audio and video.

The method of processing audio and video according to claim 1, wherein before the step of generating the positioning information label, the environmental weather information label and the time information label at the position where the recorded audio and video is located, the method further comprises the steps of:

Generating an audio and video list corresponding to a combination of different positioning information tags, environmental weather information tags, and time information tags;

Obtaining a historical audio and video corresponding to the audio and video list, and storing the historical audio and video into the corresponding audio and video list;

The audio and video lists and content are grouped together to form the database.

The method of processing audio and video according to claim 5, wherein the step of acquiring historical audio and video corresponding to the audio and video list comprises the steps of:

The historical audio and video is generated by performing positioning information calculation, environmental meteorological information query, and recording corresponding time determination in the existing audio and video in the cloud sharing database.

The method of processing audio and video according to claim 6, wherein said positioning information calculation comprises the steps of:

Extracting a picture of a specified number of frames in the audio and video;

Performing image recognition and searching on the iconic image in the picture to obtain a geographic location of the iconic image;

Calculating a recording distance of the audio and video according to a size ratio of two or more specified images in the picture and a size ratio between the corresponding objects;

The positioning information is calculated according to the content, the geographical location and the recording distance of the iconic image.

A system for processing audio and video, wherein the system comprises:

The generating module is configured to generate a positioning information label, an environmental weather information label, and a time information label at a position where the recorded audio and video are located;

a matching module, configured to match the historical audio and video of the same information label from the database;

The processing module is configured to generate a target audio and video according to the recorded audio and video and historical audio and video processing.

A computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor performs the following steps when executing the program:

The computer apparatus according to claim 9, wherein said processor further implements the following steps when said program is executed:

Selecting a target area in the recorded audio and video;

Obtaining location information of the location of the recording device;

A computer readable storage medium having stored thereon a computer program, wherein the program is executed by the processor to implement the following steps:

The computer readable storage medium of claim 13 wherein the program, when executed by the processor, further implements the following steps:

Selecting a target area in the recorded audio and video;

Obtaining location information of the location of the recording device;