[go: up one dir, main page]

WO2019165723A1 - Procédé et système de traitement d'une audio-vidéo, et dispositif et support de stockage - Google Patents

Procédé et système de traitement d'une audio-vidéo, et dispositif et support de stockage Download PDF

Info

Publication number
WO2019165723A1
WO2019165723A1 PCT/CN2018/091115 CN2018091115W WO2019165723A1 WO 2019165723 A1 WO2019165723 A1 WO 2019165723A1 CN 2018091115 W CN2018091115 W CN 2018091115W WO 2019165723 A1 WO2019165723 A1 WO 2019165723A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
audio
information label
target
positioning information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2018/091115
Other languages
English (en)
Chinese (zh)
Inventor
袁晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Ikmak Tech Co Ltd
Original Assignee
Shenzhen Ikmak Tech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Ikmak Tech Co Ltd filed Critical Shenzhen Ikmak Tech Co Ltd
Publication of WO2019165723A1 publication Critical patent/WO2019165723A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/489Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using time information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/487Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/91Television signal processing therefor

Definitions

  • the present application relates to the field of wireless communications, and in particular, to a method, system, device, and storage medium for processing audio and video.
  • the existing audio and video editing software generally performs intelligent calculation on the audio and video editing part and nearby content according to the embedded algorithm to obtain the edited audio and video content.
  • This method has strong dependence on the software embedded algorithm, and the audio and video editing is strong.
  • the low authenticity depends mainly on the accuracy of the algorithm and the level of personal ability of the user.
  • the requirements on the user and the editing machine are high, which greatly affects the user experience.
  • the main purpose of the present application is to provide a method, system, device, and storage medium for processing audio and video, which improves the authenticity of a scene in a modified audio and video.
  • the present application provides a method for processing audio and video, including the steps of:
  • the target audio and video is generated according to the above-mentioned recorded audio and video and historical audio and video processing.
  • the present application proposes a system for processing audio and video, comprising:
  • the generating module is configured to generate a positioning information label, an environmental weather information label, and a time information label at a position where the recorded audio and video are located;
  • the matching module is configured to match the historical audio and video of the same information label from the database
  • the processing module is configured to generate a target audio and video according to the above-mentioned recorded audio and video and historical audio and video processing.
  • the present application provides a computer device including a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor performs the following steps when executing the program:
  • the target audio and video is generated according to the recorded audio and video and historical audio and video processing.
  • the present application provides a computer readable storage medium having stored thereon a computer program that, when executed by a processor, implements the following steps:
  • the target audio and video is generated according to the recorded audio and video and historical audio and video processing.
  • the beneficial effects of the method, system, device and storage medium for processing audio and video of the present application are as follows: by adding a positioning information label, an environmental weather information label and a time information label of a location for recording audio and video, so that the audio and video can be processed when the audio and video are processed.
  • the historical audio and video with the same information label is used as a reference for the recorded audio and video, so that the audio and video can be edited and processed more accurately, the operation difficulty and operation time of the user are reduced, and the authenticity of the target audio and video after processing is improved.
  • FIG. 1 is a schematic flow chart of a method for processing audio and video according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a method for processing audio and video according to an embodiment of the present application
  • FIG. 3 is a schematic flowchart diagram of a method for processing audio and video according to an embodiment of the present application
  • FIG. 4 is a schematic flowchart diagram of a method for processing audio and video according to an embodiment of the present application.
  • FIG. 5 is a schematic flowchart diagram of a method for processing audio and video according to an embodiment of the present application
  • FIG. 6 is a schematic flowchart of a method for processing audio and video according to an embodiment of the present application.
  • FIG. 7 is a schematic flowchart diagram of a method for processing audio and video according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of a module for processing an audio and video according to an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of a computer device according to an embodiment of the present application.
  • first”, “second”, and the like in this application are used for the purpose of description only, and are not to be construed as indicating or implying their relative importance or implicitly indicating the number of technical features indicated.
  • features defining “first” or “second” may include at least one of the features, either explicitly or implicitly.
  • the technical solutions between the various embodiments may be combined with each other, but must be based on the realization of those skilled in the art, and when the combination of the technical solutions is contradictory or impossible to implement, it should be considered that the combination of the technical solutions does not exist. Nor is it within the scope of protection required by this application.
  • the application provides a method for processing audio and video, including the steps of:
  • S101 Generate a positioning information label, an environmental weather information label, and a time information label at a location where the recorded audio and video are located;
  • S103 Generate target audio and video according to the foregoing recording audio and video and historical audio and video processing.
  • the positioning information label, the environmental meteorological information label, and the time information label of the location where the recorded audio and video are located are generated.
  • the positioning information of the location of the recording device is obtained, and the information is obtained.
  • the positioning information is integrated into the data of the recorded audio and video to form a positioning information label of the recorded audio and video.
  • the positioning information label generally uses latitude and longitude as the performance value. After confirming the positioning information label, according to the positioning information of the positioning information label.
  • the environmental meteorological parameters and real-time time of the area which generally includes — but is not limited to – temperature, pressure, humidity, wind direction, wind speed, light intensity, UV intensity and weather, and integrates the above environmental meteorological parameters and real-time time
  • the environmental weather information label and the time information label are obtained.
  • step S102 the historical audio and video that is the same as or similar to the positioning information label, the environmental weather information label, and the time information label is matched from the database, and the positioning information label, the environmental weather information label, and the time information label are performed in the database. Matching the search to obtain a historical audio and video with the same positioning information label, environmental weather information label, and time information label as the above recorded audio and video mark,
  • the target audio and video is generated according to the above-mentioned recorded audio and video and historical audio and video processing, and the recorded audio and video is edited after the sufficient historical audio and video is obtained in the above step S102 to obtain the processed target audio and video.
  • the above target audio and video generally includes, but is not limited to, 3D audio and video or specific scene audio and video, etc.
  • the above processing generally includes, but is not limited to, trimming and deleting of video or adding a specified image in the video
  • the recorded audio and video may be authentically detected.
  • the detection process only needs to compare the difference between the historical audio and video and the recorded audio and video. Whether the video is modified or not, this detection process is easier and less time consuming than checking whether the video data has a modified trace.
  • the foregoing step of generating a target audio and video includes the following steps:
  • the image data is replaced by the image data of the target area to generate the target audio and video.
  • the target area in the recorded audio and video is selected, and the selected target area is generally selected from the recorded audio and video as a selection basis, wherein the target area is generally a non-fixed area, and is selected by the user.
  • the target image in the target area is selected, and after the target image is selected, the target image is identified on the remaining number of frames in the recorded audio and video to obtain an image that is the same as or similar to the target image. And re-defining the position of the image as the above target area, thereby obtaining the motion track of the target area.
  • step S133 the image data is replaced with the image data of the target area to generate the target audio and video.
  • the foregoing step of generating a target audio and video includes the following steps:
  • step S134 the target audio image and the target image in the historical audio and video are selected, and the user selects a corresponding target image from the recorded audio and video and the historical audio and video, respectively, each of the recorded audio and video or the historical sound.
  • the target image selected in the video is generally - but not limited to - specifying an image of the object or scene at the perspective of the audio and video, wherein the image generally includes - but is not limited to - a front view of the specified object or scene, Multiple sheets in the rear view, left view, right view, top view, and bottom view. It is also possible to increase the image of other views or reduce the corresponding image according to the actual number of the above-mentioned historical audio and video and the difference of the recorded viewing angle or the complexity of the specified object or scene.
  • step S135 the target image is integrated to obtain the three-dimensional data of the target image, and the target image obtained by the above step S134 is subjected to corresponding three-dimensional merging, and the three-dimensional data of the target image is integrated to obtain a specified object or scene.
  • Three-dimensional image
  • the step S136 is configured to generate the target audio and video according to the three-dimensional data of the target image, and generate the target audio and video according to the three-dimensional data of the target image, that is, the three-dimensional image of the specified object or the scene, with the specified audio or video image.
  • the step of generating a corresponding positioning information label, an environmental weather information label, and a time information label for recording audio and video includes the following steps:
  • the positioning information label, the environmental weather information label, and the time information label are generated according to the positioning information, the real-time environmental weather information, and the current time information, and are added to the recorded audio and video.
  • the location information obtained when acquiring the location information of the location of the video recording device is generally - but not limited to - latitude and longitude, and includes altitude, but when the altitude is The content of audio and video has little or no effect. Users can manually cancel the acquisition of altitude and save only the latitude and longitude.
  • the environmental meteorological information of the location is obtained by synchronizing the real-time time while obtaining the environmental meteorological information of the corresponding location through the local meteorological information sharing platform or the meteorological website in the area, to obtain the current time information of the corresponding location, and the current time information is generally
  • obtaining the current time information may also obtain the recording start time or the recording end time of the recorded audio and video, and then perform the frame time of the recorded audio and video according to the timing module in the recording device. Conversion,
  • the positioning information label, the environmental weather information label and the time information label are generated according to the positioning information, the real-time environmental weather information and the current time information, and are added to the recorded audio and video.
  • the method further includes the following steps:
  • the recorded audio and video is stored in the audio and video list corresponding to the database, and after the step S103 is performed, the recorded audio and video is stored in the corresponding audio and video list to achieve the purpose of updating the audio and video list. , optimized synchronization and update of changes in location.
  • the method further includes the following steps:
  • S501 Generate an audio and video list corresponding to a combination of different positioning information labels, environmental weather information labels, and time information labels.
  • Step S501 generating an audio and video list corresponding to the content combination of the different positioning information label, the environmental weather information label, and the time information label, and generating different according to different content combinations of the positioning information label, the environmental weather information label, and the time information label.
  • the audio and video list, and the content combinations of the positioning information label, the environmental weather information label, and the time information label in the different audio and video lists are different from each other.
  • the historical audio and video is stored in the above audio and video list, wherein the method for obtaining the historical audio and video generally includes, but is not limited to, performing pre-recording or acquiring from a cloud sharing database.
  • step S701 the audio and video list and the content are collected to form the database.
  • step S601 the generated audio and video list and the historical audio and video stored in the list are integrated into the database.
  • the step of acquiring the historical audio and video corresponding to the audio and video list includes the following steps:
  • the historical audio and video is generated by performing positioning information calculation, environment meteorological information query, and recording corresponding time determination in the existing audio and video in the cloud sharing database.
  • step S610 the historical audio and video is generated by performing positioning information calculation, environmental weather information query, and recording corresponding time determination in the existing audio and video in the cloud sharing database.
  • the step of acquiring the historical audio and video corresponding to the audio and video list may further replace the foregoing step S610 by using the following steps. include:
  • S620 Perform pre-recording of audio and video and add a corresponding positioning information label, an environmental weather information label, and a time information label to form the historical audio and video.
  • step S620 the pre-recording of the audio and video is performed, and the corresponding positioning information label, the environmental weather information label and the time information label are added to form the historical audio and video, and a plurality of audio and video are pre-recorded and the corresponding positioning information label is
  • the environmental weather information tag and the time information tag are attached to the audio and video to form the above-mentioned historical audio and video.
  • the foregoing positioning information calculation includes the following steps:
  • step S611 a picture of the specified number of frames in the audio and video is extracted, wherein the picture of the specified number of frames is generally a face change including a prominent recognition feature or a landmark, such as a landmark building, an item, a character, and the like.
  • the iconic image in the above-mentioned screen is image-recognized and searched to obtain the geographic location of the iconic image, wherein the iconic image is generally selected by the user, and the user selects the image when the user gives up the selection.
  • An area where the image color difference is large or a region with a large number of historical selections is defined as an iconic image, and the above-mentioned iconic image includes at least one, and generally preferably includes, but is not limited to, two.
  • step S613 the recording distance of the audio and video is calculated according to the size ratio of the two or more specified images in the screen and the size ratio between the corresponding objects, and the size ratio of the plurality of specified images in the screen is obtained or The ratio of the distance between the two images to the actual scene, thereby obtaining the recording distance of the above-mentioned recording device and the actual scene in the specified image,
  • the positioning information is calculated according to the content, the geographical location and the recording distance of the iconic image.
  • the application provides a system for processing audio and video, including:
  • the generating module 101 is configured to generate a positioning information label, an environmental weather information label, and a time information label of the location where the recorded audio and video are located;
  • the matching module 102 is configured to match, from the database, historical audio and video that is the same as or similar to the positioning information label, the environmental weather information label, and the time information label;
  • the processing module 103 is configured to generate a target audio and video according to the recorded audio and video and historical audio and video processing.
  • the generating module 101 is generally configured to generate a positioning information label, an environmental weather information label, and a time information label at a position where the recorded audio and video is located, and obtain the positioning information of the location of the recording device after the recording of the recorded audio and video is completed, and The obtained positioning information is integrated into the data of the recorded audio and video to form a positioning information label of the recorded audio and video.
  • the positioning information label generally has a latitude and longitude as a performance value, and after confirming the positioning information label, according to the positioning information label.
  • the location information queries the environmental meteorological parameters and real-time time of the area, which generally includes, but is not limited to, temperature, air pressure, humidity, wind direction, wind speed, light intensity, ultraviolet intensity and weather, and the above-mentioned environmental meteorological parameters and real-time
  • the time is integrated into the data of the recorded audio and video, and the environmental weather information label and the time information label are obtained.
  • the matching module 102 is generally configured to match historical audio and video that is the same as or similar to the positioning information label, the environmental weather information label, and the time information label from the database, and the positioning information label, the environmental meteorological information label, and the database in the database.
  • the time information tag performs a matching search to obtain a historical audio and video with the same positioning information tag, environmental weather information tag, and time information tag as the above recorded audio and video tag.
  • the processing module 103 is generally configured to generate a target audio and video according to the recorded audio and video and the historical audio and video processing, and after the foregoing matching module 102 obtains sufficient historical audio and video, edit the recorded audio and video to obtain the processed audio and video.
  • the above-mentioned target audio and video wherein the above-mentioned target audio and video generally includes, but is not limited to, 3D audio and video or specific scene audio and video, etc.
  • the above processing generally includes, but is not limited to, trimming and deleting of video or adding video.
  • An image is specified, wherein after the sufficient historical audio and video is obtained by the matching module 102, the recorded audio and video may be authentically detected, and the detection process only needs to compare the difference between the historical audio and video and the recorded audio and video. The ratio can be used to determine whether the video has been modified. This detection process is easier and less time-consuming than checking whether the video data has a modified trace.
  • the processing module 103 includes:
  • the first selecting module 131 is configured to select a target area in the recorded audio and video
  • the image data module 132 is configured to acquire image data of a corresponding target area in the historical audio and video;
  • the replacement module 133 is configured to generate the target audio and video by replacing the image data with the image data of the target area.
  • the first selection module 131 is generally configured to select a target area in the recorded audio and video.
  • a frame frame is generally selected from the recorded audio and video as a selection basis, where the target area is generally a non-fixed area.
  • the target area is generally a non-fixed area.
  • the image data module 132 is configured to acquire image data of a corresponding target area in the historical audio and video, and after the first selection module 131 is executed, set a motion track of the target area in the historical audio and video, and obtain the foregoing The trajectory of the target area says the image data covered,
  • the replacement module 133 is generally configured to generate the target audio and video by replacing the image data with the image data of the target area.
  • the processing module 103 includes:
  • the second selecting module 134 is configured to select the target audio image and the target image in the historical audio and video;
  • the integration module 135 is configured to integrate the target image to obtain three-dimensional data of the target image.
  • the audio and video generation module 136 is configured to generate the target audio and video according to the three-dimensional data of the target image.
  • the second selection module 134 is generally configured to select the target audio image and the target image in the historical audio and video, and the user selects a corresponding target image from the recorded audio and video and the historical audio and video, respectively.
  • the target image selected in the video or the above-mentioned historical audio and video is generally - but not limited to - an image specifying the object or scene in the audio and video perspective, wherein the image generally includes - but is not limited to - the specified object or Multiple views in the front view, rear view, left view, right view, top view, and bottom view of the scene, and other views can be added according to the actual number of historical audio and video and the difference of the recorded viewing angle or the complexity of the specified object or scene. Image or reduce the image of the corresponding perspective;
  • the integration module 135 is generally configured to integrate the target image to obtain three-dimensional data of the target image, and perform corresponding three-dimensional merging of the target image obtained by the second selection module 134 to integrate three-dimensional data of the target image. To get a three-dimensional image of the specified object or scene,
  • the audio and video generating module 136 is configured to generate the target audio and video according to the three-dimensional data of the target image, and generate the target sound according to the three-dimensional data of the target image, that is, the three-dimensional image of the specified object or the scene and the specified audio or video image. video.
  • the generating module 101 includes:
  • the first obtaining sub-module 111 is configured to obtain positioning information of a location where the recording device is located;
  • the second obtaining sub-module 112 is configured to acquire real-time environmental meteorological information and current time information of the location corresponding to the positioning information according to the foregoing positioning information;
  • the additional sub-module 113 is configured to generate the positioning information label, the environmental weather information label and the time information label according to the positioning information, the real-time environmental weather information and the current time information, and add the sound information to the recorded audio and video.
  • the first obtaining sub-module 111 is generally configured to obtain positioning information of a location where the recording device is located, and the positioning information obtained when acquiring the positioning information of the location where the recording device is located is generally, but not limited to, latitude and longitude, and also includes an altitude. However, when the altitude has little or no influence on the content of the audio and video, the user can manually cancel the acquisition of the altitude and save only the latitude and longitude.
  • the second obtaining sub-module 112 is configured to obtain the real-time environmental weather information and the current time information of the location corresponding to the positioning information according to the positioning information, and obtain the positioning information by using the first acquiring sub-module 111 to connect the area.
  • the local meteorological information sharing platform or the meteorological website obtains the environmental meteorological information of the corresponding location, and obtains the environmental meteorological information of the corresponding location through the local meteorological information sharing platform or the weather website of the area, and simultaneously synchronizes the real-time time to obtain the corresponding position.
  • the current time information is generally the time zone real time corresponding to the location, and the obtaining the current time information may also obtain the recording start time or the recording end time of the recorded audio and video, and then according to the timing module in the recording device. Convert each frame time of the above recorded audio and video,
  • the additional sub-module 113 is generally configured to generate the positioning information label, the environmental weather information label, and the time information label according to the positioning information, the real-time environmental weather information, and the current time information, and add the sound information to the recorded audio and video.
  • the storage module 401 is configured to store the recorded audio and video into the audio and video list corresponding to the database.
  • the storage module 401 is generally configured to store the recorded audio and video in the audio and video list corresponding to the database, and after the processing module 103 executes, store the recorded audio and video to the corresponding audio and video list to update the sound.
  • the purpose of the video list has been to achieve optimal synchronization and update of changes in location.
  • the method further includes:
  • the list generating module 501 is configured to generate an audio and video list corresponding to a combination of content of different positioning information labels, environmental weather information labels, and time information labels;
  • the list storage module 601 is configured to acquire a historical audio and video corresponding to the audio and video list, and store the historical audio and video into the corresponding audio and video list;
  • the list collection module 701 is configured to group the audio and video lists and contents to form the database.
  • the above-mentioned list generating module 501 is generally configured to generate an audio and video list corresponding to a combination of different positioning information tags, environmental weather information tags, and time information tags, according to different content of the positioning information tags, environmental weather information tags, and time information tags. Combining different audio and video lists, and the content combinations of the positioning information label, the environmental weather information label, and the time information label in the different audio and video lists are different from each other.
  • the above-mentioned list storage module 601 is generally configured to acquire a historical audio and video corresponding to the audio and video list, and store the historical audio and video in the corresponding audio and video list, after the list generating module 501 executes, according to the generated
  • the audio and video list stores the corresponding historical audio and video to the audio and video list, wherein the method for obtaining the historical audio and video generally includes, but is not limited to, performing pre-recording or acquiring from a cloud sharing database.
  • the list aggregation module 701 is generally configured to collect the audio and video list and the content to form the database. After the list storage module 601 is executed, the generated audio and video list and the historical audio and video stored in the list are integrated into The above database.
  • the list storage module 601 includes:
  • the history generation sub-module 610 is configured to generate the historical audio and video by performing positioning information calculation, environment meteorological information query, and recording corresponding time determination in the existing audio and video in the cloud sharing database.
  • the history generation sub-module 610 is generally configured to generate the historical audio and video by performing positioning information calculation, environmental meteorological information query, and recording corresponding time determination in the existing audio and video in the cloud sharing database.
  • the history generation sub-module 610 can be replaced by the following modules, including:
  • the pre-recording sub-module 620 is configured to perform pre-recording of audio and video and add corresponding positioning information tags, environmental weather information tags and time information tags to form the historical audio and video.
  • the pre-recording sub-module 620 is generally configured to perform pre-recording of audio and video, and add a corresponding positioning information label, an environmental weather information label, and a time information label to form the historical audio and video, and pre-record a large number of audio and video and corresponding
  • the positioning information label, the environmental weather information label, and the time information label are attached to the audio and video to form the historical audio and video.
  • the positioning information calculation includes the following steps:
  • the extraction submodule 611 is configured to extract a picture of the specified number of frames in the audio and video;
  • the identification sub-module 612 is configured to perform image recognition and search for the iconic image in the above-mentioned screen to obtain the geographic location of the iconic image;
  • the first calculation sub-module 613 is configured to calculate the recording distance of the audio and video according to the size ratio of two or more specified images in the above-mentioned picture and the size ratio between the corresponding objects;
  • the second calculating sub-module 614 is configured to calculate the positioning information according to the content, the geographical location, and the recording distance of the icon image.
  • the extraction sub-module 611 is generally configured to extract a picture of the specified number of frames in the audio and video, wherein the picture of the specified number of frames is generally a face-changing feature including a prominent recognition feature or a landmark, such as a landmark building, an item, a character, and the like.
  • the identification sub-module 612 is generally configured to perform image recognition and search for the landmark image in the image to obtain the geographic location of the icon image, wherein the icon image is generally selected by the user, and the user abandons the selection.
  • the area in which the image color difference is large in the selection screen or the area in which the number of historical selections is large is defined as an iconic image, and the above-mentioned iconic image includes at least one, and generally includes, but is not limited to, two.
  • the first calculating sub-module 613 is generally configured to calculate a recording distance of the audio and video according to a size ratio of two or more specified images in the above-mentioned picture and a size ratio between the corresponding objects, and acquire multiples in the foregoing picture. Specifying the size ratio of the image or the ratio of the distance between the two images to the actual scene, thereby obtaining the recording distance of the above-mentioned recording device and the actual scene in the specified image,
  • the second calculating sub-module 614 is generally configured to calculate the positioning information according to the content, the geographical location, and the recording distance of the iconic image.
  • the present application further provides a computer device.
  • the computer device 12 is represented in the form of a general-purpose computing device.
  • the components of the computer device 12 may include, but are not limited to: one or more processors or Processing unit 16, system memory 28, connects bus 18 of various system components, including system memory 28 and processing unit 16.
  • Bus 18 represents one or more of several types of bus 18 architectures, including memory bus 18 or memory controller, peripheral bus 18, graphics acceleration port, processor or office using any of the various bus 18 architectures.
  • Domain bus 18 includes, but are not limited to, an Industry Standard Architecture (ISA) bus 18, a Micro Channel Architecture (MAC) bus 18, an Enhanced ISA Bus 18, a Video and Video Electronics Standards Association (VESA) local bus 18, and Peripheral Component Interconnect (PCI) bus 18.
  • ISA Industry Standard Architecture
  • MAC Micro Channel Architecture
  • VESA Video and Video Electronics Standards Association
  • PCI Peripheral Component Interconnect
  • Computer device 12 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by computer device 12, including both volatile and nonvolatile media, removable and non-removable media.
  • System memory 28 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 30 and/or cache memory 32.
  • Computer device 12 may further include other mobile/non-removable, volatile/non-volatile computer system storage media.
  • storage system 34 may be used to read and write non-removable, non-volatile magnetic media (commonly referred to as "hard disk drives").
  • a disk drive for reading and writing to a removable non-volatile disk such as a "floppy disk”
  • a removable non-volatile disk for example, CD ⁇ ROM, DVD ⁇ ROM
  • other optical media read and write optical drive.
  • each drive can be coupled to bus 18 via one or more data medium interfaces.
  • the memory can include at least one program product having a set (e.g., at least one) of program modules 42 that are configured to perform the functions of the various embodiments of the present application.
  • a program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in a memory, such program modules 42 including, but not limited to, an operating system, one or more applications, other program modules 42 and program data, each of these examples or some combination may include an implementation of a network environment.
  • Program module 42 typically performs the functions and/or methods of the embodiments described herein.
  • Computer device 12 may also be in communication with one or more external devices 14 (eg, a keyboard, pointing device, display 24, camera, etc.), and may also be in communication with one or more devices that enable a user to interact with the computer device 12, and/ Or communicating with any device (eg, a network card, modem, etc.) that enables the computer device 12 to communicate with one or more other computing devices. This communication can take place via an input/output (I/O) interface 22. Also, computer device 12 can communicate with one or more networks (e.g., a local area network (LAN)), a wide area network (WAN), and/or a public network (e.g., the Internet) via network adapter 20. As shown, network adapter 20 communicates with other modules of computer device 12 via bus 18.
  • LAN local area network
  • WAN wide area network
  • public network e.g., the Internet
  • the processing unit 16 performs various function applications and data processing by running a program stored in the system memory 28, for example, a method for processing audio and video provided by the embodiments of the present application.
  • the method when the processing unit 16 executes the foregoing procedure, the method provides: generating a positioning information label, an environmental weather information label, and a time information label at a position where the recorded audio and video is located; and matching the positioning information label, the environmental weather information label, and the time from the database.
  • the historical audio and video with the same or similar information labels; the target audio and video is generated according to the above-mentioned recorded audio and video and historical audio and video processing.
  • the present application further provides a computer readable storage medium, where a computer program is stored, and when the program is executed by the processor, the method for processing audio and video provided by all embodiments of the present application is implemented.
  • the method implemented on the processor can refer to various embodiments of the method for processing audio and video according to the present invention, and details are not described herein.
  • the method, system, device and storage medium for processing audio and video of the present invention have the beneficial effects of: by adding a positioning information label, an environmental weather information label and a time information label of a location for recording audio and video, so that the audio and video can be processed when the audio and video are processed.
  • the historical audio and video with the same information label is used as a reference for the recorded audio and video, so that the audio and video can be edited and processed more accurately, the operation difficulty and operation time of the user are reduced, and the authenticity of the target audio and video after processing is improved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

L'invention concerne un procédé et un système de traitement d'une audio-vidéo, et un dispositif et un support de stockage. Le procédé consiste : à générer une balise d'informations de positionnement, une balise d'informations météorologiques environnementales, et une balise d'informations temporelles d'un emplacement où des audio-vidéo sont enregistrées; à effectuer une mise en correspondance pour obtenir d'une base de données une audio-vidéo historique comportant des balises d'informations identiques aux balises d'informations ci-dessus; et à traiter et générer une audio-vidéo cible en fonction de l'audio-vidéo enregistrée et de l'audio-vidéo historique. Le procédé et le système de traitement de l'audio-vidéo, ainsi que le dispositif et le support de stockage sont utilisés pour ajouter la balise d'informations de positionnement, la balise d'informations météorologiques environnementales, et la balise d'informations temporelles de l'emplacement où sont enregistrées l'audio-vidéo, de sorte que l'audio-vidéo historique comportant des balises d'informations identiques puisse être utilisée comme référence de l'audio-vidéo enregistrée lorsque l'audio-vidéo est traitée, que l'audio-vidéo puisse être éditée et traitée avec plus de précision, que la difficulté d'exécution et le temps d'exécution pour un utilisateur soient réduits, et que l'authenticité de l'audio-vidéo cible soit accrue après traitement.
PCT/CN2018/091115 2018-02-28 2018-06-13 Procédé et système de traitement d'une audio-vidéo, et dispositif et support de stockage Ceased WO2019165723A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810167141.0A CN108388649B (zh) 2018-02-28 2018-02-28 处理音视频的方法、系统、设备及存储介质
CN201810167141.0 2018-02-28

Publications (1)

Publication Number Publication Date
WO2019165723A1 true WO2019165723A1 (fr) 2019-09-06

Family

ID=63069030

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/091115 Ceased WO2019165723A1 (fr) 2018-02-28 2018-06-13 Procédé et système de traitement d'une audio-vidéo, et dispositif et support de stockage

Country Status (2)

Country Link
CN (1) CN108388649B (fr)
WO (1) WO2019165723A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023051293A1 (fr) * 2021-09-28 2023-04-06 北京字跳网络技术有限公司 Procédé et appareil de traitement audio, dispositif électronique et support de stockage

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110675922A (zh) * 2019-09-23 2020-01-10 北京阳光欣晴健康科技有限责任公司 基于多模态的智能随访方法及系统
CN110851625A (zh) * 2019-10-16 2020-02-28 联想(北京)有限公司 视频创建方法及装置、电子设备、存储介质
CN112866604B (zh) 2019-11-27 2022-06-14 深圳市万普拉斯科技有限公司 视频文件生成方法、装置、计算机设备和存储介质
CN114025116B (zh) * 2021-11-25 2023-08-04 北京字节跳动网络技术有限公司 视频生成方法、装置、可读介质和电子设备
CN116456057B (zh) * 2023-04-26 2023-11-14 河南铭视科技股份有限公司 一种基于物联网的视频处理系统及方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130294746A1 (en) * 2012-05-01 2013-11-07 Wochit, Inc. System and method of generating multimedia content
CN104584618A (zh) * 2012-07-20 2015-04-29 谷歌公司 Mob源电话视频协作
CN106416281A (zh) * 2013-12-30 2017-02-15 理芙麦资公司 视频元数据
CN107147959A (zh) * 2017-05-05 2017-09-08 中广热点云科技有限公司 一种广播视频剪辑获取方法及系统

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6545192B2 (ja) * 2014-05-12 2019-07-17 シグニファイ ホールディング ビー ヴィ 変調光源からの照光から復号されたタイムスタンプを使用する、捕捉された画像の検証
CN105827959B (zh) * 2016-03-21 2019-06-25 深圳市至壹科技开发有限公司 基于地理位置的视频处理方法
CN105933651B (zh) * 2016-05-04 2019-04-30 深圳市至壹科技开发有限公司 基于目标路线跳接视频的方法与装置
GB2552316A (en) * 2016-07-15 2018-01-24 Sony Corp Information processing apparatus, method and computer program product
CN106251271A (zh) * 2016-07-29 2016-12-21 北京云海寰宇信息技术有限责任公司 城市智能管理平台
CN106596888B (zh) * 2016-12-12 2019-01-15 刘邦楠 采用终端机和手机的网络水质检测系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130294746A1 (en) * 2012-05-01 2013-11-07 Wochit, Inc. System and method of generating multimedia content
CN104584618A (zh) * 2012-07-20 2015-04-29 谷歌公司 Mob源电话视频协作
CN106416281A (zh) * 2013-12-30 2017-02-15 理芙麦资公司 视频元数据
CN107147959A (zh) * 2017-05-05 2017-09-08 中广热点云科技有限公司 一种广播视频剪辑获取方法及系统

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023051293A1 (fr) * 2021-09-28 2023-04-06 北京字跳网络技术有限公司 Procédé et appareil de traitement audio, dispositif électronique et support de stockage
US12142296B2 (en) 2021-09-28 2024-11-12 Beijing Zitiao Network Technology Co., Ltd. Audio processing method and apparatus, and electronic device and storage medium

Also Published As

Publication number Publication date
CN108388649B (zh) 2021-06-22
CN108388649A (zh) 2018-08-10

Similar Documents

Publication Publication Date Title
WO2019165723A1 (fr) Procédé et système de traitement d'une audio-vidéo, et dispositif et support de stockage
WO2021261830A1 (fr) Procédé et appareil d'évaluation de qualité de vidéo
WO2017164716A1 (fr) Procédé et dispositif de traitement d'informations multimédia
WO2013170662A1 (fr) Procédé et dispositif d'ajout d'informations d'amis, et support de stockage informatique
WO2017143692A1 (fr) Téléviseur intelligent et son procédé de commande vocale
WO2019041851A1 (fr) Procédé de conseil après-vente d'appareil ménager, dispositif électronique et support de stockage lisible par ordinateur
WO2016082267A1 (fr) Procédé et système de reconnaissance vocale
WO2018166224A1 (fr) Procédé et appareil d'affichage de suivi de cible pour une vidéo panoramique et support d'informations
WO2018139884A1 (fr) Procédé de traitement audio vr et équipement correspondant
WO2019000801A1 (fr) Procédé, appareil et dispositif de synchronisation de données, et support d'informations lisible par ordinateur
WO2019114269A1 (fr) Procédé de reprise de la visualisation d'un programme, téléviseur et support d'informations lisible par ordinateur
WO2017084302A1 (fr) Procédé destiné à la lecture de vidéo de démarrage d'un terminal d'affichage et terminal d'affichage
WO2019051905A1 (fr) Procédé de commande de climatiseur, climatiseur, et support d'informations lisible par ordinateur
WO2019051866A1 (fr) Procédé, dispositif et appareil de gestion d'informations de droits et d'intérêts, et support d'informations lisible par ordinateur
WO2018032680A1 (fr) Procédé et système de lecture audio et vidéo
WO2017054488A1 (fr) Procédé de commande de lecture de télévision, serveur et système de commande de lecture de télévision
WO2020017827A1 (fr) Dispositif électronique et procédé de commande pour dispositif électronique
WO2018233221A1 (fr) Procédé de sortie sonore multi-fenêtre, télévision et support de stockage lisible par ordinateur
WO2016101702A1 (fr) Procédé et dispositif d'enregistrement de programme
WO2021137671A1 (fr) Appareil de génération de vidéo et procédé de génération de vidéo exécuté par l'appareil de génération de vidéo
WO2019051903A1 (fr) Procédé et appareil de commande de terminal, et support d'informations lisible par un ordinateur
WO2019071771A1 (fr) Procédé et système de calibrage d'informations d'empreinte de signal sans fil, serveur, et support
WO2017181504A1 (fr) Procédé et téléviseur pour le réglage intelligent de la taille de sous-titres
WO2018018680A1 (fr) Procédé et appareil d'affichage des informations d'invite d'application
WO2018023925A1 (fr) Procédé et système de photographie

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18907911

Country of ref document: EP

Kind code of ref document: A1