CN116847168A

CN116847168A - Video editor, video editing method and related device

Info

Publication number: CN116847168A
Application number: CN202310564949.3A
Authority: CN
Inventors: 黄晨升; 戴鹭琳; 张晨露; 柴金祥
Original assignee: Shanghai Movu Technology Co Ltd; Mofa Shanghai Information Technology Co Ltd
Current assignee: Shanghai Movu Technology Co Ltd; Mofa Shanghai Information Technology Co Ltd
Priority date: 2023-05-18
Filing date: 2023-05-18
Publication date: 2023-10-03

Abstract

The application provides a video editor, a video editing method and a related device, which are used for making target video of a virtual object interactive application, wherein the video editor comprises: the video presetting module is used for configuring configuration information of the target video, wherein the configuration information comprises one or more of video name information, video size information, virtual object information, gesture information, pronunciation language information and pronunciation tone information; the video editing module is used for configuring configuration materials of the target video; and the video synthesis module is used for previewing, synthesizing and downloading the target video. On the premise of simplifying the video making operation of a user and improving the making efficiency, a virtual object video of high-quality and vivid virtual object interactive application is made.

Description

Video editor, video editing method and related device

Technical Field

The present application relates to the technical field of virtual man, interactive design, and artificial intelligence, and in particular, to a video editor, a video editing method, an electronic device, and a computer-readable storage medium.

Background

The virtual objects include virtual humans, virtual animals, virtual cartoon figures, and the like. The virtual person is a personified image constructed by CG technology and operated in a code form, and has various interaction modes such as language communication, expression, action display and the like. The technology of the dummy person has been rapidly developed in the field of artificial intelligence and has been applied in many technical fields such as video, media, games, finance, travel, education, medical and so on.

The traditional video editor generally cannot generate virtual objects or cannot process actions and expressions of the virtual objects, lacks support of intelligent technology, is complex in interface and function, and has a high use threshold for non-professionals.

Based on the above, the present application provides a video editor, a video editing method and related devices to improve the related art.

Disclosure of Invention

The application aims to provide a video editor, a video editing method, electronic equipment and a computer readable storage medium, which can produce a high-quality and realistic virtual object video for virtual object interactive application on the premise of simplifying video production operation of a user and improving production efficiency.

The application adopts the following technical scheme:

in a first aspect, the present application provides a video editor for producing a target video of a virtual object interactive application, the video editor comprising:

the video presetting module is used for configuring configuration information of the target video, wherein the configuration information comprises one or more of video name information, video size information, virtual object information, gesture information, pronunciation language information and pronunciation tone information;

the video editing module is used for configuring configuration materials of the target video, wherein the configuration materials comprise one or more of role materials, scene materials, display materials, subtitle materials and score materials;

and the video synthesis module is used for previewing, synthesizing and downloading the target video.

The beneficial effect of this technical scheme lies in: the method can better meet the user requirements of the virtual object interactive application, and is convenient for users to manufacture high-quality and vivid virtual object videos of the virtual object interactive application. The user can rapidly configure the required parameters and virtual object materials according to the specific requirements of the configuration information of the target video production through the video preset module, so that the highly customized virtual object video production is conveniently realized; visual interfaces and operation modes are provided through the video editing module, so that a user can rapidly complete editing and combination of configuration materials, and video production efficiency is improved; each module provides a real-time preview function, so that a user can view the video editing result at any time and adjust and modify the video editing result at any time; the video editor supports the configuration of various virtual object information, can generate high-quality and vivid virtual object interactive video, is suitable for professional persons and non-professional users, expands the application range of virtual object video production, and improves the use experience and usability of the users.

In some optional embodiments, the video editing module includes a material library unit, a first display area, a material editing unit, a language editing unit, a mode switching button, and a save button;

the material library unit is used for providing the configuration materials;

the first display area is used for previewing the configuration content of the target video;

the material editing unit is used for editing the configuration material;

the language editing unit is used for providing language materials of the target video;

the mode switching button is used for switching to a code mode configuration interface, and the code configuration interface comprises a voice synthesis markup language configuration interface;

the save button is used for confirming the configuration content of the target video.

The beneficial effect of this technical scheme lies in: the user can select the needed configuration materials in the material library unit, and flexibly configure roles, scenes, exhibits, characters, music and the like in the virtual object video according to the needs; the user can also use the material editing unit to edit and clip the selected configuration materials, and the video editing module provides a visual and intuitive editing interface, so that the user can more easily understand and master the video editing tool, and can realize the combination and editing of different types of materials through simple drag-and-drop operation, such as editing of audio, virtual object actions, expressions, shots and the like, and the system has more flexibility; providing language materials of the target video in a language editing unit to generate audio content of the target video; in addition, the video editor also keeps the code editing mode, users can switch over to the code mode configuration interface through the mode switch button too, use the voice synthesis mark-up language (SSML) to dispose the interface and carry on the configuration of the video content; the user can confirm the configuration content of the target video through the save button; on the one hand, the virtual object video editing material library is provided, so that a user can be helped to quickly, simply and conveniently synthesize various material combination contents into a target video by utilizing a visual and intuitive editing interface, and more personalized and specialized video works can be created by the user; on the other hand, the technical scheme also has the function of generating the voice material by using the language editing unit, can automatically generate the voice material through simple text input configuration, can instantly modify, delete and adjust the voice content, reduces tedious operations such as manually recording audio, editing audio and the like, has high consistency with the generated voice material, avoids the variability and errors easily occurring in manual recording, has higher speed of automatically generating the voice material, can greatly improve the production efficiency and reduces the video manufacturing cost.

In some optional embodiments, the material library unit includes a first navigation bar, a second navigation bar, and a material display area;

the first navigation bar is used for selecting the configuration materials;

the second navigation bar is used for selecting sub-materials of the configuration materials;

the material display area is used to display the sub-material, which may be added to the material editing unit.

The beneficial effect of this technical scheme lies in: different types of materials are placed in different navigation bars according to classification, so that a user can find and configure required materials more quickly, and the efficiency of configuring the materials is improved; the material display area can intuitively display the selected sub-materials, so that the selection and adjustment of a user are facilitated, and the visibility of the materials is improved; for users needing to combine different sub-materials, the interface provides greater freedom and flexibility, various combination modes can be rapidly realized, and the freedom of the users is enhanced; the interface is simple and easy to understand in design form, simple to operate and capable of reducing the learning cost of a user.

In some optional embodiments, the language editing unit includes a language configuration mode drop-down box for selecting to configure the language material with a speech synthesis mode or a real person audio mode;

When the language material is configured by using the voice synthesis mode, the language editing unit comprises a language text input box and a language material editing column, wherein the language text input box is used for inputting the language material, the language material editing column is used for editing configuration information of the language material, and the configuration information of the language material comprises phonetic notations, pauses, speech speeds, volume and tones; the language material editing column further comprises a trial listening button, and the trial listening button is used for trial listening of the language material;

when the real person audio mode is selected to be used for configuring the language materials, the language editing unit comprises a language material uploading button and a language material adjusting subunit, wherein the language material uploading button is used for uploading the language materials, and the material adjusting subunit is used for adjusting the language materials.

The beneficial effect of this technical scheme lies in: the user can automatically synthesize the language material (audio) of the target video according to the input words through the language editing unit, and can also fax the audio as the language material of the target video, so that two editing modes are provided, different types of voice materials can be edited and configured more efficiently, and the editing efficiency of the voice materials is improved; in the speech synthesis mode, a user can input required language materials through a language character input box, and edit is carried out in the speech synthesis mode, so that flexible speech material editing is supported; by editing the configuration of phonetic notation, pause, speech speed, volume and tone of the language materials, the generated voice materials are more natural, smooth and accurate, and the quality of the voice materials is improved; in the real person audio mode, the existing real person audio material can be uploaded through a language material uploading button, and editing is performed in the real person audio mode; the interface is simple, the operation is simple and easy to understand, and the learning cost and the use difficulty of a user are reduced.

In some optional embodiments, the material editing unit includes at least one material track, the material track including a track identification and a configuration track; the material track is used for editing sub-materials of the configuration materials executed by the virtual object.

The beneficial effect of this technical scheme lies in: the track identifier is used for identifying the type or name of the material edited by the current track, such as actions, expressions, shots, scenes and the like; the configuration track is used for configuring information such as specific positions, time lengths and the like of the sub-materials on the track, and a user can drag and drop different materials on the corresponding material track and edit and adjust the materials in the configuration track, so that the sub-materials of the configuration materials required by virtual object execution are realized; on the one hand, by respectively placing different types of materials on different tracks, the classification management of the materials is realized, the required materials can be quickly found and edited, and the editing efficiency of the materials is improved; on the other hand, multi-level material editing is supported, a plurality of sub-materials can be added to each track, superposition can be carried out among the plurality of tracks, editing and adjustment can be carried out on the tracks, and therefore the requirements under complex scenes are met; the precision and flexibility of editing materials are improved: the position, the duration, the volume and other attributes of the materials can be controlled more finely by configuring specific parameter adjustment in the track, so that the editing quality and the flexibility of the materials are improved; on the other hand, through simple and visual interface design, the user can learn and use more easily, and the learning cost and the operation difficulty are reduced.

In some optional embodiments, the video preset module includes a second display area, a first video name text input box, a video size box, a virtual object box, a gesture box, a pronunciation language drop-down box, a pronunciation timbre drop-down box, and an update button;

the second display area is used for previewing the configuration content of the target video;

the first video name text input box is used for inputting video name information of the target video;

the video size single selection box is used for configuring video size information of the target video;

the virtual object single selection box is used for configuring virtual object information of the target video;

the gesture single selection box is used for configuring gesture information of the virtual object;

the pronunciation language drop-down box is used for configuring the pronunciation language of the virtual object;

the sounding tone drop-down frame is used for configuring sounding tone information of the virtual object;

the update button is used for confirming the configuration information of the target video.

The beneficial effect of this technical scheme lies in: by placing different configuration information in different single selection frames in a classified manner, a user can quickly find and edit, so that the requirements under various scenes can be conveniently realized, and the editing efficiency is improved; through the preview function of the second display area, the user can intuitively know the configured video content, so that the video preset parameters are better adjusted, and the video preview precision is improved; through simple and clear interface design and easy-to-understand single selection frame layout, the learning cost and the use difficulty of a user are reduced, and the usability of the product is improved.

In some optional embodiments, the video composition module includes a third display area, a first play button, a second video title text input box, a video format drop-down box, a video quality drop-down box, a video preset drop-down box, and a video composition button;

the third display area is used for previewing the target video;

the first playing button is used for playing the target video in the third display area after being clicked;

the second video name text input box is used for modifying the video name information of the target video;

the video format drop-down box is used for configuring the video format of the target video;

the video quality drop-down frame is used for configuring frame rate information and code rate information of the target video;

the video preset drop-down frame is used for configuring preset information of the target video, and the preset information comprises a complete video, 1/2 video and 1/4 video;

the video composition button is used for composing the target video after being clicked, and the video composition button is updated to be a re-composition button after being clicked for the first time.

The beneficial effect of this technical scheme lies in: through a plurality of input frames and drop-down frames provided on the interface, a user can rapidly configure information of the synthesized target video, and efficiency of video synthesis is improved; the user can set the video format and quality according to different scene demands, thereby meeting the video synthesis demands under different scenes; through the preview function of the third display area, a user can intuitively know the content and effect of the synthesized video, so that the synthesis parameters are better adjusted; reduces the operation difficulty of users and improves the usability of the product.

In some alternative embodiments, the video composition button is clicked to display a composed information unit, the composed information unit including an information display area, a second play button, a download video button, a download audio button, and a delete button;

the information display area is used for displaying notification information of the target video composition;

the second playing button is used for playing the target video;

the video downloading button is used for downloading video information of the target video;

the downloading audio button is used for downloading the audio information of the target video;

the delete button is used for deleting the target video.

The beneficial effect of this technical scheme lies in: by providing a plurality of function buttons and an information display area, the operation difficulty of a user is reduced, and the use experience of the user is enhanced; the synthesized information unit is provided after the synthesis is completed, and a user is informed of the synthesis result, so that the reliability of synthesizing the target video is improved; supporting multiple file output operations, and enabling a user to select to download video information or audio information so as to meet requirements in different scenes; by deleting the button, the user can conveniently delete the unnecessary target video, thereby reducing the occupation of the storage space.

In a second aspect, the present application provides a video editing method, which uses the video editor to make a target video of a virtual object interactive application, where the video editor includes a video preset module, a video editing module, and a video synthesis module, and the video editing method includes:

configuring configuration information of a target video by utilizing the video presetting module, wherein the configuration information comprises one or more of video name information, video size information, virtual object information, gesture information, pronunciation language information and pronunciation tone information;

configuring configuration materials of the target video by utilizing the video editing module, wherein the configuration materials comprise one or more of character materials, scene materials, display materials, subtitle materials and score materials;

and previewing, synthesizing and downloading the target video by utilizing the video synthesizing module.

In some alternative embodiments, the video editor further comprises a content detection module;

when the video synthesis button is clicked, detecting whether the target video contains bad content or not by using the content detection module;

when the target video contains bad content, canceling synthesizing the target video, acquiring user information corresponding to the target video and carrying out bad mark on the user information; when the marked times of the user information are larger than the preset bad mark times, refusing to respond to the access request of the user information;

When the target video does not contain bad content, no operation is performed.

The beneficial effect of this technical scheme lies in: by detecting whether the target video contains bad content, the transmission and influence of the bad content are avoided, and the safety of the platform is improved; by acquiring user information, carrying out bad marks and paying attention, and setting preset bad mark times, malicious users can be limited, illegal behaviors can be effectively hit, and negative effects caused by bad content transmission are reduced; the management of virtual interactive application is enhanced, and the probability of producing bad content is reduced.

In some optional embodiments, the video editing module includes a material library unit, a first display area, a material editing unit, a language editing unit, a mode switching button, and a save button, and the video editing method further includes:

providing the configuration materials by utilizing the material library unit;

previewing configuration content of the target video by utilizing the first display area;

editing the configuration material by using the material editing unit;

providing language materials of the target video by using the language editing unit;

switching to a code mode configuration interface by using the mode switching button, wherein the code configuration interface comprises a voice synthesis markup language configuration interface;

And confirming the configuration content of the target video by using the save button.

In some optional embodiments, the material library unit includes a first navigation bar, a second navigation bar, and a material display area, and the video editing method further includes:

selecting the configuration material by using the first navigation bar;

selecting sub-materials of the configuration materials by using the second navigation bar;

the sub-material is displayed using the material display area, and the sub-material may be added to the material editing unit.

In some optional embodiments, the language editing unit includes a language configuration mode drop-down box, the language configuration mode drop-down box is used to select to configure the language material using a speech synthesis mode or a real audio mode, and the video editing method further includes:

when the language material is configured by using the voice synthesis mode, the language editing unit comprises a language text input box and a language material editing column, the language material is input by using the language text input box, and configuration information of the language material is edited by using the language material editing column, wherein the configuration information of the language material comprises phonetic notations, pauses, speech speeds, volume and tones; the language material editing column further comprises a test listening button, and the language material is listened to by the test listening button;

When the language material is configured by using the real person audio mode, the language editing unit comprises a language material uploading button and a language material adjusting subunit, the language material is uploaded by using the language material uploading button, and the language material is adjusted by using the material adjusting subunit.

In some optional embodiments, the material editing unit includes at least one material track, the material track including a track identifier and a configuration track, and the video editing method further includes: editing the sub-materials of the configuration materials executed by the virtual object by using the material track.

In some optional embodiments, the video preset module includes a second display area, a first video name text input box, a video size box, a virtual object box, a gesture box, a pronunciation language drop-down box, a pronunciation tone drop-down box, and an update button, and the video editing method further includes:

previewing the configuration content of the target video by utilizing the second display area;

inputting video name information of the target video by using the first video name text input box;

configuring video size information of the target video by utilizing the video size single selection frame;

Configuring virtual object information of the target video by utilizing the virtual object single selection frame;

configuring gesture information of the virtual object by utilizing the gesture single selection frame;

configuring the pronunciation language of the virtual object by utilizing the pronunciation language drop-down box;

configuring pronunciation tone information of the virtual object by using the pronunciation tone drop-down frame;

and confirming the configuration information of the target video by using the update button.

In some optional embodiments, the video composition module includes a third display area, a first play button, a second video name text input box, a video format drop-down box, a video quality drop-down box, a video preset drop-down box, and a video composition button, and the video editing method further includes:

previewing the target video by using the third display area;

playing the target video in the third display area after the first playing button is clicked;

modifying video name information of the target video by using the second video name text input box;

configuring a video format of the target video by utilizing the video format drop-down box;

configuring frame rate information and code rate information of the target video by utilizing the video quality drop-down frame;

Configuring preset information of the target video by utilizing the video preset drop-down frame, wherein the preset information comprises a complete video, a 1/2 video and a 1/4 video;

and synthesizing the target video after clicking the video synthesis button, wherein the video synthesis button is updated to be a re-synthesis button after clicking for the first time.

In some alternative embodiments, the video composition button is clicked to display a composed information unit, the composed information unit including an information display area, a second play button, a download video button, a download audio button, and a delete button, the video editing method further comprising:

displaying notification information of the target video composition by using the information display area;

playing the target video by using the second play button;

downloading video information of the target video by using the video downloading button;

downloading the audio information of the target video by utilizing the downloading audio button;

and deleting the target video by using the deleting button.

In a third aspect, the present application provides an electronic device, configured to make a target video of a virtual object interactive application by using the video editor, where the video editor includes a video preset module, a video editing module, and a video synthesis module;

The electronic device comprises a memory storing a computer program and at least one processor configured to implement the following steps when executing the computer program:

In some alternative embodiments, the video editor further comprises a content detection module, the at least one processor being further configured to implement the following steps when executing the computer program:

When the target video does not contain bad content, no operation is performed.

In some alternative embodiments, the video editing module includes a material library unit, a first display area, a material editing unit, a language editing unit, a mode switch button, and a save button, the at least one processor being further configured to implement the following steps when executing the computer program:

providing the configuration materials by utilizing the material library unit;

editing the configuration material by using the material editing unit;

In some alternative embodiments, the material gallery unit includes a first navigation bar, a second navigation bar, and a material display area, the at least one processor being further configured to implement the following steps when executing the computer program:

selecting the configuration material by using the first navigation bar;

In some alternative embodiments, the language editing unit comprises a language configuration mode drop-down box with which to choose to configure the language material with a speech synthesis mode or a real audio mode, the at least one processor being further configured to implement the following steps when executing the computer program:

In some alternative embodiments, the material editing unit comprises at least one material track comprising a track identification and a configuration track, the at least one processor being further configured to implement the following steps when executing the computer program:

editing the sub-materials of the configuration materials executed by the virtual object by using the material track.

In some optional embodiments, the video preset module includes a second display area, a first video name text input box, a video size box, a virtual object box, a gesture box, a pronunciation language drop-down box, a pronunciation timbre drop-down box, and an update button, the at least one processor being further configured to implement the following steps when executing the computer program:

In some alternative embodiments, the video composition module includes a third display area, a first play button, a second video name text input box, a video format drop-down box, a video quality drop-down box, a video preset drop-down box, and a video composition button, the at least one processor being further configured to implement the following steps when executing the computer program:

previewing the target video by using the third display area;

In some alternative embodiments, the video composition button is clicked to display a composed information unit, the composed information unit including an information display area, a second play button, a download video button, a download audio button, and a delete button, the at least one processor being further configured to implement the following steps when executing the computer program:

playing the target video by using the second play button;

and deleting the target video by using the deleting button.

In a fourth aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by at least one processor, performs the steps of the above method or performs the functions of the above electronic device.

Drawings

The application will be further described with reference to the drawings and examples.

FIG. 1 is a schematic flow chart of a video editor according to an embodiment of the present application;

fig. 2 is a schematic flow chart of a video editing method according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a program product for implementing a video editor according to an embodiment of the present application.

Detailed Description

The present application will be further described with reference to the accompanying drawings and detailed description, wherein it is to be understood that, on the premise of no conflict, the following embodiments or technical features may be arbitrarily combined to form new embodiments.

In embodiments of the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a alone, a and B together, and B alone, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, a and b, a and c, b and c, a and b and c, wherein a, b and c can be single or multiple. It is noted that "at least one" may also be interpreted as "one (a) or more (a)".

It is also noted that, in embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any implementation or design described as "exemplary" or "e.g." in the examples of this application should not be construed as preferred or advantageous over other implementations or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.

The virtual objects include virtual humans, virtual animals, virtual cartoon figures, and the like. The virtual person is a personified image constructed by CG technology and operated in a code form, and has various interaction modes such as language communication, expression, action display and the like. The technology of virtual persons has been rapidly developed in the field of artificial intelligence and has been applied in many technical fields such as video, media, games, finance, travel, education, medical treatment, etc., and not only can a virtual host, a virtual anchor, a virtual even image, a virtual customer service, a virtual lawyer, a virtual financial advisor, a virtual teacher, a virtual doctor, a virtual instructor, a virtual assistant, etc. be customized, but also a video can be generated through text or audio one-key. In the virtual people, the service type virtual people mainly have the functions of replacing real people to serve and provide daily accompaniment, are the virtualization of service type roles in reality, and have the industrial value of mainly reducing the cost of the existing service type industry and enhancing the cost reduction of the stock market.

Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. The design principle and the implementation method of various intelligent machines are researched by artificial intelligence, so that the machines have the functions of perception, reasoning and decision. The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. The computer program may learn experience E given a certain class of tasks T and performance metrics P, and increase with experience E if its performance in task T happens to be measured by P. Machine learning is specialized in studying how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, reorganizing existing knowledge structures to continually improve its own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence.

Deep learning is a special machine learning by which the world is represented using a hierarchy of nested concepts, each defined as being associated with a simple concept, and achieving great functionality and flexibility, while a more abstract representation is computed in a less abstract way. Machine learning and deep learning typically include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

Natural language processing (Natural Language Processing), abbreviated NLP, is a technology related to human language in the field of artificial intelligence. The technology aims at enabling a computer to understand, analyze, process and generate natural language, and realizing natural language interaction between the computer and human beings. NLP technology is widely used in the fields of information retrieval, text classification, voice recognition, machine translation, chat robots and the like.

The traditional video editor has the defects that the generation process is complicated, the process of generating the virtual human interactive video is complicated and time-consuming, and a great deal of time is required to learn and master; the support of hardware devices such as a high-performance computer, a display card and the like is needed to smoothly process high-resolution and large-scale video files; traditional video editors generally need to store video files and editing items in a local computer, which is inconvenient for multi-user collaboration and real-time sharing; for non-professional users, the use threshold is high, and the users are difficult to get up quickly. In the aspect of generating virtual object interactive video, a vivid virtual object image cannot be directly generated; the complex actions and expressions of the virtual object cannot be processed, and the detailed and natural actions and expressions are difficult to present; the support of intelligent technology, such as voice recognition, natural language processing and other technologies, is lacking, and the limitation of intelligent virtual objects and the like is difficult to realize.

The virtual object interaction application is used for providing virtual object interaction functions. The virtual human interactive application may simulate human communication and behavior and interact with the user. Such software (referred to as virtual human interactive applications) is typically driven by artificial intelligence and natural language processing techniques and is capable of interacting with a user by means of text, speech or images, etc. In an embodiment of the present application, the virtual object includes one or more of a virtual person, a virtual animal, and a virtual cartoon character. As one example, the virtual object is a virtual person "JING" (chinese name: mirror). As one example, the name of the virtual object interactive application is "company a_jing".

The user of the video editor refers to staff of clients such as enterprises, institutions, banks, schools, hospitals and the like for virtual object interactive application (for example, configuration staff responsible for editing virtual interactive object videos); a user in this context generally refers to a person interacting with a virtual object at a virtual object interaction application, rather than the user of the instant interaction information configuration tool.

(video editor)

Referring to fig. 1, fig. 1 is a schematic flow chart of a video editor according to an embodiment of the present application.

The embodiment of the application provides a video editor for making a target video of a virtual object interactive application, comprising:

Therefore, the user requirement of the virtual object interactive application can be better met, and the user can conveniently manufacture the virtual object video of the high-quality and vivid virtual object interactive application. The user can rapidly configure the required parameters and virtual object materials according to the specific requirements of the configuration information of the target video production through the video preset module, so that the highly customized virtual object video production is conveniently realized; visual interfaces and operation modes are provided through the video editing module, so that a user can rapidly complete editing and combination of configuration materials, and video production efficiency is improved; each module provides a real-time preview function, so that a user can view the video editing result at any time and adjust and modify the video editing result at any time; the video editor supports the configuration of various virtual object information, can generate high-quality and vivid virtual object interactive video, is suitable for professional persons and non-professional users, expands the application range of virtual object video production, and improves the use experience and usability of the users.

In the embodiment of the application, the target video can be a synthesized video of a virtual object; the target video can be used for technical demonstration and training, product marketing and popularization, educational training, game entertainment and film and television special effects; for example, the operation process of some high-precision instruments is complex, certain expertise and skills are needed, and the operation flow can be intuitively displayed to a user by making an avatar video; on an electronic commerce platform, the characteristics and functions of the product are displayed by making an virtual image video, so that the attention and purchasing desire of a user are attracted; on an online education platform, the teaching content is presented by making the virtual image video, so that the learning interest and participation of students are enhanced, and the teaching quality is improved; in some online games, the game experience and interest is enhanced by creating avatar videos to customize the appearance and actions of the game character; in movies and cartoons, avatar videos can be used to create various fantasy worlds and character figures, enhancing the visual impact and emotional expression of the movie.

Synthesizing the target video in the embodiment of the application can be to adopt a text to drive the mouth shape and the action of the virtual object to render each frame of image corresponding to the target video so as to synthesize the target video; or the method can adopt a mouth shape of a text-driven virtual object, and render each frame of image corresponding to the synthesized video to synthesize the target video by combining configuration information such as one or more actions, expressions, scenes, display contents, background music or subtitles configured by a user for the virtual object.

the material library unit is used for providing the configuration materials;

the material editing unit is used for editing the configuration material;

Therefore, a user can select required configuration materials in the material library unit, and flexibly configure roles, scenes, exhibits, characters, music and the like in the virtual object video according to the needs; the user can also use the material editing unit to edit and clip the selected configuration materials, and the video editing module provides a visual and intuitive editing interface, so that the user can more easily understand and master the video editing tool, and can realize the combination and editing of different types of materials through simple drag-and-drop operation, such as editing of audio, virtual object actions, expressions, shots and the like, and the system has more flexibility; providing language materials of the target video in a language editing unit to generate audio content of the target video; in addition, the video editor also keeps the code editing mode, users can switch over to the code mode configuration interface through the mode switch button too, use the voice synthesis mark-up language (SSML) to dispose the interface and carry on the configuration of the video content; the user can confirm the configuration content of the target video through the save button; on the one hand, the virtual object video editing material library is provided, so that a user can be helped to quickly, simply and conveniently synthesize various material combination contents into a target video by utilizing a visual and intuitive editing interface, and more personalized and specialized video works can be created by the user; on the other hand, the technical scheme also has the function of generating the voice material by using the language editing unit, can automatically generate the voice material through simple text input configuration, can instantly modify, delete and adjust the voice content, reduces tedious operations such as manually recording audio, editing audio and the like, has high consistency with the generated voice material, avoids the variability and errors easily occurring in manual recording, has higher speed of automatically generating the voice material, can greatly improve the production efficiency and reduces the video manufacturing cost.

the first navigation bar is used for selecting the configuration materials;

Therefore, the materials of different types are placed in different navigation columns according to the classification, so that a user can find and configure the required materials more quickly, and the efficiency of configuring the materials is improved; the material display area can intuitively display the selected sub-materials, so that the selection and adjustment of a user are facilitated, and the visibility of the materials is improved; for users needing to combine different sub-materials, the interface provides greater freedom and flexibility, various combination modes can be rapidly realized, and the freedom of the users is enhanced; the interface is simple and easy to understand in design form, simple to operate and capable of reducing the learning cost of a user.

Therefore, a user can automatically synthesize language materials (audio) of the target video according to the input words through the language editing unit, and can also fax the audio as the language materials of the target video, so that two editing modes are provided, different types of voice materials can be edited and configured more efficiently, and the editing efficiency of the voice materials is improved; in the speech synthesis mode, a user can input required language materials through a language character input box, and edit is carried out in the speech synthesis mode, so that flexible speech material editing is supported; by editing the configuration of phonetic notation, pause, speech speed, volume and tone of the language materials, the generated voice materials are more natural, smooth and accurate, and the quality of the voice materials is improved; in the real person audio mode, the existing real person audio material can be uploaded through a language material uploading button, and editing is performed in the real person audio mode; the interface is simple, the operation is simple and easy to understand, and the learning cost and the use difficulty of a user are reduced.

Thus, the track identifier is used for identifying the type or name of the material edited by the current track, such as actions, expressions, shots, scenes and the like; the configuration track is used for configuring information such as specific positions, time lengths and the like of the sub-materials on the track, and a user can drag and drop different materials on the corresponding material track and edit and adjust the materials in the configuration track, so that the sub-materials of the configuration materials required by virtual object execution are realized; on the one hand, by respectively placing different types of materials on different tracks, the classification management of the materials is realized, the required materials can be quickly found and edited, and the editing efficiency of the materials is improved; on the other hand, multi-level material editing is supported, a plurality of sub-materials can be added to each track, superposition can be carried out among the plurality of tracks, editing and adjustment can be carried out on the tracks, and therefore the requirements under complex scenes are met; the precision and flexibility of editing materials are improved: the position, the duration, the volume and other attributes of the materials can be controlled more finely by configuring specific parameter adjustment in the track, so that the editing quality and the flexibility of the materials are improved; on the other hand, through simple and visual interface design, the user can learn and use more easily, and the learning cost and the operation difficulty are reduced.

In some alternative embodiments, the process of editing and editing the selected configuration material using the material editing unit may be, for example: dragging and dropping one or more materials selected from a material library unit onto one or more clipping tracks of a material editing unit (the materials selected in the material library unit are, for example, sub-materials "pronunciation", "action", "expression" and "role effect" corresponding to a configuration material "role", sub-materials "lens" and "space" corresponding to a configuration material "scene", sub-materials "uploading materials", "text", "image", "special effect", "image-text", "chart", "interactive group" and "programming module" corresponding to a configuration material "show", sub-materials "with base color" centering "," white word left "," live broadcast barrage "," custom caption "configuring material" match "and the like); on the editing track, a user can adjust the sequence and duration of the materials to realize the switching and combination of different materials, for example, arranging a plurality of scene materials according to time sequence or matching the audio materials with the video materials; the user can clip and adjust each material to realize fine control of the material content, for example, gesture adjustment, position movement and other operations are performed on the virtual object, and fade-in and fade-out processing is performed on the music material; the user can add special effects and transition effects to the materials, for example, add filter effects to scene materials, add flickering effects to text materials, and the like; after the editing is completed, a user can check the final effect through the preview function and enter a video synthesis module to synthesize a target video according to the requirement so as to share or store.

In a specific application, a language material editing column of the language editing unit is provided with a ZhuYin button, a pause button, a speech speed input box, a volume button, a tone button and a listening test button.

The embodiment of the application does not limit the number of words which can be input in the language text input box, the phonetic annotation button is used for annotating the words in the pronunciation text, the pause button is used for editing the pause mark for the pronunciation text, the speech speed input box is used for setting the speech speed of the virtual object, the volume button is used for setting the volume of the virtual object, the tone button is used for setting the tone of the virtual object, and the listening test button is used for watching the effect of the virtual object for reading the whole pronunciation text.

When the target phrase in the pronunciation text is selected (box selection), the language editing unit can present a pronunciation editing interface in response to clicking of the phonetic annotation button, the pronunciation editing interface is provided with a pronunciation input box and a pronunciation listening test button, and a user can input the pronunciation corresponding to the target phrase in the pronunciation input box.

After the cursor is stopped at the position needing to be stopped, responding to the click of the stopping component, the language editing unit can generate a single stopping time frame and a stopping time input frame, and a user can edit the stopping time according to the needs. The unit of the pause duration is seconds.

In addition, when the mouse hovers over the speech speed input box, a corresponding speech speed increasing and decreasing button appears on one side of the speech speed input box, so that the user can quickly set the speech speed of the virtual object.

In response to the volume button being clicked, the language editing unit displays a volume progress bar and a volume value input box in a floating layer form; similarly, in response to the tone button being clicked, the language editing unit displays a tone progress bar and a tone value input box in a floating layer form.

For example, a user inputs a section of pronunciation text "take a rest" in a language text input box, lets us listen to a section of wonderful music bar ", uses the phrase of" music "as a phonetic notation button to carry out phonetic notation" yin1yue ", stops a cursor at a middle position of" let ", clicks a stop button, sets a stop time period to 1 second, sets a speech speed to 1.25 by using a speech speed output box, sets a sound volume to 1 by using a sound volume button, and sets a tone to 1 by using a tone button.

In some optional embodiments, language emotion information of the language material is identified through a language material identification module (a software program with a language material identification model), and based on the emotion information, corresponding expression material information is selected in a material library unit to be configured at a position corresponding to the language material in an expression material track of the virtual object, so that emotion expression of the virtual person is realized.

In some alternative embodiments, preprocessing such as text cleaning, word segmentation, word deactivation and the like is performed on input text or converting voice materials into text through voice emotion recognition technology; extracting features of the preprocessed text by using natural language processing and using a word bag model or word embedding mode, and vectorizing to obtain a vector form; establishing an emotion classification model by using a machine learning or deep learning algorithm, performing emotion classification on input text materials, outputting emotion labels, and identifying emotion information contained in the emotion labels; based on the emotion labels, matching corresponding virtual object expression materials in a material library, and applying the expression material configuration to video production of a virtual person. In a specific embodiment, when emotion is identified as sadness, dull colors, background music, crying actions and sounds of a virtual person and the like can be added in the video; aiming at factors such as different languages, cultural differences and the like, the language material recognition model is trained and optimized, and the accuracy and stability of emotion recognition are improved, so that more accurate and effective emotion-driven video production is realized.

Through the language material identification module, the automatic identification and customized generation of the user requirements are realized, and the efficiency and quality of video production are improved; meanwhile, emotion information contained in the language material can be accurately identified and applied to video production of a virtual person, and various different expressions, actions, sounds and other capabilities can be automatically configured for the virtual person, so that rich and colorful video contents are created, the emotion expression effect of the video is improved, the watching experience of a user can be improved, and the participation and immersion of the user are enhanced.

The first display area is used for previewing the configuration content of the target video and is provided with a virtual object display frame, an action playing button, a definition pull-down option frame and a large screen window button; when the action play button is clicked, the virtual object display frame can display the virtual object reading pronunciation text and make standby action and the effect of corresponding expression; the definition drop-down option box is provided with 4 drop-down options, which are respectively: fluent, standard definition, high definition and super definition, and after the large screen window button is clicked, the first display area can be displayed in an initial window or a large screen window.

The material editing unit is provided with at least one material track, can be text material track, action material track, expression material track, camera lens material track, show material track, scene material track and score material track, is provided with corresponding empty key before every material track, and the person of facilitating the use emptys corresponding effect one by one.

Taking a sub-material "action" of a configuration material "role" as an example, the sub-material "action" can be divided into "lecture", "guide", "etiquette" and "others" according to types, and an action preview button is arranged at the lower left corner of a preview image of each type of action effect, so that a video of the action can be played after the action preview button is clicked, wherein the actions of the etiquette can comprise "bye-double-hand waving", "call-click", "bow" and the like; taking an action material track as an example, after sub-materials of the action material are dragged to the action material track, the action material track is filled in a rectangular frame, one end of the rectangular frame is provided with an action editing button, and when the action editing button is clicked, an action editing interface is displayed in a floating layer mode; the action editing interface is provided with an action changing button, a speed adjusting progress bar, a speed numerical value input box and an eye locking sliding switch button, a user can change the selected action by clicking the action changing button, set the speed of the action by clicking the speed adjusting progress bar and the speed numerical value input box, and open or close the eye locking effect of the standby action by clicking the eye locking sliding switch button.

For another example, the sub-material "scene" is, for example, the background culture wall of company X; after the sub-material 'scene' is dragged to the scene material track, the scene material track is filled in a rectangular frame, one end of the rectangular frame is provided with a scene editing button, when the scene editing button is clicked, the scene editing interface is displayed in a floating layer mode, the scene editing interface is provided with a scene replacement button, and the animation effect drop-down option frame, the animation duration adjustment progress bar and the animation duration value input frame are used for facilitating a user to edit the animation effect and the animation duration corresponding to the scene in the scene editing interface. Specifically, the animation effect drop-down option box is provided with drop-down options such as "fade-in and fade-out", "push", "erase", "split", "clock", "mosaic", "dissolve", and the like.

For another example, the configuration material "show material" is attached to the display content in front of or behind the virtual object, and the show material is, for example, a "schematic diagram of an X product". After a schematic diagram of a product for displaying the material X is dragged to a display material track, the display material track is filled in a rectangular frame, one end of the rectangular frame is provided with a material editing button, when the material editing button is clicked, a material editing interface is displayed in a floating layer mode, and the material editing interface is provided with a material replacement button, a material position numerical value input frame, a material size numerical value input frame, a character front and rear position single selection frame, an animation effect pull-down option frame, an animation duration adjustment progress bar and an animation duration numerical value input frame. Specifically, the animation effect drop-down option box is provided with drop-down options such as "fade-in and fade-out", "push", "erase", "split", "clock", "mosaic", "dissolve", etc., and it should be noted that: the size and position of the material can also be set directly in the first display area by dragging the material.

Therefore, by classifying and placing different configuration information in different single selection frames, a user can quickly find and edit, so that the requirements in various scenes can be conveniently realized, and the editing efficiency is improved; through the preview function of the second display area, the user can intuitively know the configured video content, so that the video preset parameters are better adjusted, and the video preview precision is improved; through simple and clear interface design and easy-to-understand single selection frame layout, the learning cost and the use difficulty of a user are reduced, and the usability of the product is improved.

The second display area is used for previewing the configuration content of the target video, wherein the configuration content comprises the size of the target video, the image of the virtual object and the gesture of the virtual object; the video size information is, for example, "7020 x 1280", "1920 x 1080", or a custom video size; the virtual object information is, for example, a preset avatar "Ada", "mushroom sauce", "bear expansion", "Jane", or the like; the posture information is, for example, "anchor standing posture", "customer service standing posture", "girl posture", "leisure standing posture", "live sitting posture", "anchor standing posture", and "sitting posture"; the pronunciation languages are, for example, "Chinese", "English", "Chinese-English" or "Russian", etc.; the sound tone is, for example, "sweet young girl", "loving young girl", "gentle adult girl", "spiritual girl", or the like.

In a specific embodiment, a product project technician of the S company uses a video editor to make a video of a product to be taught by a virtual person applied to a virtual object interactive application, the project technician inputs a video name "a product description" through a video preset module of the video editor, selects a video size of 1920 x 1080", selects a created virtual person" JING "as a virtual object of the video explanation, selects a posture of the virtual object as a" main broadcasting standing posture ", selects a pronunciation language of the virtual object as" Chinese and English pronunciation ", and a pronunciation tone as" sweet female "and clicks an update button to confirm configuration information of the video.

the third display area is used for previewing the target video;

Therefore, through various input frames and drop-down frames provided on the interface, a user can rapidly configure information of the synthesized target video, and efficiency of video synthesis is improved; the user can set the video format and quality according to different scene demands, thereby meeting the video synthesis demands under different scenes; through the preview function of the third display area, a user can intuitively know the content and effect of the synthesized video, so that the synthesis parameters are better adjusted; reduces the operation difficulty of users and improves the usability of the product.

Such as AVI format, MP4 format, MOV format, WMV format, FLV format, MKV format, etc.

The frame rate information is, for example, 24fps, 30fps; the code rate information is, for example, code rate 1Mbps, code rate 2Mbps, and code rate 4Mbps.

The preset information is, for example, 1920 x 1920 full video, 960 x 960 1/2 video, 480 x 480 1/4 video.

the second playing button is used for playing the target video;

the delete button is used for deleting the target video.

Therefore, through providing a plurality of function buttons and an information display area, the operation difficulty of a user is reduced, and the use experience of the user is enhanced; the synthesized information unit is provided after the synthesis is completed, and a user is informed of the synthesis result, so that the reliability of synthesizing the target video is improved; supporting multiple file output operations, and enabling a user to select to download video information or audio information so as to meet requirements in different scenes; by deleting the button, the user can conveniently delete the unnecessary target video, thereby reducing the occupation of the storage space.

The notification information is, for example, "video name: product a description, resolution: 1920×1920, frame rate: 24fps, code rate: 1Mbps, video preset: 1/4 video 480x480, time: 15 seconds ago).

(video editing method)

Referring to fig. 2, fig. 2 is a flowchart of a video editing method according to an embodiment of the present application.

The embodiment of the application also provides a video editing method, the specific embodiment of which is consistent with the embodiment recorded in the video editor embodiment and the achieved technical effect, and part of the content is not repeated.

The video editing method utilizes the video editor to manufacture target videos of virtual objects of the virtual object interactive application, the video editor comprises a video preset module, a video editing module and a video synthesis module, and the video editing method can comprise the following steps:

step S101: configuring configuration information of a target video by utilizing the video presetting module, wherein the configuration information comprises one or more of video name information, video size information, virtual object information, gesture information, pronunciation language information and pronunciation tone information;

step S102: configuring configuration materials of the target video by utilizing the video editing module, wherein the configuration materials comprise one or more of character materials, scene materials, display materials, subtitle materials and score materials;

Step S103: and previewing, synthesizing and downloading the target video by utilizing the video synthesizing module.

step S201: when the video synthesis button is clicked, detecting whether the target video contains bad content or not by using the content detection module;

step S202: when the target video contains bad content, canceling synthesizing the target video, acquiring user information corresponding to the target video and carrying out bad mark on the user information; when the marked times of the user information are larger than the preset bad mark times, refusing to respond to the access request of the user information;

step S203: when the target video does not contain bad content, no operation is performed.

Therefore, whether the target video contains bad content is detected, so that the propagation and influence of the bad content are avoided, and the safety of the platform is improved; by acquiring user information, carrying out bad marks and paying attention, and setting preset bad mark times, malicious users can be limited, illegal behaviors can be effectively hit, and negative effects caused by bad content transmission are reduced; the management of virtual interactive application is enhanced, and the probability of producing bad content is reduced.

In a specific embodiment, after the user configures the target video, clicking a video composition button, and calling a content detection module (with a content detection model) to detect the target video to determine whether the target video contains bad content; the content detection model samples a certain number of video frames (such as 10 frames per second) of a large number of video materials which contain bad content labels after pretreatment, performs feature extraction on each frame, and outputs feature vectors of the frame; taking the characteristic vector of each frame as input, and training a content detection model by utilizing a multi-label classification algorithm; detecting a target video by using the trained content detection model to obtain tag probability distribution of each frame; obtaining the label distribution probability of the whole video by counting the average value of the label distribution of all frames; if the tag distribution probability of the video reaches a set threshold value, the video is indicated to contain bad content; if the target video contains bad content, canceling synthesizing the target video, acquiring user information (such as IP address, user account number and the like) for synthesizing the target video, performing bad marking, recording the marked times of the user, and comparing whether the marked times reach the preset bad mark times or not; if the marking times reach the preset value, the system refuses to respond to the access request of the user and lists the access request in a blacklist; if the target video does not contain objectionable content, the user is allowed to continue with the video composition operation.

In some optional embodiments, the video editing module includes a material library unit, a first display area, a material editing unit, a language editing unit, a mode switching button, and a save button, and the video editing method may further include:

step S301: providing the configuration materials by utilizing the material library unit;

step S302: previewing configuration content of the target video by utilizing the first display area;

step S303: editing the configuration material by using the material editing unit;

step S304: providing language materials of the target video by using the language editing unit;

step S305: switching to a code mode configuration interface by using the mode switching button, wherein the code configuration interface comprises a voice synthesis markup language configuration interface;

step S306: and confirming the configuration content of the target video by using the save button.

In some optional embodiments, the material library unit includes a first navigation bar, a second navigation bar, and a material display area, and the video editing method may further include:

step S401: selecting the configuration material by using the first navigation bar;

step S402: selecting sub-materials of the configuration materials by using the second navigation bar;

Step S403: the sub-material is displayed using the material display area, and the sub-material may be added to the material editing unit.

In some optional embodiments, the language editing unit includes a language configuration mode drop-down box, the language configuration mode drop-down box is used to select to configure the language material using a speech synthesis mode or a real audio mode, and the video editing method may further include:

step S501: when the language material is configured by using the voice synthesis mode, the language editing unit comprises a language text input box and a language material editing column, the language material is input by using the language text input box, and configuration information of the language material is edited by using the language material editing column, wherein the configuration information of the language material comprises phonetic notations, pauses, speech speeds, volume and tones; the language material editing column further comprises a test listening button, and the language material is listened to by the test listening button;

step S502: when the language material is configured by using the real person audio mode, the language editing unit comprises a language material uploading button and a language material adjusting subunit, the language material is uploaded by using the language material uploading button, and the language material is adjusted by using the material adjusting subunit.

In some optional embodiments, the material editing unit includes at least one material track, the material track including a track identifier and a configuration track, and the video editing method may further include:

step S601: editing the sub-materials of the configuration materials executed by the virtual object by using the material track.

In some optional embodiments, the video preset module includes a second display area, a first video name text input box, a video size box, a virtual object box, a gesture box, a pronunciation language drop-down box, a pronunciation tone drop-down box, and an update button, and the video editing method may further include:

step S701: previewing the configuration content of the target video by utilizing the second display area;

step S702: inputting video name information of the target video by using the first video name text input box;

step S703: configuring video size information of the target video by utilizing the video size single selection frame;

step S704: configuring virtual object information of the target video by utilizing the virtual object single selection frame;

step S705: configuring gesture information of the virtual object by utilizing the gesture single selection frame;

Step S706: configuring the pronunciation language of the virtual object by utilizing the pronunciation language drop-down box;

step S707: configuring pronunciation tone information of the virtual object by using the pronunciation tone drop-down frame;

step S708: and confirming the configuration information of the target video by using the update button.

In some optional embodiments, the video composition module includes a third display area, a first play button, a second video name text input box, a video format drop-down box, a video quality drop-down box, a video preset drop-down box, and a video composition button, and the video editing method may further include:

step S801: previewing the target video by using the third display area;

step S802: playing the target video in the third display area after the first playing button is clicked;

step S803: modifying video name information of the target video by using the second video name text input box;

step S804: configuring a video format of the target video by utilizing the video format drop-down box;

step S805: configuring frame rate information and code rate information of the target video by utilizing the video quality drop-down frame;

Step S806: configuring preset information of the target video by utilizing the video preset drop-down frame, wherein the preset information comprises a complete video, a 1/2 video and a 1/4 video;

step S807: and synthesizing the target video after clicking the video synthesis button, wherein the video synthesis button is updated to be a re-synthesis button after clicking for the first time.

In some alternative embodiments, the video composition button is clicked to display a composed information unit, the composed information unit including an information display area, a second play button, a download video button, a download audio button, and a delete button, and the video editing method may further include:

step S901: displaying notification information of the target video composition by using the information display area;

step S902: playing the target video by using the second play button;

step S903: downloading video information of the target video by using the video downloading button;

step S904: downloading the audio information of the target video by utilizing the downloading audio button;

step S905: and deleting the target video by using the deleting button.

(device example)

The embodiment of the application provides an electronic device, the specific embodiment of which is consistent with the embodiment described in the embodiment of the video editor and the achieved technical effect, and part of the contents are not repeated.

The embodiment of the application provides electronic equipment, which is used for manufacturing a target video of a virtual object interactive application by utilizing the video editor, wherein the video editor comprises a video preset module, a video editing module and a video synthesis module;

when the target video does not contain bad content, no operation is performed.

providing the configuration materials by utilizing the material library unit;

editing the configuration material by using the material editing unit;

selecting the configuration material by using the first navigation bar;

previewing the target video by using the third display area;

playing the target video by using the second play button;

and deleting the target video by using the deleting button.

Referring to fig. 3, fig. 3 shows a block diagram of an electronic device according to an embodiment of the present application.

The electronic device 10 may for example comprise at least one memory 11, at least one processor 12 and a bus 13 connecting the different platform systems.

Memory 11 may include readable media in the form of volatile memory, such as Random Access Memory (RAM) 111 and/or cache memory 112, and may further include Read Only Memory (ROM) 113.

The memory 11 also stores a computer program executable by the processor 12 to cause the processor 12 to implement the steps of any of the methods described above.

Memory 11 may also include utility 114 having at least one program module 115, such program modules 115 include, but are not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

Accordingly, the processor 12 may execute the computer programs described above, as well as may execute the utility 114.

The processor 12 may employ one or more application specific integrated circuits (ASICs, application Specific Integrated Circuit), DSPs, programmable logic devices (PLD, programmableLogic devices), complex programmable logic devices (CPLDs, complex Programmable Logic Device), field programmable gate arrays (FPGAs, fields-Programmable Gate Array), or other electronic components.

Bus 13 may be a local bus representing one or more of several types of bus structures including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or any of a variety of bus architectures.

The electronic device 10 may also communicate with one or more external devices such as a keyboard, pointing device, bluetooth device, etc., as well as one or more devices capable of interacting with the electronic device 10 and/or with any device (e.g., router, modem, etc.) that enables the electronic device 10 to communicate with one or more other computing devices. Such communication may be via the input-output interface 14. Also, the electronic device 10 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through a network adapter 15. The network adapter 15 may communicate with other modules of the electronic device 10 via the bus 13. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with the electronic device 10 in actual applications, including, but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, data backup storage platforms, and the like.

(Medium examples)

The embodiment of the application also provides a computer readable storage medium, which stores a computer program, the computer program realizes the steps of any one of the methods or the functions of any one of the devices when being executed by a processor, and the specific embodiment is consistent with the embodiment described in the method embodiment and the achieved technical effect, and some contents are not repeated.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a program product according to an embodiment of the present application.

The program product being for implementing any of the methods described above. The program product may take the form of a portable compact disc read-only memory (CD-ROM) and comprises program code and may be run on a terminal device, such as a personal computer. However, the program product of the present application is not limited thereto, and in the embodiments of the present application, the readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable storage medium may also be any readable medium that can transmit, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the C programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

The present application has been described in terms of its purpose, performance, advancement, and novelty, and the like, and is thus adapted to the functional enhancement and use requirements highlighted by the patent statutes, but the description and drawings are not limited to the preferred embodiments of the present application, and therefore, all equivalents and modifications that are included in the construction, apparatus, features, etc. of the present application shall fall within the scope of the present application.

Claims

1. A video editor for producing a target video of a virtual object for a virtual object interactive application, the video editor comprising:

2. The video editor of claim 1, wherein the video editing module comprises a material library unit, a first display area, a material editing unit, a language editing unit, a mode switch button, and a save button;

the material library unit is used for providing the configuration materials;

the material editing unit is used for editing the configuration material;

3. The video editor of claim 2, wherein the stock element comprises a first navigation bar, a second navigation bar, and a stock display area;

the first navigation bar is used for selecting the configuration materials;

4. The video editor of claim 2 wherein the language editing unit comprises a language configuration mode drop down box for selecting to configure the language material with either a speech synthesis mode or a real audio mode;

5. The video editor of claim 2 wherein the material editing unit comprises at least one material track comprising a track identification and a configuration track; the material track is used for editing sub-materials of the configuration materials executed by the virtual object.

6. The video editor of claim 1 wherein the video preset module comprises a second display area, a first video name text entry box, a video size box, a virtual object box, a gesture box, a pronunciation language drop down box, a pronunciation timbre drop down box, and an update button;

7. The video editor of claim 1, wherein the video composition module comprises a third display area, a first play button, a second video title text input box, a video format drop-down box, a video quality drop-down box, a video preset drop-down box, and a video composition button;

the third display area is used for previewing the target video;

8. The video editor of claim 7 wherein the video composition button displays a composed information element after being clicked, the composed information element comprising an information display area, a second play button, a download video button, a download audio button, and a delete button;

the second playing button is used for playing the target video;

the delete button is used for deleting the target video.

9. A video editing method, characterized in that the video editing method is used for making a target video of a virtual object interactive application by using a video editor according to any one of claims 1-8, the video editor comprises a video preset module, a video editing module and a video synthesis module, and the video editing method comprises:

Configuring configuration materials of the target video by utilizing the video editing module;

10. The video editing method of claim 9, wherein the video editor further comprises a content detection module;

when the target video does not contain bad content, no operation is performed.

11. An electronic device, characterized in that it is configured to make a target video of a virtual object interactive application by using the video editor of any one of claims 1-8, where the video editor includes a video preset module, a video editing module, and a video synthesizing module;

12. A computer-readable storage medium, characterized in that it stores a computer program which, when executed by at least one processor, implements the steps of the method of claim 9 or 10 or implements the functions of the electronic device of claim 11.