[go: up one dir, main page]

CN108900886A - A kind of Freehandhand-drawing video intelligent dubs generation and synchronous method - Google Patents

A kind of Freehandhand-drawing video intelligent dubs generation and synchronous method Download PDF

Info

Publication number
CN108900886A
CN108900886A CN201810788821.4A CN201810788821A CN108900886A CN 108900886 A CN108900886 A CN 108900886A CN 201810788821 A CN201810788821 A CN 201810788821A CN 108900886 A CN108900886 A CN 108900886A
Authority
CN
China
Prior art keywords
data
dubbing
generation
intelligent
freehandhand
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810788821.4A
Other languages
Chinese (zh)
Inventor
魏博
邵猛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Qianhai Hand Painted Technology and Culture Co Ltd
Original Assignee
Shenzhen Qianhai Hand Painted Technology and Culture Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Qianhai Hand Painted Technology and Culture Co Ltd filed Critical Shenzhen Qianhai Hand Painted Technology and Culture Co Ltd
Priority to CN201810788821.4A priority Critical patent/CN108900886A/en
Publication of CN108900886A publication Critical patent/CN108900886A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Management Or Editing Of Information On Record Carriers (AREA)

Abstract

The present invention provides a kind of Freehandhand-drawing video intelligent and dubs generation and synchronous method, which is characterized in that the Freehandhand-drawing video intelligent dubs generation and synchronous method includes:S10:Script data is dubbed in acquisition;S20:The script data of dubbing got is standardized;S30:Obtain the characteristic for dubbing script data;S40:Generate intelligent dubbing data;S50:The intelligent dubbing data of generation is synchronized in institute's video scene to be applied;S60:To after synchronizing intelligent dubbing data and video scene be finely adjusted processing.The present invention provides a kind of Freehandhand-drawing video intelligent and dubs generation and synchronous method, based on to different types of a large amount of voice datas are specified, carries out speech model training, and user can choose the sound of a certain type, be dubbed for target video.

Description

A kind of Freehandhand-drawing video intelligent dubs generation and synchronous method
Technical field
The present invention relates to Freehandhand-drawing video intelligents to dub field, and in particular to a kind of Freehandhand-drawing video intelligent dubs generation and same The method of step.
Background technique
In the short-sighted frequency production process of Freehandhand-drawing, sound is an important component part, sound and Freehandhand-drawing animation group together At the short-sighted frequency of Freehandhand-drawing.
And the sound in the short-sighted frequency of Freehandhand-drawing now realizes technology, there are mainly two types of:1. being simply added background music, 2. specially Industry is dubbed personnel and is dubbed according to video image content, and 3. utilize speech synthesis technique, synthesizes the voice of specified text, and These three sound realize technology, each there is many problems and disadvantages.
The first is simply added the mode of background music, and disadvantage music cannot be synchronous with picture material, can not accomplish to picture The effects of explanation in face, can only sound in simple realization Freehandhand-drawing video, the compatible degree of sound and the quality of video are all difficult to protect Card.
Second of profession dubs mode, needs to dub script by the personnel that dub of profession, production, uses dubbing for profession Equipment is gone to realize and be dubbed, then using the video editing tool of profession, is synthesized to video dubbing according to video image content In;The shortcomings that this mode, is and profession dubs communication cost between personnel, monetary cost and entirely dubs synthesis process Time cost, once and video content have modification, it is necessary to re-start it is above-mentioned entirely dub process, for common It is that difficulty is biggish that user's use or professional user, which use,.
The third speech synthesis technique, the voice after synthesis, sounder are machine synthetic videos, word speed, intonation, at the tone Reason, fluency etc. all have a long way to go between true man's sound, can not achieve dubbing for high quality in Freehandhand-drawing video.
To sum up, existing technology, the sound for having no idea to complete well in Freehandhand-drawing video are realized at present.
Summary of the invention
In view of this, the present invention, which provides a kind of Freehandhand-drawing video intelligent, dubs generation and synchronous method, based on to it is specified not A large amount of voice datas of same type carry out speech model training, and user can choose the sound of a certain type, be target video into Row is dubbed;And according to the duration of each scene in the short-sighted frequency of target, fine tuning is done to each section in dubbing data and is synchronized;Again to every Gradual change processing is done at the pause of section dubbing data, the case where there are multiple voices to same time point, does reasonable volume adjustment Operation, finally realize the generation intelligently dubbed with it is synchronous.
The present invention provides a kind of Freehandhand-drawing video intelligent and dubs generation and synchronous method, which is characterized in that the Freehandhand-drawing view Frequency intelligence dubs generation and synchronous method includes:
S10:Script data is dubbed in acquisition, and is stored in dubbing in script database;
S20:The script data of dubbing got is standardized;
S30:The characteristic for dubbing script data is obtained, and is stored in dubbing in teaching data characteristics database;
S40:Generate intelligent dubbing data;
S50:The intelligent dubbing data of generation is synchronized in institute's video scene to be applied;
S60:To after synchronizing intelligent dubbing data and video scene be finely adjusted processing.
Preferably, step S40 includes:
S401:Obtain different types of voice data;
S402:Classification sound characteristic is carried out to the different types of voice data of acquisition to extract;
S403:Classification voice is carried out to the different types of sound characteristic of extraction to be trained, and generates phonetic algorithm model;
S404:It is special according to different types of voice data corresponding in phonetic algorithm model and the corresponding classification sound of the type Data are levied, synthesize the voice in corresponding voice script using speech regeneration method, adjustment word speed and corresponding pause point, synthesis are dubbed Data.
It preferably, further include step S41 after the step S40:
Be related to it is special dub script data, to it is special dub script data carry out specially treated.
Preferably, the script data of dubbing includes audio text information, pause point information, word speed information and sound number According to type information.
Preferably, step S60 includes:
S601:When video scene switches, gradual change processing is carried out to the dubbing data of generation;
S602:Dubbing data and video scene are further finely tuned, guarantee the synchronization of dubbing data and video scene.
The invention has the advantages and positive effects that:The present invention is based on to specifying different types of a large amount of voice datas, The extraction of the features such as vocal print, sound frequency, intonation, the tone is carried out, speech model training is carried out;It is also based on user's typing Voice data carries out above-mentioned feature extraction, carries out model training.Script is dubbed according to user's input in this way, so that it may complete At the synthesis for specifying different types of sound rendering or even user oneself sound.User can choose the sound of a certain type, be Target video is dubbed;And according to the duration of each scene in the short-sighted frequency of target, each section in dubbing data is finely tuned It is synchronous;Again to gradual change processing is done at the pause of every section of dubbing data, the case where there are multiple voices to same time point, is done rationally Volume adjustment operation, finally realize the generation intelligently dubbed with it is synchronous.
Detailed description of the invention
Fig. 1 is that the Freehandhand-drawing video intelligent of first embodiment of the invention embodiment dubs the method stream of generation and synchronous method Cheng Tu;
Fig. 2 is that the Freehandhand-drawing video intelligent of second embodiment of the invention dubs the method flow diagram of generation and synchronous method;
Fig. 3 is the method flow diagram of the intelligent dubbing data of generation of the embodiment of the present invention;
Fig. 4 be the embodiment of the present invention to after synchronizing intelligent dubbing data and video scene be finely adjusted the method stream of processing Cheng Tu;
Fig. 5 is that the Freehandhand-drawing video intelligent of inventive embodiments dubs the functional diagram of generation and synchronization engine.
Specific embodiment
In order to better understand the present invention, the present invention is further retouched with attached drawing combined with specific embodiments below It states.
A kind of Freehandhand-drawing video intelligent of the present invention dubs generation and synchronous method, which is characterized in that the Freehandhand-drawing video intelligence Generation can be dubbed and synchronous method includes:
S10:Script data is dubbed in acquisition;
S20:The script data of dubbing got is standardized;
S30:Obtain the characteristic for dubbing script data;
S40:Generate intelligent dubbing data;
S50:The intelligent dubbing data of generation is synchronized in institute's video scene to be applied;
S60:To after synchronizing intelligent dubbing data and video scene be finely adjusted processing.
In one embodiment of the invention, setting be stored with dub script data dub script database VoiceScriptData, it is described dub to be stored in script database VoiceScriptData largely dub script data, Dubbing script data described in wherein includes audio text information, pause point information, word speed information and voice data type information Deng.User, which can according to need, dubs foot required for extraction in described dub in script database VoiceScriptData Notebook data dubs foot to acquisition then by dubbing script data specification tool normalizeVoiceScriptData () Notebook data carries out standardization processing.Specifically, main includes that forbidden character in script text and not sounding character are dubbed in filtering, According to word speed, according to target video scene distribution quantity and video length, first successive step text size is adjusted to dub script Data are as subsequent algorithm operating specification data.
Further, by dubbing script feature extracting tool getVoiceScriptFeatureData (), acquisition is dubbed Characteristic in script data, including sound type, word speed and pause point data.
Further, intelligent dubbing data is generated.In one embodiment of the invention, intelligent dubbing data tool is generated GeneAIVoiceData () includes:
S401:Obtain different types of voice data;
S402:Classification sound characteristic is carried out to the different types of voice data of acquisition to extract;
S403:Classification voice is carried out to the different types of sound characteristic of extraction to be trained, and generates phonetic algorithm model AIVoiceModel;
S404:It is corresponding according to different types of voice data corresponding in phonetic algorithm model AIVoiceModel and the type Classification voice characteristics data, synthesize the voice in corresponding voice script using speech regeneration method, adjustment word speed and correspondence are stopped Pause point synthesizes dubbing data.
The present invention is based on to different types of a large amount of voice datas are specified, vocal print, sound frequency, intonation, tone etc. are carried out The extraction of feature carries out speech model training;It is also based on the voice data of user's typing, carries out above-mentioned feature extraction, Carry out model training.Script is dubbed according to user's input in this way, so that it may which different types of sound rendering is specified in completion, even The synthesis of user oneself sound.User can choose the sound of a certain type, be dubbed for target video.
Further, be related to it is special dub script data, to it is special dub script data carry out specially treated, specifically For, according to it is special dub script and generate special dub audio.
Further, after according to script data sound dubbing data is dubbed, the dubbing data of generation is matched by synchronizing Sound and video scene tool syncVoiceWithVideo () mutually tie the dubbing data of synthesis with the short video scene of target It closes, realizes that sound is drawn and synchronize, specific implementation process first determines whether the duration of dubbing data and the short-sighted frequency of target, realizes and is aligned to the two Operation.
Further, to after synchronizing intelligent dubbing data and video scene be finely adjusted processing, specifically include:
S601:When video scene switches, gradual change processing is carried out to the dubbing data of generation;
S602:Dubbing data and video scene are further finely tuned, guarantee the synchronization of dubbing data and video scene.
According to the duration of each scene in the short-sighted frequency of target, fine tuning is done to each section in dubbing data and is synchronized;Again to every section Gradual change processing is done at the pause of dubbing data, the case where there are multiple voices to same time point is reasonable volume adjustment behaviour Make, finally realize the generation intelligently dubbed with it is synchronous.
It should be noted that herein, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.
Finally it should be noted that:Obviously, the above embodiment is merely an example for clearly illustrating the present invention, and simultaneously The non-restriction to embodiment.For those of ordinary skill in the art, it can also do on the basis of the above description Other various forms of variations or variation out.There is no necessity and possibility to exhaust all the enbodiments.And thus drawn The obvious changes or variations that Shen goes out are still in the protection scope of this invention.

Claims (5)

1. a kind of Freehandhand-drawing video intelligent dubs generation and synchronous method, which is characterized in that the Freehandhand-drawing video intelligent dubs life At and synchronous method include:
S10:Script data is dubbed in acquisition, and is stored in dubbing in script database;
S20:The script data of dubbing got is standardized;
S30:The characteristic for dubbing script data is obtained, and is stored in dubbing in teaching data characteristics database;
S40:Generate intelligent dubbing data;
S50:The intelligent dubbing data of generation is synchronized in institute's video scene to be applied;
S60:To after synchronizing intelligent dubbing data and video scene be finely adjusted processing.
2. Freehandhand-drawing video intelligent according to claim 1 dubs generation and synchronous method, it is characterised in that:Step S40 Including:
S401:Obtain different types of voice data;
S402:Classification sound characteristic is carried out to the different types of voice data of acquisition to extract;
S403:Classification voice is carried out to the different types of sound characteristic of extraction to be trained, and generates phonetic algorithm model;
S404:It is special according to different types of voice data corresponding in phonetic algorithm model and the corresponding classification sound of the type Data are levied, synthesize the voice in corresponding voice script using speech regeneration method, adjustment word speed and corresponding pause point, synthesis are dubbed Data.
3. according to claim 1 or Freehandhand-drawing video intelligent described in any one of 2 dubs generation and synchronous method, feature It is:It further include step S41 after the step S40:
Be related to it is special dub script data, to it is special dub script data carry out specially treated.
4. Freehandhand-drawing video intelligent according to claim 3 dubs generation and synchronous method, it is characterised in that:It is described to dub Script data includes audio text information, pause point information, word speed information and voice data type information.
5. Freehandhand-drawing video intelligent according to claim 4 dubs generation and synchronous method, it is characterised in that:Step S60 Including:
S601:When video scene switches, gradual change processing is carried out to the dubbing data of generation;
S602:Dubbing data and video scene are further finely tuned, guarantee the synchronization of dubbing data and video scene.
CN201810788821.4A 2018-07-18 2018-07-18 A kind of Freehandhand-drawing video intelligent dubs generation and synchronous method Pending CN108900886A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810788821.4A CN108900886A (en) 2018-07-18 2018-07-18 A kind of Freehandhand-drawing video intelligent dubs generation and synchronous method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810788821.4A CN108900886A (en) 2018-07-18 2018-07-18 A kind of Freehandhand-drawing video intelligent dubs generation and synchronous method

Publications (1)

Publication Number Publication Date
CN108900886A true CN108900886A (en) 2018-11-27

Family

ID=64350849

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810788821.4A Pending CN108900886A (en) 2018-07-18 2018-07-18 A kind of Freehandhand-drawing video intelligent dubs generation and synchronous method

Country Status (1)

Country Link
CN (1) CN108900886A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111031386A (en) * 2019-12-17 2020-04-17 腾讯科技(深圳)有限公司 Video dubbing method and device based on voice synthesis, computer equipment and medium
CN111866582A (en) * 2019-04-26 2020-10-30 广州声活圈信息科技有限公司 Deduction for user matching opponent game and deduction synthesis method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020193994A1 (en) * 2001-03-30 2002-12-19 Nicholas Kibre Text selection and recording by feedback and adaptation for development of personalized text-to-speech systems
EP1701527A1 (en) * 2005-03-10 2006-09-13 Avaya Technology Llc Graphical menu generation in interactive voice response systems
CN101178896A (en) * 2007-12-06 2008-05-14 安徽科大讯飞信息科技股份有限公司 Unit selection voice synthetic method based on acoustics statistical model
CN101359473A (en) * 2007-07-30 2009-02-04 国际商业机器公司 Auto speech conversion method and apparatus
CN105118498A (en) * 2015-09-06 2015-12-02 百度在线网络技术(北京)有限公司 Training method and apparatus of speech synthesis model
CN106531148A (en) * 2016-10-24 2017-03-22 咪咕数字传媒有限公司 Cartoon dubbing method and apparatus based on voice synthesis
CN107172449A (en) * 2017-06-19 2017-09-15 微鲸科技有限公司 Multi-medium play method, device and multimedia storage method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020193994A1 (en) * 2001-03-30 2002-12-19 Nicholas Kibre Text selection and recording by feedback and adaptation for development of personalized text-to-speech systems
EP1701527A1 (en) * 2005-03-10 2006-09-13 Avaya Technology Llc Graphical menu generation in interactive voice response systems
CN101359473A (en) * 2007-07-30 2009-02-04 国际商业机器公司 Auto speech conversion method and apparatus
CN101178896A (en) * 2007-12-06 2008-05-14 安徽科大讯飞信息科技股份有限公司 Unit selection voice synthetic method based on acoustics statistical model
CN105118498A (en) * 2015-09-06 2015-12-02 百度在线网络技术(北京)有限公司 Training method and apparatus of speech synthesis model
CN106531148A (en) * 2016-10-24 2017-03-22 咪咕数字传媒有限公司 Cartoon dubbing method and apparatus based on voice synthesis
CN107172449A (en) * 2017-06-19 2017-09-15 微鲸科技有限公司 Multi-medium play method, device and multimedia storage method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111866582A (en) * 2019-04-26 2020-10-30 广州声活圈信息科技有限公司 Deduction for user matching opponent game and deduction synthesis method
CN111031386A (en) * 2019-12-17 2020-04-17 腾讯科技(深圳)有限公司 Video dubbing method and device based on voice synthesis, computer equipment and medium

Similar Documents

Publication Publication Date Title
Lacey Listening publics: The politics and experience of listening in the media age
TWI704805B (en) Video editing method and device
CN105244022B (en) Audio-video method for generating captions and device
US5880788A (en) Automated synchronization of video image sequences to new soundtracks
CN110428811B (en) Data processing method and device and electronic equipment
US20080275700A1 (en) Method of and System for Modifying Messages
JP5206553B2 (en) Browsing system, method, and program
CN102280104B (en) File phoneticization processing method and system based on intelligent indexing
CN104252872B (en) Lyric generating method and intelligent terminal
CN105244041B (en) The evaluation method and device of song audition
KR20140133056A (en) Apparatus and method for providing auto lip-synch in animation
CN113676772B (en) Video generation method and device
CN115515002A (en) Intelligent admire class generation method and device based on virtual digital person and storage medium
CN109584859A (en) Phoneme synthesizing method and device
CN111613224A (en) Personalized voice synthesis method and device
CN101615417B (en) Synchronous Chinese lyrics display method which is accurate to words
CN108900886A (en) A kind of Freehandhand-drawing video intelligent dubs generation and synchronous method
CN103544978A (en) Multimedia file manufacturing and playing method and intelligent terminal
CN102447785A (en) Generation method of prompt information of mobile terminal and device
Barbatsis et al. Analyzing meaning in form: Soap opera's compositional construction of “realness”
Koço et al. Applying multiview learning algorithms to human-human conversation classification.
CN118945440A (en) A digital human video generation method and device based on AIGC technology
Thain Anarchival Images: The Labour of Chronic Collage
JP2008084021A (en) Movie scenario generation method, program, and apparatus
CN121078294B (en) Video call subtitle real-time refreshing method based on voice large model analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181127