[go: up one dir, main page]

CN103165126A - Method for voice playing of mobile phone text short messages - Google Patents

Method for voice playing of mobile phone text short messages Download PDF

Info

Publication number
CN103165126A
CN103165126A CN2011104243757A CN201110424375A CN103165126A CN 103165126 A CN103165126 A CN 103165126A CN 2011104243757 A CN2011104243757 A CN 2011104243757A CN 201110424375 A CN201110424375 A CN 201110424375A CN 103165126 A CN103165126 A CN 103165126A
Authority
CN
China
Prior art keywords
word
text
rhythm
mobile phone
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011104243757A
Other languages
Chinese (zh)
Inventor
卢晓鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Vimicro Corp
Original Assignee
Wuxi Vimicro Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Vimicro Corp filed Critical Wuxi Vimicro Corp
Priority to CN2011104243757A priority Critical patent/CN103165126A/en
Publication of CN103165126A publication Critical patent/CN103165126A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses a method for voice playing of mobile phone text short messages. After a mobile phone receives a short message in a text type, text analysis is carried out on text word strings of the short message to obtain corresponding voice waveform, and therefore a voice is synthetized and played. The method has the advantages that instant voice synthesis and instant text voice conversion are achieved, time is saved, driving safety of a user is guaranteed, and convenience is brought to old users of poor vision.

Description

A kind of method of speech play of mobile phone text note
Technical field
The present invention relates to moving communicating field, but particularly a kind of real-time voice is play the method for note.
Background technology
Along with popularizing of increasing cell phone type PDA equipment, people have obtained many convenience in life or work, can exchange with the household with friend timely and conveniently.But many users receive note in steering vehicle, can not in time consult, if consult by force, easily cause again traffic hazard; Moreover, for old cellphone subscriber, because the size of screen font and its eyesight degree are not inconsistent, also cause many difficulties in use.Therefore, seek a kind of speech synthesis technique, the text SMS that mobile phone is received in time plays back by the mode of voice, becomes a very useful function of mobile phone.
Summary of the invention
For solving above-mentioned technical barrier, but the present invention aims to provide a kind of method that real-time voice is play note, after mobile phone receives the note of textual form, to the text word string of this note through text analyzing, obtain corresponding speech waveform, thereby form synthetic speech and play.
It comprises that text normalization is processed and the symbol step of converting, and the special symbol, abbreviation, English word and the measurement unit that are used for the text-string of the note that will obtain are converted to discernible phonation unit and identify.
Comprise the participle model treatment step, be used for the text to input is carried out the division of word by the participle rule that presets, determine the pronunciation of rhythm structure and the polyphone of sentence.
Also comprise prosody prediction step, coarticulation step and select the word step, wherein the prosody prediction step determines each word pronunciation, coarticulation has determined the annexation between each word, selects the word step to select optimum pronunciation according to the pronunciation of rhythm requirement and word in dictionary.
When selecting acoustic elements structure sound bank, utilize the degree of loss function to describe the synthesis capability with formed objects sound bank, the degree of loss function can be expressed as:
ζ(f,d,c)=cf/d
Wherein f is the word frequency of current acoustic elements, d is the prediction duration of acoustic elements, and c is not considering under rhythm condition for the size of coarticulation between the phoneme that comprises in this unit, during sound bank that structure is comprised of acoustic elements, make that the value of degree of loss letter on this sound bank is minimum is target.
Adopt the base frequency parameters model to control the generation of the rhythm.
The speech playing method of mobile phone text note provided by the invention, at mobile phone, the text voice translation system is installed simultaneously, automatically complete phonetic synthesis and play by loudspeaker, it is synthetic that the present invention has real-time phonetic, instant text voice conversion, save time, guarantee user's traffic safety, convenient old user's weak-eyed advantage.Set and the sound bank customization by the user, can change male and female students, perhaps cartoon character sound.Mobile phone also can be according to predefined originator's sex, this sex sound bank of Automatically invoked.
Description of drawings
Fig. 1 is method flow diagram in the embodiment of the present invention;
Fig. 2 is the process flow diagram of a kind of specific embodiment of the present invention.
Embodiment
With reference to figure 1, but the method for a kind of real-time voice broadcast of the present invention note comprises the following steps:
Step 1, note receive
It is to utilize mobile phone receive the note of one or more textual forms by its radio-frequency module from the base station and temporarily be stored in the internal memory of mobile phone that note receives.
Step 2, the conversion of civilian language
Adopt civilian language modular converter (Text-To-Speech model, TTS Model), namely one take text strings as the input voice synthetic module.Its input be common text word string, text analyzer in module is at first according to Pronounceable dictionary, the text strings of input is decomposed into word and pronunciation symbol thereof with attribute flags, again according to semantic rules and phonetic rules, for stress grade and sentence structure and intonation determined in each word, each syllable, and various pauses etc.Text strings just changes the symbol code string into like this.According to the result of front surface analysis, generate the prosodic features of target voice, then carry out the voice combination, synthesize the output voice.
The present invention reports the note data in mobile phone out so that the form of voice is instant, corresponding user only needs passive listening to get final product, here, requirement to speech synthesis system is fast response time, computation complexity and storage space complexity are low, be with good expansibility and synthetic speech sharpness, the property understood strong, be suitable for the interchange of daily life or some professional domain etc.
The present invention utilizes the degree of loss function to describe the synthesis capability with formed objects sound bank when selecting acoustic elements structure sound bank.The degree of loss function can be expressed as:
ζ(f,d,c)=cf/d
Wherein f is the word frequency of current acoustic elements, and d is the prediction duration of acoustic elements, and c is the size of coarticulation between the phoneme that comprises in this unit.Do not considering under rhythm condition, when constructing the sound bank that is comprised of acoustic elements, should make the value of degree of loss letter on this sound bank minimum is target.
The present invention adopts the Fujisaki model to control the generation of the rhythm, and it is a kind of widely used base frequency parameters model, mainly predicts the variation of fundamental frequency by simulation people's Mechanism of Speech Production, controls rhythm, tone intonation, the emotion of synthetic speech;
The inconvenient situation of directly going to see note with eyes of all users that has been suitable for of the present invention.
Set and the sound bank customization by the user, can change male and female students, perhaps cartoon character sound.
Mobile phone also can be according to predefined originator's sex, this sex sound bank of Automatically invoked.
With reference to figure 2,, the text of input mainly transforms by standardization processing and symbol, and wherein special symbol, abbreviation, English word and measurement unit etc. are converted to discernible phonation unit sign.In participle model, the text of inputting is carried out the division of word by the participle rule that presets, just substantially determined the pronunciation of rhythm structure and the polyphone of sentence by word segmentation processing.Prosody prediction determines each word pronunciation; Coarticulation has determined the annexation between each word.Select the word module to select optimum pronunciation according to the pronunciation of rhythm requirement and word in dictionary, reconstruct recovers waveform through voice.The speech waveform of each word is completed the synthetic of final statement under the control of splicing parameter through concatenation module.
1, acoustic elements is selected and is generated
For making synthetic speech have higher sharpness, intelligibility and naturalness, usually take the speech synthesis technique based on waveform.Synthesis unit in the waveform concatenation phonetic synthesis cuts out from the primitive nature voice, has kept some prosodic features of natural-sounding.According to voice and the rhythm rule of natural language, store suitable speech primitive, make these unit have maximum voice and rhythm coverage rate under the memory capacity of determining.Export voice after the steps such as the selection of process acoustic elements, waveform concatenation, smoothing processing when synthetic.By well-designed corpus, and choose optimal acoustic elements according to voice and prosodic rules from the sound storehouse, make the voice of system's outputting high quality.
Common voice unit candidate can have phrase, syllable, phoneme and diphones etc.When the needed corpus of structure waveform concatenation, can be in conjunction with the relative merits of dissimilar sample, for example for the strong phoneme of some coarticulations that often occur in natural flow, syllable combination, when forming target voice by waveform concatenation, should avoid splicing between the large phonotactics of these coarticulation impacts as far as possible, otherwise slightly having of unit selection is improper, will cause the acceptance that is difficult to acoustically.So type and the length of the acoustic elements of taking when the practical synthesis system of structure will be all unfixed.
When selecting acoustic elements structure sound bank, usually utilize certain degree of loss function to describe the synthesis capability with formed objects sound bank.A typical degree of loss function can be expressed as:
ζ(f,d,c)=cf/d (1)
Wherein f is the word frequency of current acoustic elements, and d is the prediction duration of acoustic elements, and c is the size of coarticulation between the phoneme that comprises in this unit.Do not considering under rhythm condition, when constructing the sound bank that is comprised of acoustic elements, should make minimum by the value of degree of loss letter on this sound bank of (1) expression is target.
The acoustic elements that is used for splicing is obtained by continuous flow cutting usually.In living, common expressions can obtain word frequency information by statistics, and selects sentence under the guidance of word frequency information, makes the sentence of selecting have preferably high frequency words and covers, and these select sentences become needs the script recorded after a while.
Select suitable announcer, the contrast script is rationally read aloud, and recording.The speech waveform data of recording gained are carried out cutting by the division of script and acoustic elements, usually can cutting be word, word (CV structure) and English needs cutting to arrive word and a small amount of phoneme or diphones usually for Chinese, thereby consist of the phonation unit storehouse.The acoustic elements that cutting the is obtained words that the position in former sentence (after in front) and front and back are connected by it marks.These markup informations provide foundation to the judgement of selecting the word module
2, the generation of the rhythm
Prosodic parameter is significant for the rhythm of controlling synthetic speech, tone intonation, emotion etc., and to Chinese spectrum mandarin, fundamental frequency is that the harmony straightening connects relevant physical parameter.The constitution principle of Chinese can be summed up as follows: consist of initial consonant or simple or compound vowel of a Chinese syllable by phoneme, become after tone on the rhythm master tape and transfer mother, become syllable by single accent mother or by initial consonant with transferring female the splicing.Chinese has high and level tone, rising tone, upper sound, falling tone, 5 accent softly, and more than 1200 has the tuning joint.A syllable is exactly the sound of a word, i.e. syllable word.Consist of word by syllable word, consist of sentence by word more at last.
Prosody generation based on machine learning.Although obtained many rules about the rhythm at present, these rules also fall far short for forming the rhythm that gets close to nature very much.Hide and inenarrable prosodic rules utilizes the method for machine learning to realize the generation of the rhythm usually for realizing.Algorithm model commonly used has hidden Markov model (HMM), artificial neural network (ANN), support vector machine (SVM) and decision tree etc.
Prosody generation based on parameterized model.Rhythm model based on machine learning extracts the detailed rules and regulations that some manually can't be analyzed, the adult reduces the workload that artificial participation is analyzed, but also there are the following problems simultaneously for this method: at first, general learning algorithm all requires many data resources, when particularly attributive character is many; Secondly, if the data with existing maldistribution of the resources is even, the whole deviation of training, impact analysis result will be caused; Again, expertise well in conjunction with utilizing, is not a kind of information waste; The 4th, training pattern does not have and language feature and human perception hook, can't shift and adjust.Fundamental frequency and duration be the direct parameters,acoustic that affects people's rhythm sense of hearing, both temporal evolution and environmental change.Parameter model utilizes priori, and the relation of first analyzing fundamental frequency duration and language feature, people's sense of hearing is built this relation and touched, and extracts fundamental frequency duration and language feature and people's the directly related parameter of sense of hearing.Such model has effectively utilized expertise, just can train with few data the relation of text language feature and parameter, just can reach simultaneously the purpose of the prosodic features that changes sense of hearing by the adjustment model parameter.
The Fujisaki model is a kind of widely used base frequency parameters model, and it mainly predicts the variation of fundamental frequency by simulation people's Mechanism of Speech Production.Fujisaki thinks that the change of fundamental frequency mainly contains two reasons: the impact of (Accent) transferred in the impact on prosodic phrase border (Phrase) and syllable.The generation of fundamental curve is the mechanism according to vocal cord vibration, and with Phrase and the Accent input as prognoses system, with the input of fundamental curve as system, wherein the form with pulse signal produces the Phrase shape, produces the Accent shape with step function.Fundamental curve can be expressed as under this model:
ln [ F 0 ( t ) ] - ln [ F min ] + Σ i = 1 l A pi G pi ( t - T 0 i ) + Σ j = 1 J A aj [ G aj ( t - T 1 j ) - G aj ( t - T 2 j ) ] - - - ( 2 )
Wherein,
G pi ( t ) = q i texp ( - a i t ) t > 0 0 - - - ( 3 )
G aj ( t ) = min [ 1 - ( 1 + βt ) exp ( - βt ) , θ ] t ≤ 0 0 else
Other parameters in formula are as follows: Fmin, fundamental frequency minimum value; a i, i Phrase order control coefrficient; I, the Phrase number of elements; β j, j Accent order control coefrficient; J, the Accent number of elements; θ, Accent order maximal value parameter; T 0i, the time mark of i Phrase order; A pi, i Phrase order amplitude; T 1j, j Accent order start time; A aj, j Accent order amplitude; T 2j, j Accent order concluding time.
The mechanism of Fujisaki model is very simple, for each phrase order, is exactly to pass through the phrase wave filter with a pulse signal, and corresponding fundamental frequency value rises to maximum point, then decay gradually.For continuous phrase order, fundamental curve produces continuous fluctuation.The Accent order is by a step function initialization, because the parameter alpha of accent wave filter much larger than β, makes the Accent element reach very soon its maximal value, and then decay rapidly.
Beneficial effect of the present invention:
It is synthetic that the present invention has real-time phonetic, instant text voice conversion, and system effectiveness is high, stable;
The text voice conversion module clear in structure that the present invention proposes, the each several part division of labor is clear and definite, and independence is strong;
The present invention is convenient, and old user uses, and thoroughly breaks away from presbyopic glasses;
User of the present invention uses when driving, guarantees user's traffic safety.
Those skilled in the art should understand, embodiments of the invention can be provided as method, system or computer program.Therefore, the present invention can adopt complete hardware implementation example, implement software example or in conjunction with the form of the embodiment of software and hardware aspect fully.And the present invention can adopt the form that wherein includes the upper computer program of implementing of computer-usable storage medium (including but not limited to magnetic disk memory and optical memory etc.) of computer usable program code one or more.
The present invention is that reference is described according to process flow diagram and/or the block scheme of method, equipment (system) and the computer program of the embodiment of the present invention.Should understand can be by the flow process in each flow process in computer program instructions realization flow figure and/or block scheme and/or square frame and process flow diagram and/or block scheme and/or the combination of square frame.Can provide these computer program instructions to the processor of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine, make the instruction of carrying out by the processor of computing machine or other programmable data processing device produce to be used for the device of realizing in the function of flow process of process flow diagram or a plurality of flow process and/or square frame of block scheme or a plurality of square frame appointments.
These computer program instructions also can be stored in energy vectoring computer or the computer-readable memory of other programmable data processing device with ad hoc fashion work, make the instruction that is stored in this computer-readable memory produce the manufacture that comprises command device, this command device is realized the function of appointment in flow process of process flow diagram or a plurality of flow process and/or square frame of block scheme or a plurality of square frame.
These computer program instructions also can be loaded on computing machine or other programmable data processing device, make on computing machine or other programmable devices and to carry out the sequence of operations step producing computer implemented processing, thereby be provided for realizing the step of the function of appointment in flow process of process flow diagram or a plurality of flow process and/or square frame of block scheme or a plurality of square frame in the instruction of carrying out on computing machine or other programmable devices.
Obviously, those skilled in the art can carry out various changes and modification and not break away from the spirit and scope of the present invention the present invention.Like this, if within of the present invention these are revised and modification belongs to the scope of claim of the present invention and equivalent technologies thereof, the present invention also is intended to comprise these changes and modification interior.

Claims (6)

1. the method for the speech play of a mobile phone text note after mobile phone receives the note of textual form,, obtains corresponding speech waveform, thereby forms synthetic speech and play through text analyzing the text word string of this note.
2. method as claimed in claim 1, it is characterized in that: comprise that text normalization is processed and the symbol step of converting, the special symbol, abbreviation, English word and the measurement unit that are used for the text-string of the note that will obtain are converted to discernible phonation unit and identify.
3. method as claimed in claim 2, is characterized in that: comprise the participle model treatment step, be used for the text to input is carried out the division of word by the participle rule that presets, determine the pronunciation of rhythm structure and the polyphone of sentence.
4. method as claimed in claim 3, it is characterized in that: also comprise prosody prediction step, coarticulation step and select the word step, wherein the prosody prediction step determines each word pronunciation, coarticulation has determined the annexation between each word, selects the word step to select optimum pronunciation according to the pronunciation of rhythm requirement and word in dictionary.
5. method as claimed in claim 1, it is characterized in that: when selecting acoustic elements structure sound bank, utilize the degree of loss function to describe the synthesis capability with formed objects sound bank, the degree of loss function can be expressed as:
ζ(f,d,c)=cf/d
Wherein f is the word frequency of current acoustic elements, d is the prediction duration of acoustic elements, and c is not considering under rhythm condition for the size of coarticulation between the phoneme that comprises in this unit, during sound bank that structure is comprised of acoustic elements, make that the value of degree of loss letter on this sound bank is minimum is target.
6. method as claimed in claim 1, is characterized in that: adopt the base frequency parameters model to control the generation of the rhythm.
CN2011104243757A 2011-12-15 2011-12-15 Method for voice playing of mobile phone text short messages Pending CN103165126A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011104243757A CN103165126A (en) 2011-12-15 2011-12-15 Method for voice playing of mobile phone text short messages

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011104243757A CN103165126A (en) 2011-12-15 2011-12-15 Method for voice playing of mobile phone text short messages

Publications (1)

Publication Number Publication Date
CN103165126A true CN103165126A (en) 2013-06-19

Family

ID=48588150

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011104243757A Pending CN103165126A (en) 2011-12-15 2011-12-15 Method for voice playing of mobile phone text short messages

Country Status (1)

Country Link
CN (1) CN103165126A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103888611A (en) * 2014-03-20 2014-06-25 联想(北京)有限公司 A kind of output method and communication equipment
CN104200803A (en) * 2014-09-16 2014-12-10 北京开元智信通软件有限公司 Voice broadcasting method, device and system
CN104485100A (en) * 2014-12-18 2015-04-01 天津讯飞信息科技有限公司 Text-to-speech pronunciation person self-adaptive method and system
CN105095180A (en) * 2014-05-14 2015-11-25 中兴通讯股份有限公司 Chinese name broadcasting method and device
CN106294310A (en) * 2015-06-12 2017-01-04 讯飞智元信息科技有限公司 A kind of Tibetan language tone Forecasting Methodology and system
CN106507321A (en) * 2016-11-22 2017-03-15 新疆农业大学 A Uygur and Chinese Bilingual GSM Short Message Speech Conversion Broadcasting System
CN106652995A (en) * 2016-12-31 2017-05-10 深圳市优必选科技有限公司 Text voice broadcast method and system
CN107909993A (en) * 2017-11-27 2018-04-13 安徽经邦软件技术有限公司 A kind of intelligent sound report preparing system
CN109031474A (en) * 2018-08-31 2018-12-18 成都润联科技开发有限公司 A kind of weather information hiding Chinese phonetic broadcasting terminals and its working method based on Beidou satellite communication
CN111128116A (en) * 2019-12-20 2020-05-08 珠海格力电器股份有限公司 Voice processing method and device, computing equipment and storage medium
CN111261139A (en) * 2018-11-30 2020-06-09 上海擎感智能科技有限公司 Character personification broadcasting method and system
CN112966476A (en) * 2021-04-19 2021-06-15 马上消费金融股份有限公司 Text processing method and device, electronic equipment and storage medium
CN113382123A (en) * 2020-03-10 2021-09-10 精工爱普生株式会社 Scanning system, storage medium, and scanning data generation method for scanning system
CN113903324A (en) * 2020-06-18 2022-01-07 新加坡依图有限责任公司(私有) Method, device, equipment and machine readable medium for text-to-speech
CN113936638A (en) * 2020-06-29 2022-01-14 华为技术有限公司 Text audio playing method and device and terminal equipment
CN114360494A (en) * 2021-12-29 2022-04-15 广州酷狗计算机科技有限公司 Rhythm labeling method and device, computer equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1271216A (en) * 1999-04-16 2000-10-25 松下电器产业株式会社 Speech voice communication system
DE20102259U1 (en) * 2001-02-09 2002-02-21 Materna Gmbh Information & Com SMS short message system
GB2378875A (en) * 2001-05-04 2003-02-19 Andrew James Marsh Annunciator for converting text messages to speech
CN1731509A (en) * 2005-09-02 2006-02-08 清华大学 Mobile speech synthesis method
CN1972478A (en) * 2005-11-24 2007-05-30 展讯通信(上海)有限公司 A novel method for mobile phone reading short message
CN101605307A (en) * 2008-06-12 2009-12-16 深圳富泰宏精密工业有限公司 Test short message service (SMS) voice play system and method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1271216A (en) * 1999-04-16 2000-10-25 松下电器产业株式会社 Speech voice communication system
DE20102259U1 (en) * 2001-02-09 2002-02-21 Materna Gmbh Information & Com SMS short message system
GB2378875A (en) * 2001-05-04 2003-02-19 Andrew James Marsh Annunciator for converting text messages to speech
CN1731509A (en) * 2005-09-02 2006-02-08 清华大学 Mobile speech synthesis method
CN1972478A (en) * 2005-11-24 2007-05-30 展讯通信(上海)有限公司 A novel method for mobile phone reading short message
CN101605307A (en) * 2008-06-12 2009-12-16 深圳富泰宏精密工业有限公司 Test short message service (SMS) voice play system and method

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105357397A (en) * 2014-03-20 2016-02-24 联想(北京)有限公司 Output method and communication devices
CN103888611A (en) * 2014-03-20 2014-06-25 联想(北京)有限公司 A kind of output method and communication equipment
CN103888611B (en) * 2014-03-20 2016-01-27 联想(北京)有限公司 A kind of output method and communication equipment
CN105095180A (en) * 2014-05-14 2015-11-25 中兴通讯股份有限公司 Chinese name broadcasting method and device
CN104200803A (en) * 2014-09-16 2014-12-10 北京开元智信通软件有限公司 Voice broadcasting method, device and system
CN104485100B (en) * 2014-12-18 2018-06-15 天津讯飞信息科技有限公司 Phonetic synthesis speaker adaptive approach and system
CN104485100A (en) * 2014-12-18 2015-04-01 天津讯飞信息科技有限公司 Text-to-speech pronunciation person self-adaptive method and system
CN106294310B (en) * 2015-06-12 2019-05-03 讯飞智元信息科技有限公司 A kind of Tibetan language tone prediction technique and system
CN106294310A (en) * 2015-06-12 2017-01-04 讯飞智元信息科技有限公司 A kind of Tibetan language tone Forecasting Methodology and system
CN106507321A (en) * 2016-11-22 2017-03-15 新疆农业大学 A Uygur and Chinese Bilingual GSM Short Message Speech Conversion Broadcasting System
CN106652995A (en) * 2016-12-31 2017-05-10 深圳市优必选科技有限公司 Text voice broadcast method and system
WO2018121757A1 (en) * 2016-12-31 2018-07-05 深圳市优必选科技有限公司 Method and system for speech broadcast of text
CN107909993A (en) * 2017-11-27 2018-04-13 安徽经邦软件技术有限公司 A kind of intelligent sound report preparing system
CN109031474A (en) * 2018-08-31 2018-12-18 成都润联科技开发有限公司 A kind of weather information hiding Chinese phonetic broadcasting terminals and its working method based on Beidou satellite communication
CN111261139B (en) * 2018-11-30 2023-12-26 上海擎感智能科技有限公司 Literal personification broadcasting method and system
CN111261139A (en) * 2018-11-30 2020-06-09 上海擎感智能科技有限公司 Character personification broadcasting method and system
CN111128116A (en) * 2019-12-20 2020-05-08 珠海格力电器股份有限公司 Voice processing method and device, computing equipment and storage medium
CN113382123A (en) * 2020-03-10 2021-09-10 精工爱普生株式会社 Scanning system, storage medium, and scanning data generation method for scanning system
CN113903324A (en) * 2020-06-18 2022-01-07 新加坡依图有限责任公司(私有) Method, device, equipment and machine readable medium for text-to-speech
CN113936638A (en) * 2020-06-29 2022-01-14 华为技术有限公司 Text audio playing method and device and terminal equipment
CN112966476B (en) * 2021-04-19 2022-03-25 马上消费金融股份有限公司 Text processing method and device, electronic equipment and storage medium
CN112966476A (en) * 2021-04-19 2021-06-15 马上消费金融股份有限公司 Text processing method and device, electronic equipment and storage medium
CN114360494A (en) * 2021-12-29 2022-04-15 广州酷狗计算机科技有限公司 Rhythm labeling method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN103165126A (en) Method for voice playing of mobile phone text short messages
US8219398B2 (en) Computerized speech synthesizer for synthesizing speech from text
Kuligowska et al. Speech synthesis systems: disadvantages and limitations
Panda et al. A survey on speech synthesis techniques in Indian languages
Indumathi et al. Survey on speech synthesis
Panda et al. Text-to-speech synthesis with an Indian language perspective
Mukherjee et al. A bengali hmm based speech synthesis system
Chomphan et al. Tone correctness improvement in speaker dependent HMM-based Thai speech synthesis
Koriyama et al. Prosody generation using frame-based Gaussian process regression and classification for statistical parametric speech synthesis
Trouvain et al. Speech synthesis: text-to-speech conversion and artificial voices
Sun et al. A method for generation of Mandarin F0 contours based on tone nucleus model and superpositional model
Kishore et al. Building Hindi and Telugu voices using festvox
CN102122505A (en) Modeling method for enhancing expressive force of text-to-speech (TTS) system
Chen et al. A Mandarin Text-to-Speech System
Sečujski et al. Learning prosodic stress from data in neural network based text-to-speech synthesis
Pitrelli et al. Expressive speech synthesis using American English ToBI: questions and contrastive emphasis
Bruce et al. On the analysis of prosody in interaction
Gerazov et al. A novel quasi-diphone inventory approach to Text-To-Speech synthesis
Anberbir et al. Development of an Amharic text-to-speech system using cepstral method
Waghmare et al. Analysis of pitch and duration in speech synthesis using PSOLA
Bunnell Speech synthesis: Toward a “Voice” for all
Azeem Designing a model for speech synthesis using HMM
Ng Survey of data-driven approaches to Speech Synthesis
IMRAN ADMAS UNIVERSITY SCHOOL OF POST GRADUATE STUDIES DEPARTMENT OF COMPUTER SCIENCE
Tuckova et al. Prosody optimisation of a Czech language synthesizer

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C05 Deemed withdrawal (patent law before 1993)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130619