[go: up one dir, main page]

GB2591245B - An expressive text-to-speech system - Google Patents

An expressive text-to-speech system Download PDF

Info

Publication number
GB2591245B
GB2591245B GB2000883.5A GB202000883A GB2591245B GB 2591245 B GB2591245 B GB 2591245B GB 202000883 A GB202000883 A GB 202000883A GB 2591245 B GB2591245 B GB 2591245B
Authority
GB
United Kingdom
Prior art keywords
speech system
expressive text
expressive
text
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
GB2000883.5A
Other versions
GB2591245A (en
GB202000883D0 (en
Inventor
Monge Alvarez Jesus
Francois Holly
Sung Hosang
Choi Seungdo
Choo Kihyun
Park Sangjun
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Priority to GB2000883.5A priority Critical patent/GB2591245B/en
Publication of GB202000883D0 publication Critical patent/GB202000883D0/en
Priority to KR1020200062637A priority patent/KR102775245B1/en
Priority to US17/037,023 priority patent/US11830473B2/en
Publication of GB2591245A publication Critical patent/GB2591245A/en
Application granted granted Critical
Publication of GB2591245B publication Critical patent/GB2591245B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • G10L13/10Prosody rules derived from text; Stress or intonation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Signal Processing (AREA)
  • Machine Translation (AREA)
GB2000883.5A 2020-01-21 2020-01-21 An expressive text-to-speech system Active GB2591245B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
GB2000883.5A GB2591245B (en) 2020-01-21 2020-01-21 An expressive text-to-speech system
KR1020200062637A KR102775245B1 (en) 2020-01-21 2020-05-25 Expressive text-to-speech system and method
US17/037,023 US11830473B2 (en) 2020-01-21 2020-09-29 Expressive text-to-speech system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB2000883.5A GB2591245B (en) 2020-01-21 2020-01-21 An expressive text-to-speech system

Publications (3)

Publication Number Publication Date
GB202000883D0 GB202000883D0 (en) 2020-03-04
GB2591245A GB2591245A (en) 2021-07-28
GB2591245B true GB2591245B (en) 2022-06-15

Family

ID=69636811

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2000883.5A Active GB2591245B (en) 2020-01-21 2020-01-21 An expressive text-to-speech system

Country Status (2)

Country Link
KR (1) KR102775245B1 (en)
GB (1) GB2591245B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11475874B2 (en) * 2021-01-29 2022-10-18 Google Llc Generating diverse and natural text-to-speech samples
CN112951202B (en) * 2021-03-11 2022-11-08 北京嘀嘀无限科技发展有限公司 Speech synthesis method, apparatus, electronic device and program product
EP4293660A4 (en) * 2021-06-22 2024-07-17 Samsung Electronics Co., Ltd. ELECTRONIC DEVICE AND ITS CONTROL METHOD
CN113611309B (en) * 2021-07-13 2024-05-10 北京捷通华声科技股份有限公司 Tone conversion method and device, electronic equipment and readable storage medium
CN113838452B (en) * 2021-08-17 2022-08-23 北京百度网讯科技有限公司 Speech synthesis method, apparatus, device and computer storage medium
US11978475B1 (en) * 2021-09-03 2024-05-07 Wells Fargo Bank, N.A. Systems and methods for determining a next action based on a predicted emotion by weighting each portion of the action's reply
CN115985282B (en) * 2021-10-14 2025-11-04 北京字跳网络技术有限公司 Speech rate adjustment methods, devices, electronic equipment, and readable storage media
US12183344B1 (en) 2021-11-24 2024-12-31 Wells Fargo Bank, N.A. Systems and methods for determining a next action based on entities and intents
CN114187892B (en) * 2021-12-08 2025-05-02 北京百度网讯科技有限公司 A style transfer synthesis method, device and electronic device
US12190906B1 (en) 2021-12-17 2025-01-07 Wells Fargo Bank, N.A. Systems and methods for predicting an emotion based on a multimodal input
CN114255737B (en) * 2022-02-28 2022-05-17 北京世纪好未来教育科技有限公司 Voice generation method and device and electronic equipment
CN115116431B (en) * 2022-08-29 2022-11-18 深圳市星范儿文化科技有限公司 Audio generation method, device, equipment and storage medium based on intelligent reading kiosk
CN116264074A (en) * 2022-10-25 2023-06-16 中移(苏州)软件技术有限公司 Speech synthesis method, device, computing device and computer storage medium
US12354614B2 (en) 2022-10-28 2025-07-08 Electronics And Telecommunications Research Institute Speech coding method and apparatus for performing the same
KR20240122146A (en) 2023-02-03 2024-08-12 한국기술교육대학교 산학협력단 Method for providing transfer service of text style transfer system for sound transfer of news text
KR102912837B1 (en) 2023-02-03 2026-01-14 한국기술교육대학교 산학협력단 Text style transfer system for sound transfer of news text

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150186359A1 (en) * 2013-12-30 2015-07-02 Google Inc. Multilingual prosody generation
US20190172443A1 (en) * 2017-12-06 2019-06-06 International Business Machines Corporation System and method for generating expressive prosody for speech synthesis
WO2019139428A1 (en) * 2018-01-11 2019-07-18 네오사피엔스 주식회사 Multilingual text-to-speech synthesis method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150186359A1 (en) * 2013-12-30 2015-07-02 Google Inc. Multilingual prosody generation
US20190172443A1 (en) * 2017-12-06 2019-06-06 International Business Machines Corporation System and method for generating expressive prosody for speech synthesis
WO2019139428A1 (en) * 2018-01-11 2019-07-18 네오사피엔스 주식회사 Multilingual text-to-speech synthesis method

Also Published As

Publication number Publication date
KR20210095010A (en) 2021-07-30
GB2591245A (en) 2021-07-28
GB202000883D0 (en) 2020-03-04
KR102775245B1 (en) 2025-03-06

Similar Documents

Publication Publication Date Title
GB2591245B (en) An expressive text-to-speech system
SG11202009556XA (en) Text-to-speech synthesis system and method
EP3895159A4 (en) Multi-speaker neural text-to-speech synthesis
IL254317A0 (en) System and method for generating accurate speech transcription from natural speech audio signals
GB201701918D0 (en) A spoken dialogue system, a spoken dialogue method and a method of adapting a spoken dialogue system
SG11202000353QA (en) Emergency voice service support indications
EP3709249A4 (en) System for providing user-customized last and method therefor
EP3622885C0 (en) VOICE CONTROL SYSTEM FOR OPHTHALMIC LASER SYSTEMS
GB201903288D0 (en) An aerosol provision system
EP4046396A4 (en) BEAM SHAPING DEVICES FOR HEARING AIDS
DK3833043T3 (en) HEARING SYSTEM INCLUDING A PERSONAL BEAM SHAPER
GB2607903B (en) Text-to-speech system
ZA201907037B (en) Hydraulic support voice control system and method based on vocal cord vibration measurement
GB202109219D0 (en) An aerosol provision system
EP3614696A4 (en) BEAM SHAPER, BEAM SHAPING PROCESS AND HEARING AID SYSTEM
EP4032359C0 (en) MESSAGE CONFIGURATION FOR TWO-STAGE DIRECT ACCESS METHOD
SG11202009311RA (en) Speech analysis system
GB201811458D0 (en) An ambisonic microphone apparatus
GB2568902B (en) System for speech evaluation
GB202102114D0 (en) Laser system
PL4179320T3 (en) System
GB202001651D0 (en) Speaker system
GB202105780D0 (en) Emotion recognition for artificially-intelligent system
IL313380A (en) Provision system
HK40047542A (en) Text-to-speech synthesis system and method