GB2591245B - An expressive text-to-speech system - Google Patents

An expressive text-to-speech system Download PDF

Info

Publication number: GB2591245B
Authority: GB; United Kingdom
Prior art keywords: speech system; expressive text; expressive; text; speech
Prior art date: 2020-01-21
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.): Active

Application number

GB2000883.5A

Other versions

GB2591245A (en

GB202000883D0 (en

Inventor

Monge Alvarez Jesus

Francois Holly

Sung Hosang

Choi Seungdo

Choo Kihyun

Park Sangjun

Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)

Samsung Electronics Co Ltd

Original Assignee

Samsung Electronics Co Ltd

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

2020-01-21

Filing date

2020-01-21

Publication date

2022-06-15

2020-01-21 Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd

2020-01-21 Priority to GB2000883.5A priority Critical patent/GB2591245B/en

2020-03-04 Publication of GB202000883D0 publication Critical patent/GB202000883D0/en

2020-05-25 Priority to KR1020200062637A priority patent/KR102775245B1/en

2020-09-29 Priority to US17/037,023 priority patent/US11830473B2/en

2021-07-28 Publication of GB2591245A publication Critical patent/GB2591245A/en

2022-06-15 Application granted granted Critical

2022-06-15 Publication of GB2591245B publication Critical patent/GB2591245B/en

Status Active legal-status Critical Current

2040-01-21 Anticipated expiration legal-status Critical

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G10L2015/025—Phonemes, fenemes or fenones being the recognition units

Landscapes

Engineering & Computer Science (AREA)
Computational Linguistics (AREA)
Health & Medical Sciences (AREA)
Audiology, Speech & Language Pathology (AREA)
Human Computer Interaction (AREA)
Physics & Mathematics (AREA)
Acoustics & Sound (AREA)
Multimedia (AREA)
Artificial Intelligence (AREA)
Evolutionary Computation (AREA)
Signal Processing (AREA)
Machine Translation (AREA)

GB2000883.5A 2020-01-21 2020-01-21 An expressive text-to-speech system Active GB2591245B (en)

Priority Applications (3)

Application Number	Priority Date	Filing Date	Title
GB2000883.5A GB2591245B (en)	2020-01-21	2020-01-21	An expressive text-to-speech system
KR1020200062637A KR102775245B1 (en)	2020-01-21	2020-05-25	Expressive text-to-speech system and method
US17/037,023 US11830473B2 (en)	2020-01-21	2020-09-29	Expressive text-to-speech system and method

Applications Claiming Priority (1)

Application Number	Priority Date	Filing Date	Title
GB2000883.5A GB2591245B (en)	2020-01-21	2020-01-21	An expressive text-to-speech system

Publications (3)

Publication Number	Publication Date
GB202000883D0 GB202000883D0 (en)	2020-03-04
GB2591245A GB2591245A (en)	2021-07-28
GB2591245B true GB2591245B (en)	2022-06-15

Family

ID=69636811

Family Applications (1)

Application Number	Title	Priority Date	Filing Date
GB2000883.5A Active GB2591245B (en)	2020-01-21	2020-01-21	An expressive text-to-speech system

Country Status (2)

Country	Link
KR (1)	KR102775245B1 (en)
GB (1)	GB2591245B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
US11475874B2 (en) *	2021-01-29	2022-10-18	Google Llc	Generating diverse and natural text-to-speech samples
CN112951202B (en) *	2021-03-11	2022-11-08	北京嘀嘀无限科技发展有限公司	Speech synthesis method, apparatus, electronic device and program product
EP4293660A4 (en) *	2021-06-22	2024-07-17	Samsung Electronics Co., Ltd.	ELECTRONIC DEVICE AND ITS CONTROL METHOD
CN113611309B (en) *	2021-07-13	2024-05-10	北京捷通华声科技股份有限公司	Tone conversion method and device, electronic equipment and readable storage medium
CN113838452B (en) *	2021-08-17	2022-08-23	北京百度网讯科技有限公司	Speech synthesis method, apparatus, device and computer storage medium
US11978475B1 (en) *	2021-09-03	2024-05-07	Wells Fargo Bank, N.A.	Systems and methods for determining a next action based on a predicted emotion by weighting each portion of the action's reply
CN115985282B (en) *	2021-10-14	2025-11-04	北京字跳网络技术有限公司	Speech rate adjustment methods, devices, electronic equipment, and readable storage media
US12183344B1 (en)	2021-11-24	2024-12-31	Wells Fargo Bank, N.A.	Systems and methods for determining a next action based on entities and intents
CN114187892B (en) *	2021-12-08	2025-05-02	北京百度网讯科技有限公司	A style transfer synthesis method, device and electronic device
US12190906B1 (en)	2021-12-17	2025-01-07	Wells Fargo Bank, N.A.	Systems and methods for predicting an emotion based on a multimodal input
CN114255737B (en) *	2022-02-28	2022-05-17	北京世纪好未来教育科技有限公司	Voice generation method and device and electronic equipment
CN115116431B (en) *	2022-08-29	2022-11-18	深圳市星范儿文化科技有限公司	Audio generation method, device, equipment and storage medium based on intelligent reading kiosk
CN116264074A (en) *	2022-10-25	2023-06-16	中移(苏州)软件技术有限公司	Speech synthesis method, device, computing device and computer storage medium
US12354614B2 (en)	2022-10-28	2025-07-08	Electronics And Telecommunications Research Institute	Speech coding method and apparatus for performing the same
KR20240122146A (en)	2023-02-03	2024-08-12	한국기술교육대학교 산학협력단	Method for providing transfer service of text style transfer system for sound transfer of news text
KR102912837B1 (en)	2023-02-03	2026-01-14	한국기술교육대학교 산학협력단	Text style transfer system for sound transfer of news text

Citations (3)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
US20150186359A1 (en) *	2013-12-30	2015-07-02	Google Inc.	Multilingual prosody generation
US20190172443A1 (en) *	2017-12-06	2019-06-06	International Business Machines Corporation	System and method for generating expressive prosody for speech synthesis
WO2019139428A1 (en) *	2018-01-11	2019-07-18	네오사피엔스 주식회사	Multilingual text-to-speech synthesis method

2020
- 2020-01-21 GB GB2000883.5A patent/GB2591245B/en active Active
- 2020-05-25 KR KR1020200062637A patent/KR102775245B1/en active Active

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
US20150186359A1 (en) *	2013-12-30	2015-07-02	Google Inc.	Multilingual prosody generation
US20190172443A1 (en) *	2017-12-06	2019-06-06	International Business Machines Corporation	System and method for generating expressive prosody for speech synthesis
WO2019139428A1 (en) *	2018-01-11	2019-07-18	네오사피엔스 주식회사	Multilingual text-to-speech synthesis method

Also Published As

Publication number	Publication date
KR20210095010A (en)	2021-07-30
GB2591245A (en)	2021-07-28
GB202000883D0 (en)	2020-03-04
KR102775245B1 (en)	2025-03-06

Publication	Publication Date	Title
GB2591245B (en)	2022-06-15	An expressive text-to-speech system
SG11202009556XA (en)	2020-10-29	Text-to-speech synthesis system and method
EP3895159A4 (en)	2022-06-29	Multi-speaker neural text-to-speech synthesis
IL254317A0 (en)	2017-11-30	System and method for generating accurate speech transcription from natural speech audio signals
GB201701918D0 (en)	2017-03-22	A spoken dialogue system, a spoken dialogue method and a method of adapting a spoken dialogue system
SG11202000353QA (en)	2020-02-27	Emergency voice service support indications
EP3709249A4 (en)	2021-07-21	System for providing user-customized last and method therefor
EP3622885C0 (en)	2024-12-11	VOICE CONTROL SYSTEM FOR OPHTHALMIC LASER SYSTEMS
GB201903288D0 (en)	2019-04-24	An aerosol provision system
EP4046396A4 (en)	2024-01-03	BEAM SHAPING DEVICES FOR HEARING AIDS
DK3833043T3 (en)	2022-12-12	HEARING SYSTEM INCLUDING A PERSONAL BEAM SHAPER
GB2607903B (en)	2024-06-19	Text-to-speech system
ZA201907037B (en)	2020-09-30	Hydraulic support voice control system and method based on vocal cord vibration measurement
GB202109219D0 (en)	2021-08-11	An aerosol provision system
EP3614696A4 (en)	2020-12-09	BEAM SHAPER, BEAM SHAPING PROCESS AND HEARING AID SYSTEM
EP4032359C0 (en)	2023-09-13	MESSAGE CONFIGURATION FOR TWO-STAGE DIRECT ACCESS METHOD
SG11202009311RA (en)	2020-10-29	Speech analysis system
GB201811458D0 (en)	2018-08-29	An ambisonic microphone apparatus
GB2568902B (en)	2020-09-09	System for speech evaluation
GB202102114D0 (en)	2021-03-31	Laser system
PL4179320T3 (en)	2024-12-02	System
GB202001651D0 (en)	2020-03-25	Speaker system
GB202105780D0 (en)	2021-06-09	Emotion recognition for artificially-intelligent system
IL313380A (en)	2024-08-01	Provision system
HK40047542A (en)	2021-11-19	Text-to-speech synthesis system and method