Sečujski et al., 2007 - Google Patents

An overview of the AlfaNum text-to-speech synthesis system

Sečujski et al., 2007

Document ID: 7317086725637515801
Author: Sečujski M; Delić V; Pekar D; Obradović R; Knežević D
Publication year: 2007
Publication venue: Proceedings of 12th SPECOM (Speech and Computer)

External Links

Cited by

Snippet

The paper gives a brief review of the development of the first widely applied text-to-speech synthesis system in the Serbian language The system was developed at the Faculty of Engineering, University of Novi Sad, Serbia. At the signal processing level, the system can …

Continue reading at www.researchgate.net (PDF) (other versions)

230000015572 biosynthetic process 0 title abstract description 29

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
- G10L13/07—Concatenation rules
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch

Similar Documents

Publication	Publication Date	Title
US6233553B1 (en)	2001-05-15	Method and system for automatically determining phonetic transcriptions associated with spelled words
US6029132A (en)	2000-02-22	Method for letter-to-sound in text-to-speech synthesis
CA2545873C (en)	2012-07-24	Text-to-speech method and system, computer program product therefor
US7069216B2 (en)	2006-06-27	Corpus-based prosody translation system
MXPA01006594A (en)	2004-07-30	Method and system for preselection of suitable units for concatenative speech.
Maia et al.	2003	Towards the development of a brazilian portuguese text-to-speech system based on HMM.
Van Bael et al.	2007	Automatic phonetic transcription of large speech corpora
Stöber et al.	2000	Speech synthesis using multilevel selection and concatenation of units from large speech corpora
Sečujski et al.	2007	An overview of the AlfaNum text-to-speech synthesis system
RU2386178C2 (en)	2010-04-10	Method for preliminary processing of text
Chen et al.	1996	A Mandarin Text-to-Speech System
Thatphithakkul et al.	2019	LOTUS-BI: A Thai-English code-mixing speech corpus
Janyoi et al.	2017	An Isarn dialect HMM-based text-to-speech system
Roux et al.	2007	Data-driven approach to rapid prototyping Xhosa speech synthesis
Allen	1980	Speech synthesis from text
Narupiyakul et al.	2005	A stochastic knowledge-based Thai text-to-speech system
Kasie et al.	2012	Concatenative speech synthesis for Amharic using unit selection method
Secujski	2005	Obtaining prosodic information from text in Serbian language
Khalifa et al.	2008	SMaTalk: Standard malay text to speech talk system
Brinckmann	2004	The Kiel corpus of read speech as a resource for speech synthesis
Heggtveit et al.	2001	Intonation modelling with a lexicon of natural F0 contours.
EP1501075B1 (en)	2009-04-15	Speech synthesis using concatenation of speech waveforms
Demenko et al.	2008	The design of polish speech corpus for unit selection speech synthesis
Toma et al.	2009	Automatic rule-based syllabication for Romanian
Lenzo et al.	1998	Rapid-deployment text-to-speech in the DIPLOMAT system.