[go: up one dir, main page]

Campbell, 2005 - Google Patents

Developments in corpus-based speech synthesis: Approaching natural conversational speech

Campbell, 2005

Document ID
2244993565313410942
Author
Campbell N
Publication year
Publication venue
IEICE transactions on information and systems

External Links

Snippet

This paper describes the special demands of conversational speech in the context of corpus- based speech synthesis. The author proposed the CHATR system of prosody-based unit- selection for concatenative waveform synthesis seven years ago, and now extends this work …
Continue reading at search.ieice.org (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/027Concept to speech synthesisers; Generation of natural phrases from machine-based concepts
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • G10L13/10Prosody rules derived from text; Stress or intonation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/06Elementary speech units used in speech synthesisers; Concatenation rules
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis

Similar Documents

Publication Publication Date Title
Zhang et al. Speechgpt: Empowering large language models with intrinsic cross-modal conversational abilities
Campbell Developments in corpus-based speech synthesis: Approaching natural conversational speech
Gibbon et al. Handbook of multimodal and spoken dialogue systems: Resources, terminology and product evaluation
Reddy et al. Speech-to-text and text-to-speech recognition using deep learning
Tucker et al. Why we need to investigate casual speech to truly understand language production, processing and the mental lexicon
Yi Natural-sounding speech synthesis using variable-length units
Haselow Spontaneous spoken English: An integrated approach to the emergent grammar of speech
Adell et al. Production of filled pauses in concatenative speech synthesis based on the underlying fluent sentence
Campbell et al. No laughing matter.
Campbell Conversational speech synthesis and the need for some laughter
López-Ludeña et al. LSESpeak: A spoken language generator for Deaf people
Hamad et al. Arabic text-to-speech synthesizer
Ifeanyi et al. Text–To–Speech Synthesis (TTS)
Wu et al. Modeling the expressivity of input text semantics for Chinese text-to-speech synthesis in a spoken dialog system
JP3706112B2 (en) Speech synthesizer and computer program
Trouvain et al. Speech synthesis: text-to-speech conversion and artificial voices
Jaiswal et al. Concatenative text-to-speech synthesis system for communication recognition
Kamble et al. Audio Visual Speech Synthesis and Speech Recognition for Hindi Language
Mamatov et al. Formation of a Speech Database in the Karakalpak Language for Speech Synthesis Systems
Pan et al. Designing a speech corpus for instancebased spoken language generation
Mihajlik et al. Is spoken Hungarian low-resource?: a quantitative survey of Hungarian speech data sets
Campbell Extra-semantic protocols; input requirements for the synthesis of dialogue speech
Farrugia Text-to-speech technologies for mobile telephony services
Kaveri et al. A novel approach for hindi text description to speech and expressive speech synthesis
Romito CHAPTER THREE A TRAINING PROGRAM FOR EXPERT FORENSIC TRANSCRIBERS LUCIANO ROMITO