[go: up one dir, main page]

Kaya et al., 2017 - Google Patents

Emotion, age, and gender classification in children's speech by humans and machines

Kaya et al., 2017

View PDF
Document ID
8811920000706917527
Author
Kaya H
Salah A
Karpov A
Frolova O
Grigorev A
Lyakso E
Publication year
Publication venue
Computer Speech & Language

External Links

Snippet

In this article, we present the first child emotional speech corpus in Russian, called “EmoChildRu”, collected from 3 to 7 years old children. The base corpus includes over 20 K recordings (approx. 30 h), collected from 120 children. Audio recordings are carried out in …
Continue reading at www.staff.science.uu.nl (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/06Elementary speech units used in speech synthesisers; Concatenation rules
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data

Similar Documents

Publication Publication Date Title
Kaya et al. Emotion, age, and gender classification in children’s speech by humans and machines
Fagherazzi et al. Voice for health: the use of vocal biomarkers from research to clinical practice
Swain et al. Databases, features and classifiers for speech emotion recognition: a review
Eni et al. Estimating autism severity in young children from speech signals using a deep neural network
Rao et al. Emotion recognition using speech features
Rohanian et al. Alzheimer's dementia recognition using acoustic, lexical, disfluency and speech pause features robust to noisy inputs
Koolagudi et al. Emotion recognition from speech: a review
Le et al. Automatic quantitative analysis of spontaneous aphasic speech
Anagnostopoulos et al. Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011
Alonso et al. New approach in quantification of emotional intensity from the speech signal: emotional temperature
Lyakso et al. EmoChildRu: emotional child Russian speech corpus
Brahmi et al. Exploring the role of machine learning in diagnosing and treating speech disorders: a systematic literature review
Pérez-Toro et al. Depression assessment in people with Parkinson’s disease: The combination of acoustic features and natural language processing
Leal et al. Speech-based depression assessment: A comprehensive survey
Radha et al. Automatic speaker and age identification of children from raw speech using sincnet over erb scale
Campbell et al. Alzheimer's Dementia Detection from Audio and Language Modalities in Spontaneous Speech.
Rangra et al. Emotional speech-based personality prediction using NPSO architecture in deep learning
Xin et al. JNV corpus: A corpus of Japanese nonverbal vocalizations with diverse phrases and emotions
Ouyang et al. Individual differences in the prosodic encoding of informativity
Shanmugam et al. Understanding the use of acoustic measurement and Mel Frequency Cepstral Coefficient (MFCC) features for the classification of depression speech
Pravin et al. Acousto-prosodic delineation and classification of speech disfluencies in bilingual children
Themistocleous Computational Language Assessment in patients with speech, language, and communication impairments.
Stasak An investigation of acoustic, linguistic, and affect based methods for speech depression assessment
Feng et al. Acoustic encoding of vocally expressed confidence and doubt in Chinese bidialectics
Pandey Segmental evaluation of Text-to-Speech Synthesizers