Kaya et al., 2017 - Google Patents

Emotion, age, and gender classification in children's speech by humans and machines

Kaya et al., 2017

Document ID: 8811920000706917527
Author: Kaya H; Salah A; Karpov A; Frolova O; Grigorev A; Lyakso E
Publication year: 2017
Publication venue: Computer Speech & Language

External Links

Cited by

Snippet

In this article, we present the first child emotional speech corpus in Russian, called “EmoChildRu”, collected from 3 to 7 years old children. The base corpus includes over 20 K recordings (approx. 30 h), collected from 120 children. Audio recordings are carried out in …

Continue reading at www.staff.science.uu.nl (PDF) (other versions)

230000002996 emotional 0 abstract description 87

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data

Similar Documents

Publication	Publication Date	Title
Kaya et al.	2017	Emotion, age, and gender classification in children’s speech by humans and machines
Fagherazzi et al.	2021	Voice for health: the use of vocal biomarkers from research to clinical practice
Swain et al.	2018	Databases, features and classifiers for speech emotion recognition: a review
Eni et al.	2020	Estimating autism severity in young children from speech signals using a deep neural network
Rao et al.	2012	Emotion recognition using speech features
Rohanian et al.	2021	Alzheimer's dementia recognition using acoustic, lexical, disfluency and speech pause features robust to noisy inputs
Koolagudi et al.	2012	Emotion recognition from speech: a review
Le et al.	2018	Automatic quantitative analysis of spontaneous aphasic speech
Anagnostopoulos et al.	2015	Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011
Alonso et al.	2015	New approach in quantification of emotional intensity from the speech signal: emotional temperature
Lyakso et al.	2015	EmoChildRu: emotional child Russian speech corpus
Brahmi et al.	2024	Exploring the role of machine learning in diagnosing and treating speech disorders: a systematic literature review
Pérez-Toro et al.	2022	Depression assessment in people with Parkinson’s disease: The combination of acoustic features and natural language processing
Leal et al.	2024	Speech-based depression assessment: A comprehensive survey
Radha et al.	2024	Automatic speaker and age identification of children from raw speech using sincnet over erb scale
Campbell et al.	2021	Alzheimer's Dementia Detection from Audio and Language Modalities in Spontaneous Speech.
Rangra et al.	2023	Emotional speech-based personality prediction using NPSO architecture in deep learning
Xin et al.	2024	JNV corpus: A corpus of Japanese nonverbal vocalizations with diverse phrases and emotions
Ouyang et al.	2015	Individual differences in the prosodic encoding of informativity
Shanmugam et al.	2023	Understanding the use of acoustic measurement and Mel Frequency Cepstral Coefficient (MFCC) features for the classification of depression speech
Pravin et al.	2020	Acousto-prosodic delineation and classification of speech disfluencies in bilingual children
Themistocleous	2023	Computational Language Assessment in patients with speech, language, and communication impairments.
Stasak	2018	An investigation of acoustic, linguistic, and affect based methods for speech depression assessment
Feng et al.	2024	Acoustic encoding of vocally expressed confidence and doubt in Chinese bidialectics
Pandey	2024	Segmental evaluation of Text-to-Speech Synthesizers