Flores-Carballo et al., 2023 - Google Patents

Speaker identification in interactions between mothers and children with Down syndrome via audio analysis: A case study in Mexico

Flores-Carballo et al., 2023

Document ID: 13982023164021749985
Author: Flores-Carballo C; Molina-Arenas G; Macias A; Caro K; Beltran J; Castro L
Publication year: 2023
Publication venue: International Journal of Human–Computer Interaction

External Links

Cited by

Snippet

In this work, we aim at identifying the speaker in interactions between mothers and children with Down syndrome (DS) using audio. We collected audio from a session in which children with DS solved puzzles, and their mothers were by their side. We generated a dataset by …

Continue reading at www.tandfonline.com (other versions)

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B5/00—Electrically-operated educational appliances
- G09B5/06—Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
- G09B5/065—Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B19/00—Teaching not covered by other main groups of this subclass
- G09B19/06—Foreign languages
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B21/00—Teaching, or communicating with, the blind, deaf or mute
- G09B21/009—Teaching or communicating with deaf persons
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B21/00—Teaching, or communicating with, the blind, deaf or mute
- G09B21/001—Teaching or communicating with blind persons
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B7/00—Electrically-operated teaching apparatus or devices working with questions and answers
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information

Similar Documents

Publication	Publication Date	Title
Narayanan et al.	2013	Behavioral signal processing: Deriving human behavioral informatics from speech and language
Schuller et al.	2013	Computational paralinguistics: emotion, affect and personality in speech and language processing
Zou et al.	2022	Semi-structural interview-based Chinese multimodal depression corpus towards automatic preliminary screening of depressive disorders
US9819711B2 (en)	2017-11-14	Online social interaction, education, and health care by analysing affect and cognitive features
McKechnie et al.	2018	Automated speech analysis tools for children’s speech production: A systematic literature review
Robillard et al.	2014	Monolingual and bilingual children with and without primary language impairment: Core vocabulary comparison
Arjmandi et al.	2022	Variability in quantity and quality of early linguistic experience in children with cochlear implants: Evidence from analysis of natural auditory environments
Lee et al.	2023	An engineering view on emotions and speech: From analysis and predictive models to responsible human-centered applications
Flores-Carballo et al.	2023	Speaker identification in interactions between mothers and children with Down syndrome via audio analysis: A case study in Mexico
Kalanadhabhatta et al.	2024	Playlogue: Dataset and benchmarks for analyzing adult-child conversations during play
Jaisharma et al.	2024	An automated model for child language impairment prediction using hybrid optimal BiLSTM
Holm et al.	2006	Introducing Jarrod: A child with a phonological impairment
de Seyssel	2023	Unsupervised multilingual models of speech representation, an approach inspired by cognitive science
Vollebregt	2023	A multimodal approach to working alliance detection in therapist-patient psychotherapy using deep learning models
Shulga et al.	2020	Toward explainable automatic classification of children’s speech disorders
Campbell et al.	2025	Comparing language input in homes of young blind and sighted children: Insights from daylong recordings
Laskowski	2011	Predicting, detecting and explaining the occurrence of vocal activity in multi-party conversation
Laycock	2021	The influence of socioindexical information on the speech perception-production link: Evidence from a shadowing task
Abbas	2020	Improving Arabic Sign Language to support communication between vehicle drivers and passengers from deaf people
Marpaung	2019	Context-Centric Affect Recognition From Paralinguistic Features of Speech
Riad	2022	Automatic speech and language processing for precision medicine in Huntington’s disease
Forjó et al.	2021	Contributions towards the possible identification of psychosis through speech processing in Portuguese
Ma	2023	Multimodal Modeling of Collaborative Learning with Adaptive Data Fusion
Kim et al.	2018	Age-related changes in segmental accuracy and error production in Korean-English bilingual children: implications for clinical speech sound assessment procedures
Alencar et al.	2022	English time at Núcleo Saber Down: Study on adaptation of codas filled with occlusive consonants