Kooragama et al., 2021 - Google Patents
Speech Master: Natural Language Processing and Deep Learning Approach for Automated Speech EvaluationKooragama et al., 2021
- Document ID
- 14700378329183007268
- Author
- Kooragama K
- Jayashanka L
- Munasinghe J
- Jayawardana K
- Tissera M
- Buddhika T
- Publication year
- Publication venue
- 2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)
External Links
Snippet
Every English speaker wishes to expertise his/her public speaking skills sharply. However, it is extremely difficult and requires a significant amount of practice and experience on an individual basis. This paper introduces a novel online tool “Speech Master” to practice and …
- 238000003058 natural language processing 0 title abstract description 7
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/274—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Macary et al. | On the use of self-supervised pre-trained acoustic and linguistic features for continuous speech emotion recognition | |
| CN111833853B (en) | Voice processing method and device, electronic equipment and computer readable storage medium | |
| US8843372B1 (en) | Natural conversational technology system and method | |
| Pan et al. | Spanish MEACorpus 2023: A multimodal speech–text corpus for emotion analysis in Spanish from natural environments | |
| Macary et al. | AlloSat: A new call center french corpus for satisfaction and frustration analysis | |
| US20230298615A1 (en) | System and method for extracting hidden cues in interactive communications | |
| Kopparapu | Non-linguistic analysis of call center conversations | |
| CN110675292A (en) | Child language ability evaluation method based on artificial intelligence | |
| Schmitt et al. | Towards adaptive spoken dialog systems | |
| Shah et al. | First workshop on speech processing for code-switching in multilingual communities: Shared task on code-switched spoken language identification | |
| Scholten et al. | Learning to recognise words using visually grounded speech | |
| Dyriv et al. | The user's psychological state identification based on Big Data analysis for person's electronic diary | |
| Sergidou et al. | Frequent-words analysis for forensic speaker comparison | |
| Pérez-Espinosa et al. | Using acoustic paralinguistic information to assess the interaction quality in speech-based systems for elderly users | |
| Bu et al. | Roadmap towards superhuman speech understanding using large language models | |
| Tomokiyo | Recognizing non-native speech: characterizing and adapting to non-native usage in LVCSR | |
| Shirali-Shahreza et al. | Better replacement for TTS naturalness evaluation | |
| Jiao et al. | Objective intelligibility assessment by automated segmental and suprasegmental listening error analysis | |
| Ward et al. | A collection of pragmatic-similarity judgments over spoken dialog utterances | |
| Wu et al. | Aligning spoken dialogue models from user interactions | |
| Kooragama et al. | Speech Master: Natural Language Processing and Deep Learning Approach for Automated Speech Evaluation | |
| Johnson et al. | An analysis of large language models for African American English speaking children’s oral language assessment | |
| Ollerenshaw et al. | Empirical interpretation of the relationship between speech acoustic context and emotion recognition | |
| Zhang et al. | Multi‐feature intelligent oral English error correction based on few‐shot learning technology | |
| Rohanian | Multimodal Assessment of Cognitive Decline: Applications in Alzheimer’s Disease and Depression |