Matassoni et al., 1999 - Google Patents

Some results on the development of a hands-free speech recognizer for carenvironment

Matassoni et al., 1999

Document ID: 12610537013204429208
Author: Matassoni M; Omologo M; Cristoforetti L; Giuliani D; Svaizer P; Trentin E; Zovato E; et al.
Publication year: 1999
Publication venue: Proc. of ASRU

External Links

Cited by

Snippet

This paper describes some activities being conducted at IRST with the aim of developing a technology for hands-free speech recognition in car environment. This technology is based on Hidden Markov Models and is being developed and evaluated by using the car database …

Continue reading at www.researchgate.net (PDF) (other versions)

230000018109 developmental process 0 title description 3

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis

Similar Documents

Publication	Publication Date	Title
US6418411B1 (en)	2002-07-09	Method and system for adaptive speech recognition in a noisy environment
US7630878B2 (en)	2009-12-08	Speech recognition with language-dependent model vectors
US7319960B2 (en)	2008-01-15	Speech recognition method and system
Woo et al.	2006	The MIT mobile device speaker verification corpus: data collection and preliminary experiments
Boril et al.	2009	Unsupervised equalization of Lombard effect for speech recognition in noisy adverse environments
Junqua	2002	Robust speech recognition in embedded systems and PC applications
US20080281593A1 (en)	2008-11-13	Apparatus for Reducing Spurious Insertions in Speech Recognition
US20020178004A1 (en)	2002-11-28	Method and apparatus for voice recognition
Aggarwal et al.	2013	Performance evaluation of sequentially combined heterogeneous feature streams for Hindi speech recognition system
Chen et al.	2004	Recognition of noisy speech using dynamic spectral subband centroids
US6658385B1 (en)	2003-12-02	Method for transforming HMMs for speaker-independent recognition in a noisy environment
Pucher et al.	2007	Phonetic distance measures for speech recognition vocabulary and grammar optimization
Rajnoha et al.	2011	ASR systems in noisy environment: Analysis and solutions for increasing noise robustness
Fischer et al.	1999	Database and online adaptation for improved speech recognition in car environments
Matassoni et al.	1999	Some results on the development of a hands-free speech recognizer for carenvironment
Manjunath et al.	2014	Automatic phonetic transcription for read, extempore and conversation speech for an Indian language: Bengali
EP0782127A2 (en)	1997-07-02	A time-varying feature space preprocessing procedure for telephone based speech recognition
Paliwal et al.	2010	Robust speech recognition under noisy ambient conditions
Matassoni et al.	2000	Hands-free speech recognition using a filtered clean corpus and incremental HMM adaptation
Matassoni et al.	2002	On the joint use of noise reduction and MLLR adaptation for in-car hands-free speech recognition
Sun et al.	2006	Experiments on Chinese speech recognition with tonal models and pitch estimation using the Mandarin speecon data.
Beaufays et al.	2003	Using speech/non-speech detection to bias recognition search on noisy data
US6377924B1 (en)	2002-04-23	Method of enrolling phone-based speaker specific commands
Matassoni et al.	2001	Use of real and contaminated speech for training of a hands-free in-car speech recognizer.
Seltzer et al.	2005	Training wideband acoustic models using mixed-bandwidth training data via feature bandwidth extension