Dong et al., 2003 - Google Patents
Rate-distortion analysis of discrete-HMM pose estimation via multiaspect scattering dataDong et al., 2003
- Document ID
- 2197346491601134084
- Author
- Dong Y
- Carin L
- Publication year
- Publication venue
- IEEE transactions on pattern analysis and machine intelligence
External Links
Snippet
We consider the problem of estimating the pose of a target based on a sequence of scattered waveforms measured at multiple target-sensor orientations. Using a hidden Markov model (HMM) representation of the scattered-waveform sequence, pose estimation …
- 238000004458 analytical method 0 title description 23
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G10L15/144—Training of HMMs
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6296—Graphical models, e.g. Bayesian networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US6954745B2 (en) | Signal processing system | |
| Saul et al. | Mixed memory markov models: Decomposing complex stochastic processes as mixtures of simpler ones | |
| EP1488411B1 (en) | System for estimating parameters of a gaussian mixture model (gmm), or a gmm based hidden markov model | |
| AU649029B2 (en) | Method for spectral estimation to improve noise robustness for speech recognition | |
| US6466908B1 (en) | System and method for training a class-specific hidden Markov model using a modified Baum-Welch algorithm | |
| Bietti et al. | An online EM algorithm in hidden (semi-) Markov models for audio segmentation and clustering | |
| Afshan et al. | Improved subject-independent acoustic-to-articulatory inversion | |
| Zweig | Bayesian network structures and inference techniques for automatic speech recognition | |
| Ejbali et al. | Wavelet network for recognition system of Arabic word | |
| JP2004004906A (en) | Speaker and environment adaptation method including maximum likelihood method based on eigenvoice | |
| Dong et al. | Rate-distortion analysis of discrete-HMM pose estimation via multiaspect scattering data | |
| Li et al. | A Convolutional Neural Network with Non-Local Module for Speech Enhancement. | |
| Chung et al. | Mf-pam: Accurate pitch estimation through periodicity analysis and multi-level feature fusion | |
| Amrouche et al. | Efficient system for speech recognition using general regression neural network | |
| Lung | Improved wavelet feature extraction using kernel analysis for text independent speaker recognition | |
| JP2982689B2 (en) | Standard pattern creation method using information criterion | |
| Cipli et al. | Multi-class acoustic event classification of hydrophone data | |
| da Silva et al. | Speaker-independent embedded speech recognition using Hidden Markov Models | |
| Abdelaziz | Turbo Decoders for Audio-Visual Continuous Speech Recognition. | |
| Turrisi et al. | Improving generalization of vocal tract feature reconstruction: from augmented acoustic inversion to articulatory feature reconstruction without articulatory data | |
| Orphanidou et al. | Voice morphing using the generative topographic mapping | |
| Baggenstoss | A multi-resolution hidden markov model using class-specific features | |
| Rojas | Statistics and neural networks | |
| Nix et al. | Maximum-likelihood continuity mapping (MALCOM): An alternative to HMMs | |
| Rufiner et al. | Auditory cortical representations of speech signals for phoneme classification |