Lee et al., 2015 - Google Patents

Unsupervised lexicon discovery from acoustic input

Lee et al., 2015

Document ID: 12688544128695762753
Author: Lee C; O’donnell T; Glass J
Publication year: 2015
Publication venue: Transactions of the Association for Computational Linguistics

External Links

Cited by

Snippet

We present a model of unsupervised phonological lexicon discovery—the problem of simultaneously learning phoneme-like and word-like units from acoustic input. Our model builds on earlier models of unsupervised phone-like unit discovery from acoustic data (Lee …

Continue reading at direct.mit.edu (PDF) (other versions)

239000011521 glass 0 abstract description 18

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification

Similar Documents

Publication	Publication Date	Title
Lee et al.	2015	Unsupervised lexicon discovery from acoustic input
Harwath et al.	2016	Unsupervised learning of spoken language with visual context
Jansen et al.	2013	A summary of the 2012 JHU CLSP workshop on zero resource speech technologies and models of early language acquisition
Glass	2012	Towards unsupervised speech processing
Taniguchi et al.	2016	Nonparametric bayesian double articulation analyzer for direct language acquisition from continuous speech signals
Cui et al.	2013	Developing speech recognition systems for corpus indexing under the IARPA Babel program
CN112151015A (en)	2020-12-29	Keyword detection method and device, electronic equipment and storage medium
Lal et al.	2013	Cross-lingual automatic speech recognition using tandem features
Ahmed et al.	2019	End-to-end lexicon free Arabic speech recognition using recurrent neural networks
Kumar et al.	2019	A comprehensive view of automatic speech recognition system-a systematic literature review
Oh et al.	2007	Acoustic model adaptation based on pronunciation variability analysis for non-native speech recognition
Kumar et al.	2021	AutoSSR: an efficient approach for automatic spontaneous speech recognition model for the Punjabi Language: Y. Kumar et al.
Prabhavalkar et al.	2013	Discriminative articulatory models for spoken term detection in low-resource conversational settings
Walter et al.	2013	Unsupervised word discovery from phonetic input using nested pitman-yor language modeling
Okuda et al.	2022	Double articulation analyzer with prosody for unsupervised word and phone discovery
Mabokela et al.	2013	An integrated language identification for code-switched speech using decoded-phonemes and support vector machine
Lee	2014	Discovering linguistic structures in speech: Models and applications
Rao et al.	2015	Language identification using excitation source features
Kipyatkova et al.	2012	Analysis of long-distance word dependencies and pronunciation variability at conversational Russian speech recognition
Tombaloğlu et al.	2020	Deep learning based automatic speech recognition for Turkish
Zheng et al.	2021	Speech emotion recognition based on acoustic segment model
Vetter et al.	2016	Unsupervised Phoneme Segmentation of Previously Unseen Languages.
Swingley et al.	2024	Computational Modeling of the Segmentation of Sentence Stimuli From an Infant Word‐Finding Study
Röpke et al.	2019	Training a Speech-to-Text Model for Dutch on the Corpus Gesproken Nederlands.
Bates et al.	2007	Symbolic phonetic features for modeling of pronunciation variation