Can, 2018 - Google Patents

Weighted Factor Automata: A Finite-State Framework for Spoken Content Retrieval

Can, 2018

Document ID: 7294676800436940614
Author: Can D
Publication year: 2018

External Links

Cited by

Snippet

Abstract Spoken Content Retrieval (SCR) integrates Automatic Speech Recognition (ASR) and Information Retrieval (IR) to provide access to large multimedia archives based on their contents. There are several tasks of varying difficulty that fall under the SCR umbrella …

Continue reading at search.proquest.com (other versions)

238000004422 calculation algorithm 0 abstract description 107

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/085—Methods for reducing search complexity, pruning
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification

Similar Documents

Publication	Publication Date	Title
Can et al.	2011	Lattice indexing for spoken term detection
Murveit et al.	1993	Large-vocabulary dictation using SRI's DECIPHER speech recognition system: Progressive search techniques
US8700403B2 (en)	2014-04-15	Unified treatment of data-sparseness and data-overfitting in maximum entropy modeling
EP1949260B1 (en)	2017-06-14	Speech index pruning
US7634407B2 (en)	2009-12-15	Method and apparatus for indexing speech
Siu et al.	2000	Variable n-grams and extensions for conversational speech language modeling
US7831425B2 (en)	2010-11-09	Time-anchored posterior indexing of speech
Cox et al.	2002	High-level approaches to confidence estimation in speech recognition
WO2003010754A1 (en)	2003-02-06	Speech input search system
Velikovich et al.	2018	Semantic Lattice Processing in Contextual Automatic Speech Recognition for Google Assistant.
US7401019B2 (en)	2008-07-15	Phonetic fragment search in speech data
Hakkinen et al.	2001	N-gram and decision tree based language identification for written words
US20070118353A1 (en)	2007-05-24	Device, method, and medium for establishing language model
Moyal et al.	2013	Phonetic search methods for large speech databases
JP2011175046A (en)	2011-09-08	Voice search device and voice search method
Can	2018	Weighted Factor Automata: A Finite-State Framework for Spoken Content Retrieval
Turunen et al.	2008	Speech retrieval from unsegmented Finnish audio using statistical morpheme-like units for segmentation, recognition, and retrieval
Sproat et al.	2014	Applications of lexicographic semirings to problems in speech and language processing
Norouzian et al.	2014	An approach for efficient open vocabulary spoken term detection
Lease et al.	2006	A look at parsing and its applications
Mitchell et al.	1999	Improved spelling recognition using a tree-based fast lexical match
Enzell	2022	Domain Adaptation with N-gram Language Models for Swedish Automatic Speech Recognition: Using text data augmentation to create domain-specific n-gram models for a Swedish open-source wav2vec 2.0 model
JP2000267693A (en)	2000-09-29	Voice processor and index preparation device
Singh et al.	2025	BEST-STD: Bidirectional Mamba-Enhanced Speech Tokenization for Spoken Term Detection
Marin	2015	Effective use of cross-domain parsing in automatic speech recognition and error detection