Kubala et al., 2000 - Google Patents

Integrated technologies for indexing spoken language

Kubala et al., 2000

Document ID: 15955791317035052915
Author: Kubala F; Colbath S; Liu D; Srivastava A; Makhoul J
Publication year: 2000
Publication venue: Communications of the ACM

External Links

Cited by

Snippet

TONY STONE IMAGES powerful enough, we believe the original speech recordings will be valued and preserved for their informative nuance and full context. Rough'n'Ready is the name of a prototype system under development at Bolt, Beranek and Newman (BBN) that …

Continue reading at dl.acm.org (PDF) (other versions)

230000011218 segmentation 0 abstract description 10

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30796—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching

Similar Documents

Publication	Publication Date	Title
Kubala et al.	2000	Integrated technologies for indexing spoken language
JP3488174B2 (en)	2004-01-19	Method and apparatus for retrieving speech information using content information and speaker information
Makhoul et al.	2000	Speech and language technologies for audio indexing and retrieval
Chelba et al.	2008	Retrieval and browsing of spoken content
Lee et al.	2015	Spoken content retrieval—beyond cascading speech recognition with text retrieval
Yang et al.	2003	VideoQA: question answering on news video
Byrne et al.	2004	Automatic recognition of spontaneous speech for access to multilingual oral history archives
Foote	1999	An overview of audio information retrieval
Larson et al.	2012	Spoken content retrieval: A survey of techniques and technologies
Parada et al.	2009	Query-by-example spoken term detection for OOV terms
US20040204939A1 (en)	2004-10-14	Systems and methods for speaker change detection
Young et al.	1997	Acoustic indexing for multimedia retrieval and browsing
Canseco et al.	2005	A comparative study using manual and automatic transcriptions for diarization
Gauvain et al.	2000	Transcribing broadcast news for audio and video indexing
Chen et al.	2000	Retrieval of broadcast news speech in Mandarin Chinese collected in Taiwan using syllable-level statistical characteristics
Parlak et al.	2011	Performance analysis and improvement of Turkish broadcast news retrieval
Moyal et al.	2013	Phonetic search methods for large speech databases
Viswanathan et al.	1999	Retrieval from spoken documents using content and speaker information
Kubala et al.	1999	Rough'n'Ready: a meeting recorder and browser
Yu et al.	2005	Searching the audio notebook: keyword search in recorded conversation
Esteve et al.	2007	Extracting true speaker identities from transcriptions
Nouza et al.	2006	A system for information retrieval from large records of Czech spoken data
Choi et al.	1999	Spoken content-based audio navigation (SCAN)
Maskey	2008	Automatic broadcast news speech summarization
Viswanathan et al.	2000	Multimedia document retrieval using speech and speaker recognition