Kubala et al., 2000 - Google Patents
Integrated technologies for indexing spoken languageKubala et al., 2000
View PDF- Document ID
- 15955791317035052915
- Author
- Kubala F
- Colbath S
- Liu D
- Srivastava A
- Makhoul J
- Publication year
- Publication venue
- Communications of the ACM
External Links
Snippet
TONY STONE IMAGES powerful enough, we believe the original speech recordings will be valued and preserved for their informative nuance and full context. Rough'n'Ready is the name of a prototype system under development at Bolt, Beranek and Newman (BBN) that …
- 230000011218 segmentation 0 abstract description 10
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30796—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kubala et al. | Integrated technologies for indexing spoken language | |
JP3488174B2 (en) | Method and apparatus for retrieving speech information using content information and speaker information | |
Makhoul et al. | Speech and language technologies for audio indexing and retrieval | |
Chelba et al. | Retrieval and browsing of spoken content | |
Lee et al. | Spoken content retrieval—beyond cascading speech recognition with text retrieval | |
Yang et al. | VideoQA: question answering on news video | |
Byrne et al. | Automatic recognition of spontaneous speech for access to multilingual oral history archives | |
Foote | An overview of audio information retrieval | |
Larson et al. | Spoken content retrieval: A survey of techniques and technologies | |
Parada et al. | Query-by-example spoken term detection for OOV terms | |
US20040204939A1 (en) | Systems and methods for speaker change detection | |
Young et al. | Acoustic indexing for multimedia retrieval and browsing | |
Canseco et al. | A comparative study using manual and automatic transcriptions for diarization | |
Gauvain et al. | Transcribing broadcast news for audio and video indexing | |
Chen et al. | Retrieval of broadcast news speech in Mandarin Chinese collected in Taiwan using syllable-level statistical characteristics | |
Parlak et al. | Performance analysis and improvement of Turkish broadcast news retrieval | |
Moyal et al. | Phonetic search methods for large speech databases | |
Viswanathan et al. | Retrieval from spoken documents using content and speaker information | |
Kubala et al. | Rough'n'Ready: a meeting recorder and browser | |
Yu et al. | Searching the audio notebook: keyword search in recorded conversation | |
Esteve et al. | Extracting true speaker identities from transcriptions | |
Nouza et al. | A system for information retrieval from large records of Czech spoken data | |
Choi et al. | Spoken content-based audio navigation (SCAN) | |
Maskey | Automatic broadcast news speech summarization | |
Viswanathan et al. | Multimedia document retrieval using speech and speaker recognition |