[go: up one dir, main page]

Kubala et al., 2000 - Google Patents

Integrated technologies for indexing spoken language

Kubala et al., 2000

View PDF
Document ID
15955791317035052915
Author
Kubala F
Colbath S
Liu D
Srivastava A
Makhoul J
Publication year
Publication venue
Communications of the ACM

External Links

Snippet

TONY STONE IMAGES powerful enough, we believe the original speech recordings will be valued and preserved for their informative nuance and full context. Rough'n'Ready is the name of a prototype system under development at Bolt, Beranek and Newman (BBN) that …
Continue reading at dl.acm.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30781Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F17/30784Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
    • G06F17/30796Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30613Indexing
    • G06F17/30619Indexing indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching

Similar Documents

Publication Publication Date Title
Kubala et al. Integrated technologies for indexing spoken language
JP3488174B2 (en) Method and apparatus for retrieving speech information using content information and speaker information
Makhoul et al. Speech and language technologies for audio indexing and retrieval
Chelba et al. Retrieval and browsing of spoken content
Lee et al. Spoken content retrieval—beyond cascading speech recognition with text retrieval
Yang et al. VideoQA: question answering on news video
Byrne et al. Automatic recognition of spontaneous speech for access to multilingual oral history archives
Foote An overview of audio information retrieval
Larson et al. Spoken content retrieval: A survey of techniques and technologies
Parada et al. Query-by-example spoken term detection for OOV terms
US20040204939A1 (en) Systems and methods for speaker change detection
Young et al. Acoustic indexing for multimedia retrieval and browsing
Canseco et al. A comparative study using manual and automatic transcriptions for diarization
Gauvain et al. Transcribing broadcast news for audio and video indexing
Chen et al. Retrieval of broadcast news speech in Mandarin Chinese collected in Taiwan using syllable-level statistical characteristics
Parlak et al. Performance analysis and improvement of Turkish broadcast news retrieval
Moyal et al. Phonetic search methods for large speech databases
Viswanathan et al. Retrieval from spoken documents using content and speaker information
Kubala et al. Rough'n'Ready: a meeting recorder and browser
Yu et al. Searching the audio notebook: keyword search in recorded conversation
Esteve et al. Extracting true speaker identities from transcriptions
Nouza et al. A system for information retrieval from large records of Czech spoken data
Choi et al. Spoken content-based audio navigation (SCAN)
Maskey Automatic broadcast news speech summarization
Viswanathan et al. Multimedia document retrieval using speech and speaker recognition