Hauptmann et al., 1997 - Google Patents
Informedia: News-on-demand multimedia information acquisition and retrievalHauptmann et al., 1997
View PDF- Document ID
- 1810615531150680491
- Author
- Hauptmann A
- Witbrock M
- Publication year
- Publication venue
- Intelligent multimedia information retrieval
External Links
Snippet
In theory, speech recognition technology can make any spoken words in video or audio media subject to text indexing, search and retrieval. This article describes the News-on- Demand application created within the Informedia™ Digital Video Library project and …
- 230000011218 segmentation 0 abstract description 21
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30796—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30817—Information retrieval; Database structures therefor; File system structures therefor of video data using information manually generated or using information not derived from the video content, e.g. time and location information, usage information, user ratings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30743—Audio data retrieval using features automatically derived from the audio content, e.g. descriptors, fingerprints, signatures, MEP-cepstral coefficients, musical score, tempo
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
- G06F17/30023—Querying
- G06F17/30038—Querying based on information manually generated or based on information not derived from the media content, e.g. tags, keywords, comments, usage information, user ratings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30837—Query results presentation or summarisation specifically adapted for the retrieval of video data
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/102—Programmed access in sequence to addressed parts of tracks of operating record carriers
- G11B27/105—Programmed access in sequence to addressed parts of tracks of operating record carriers of operating discs
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hauptmann et al. | Informedia: News-on-demand multimedia information acquisition and retrieval | |
Makhoul et al. | Speech and language technologies for audio indexing and retrieval | |
Foote | An overview of audio information retrieval | |
EP0786114B1 (en) | Method and apparatus for creating a searchable digital video library | |
Hauptmann et al. | Text, speech, and vision for video segmentation: The informedia tm project | |
US7292979B2 (en) | Time ordered indexing of audio data | |
EP1692629B1 (en) | System & method for integrative analysis of intrinsic and extrinsic audio-visual data | |
US20030065655A1 (en) | Method and apparatus for detecting query-driven topical events using textual phrases on foils as indication of topic | |
Wactlar et al. | Informedia tm: News-on-demand experiments in speech recognition | |
Hauptmann | Speech recognition in the informedia digital video library: uses and limitations | |
Witbrock et al. | Speech recognition for a digital video library | |
Wilcox et al. | Annotation and segmentation for multimedia indexing and retrieval | |
Bouamrane et al. | Meeting browsing: State-of-the-art review | |
Wactlar et al. | Informedia News-on Demand: Using speech recognition to create a digital video library | |
Amir et al. | Search the audio, browse the video—a generic paradigm for video collections | |
Moreno et al. | From multimedia retrieval to knowledge management | |
Hauptmann et al. | Informedia news-on-demand: Using speech recognition to create a digital video library | |
Zhu et al. | Video browsing and retrieval based on multimodal integration | |
Nouza et al. | Large-scale processing, indexing and search system for Czech audio-visual cultural heritage archives | |
Nouza et al. | A system for information retrieval from large records of Czech spoken data | |
Viswanathan et al. | Multimedia document retrieval using speech and speaker recognition | |
Foote et al. | Enhanced video browsing using automatically extracted audio excerpts | |
Meng et al. | Spoken document retrieval for the languages of Hong Kong | |
Wang et al. | The SoVideo Mandarin Chinese broadcast news retrieval system | |
Lo et al. | Multi-scale spoken document retrieval for Cantonese broadcast news |