Guinaudeau et al., 2009 - Google Patents
Can automatic speech transcripts be used for large scale TV stream description and structuring?Guinaudeau et al., 2009
View PDF- Document ID
- 1373585849646447391
- Author
- Guinaudeau C
- Gravier G
- Sébillot P
- Publication year
- Publication venue
- 2009 11th IEEE International Symposium on Multimedia
External Links
Snippet
The increasing quantity of TV material requires methods to help users navigate such data streams. Automatically associating a short textual description to each program in a stream, is a first stage to navigating or structuring tasks. Speech contained in TV broadcasts …
- 239000000463 material 0 abstract description 7
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30796—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
- G06F17/30669—Translation of the query language, e.g. Chinese to English
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
- G06F17/30023—Querying
- G06F17/30038—Querying based on information manually generated or based on information not derived from the media content, e.g. tags, keywords, comments, usage information, user ratings
- G06F17/30041—Querying based on information manually generated or based on information not derived from the media content, e.g. tags, keywords, comments, usage information, user ratings using location information
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30743—Audio data retrieval using features automatically derived from the audio content, e.g. descriptors, fingerprints, signatures, MEP-cepstral coefficients, musical score, tempo
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/102—Programmed access in sequence to addressed parts of tracks of operating record carriers
- G11B27/105—Programmed access in sequence to addressed parts of tracks of operating record carriers of operating discs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30817—Information retrieval; Database structures therefor; File system structures therefor of video data using information manually generated or using information not derived from the video content, e.g. time and location information, usage information, user ratings
- G06F17/3082—Information retrieval; Database structures therefor; File system structures therefor of video data using information manually generated or using information not derived from the video content, e.g. time and location information, usage information, user ratings using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30749—Audio data retrieval using information manually generated or using information not derived from the audio data, e.g. title and artist information, time and location information, usage information, user ratings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30755—Query formulation specially adapted for audio data retrieval
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hauptmann et al. | Informedia: News-on-demand multimedia information acquisition and retrieval | |
US6345252B1 (en) | Methods and apparatus for retrieving audio information using content and speaker information | |
Foote | An overview of audio information retrieval | |
Makhoul et al. | Speech and language technologies for audio indexing and retrieval | |
Chelba et al. | Retrieval and browsing of spoken content | |
Chaudhuri et al. | Unsupervised Learning of Acoustic Unit Descriptors for Audio Content Representation and Classification. | |
US20070106646A1 (en) | User-directed navigation of multimedia search results | |
US7761298B1 (en) | Document expansion in speech retrieval | |
US20070208561A1 (en) | Method and apparatus for searching multimedia data using speech recognition in mobile device | |
US20090043581A1 (en) | Methods and apparatus relating to searching of spoken audio data | |
Wechsler et al. | New techniques for open-vocabulary spoken document retrieval | |
Kubala et al. | Integrated technologies for indexing spoken language | |
Lecouteux et al. | Integrating imperfect transcripts into speech recognition systems for building high-quality corpora | |
Akbacak et al. | Open-vocabulary spoken term detection using graphone-based hybrid recognition systems | |
Witbrock et al. | Speech recognition and information retrieval: Experiments in retrieving spoken documents | |
Turunen et al. | Indexing confusion networks for morph-based spoken document retrieval | |
Witbrock et al. | Speech recognition for a digital video library | |
Ogata et al. | Automatic transcription for a web 2.0 service to search podcasts. | |
Logan et al. | Confusion-based query expansion for OOV words in spoken document retrieval. | |
Wang | Experiments in syllable-based retrieval of broadcast news speech in Mandarin Chinese | |
Viswanathan et al. | Retrieval from spoken documents using content and speaker information | |
Guinaudeau et al. | Can automatic speech transcripts be used for large scale TV stream description and structuring? | |
Yu et al. | Searching the audio notebook: keyword search in recorded conversation | |
Wechsler et al. | New approaches to spoken document retrieval | |
Mamou et al. | Combination of multiple speech transcription methods for vocabulary independent search |