Guinaudeau et al., 2009 - Google Patents

Can automatic speech transcripts be used for large scale TV stream description and structuring?

Guinaudeau et al., 2009

Document ID: 1373585849646447391
Author: Guinaudeau C; Gravier G; Sébillot P
Publication year: 2009
Publication venue: 2009 11th IEEE International Symposium on Multimedia

External Links

Cited by

Snippet

The increasing quantity of TV material requires methods to help users navigate such data streams. Automatically associating a short textual description to each program in a stream, is a first stage to navigating or structuring tasks. Speech contained in TV broadcasts …

Continue reading at hal.science (PDF) (other versions)

239000000463 material 0 abstract description 7

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30796—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
- G06F17/30669—Translation of the query language, e.g. Chinese to English
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
- G06F17/30023—Querying
- G06F17/30038—Querying based on information manually generated or based on information not derived from the media content, e.g. tags, keywords, comments, usage information, user ratings
- G06F17/30041—Querying based on information manually generated or based on information not derived from the media content, e.g. tags, keywords, comments, usage information, user ratings using location information
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30743—Audio data retrieval using features automatically derived from the audio content, e.g. descriptors, fingerprints, signatures, MEP-cepstral coefficients, musical score, tempo
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/102—Programmed access in sequence to addressed parts of tracks of operating record carriers
- G11B27/105—Programmed access in sequence to addressed parts of tracks of operating record carriers of operating discs
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30817—Information retrieval; Database structures therefor; File system structures therefor of video data using information manually generated or using information not derived from the video content, e.g. time and location information, usage information, user ratings
- G06F17/3082—Information retrieval; Database structures therefor; File system structures therefor of video data using information manually generated or using information not derived from the video content, e.g. time and location information, usage information, user ratings using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30749—Audio data retrieval using information manually generated or using information not derived from the audio data, e.g. title and artist information, time and location information, usage information, user ratings
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30755—Query formulation specially adapted for audio data retrieval
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording

Similar Documents

Publication	Publication Date	Title
Hauptmann et al.	1997	Informedia: News-on-demand multimedia information acquisition and retrieval
US6345252B1 (en)	2002-02-05	Methods and apparatus for retrieving audio information using content and speaker information
Foote	1999	An overview of audio information retrieval
Makhoul et al.	2000	Speech and language technologies for audio indexing and retrieval
Chelba et al.	2008	Retrieval and browsing of spoken content
Chaudhuri et al.	2011	Unsupervised Learning of Acoustic Unit Descriptors for Audio Content Representation and Classification.
US20070106646A1 (en)	2007-05-10	User-directed navigation of multimedia search results
US7761298B1 (en)	2010-07-20	Document expansion in speech retrieval
US20070208561A1 (en)	2007-09-06	Method and apparatus for searching multimedia data using speech recognition in mobile device
US20090043581A1 (en)	2009-02-12	Methods and apparatus relating to searching of spoken audio data
Wechsler et al.	1998	New techniques for open-vocabulary spoken document retrieval
Kubala et al.	2000	Integrated technologies for indexing spoken language
Lecouteux et al.	2012	Integrating imperfect transcripts into speech recognition systems for building high-quality corpora
Akbacak et al.	2008	Open-vocabulary spoken term detection using graphone-based hybrid recognition systems
Witbrock et al.	1997	Speech recognition and information retrieval: Experiments in retrieving spoken documents
Turunen et al.	2007	Indexing confusion networks for morph-based spoken document retrieval
Witbrock et al.	1998	Speech recognition for a digital video library
Ogata et al.	2007	Automatic transcription for a web 2.0 service to search podcasts.
Logan et al.	2002	Confusion-based query expansion for OOV words in spoken document retrieval.
Wang	2000	Experiments in syllable-based retrieval of broadcast news speech in Mandarin Chinese
Viswanathan et al.	1999	Retrieval from spoken documents using content and speaker information
Guinaudeau et al.	2009	Can automatic speech transcripts be used for large scale TV stream description and structuring?
Yu et al.	2005	Searching the audio notebook: keyword search in recorded conversation
Wechsler et al.	2000	New approaches to spoken document retrieval
Mamou et al.	2008	Combination of multiple speech transcription methods for vocabulary independent search