Kim et al., 2005 - Google Patents

An effective news anchorperson shot detection method based on adaptive audio/visual model generation

Kim et al., 2005

Document ID: 10913623694278216809
Author: Kim S; Hwang D; Kim J; Seo Y
Publication year: 2005
Publication venue: International Conference on Image and Video Retrieval

External Links

Cited by

Snippet

A multi-modal method to improve the performance of the anchorperson shot detection for news story segmentation is proposed in this paper. The anchorperson voice information is used for the verification of anchorperson shot candidates extracted by visual information …

Continue reading at link.springer.com (other versions)

238000001514 detection method 0 title abstract description 21

Classifications

- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G06K9/00711—Recognising video content, e.g. extracting audiovisual features from movies, extracting representative key-frames, discriminating news vs. sport content
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals

Similar Documents

Publication	Publication Date	Title
US7336890B2 (en)	2008-02-26	Automatic detection and segmentation of music videos in an audio/video stream
Chung et al.	2018	Learning to lip read words by watching videos
Li et al.	2013	Video content analysis using multimodal information: For movie content extraction, indexing and representation
US10304458B1 (en)	2019-05-28	Systems and methods for transcribing videos using speaker identification
US7982797B2 (en)	2011-07-19	Detecting blocks of commercial content in video data
US7555149B2 (en)	2009-06-30	Method and system for segmenting videos using face detection
EP1959393B1 (en)	2011-09-21	Computer implemented method for detecting scene boundaries in videos
US20040143434A1 (en)	2004-07-22	Audio-Assisted segmentation and browsing of news videos
KR100763899B1 (en)	2007-10-05	Anchor shot detection method and apparatus
JP2004520756A (en)	2004-07-08	Method for segmenting and indexing TV programs using multimedia cues
Zhang et al.	2001	Detecting sound events in basketball video archive
KR20050087987A (en)	2005-09-01	A apparatus and method for deciding anchor shot.
JP5257356B2 (en)	2013-08-07	Content division position determination device, content viewing control device, and program
Kim et al.	2005	An effective news anchorperson shot detection method based on adaptive audio/visual model generation
Hoashi et al.	2004	Shot Boundary Determination on MPEG Compressed Domain and Story Segmentation Experiments for TRECVID 2004.
CN116916089B (en)	2024-09-20	Intelligent video editing method integrating voice features and face features
Kyperountas et al.	2007	Enhanced eigen-audioframes for audiovisual scene change detection
US11983923B1 (en)	2024-05-14	Systems and methods for active speaker detection
Chaloupka	2019	A prototype of audio-visual broadcast transcription system
Masneri et al.	2014	SVM-based video segmentation and annotation of lectures and conferences
Kim et al.	2004	An effective anchorperson shot extraction method robust to false alarms
WO2007036892A1 (en)	2007-04-05	Method and apparatus for long term memory model in face detection and recognition
Li	2003	Content-based video analysis, indexing and representation using multimodal information
Wang et al.	2001	Automatic segmentation of news items based on video and audio features
Albiol et al.	2002	Video preprocessing for audiovisual indexing