Kim et al., 2005 - Google Patents
An effective news anchorperson shot detection method based on adaptive audio/visual model generationKim et al., 2005
- Document ID
- 10913623694278216809
- Author
- Kim S
- Hwang D
- Kim J
- Seo Y
- Publication year
- Publication venue
- International Conference on Image and Video Retrieval
External Links
Snippet
A multi-modal method to improve the performance of the anchorperson shot detection for news story segmentation is proposed in this paper. The anchorperson voice information is used for the verification of anchorperson shot candidates extracted by visual information …
- 238000001514 detection method 0 title abstract description 21
Classifications
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G06K9/00711—Recognising video content, e.g. extracting audiovisual features from movies, extracting representative key-frames, discriminating news vs. sport content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7336890B2 (en) | Automatic detection and segmentation of music videos in an audio/video stream | |
Chung et al. | Learning to lip read words by watching videos | |
Li et al. | Video content analysis using multimodal information: For movie content extraction, indexing and representation | |
US10304458B1 (en) | Systems and methods for transcribing videos using speaker identification | |
US7982797B2 (en) | Detecting blocks of commercial content in video data | |
US7555149B2 (en) | Method and system for segmenting videos using face detection | |
EP1959393B1 (en) | Computer implemented method for detecting scene boundaries in videos | |
US20040143434A1 (en) | Audio-Assisted segmentation and browsing of news videos | |
KR100763899B1 (en) | Anchor shot detection method and apparatus | |
JP2004520756A (en) | Method for segmenting and indexing TV programs using multimedia cues | |
Zhang et al. | Detecting sound events in basketball video archive | |
KR20050087987A (en) | A apparatus and method for deciding anchor shot. | |
JP5257356B2 (en) | Content division position determination device, content viewing control device, and program | |
Kim et al. | An effective news anchorperson shot detection method based on adaptive audio/visual model generation | |
Hoashi et al. | Shot Boundary Determination on MPEG Compressed Domain and Story Segmentation Experiments for TRECVID 2004. | |
CN116916089B (en) | Intelligent video editing method integrating voice features and face features | |
Kyperountas et al. | Enhanced eigen-audioframes for audiovisual scene change detection | |
US11983923B1 (en) | Systems and methods for active speaker detection | |
Chaloupka | A prototype of audio-visual broadcast transcription system | |
Masneri et al. | SVM-based video segmentation and annotation of lectures and conferences | |
Kim et al. | An effective anchorperson shot extraction method robust to false alarms | |
WO2007036892A1 (en) | Method and apparatus for long term memory model in face detection and recognition | |
Li | Content-based video analysis, indexing and representation using multimodal information | |
Wang et al. | Automatic segmentation of news items based on video and audio features | |
Albiol et al. | Video preprocessing for audiovisual indexing |