[go: up one dir, main page]

Kim et al., 2005 - Google Patents

An effective news anchorperson shot detection method based on adaptive audio/visual model generation

Kim et al., 2005

Document ID
10913623694278216809
Author
Kim S
Hwang D
Kim J
Seo Y
Publication year
Publication venue
International Conference on Image and Video Retrieval

External Links

Snippet

A multi-modal method to improve the performance of the anchorperson shot detection for news story segmentation is proposed in this paper. The anchorperson voice information is used for the verification of anchorperson shot candidates extracted by visual information …
Continue reading at link.springer.com (other versions)

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00624Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
    • G06K9/00711Recognising video content, e.g. extracting audiovisual features from movies, extracting representative key-frames, discriminating news vs. sport content
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Similar Documents

Publication Publication Date Title
US7336890B2 (en) Automatic detection and segmentation of music videos in an audio/video stream
Chung et al. Learning to lip read words by watching videos
Li et al. Video content analysis using multimodal information: For movie content extraction, indexing and representation
US10304458B1 (en) Systems and methods for transcribing videos using speaker identification
US7982797B2 (en) Detecting blocks of commercial content in video data
US7555149B2 (en) Method and system for segmenting videos using face detection
EP1959393B1 (en) Computer implemented method for detecting scene boundaries in videos
US20040143434A1 (en) Audio-Assisted segmentation and browsing of news videos
KR100763899B1 (en) Anchor shot detection method and apparatus
JP2004520756A (en) Method for segmenting and indexing TV programs using multimedia cues
Zhang et al. Detecting sound events in basketball video archive
KR20050087987A (en) A apparatus and method for deciding anchor shot.
JP5257356B2 (en) Content division position determination device, content viewing control device, and program
Kim et al. An effective news anchorperson shot detection method based on adaptive audio/visual model generation
Hoashi et al. Shot Boundary Determination on MPEG Compressed Domain and Story Segmentation Experiments for TRECVID 2004.
CN116916089B (en) Intelligent video editing method integrating voice features and face features
Kyperountas et al. Enhanced eigen-audioframes for audiovisual scene change detection
US11983923B1 (en) Systems and methods for active speaker detection
Chaloupka A prototype of audio-visual broadcast transcription system
Masneri et al. SVM-based video segmentation and annotation of lectures and conferences
Kim et al. An effective anchorperson shot extraction method robust to false alarms
WO2007036892A1 (en) Method and apparatus for long term memory model in face detection and recognition
Li Content-based video analysis, indexing and representation using multimodal information
Wang et al. Automatic segmentation of news items based on video and audio features
Albiol et al. Video preprocessing for audiovisual indexing