[go: up one dir, main page]

Maison et al., 1999 - Google Patents

Audio-visual speaker recognition for video broadcast news: some fusion techniques

Maison et al., 1999

View PDF
Document ID
9353515045771012327
Author
Maison B
Neti C
Senior A
Publication year
Publication venue
1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No. 99TH8451)

External Links

Snippet

Audio-based speaker identification degrades severely when there is a mismatch between training and test conditions either due to channel or noise. In this paper, we explore various techniques to fuse video based speaker identification with audio-based speaker …
Continue reading at www.andrewsenior.com (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6268Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00228Detection; Localisation; Normalisation
    • G06K9/00248Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00268Feature extraction; Face representation
    • G06K9/00281Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00288Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00597Acquiring or recognising eyes, e.g. iris verification

Similar Documents

Publication Publication Date Title
US6567775B1 (en) Fusion of audio and video based speaker identification for multimedia information access
Ben-Yacoub et al. Fusion of face and speech data for person identity verification
Maison et al. Audio-visual speaker recognition for video broadcast news: some fusion techniques
US9159321B2 (en) Lip-password based speaker verification system
Zhang et al. Character identification in feature-length films using global face-name matching
Horiguchi et al. Face-voice matching using cross-modal embeddings
JP2001092974A (en) Speaker recognizing method, device for executing the same, method and device for confirming audio generation
Ding et al. Learning relations among movie characters: A social network perspective
Liu et al. Learning multi-boosted HMMs for lip-password based speaker verification
JP2006500858A (en) Enhanced commercial detection via synthesized video and audio signatures
Jiang et al. Multimodal biometric human recognition for perceptual human–computer interaction
Senior Recognizing faces in broadcast video
Pouthier et al. Active speaker detection as a multi-objective optimization with uncertainty-based multimodal fusion
CN102509548B (en) Audio indexing method based on multi-distance sound sensor
Nefian et al. A Bayesian approach to audio-visual speaker identification
Neti et al. Audio-visual speaker recognition for video broadcast news
Lathoud et al. Short-term spatio–temporal clustering applied to multiple moving speakers
Luque et al. Audio, video and multimodal person identification in a smart room
Lin et al. Meta-classification: Combining multimodal classifiers
Friedland et al. Speaker recognition and diarization
Maison et al. Audio-visual speaker recognition for video broadcast news
Ou et al. Cascade AdaBoost classifiers with stage optimization for face detection
Haller et al. Audiovisual anchorperson detection for topic-oriented navigation in broadcast news
El Khoury Unsupervised video indexing based on audiovisual characterization of persons
Anantharajah et al. Robust automatic face clustering in news video