Maison et al., 1999 - Google Patents
Audio-visual speaker recognition for video broadcast news: some fusion techniquesMaison et al., 1999
View PDF- Document ID
- 9353515045771012327
- Author
- Maison B
- Neti C
- Senior A
- Publication year
- Publication venue
- 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No. 99TH8451)
External Links
Snippet
Audio-based speaker identification degrades severely when there is a mismatch between training and test conditions either due to channel or noise. In this paper, we explore various techniques to fuse video based speaker identification with audio-based speaker …
- 238000000034 method 0 title abstract description 16
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00228—Detection; Localisation; Normalisation
- G06K9/00248—Detection; Localisation; Normalisation using facial parts and geometric relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00288—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00597—Acquiring or recognising eyes, e.g. iris verification
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6567775B1 (en) | Fusion of audio and video based speaker identification for multimedia information access | |
Ben-Yacoub et al. | Fusion of face and speech data for person identity verification | |
Maison et al. | Audio-visual speaker recognition for video broadcast news: some fusion techniques | |
US9159321B2 (en) | Lip-password based speaker verification system | |
Zhang et al. | Character identification in feature-length films using global face-name matching | |
Horiguchi et al. | Face-voice matching using cross-modal embeddings | |
JP2001092974A (en) | Speaker recognizing method, device for executing the same, method and device for confirming audio generation | |
Ding et al. | Learning relations among movie characters: A social network perspective | |
Liu et al. | Learning multi-boosted HMMs for lip-password based speaker verification | |
JP2006500858A (en) | Enhanced commercial detection via synthesized video and audio signatures | |
Jiang et al. | Multimodal biometric human recognition for perceptual human–computer interaction | |
Senior | Recognizing faces in broadcast video | |
Pouthier et al. | Active speaker detection as a multi-objective optimization with uncertainty-based multimodal fusion | |
CN102509548B (en) | Audio indexing method based on multi-distance sound sensor | |
Nefian et al. | A Bayesian approach to audio-visual speaker identification | |
Neti et al. | Audio-visual speaker recognition for video broadcast news | |
Lathoud et al. | Short-term spatio–temporal clustering applied to multiple moving speakers | |
Luque et al. | Audio, video and multimodal person identification in a smart room | |
Lin et al. | Meta-classification: Combining multimodal classifiers | |
Friedland et al. | Speaker recognition and diarization | |
Maison et al. | Audio-visual speaker recognition for video broadcast news | |
Ou et al. | Cascade AdaBoost classifiers with stage optimization for face detection | |
Haller et al. | Audiovisual anchorperson detection for topic-oriented navigation in broadcast news | |
El Khoury | Unsupervised video indexing based on audiovisual characterization of persons | |
Anantharajah et al. | Robust automatic face clustering in news video |