[go: up one dir, main page]

Bregler et al., 1997 - Google Patents

Video rewrite: visual speech synthesis from video.

Bregler et al., 1997

View PDF
Document ID
18032396023307736555
Author
Bregler C
Covell M
Slaney M
Publication year
Publication venue
AVSP

External Links

Snippet

Video Rewrite uses existing footage to create automatically new video of a person mouthing words that she did not speak in the original footage. Video Rewrite uses computer-vision techniques to track points on the speaker's mouth in the training footage, and morphing …
Continue reading at www.mangolassi.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00288Classification, e.g. identification
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00268Feature extraction; Face representation
    • G06K9/00281Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • G11B27/036Insert-editing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
    • G10L2021/105Synthesis of the lips movements from speech, e.g. for talking heads
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings

Similar Documents

Publication Publication Date Title
Bregler et al. Video rewrite: Driving visual speech with audio
US7990384B2 (en) Audio-visual selection process for the synthesis of photo-realistic talking-head animations
Wang et al. One-shot talking face generation from single-speaker audio-visual correlation learning
CN113192161B (en) Virtual human image video generation method, system, device and storage medium
Cosatto et al. Sample-based synthesis of photo-realistic talking heads
US5880788A (en) Automated synchronization of video image sequences to new soundtracks
Wang et al. Audio2head: Audio-driven one-shot talking-head generation with natural head motion
Chen Audiovisual speech processing
US6492990B1 (en) Method for the automatic computerized audio visual dubbing of movies
US8655152B2 (en) Method and system of presenting foreign films in a native language
US7027054B1 (en) Do-it-yourself photo realistic talking head creation system and method
US7109993B2 (en) Method and system for the automatic computerized audio visual dubbing of movies
US7133535B2 (en) System and method for real time lip synchronization
US7168953B1 (en) Trainable videorealistic speech animation
AU2006352758A1 (en) Talking Head Creation System and Method
CN117237521A (en) Speech driving face generation model construction method and target person speaking video generation method
Bregler et al. Video rewrite: visual speech synthesis from video.
US20100057455A1 (en) Method and System for 3D Lip-Synch Generation with Data-Faithful Machine Learning
Zhou et al. An image-based visual speech animation system
US20080221904A1 (en) Coarticulation method for audio-visual text-to-speech synthesis
Cosatto et al. Audio-visual unit selection for the synthesis of photo-realistic talking-heads
Ostermann et al. Talking faces-technologies and applications
Fang et al. Audio-to-Deep-Lip: Speaking lip synthesis based on 3D landmarks
Cosker et al. Video realistic talking heads using hierarchical non-linear speech-appearance models
Bailly et al. Lip-synching using speaker-specific articulation, shape and appearance models