Bregler et al., 1997 - Google Patents

Video rewrite: visual speech synthesis from video.

Bregler et al., 1997

Document ID: 18032396023307736555
Author: Bregler C; Covell M; Slaney M
Publication year: 1997
Publication venue: AVSP

External Links

Cited by

Snippet

Video Rewrite uses existing footage to create automatically new video of a person mouthing words that she did not speak in the original footage. Video Rewrite uses computer-vision techniques to track points on the speaker's mouth in the training footage, and morphing …

Continue reading at www.mangolassi.org (PDF) (other versions)

230000002194 synthesizing 0 title description 13

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00288—Classification, e.g. identification
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
- G11B27/036—Insert-editing
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
- G10L2021/105—Synthesis of the lips movements from speech, e.g. for talking heads
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings

Similar Documents

Publication	Publication Date	Title
Bregler et al.	2023	Video rewrite: Driving visual speech with audio
US7990384B2 (en)	2011-08-02	Audio-visual selection process for the synthesis of photo-realistic talking-head animations
Wang et al.	2022	One-shot talking face generation from single-speaker audio-visual correlation learning
CN113192161B (en)	2022-10-18	Virtual human image video generation method, system, device and storage medium
Cosatto et al.	1998	Sample-based synthesis of photo-realistic talking heads
US5880788A (en)	1999-03-09	Automated synchronization of video image sequences to new soundtracks
Wang et al.	2021	Audio2head: Audio-driven one-shot talking-head generation with natural head motion
Chen	2001	Audiovisual speech processing
US6492990B1 (en)	2002-12-10	Method for the automatic computerized audio visual dubbing of movies
US8655152B2 (en)	2014-02-18	Method and system of presenting foreign films in a native language
US7027054B1 (en)	2006-04-11	Do-it-yourself photo realistic talking head creation system and method
US7109993B2 (en)	2006-09-19	Method and system for the automatic computerized audio visual dubbing of movies
US7133535B2 (en)	2006-11-07	System and method for real time lip synchronization
US7168953B1 (en)	2007-01-30	Trainable videorealistic speech animation
AU2006352758A1 (en)	2008-12-24	Talking Head Creation System and Method
CN117237521A (en)	2023-12-15	Speech driving face generation model construction method and target person speaking video generation method
Bregler et al.	1997	Video rewrite: visual speech synthesis from video.
US20100057455A1 (en)	2010-03-04	Method and System for 3D Lip-Synch Generation with Data-Faithful Machine Learning
Zhou et al.	2012	An image-based visual speech animation system
US20080221904A1 (en)	2008-09-11	Coarticulation method for audio-visual text-to-speech synthesis
Cosatto et al.	2000	Audio-visual unit selection for the synthesis of photo-realistic talking-heads
Ostermann et al.	2004	Talking faces-technologies and applications
Fang et al.	2024	Audio-to-Deep-Lip: Speaking lip synthesis based on 3D landmarks
Cosker et al.	2003	Video realistic talking heads using hierarchical non-linear speech-appearance models
Bailly et al.	2009	Lip-synching using speaker-specific articulation, shape and appearance models