Bregler et al., 1997 - Google Patents
Video rewrite: visual speech synthesis from video.Bregler et al., 1997
View PDF- Document ID
- 18032396023307736555
- Author
- Bregler C
- Covell M
- Slaney M
- Publication year
- Publication venue
- AVSP
External Links
Snippet
Video Rewrite uses existing footage to create automatically new video of a person mouthing words that she did not speak in the original footage. Video Rewrite uses computer-vision techniques to track points on the speaker's mouth in the training footage, and morphing …
- 230000002194 synthesizing 0 title description 13
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00288—Classification, e.g. identification
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
- G11B27/036—Insert-editing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
- G10L2021/105—Synthesis of the lips movements from speech, e.g. for talking heads
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bregler et al. | Video rewrite: Driving visual speech with audio | |
US7990384B2 (en) | Audio-visual selection process for the synthesis of photo-realistic talking-head animations | |
Wang et al. | One-shot talking face generation from single-speaker audio-visual correlation learning | |
CN113192161B (en) | Virtual human image video generation method, system, device and storage medium | |
Cosatto et al. | Sample-based synthesis of photo-realistic talking heads | |
US5880788A (en) | Automated synchronization of video image sequences to new soundtracks | |
Wang et al. | Audio2head: Audio-driven one-shot talking-head generation with natural head motion | |
Chen | Audiovisual speech processing | |
US6492990B1 (en) | Method for the automatic computerized audio visual dubbing of movies | |
US8655152B2 (en) | Method and system of presenting foreign films in a native language | |
US7027054B1 (en) | Do-it-yourself photo realistic talking head creation system and method | |
US7109993B2 (en) | Method and system for the automatic computerized audio visual dubbing of movies | |
US7133535B2 (en) | System and method for real time lip synchronization | |
US7168953B1 (en) | Trainable videorealistic speech animation | |
AU2006352758A1 (en) | Talking Head Creation System and Method | |
CN117237521A (en) | Speech driving face generation model construction method and target person speaking video generation method | |
Bregler et al. | Video rewrite: visual speech synthesis from video. | |
US20100057455A1 (en) | Method and System for 3D Lip-Synch Generation with Data-Faithful Machine Learning | |
Zhou et al. | An image-based visual speech animation system | |
US20080221904A1 (en) | Coarticulation method for audio-visual text-to-speech synthesis | |
Cosatto et al. | Audio-visual unit selection for the synthesis of photo-realistic talking-heads | |
Ostermann et al. | Talking faces-technologies and applications | |
Fang et al. | Audio-to-Deep-Lip: Speaking lip synthesis based on 3D landmarks | |
Cosker et al. | Video realistic talking heads using hierarchical non-linear speech-appearance models | |
Bailly et al. | Lip-synching using speaker-specific articulation, shape and appearance models |