[go: up one dir, main page]

WO2017177363A1 - Procédés et appareils d'hallucination de visage - Google Patents

Procédés et appareils d'hallucination de visage Download PDF

Info

Publication number
WO2017177363A1
WO2017177363A1 PCT/CN2016/078960 CN2016078960W WO2017177363A1 WO 2017177363 A1 WO2017177363 A1 WO 2017177363A1 CN 2016078960 W CN2016078960 W CN 2016078960W WO 2017177363 A1 WO2017177363 A1 WO 2017177363A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
hallucination
trained model
dense
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2016/078960
Other languages
English (en)
Inventor
Xiaoou Tang
Shizhan ZHU
Cheng Li
Chen Change Loy
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sensetime Group Ltd
Original Assignee
Sensetime Group Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sensetime Group Ltd filed Critical Sensetime Group Ltd
Priority to PCT/CN2016/078960 priority Critical patent/WO2017177363A1/fr
Priority to CN201680084409.3A priority patent/CN109313795B/zh
Publication of WO2017177363A1 publication Critical patent/WO2017177363A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/169Holistic features and representations, i.e. based on the facial image taken as a whole
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • the disclosure relates to image processing, in particular, to methods and apparatus for face hallucination.
  • Face hallucination is a task that improves the resolution of facial images and provides a viable means for improving low-resolution face processing and analysis, e.g., person identification in surveillance videos and facial image enhancement.
  • a method for face hallucination comprises: estimating a dense correspondence field based on a first image and a trained model; executing face hallucination based on the first image, the estimated dense correspondence field and the trained model through a bi-network to obtain a second image; and updating the first image with the second image, wherein the steps of estimating, executing and updating are performed repeatedly until the obtained second image has a desired resolution or the steps of estimating, executing and updating have been repeated for predetermined times.
  • an apparatus for face hallucination which comprises: an estimating unit configured to estimate a dense correspondence field based on a first image and a trained model; and a hallucination unit configured to execute face hallucination based on the first image, the estimated dense correspondence field and the trained model through a bi-network to obtain a second image; wherein the first image is iteratively updated with the second image, and the estimation unit and the hallucination unit works for a predetermined times of iterations or until the obtained second image has a desired resolution.
  • a device for face hallucination which comprises a processor and a memory storing computer-readable instructions, wherein, when the instructions are executed by the processor, the processor is operable to:estimate a dense correspondence field based on a first image and a trained model; execute face hallucination based on the first image, the estimated dense correspondence field and the trained model through a bi-network to obtain a second image; and update the first image with the second image, wherein the first image is iteratively updated with the second image for a predetermined times of iterations or until the obtained second image has a desired resolution.
  • a nonvolatile storage medium containing computer-readable instructions, wherein, when the instructions are executed by a processor, the processor is operable to estimate a dense correspondence field based on a first image and a trained model; execute face hallucination based on the first image, the estimated dense correspondence field and the trained model through a bi-network to obtain a second image; and update the first image with the second image, wherein the first image is iteratively updated with the second image for a predetermined times of iterations or until the obtained second image has a desired resolution.
  • Fig. 1 is a flow chart of a method for face hallucination according to an embodiment of the present disclosure.
  • Fig. 2 illustrates an apparatus for face hallucination according to an embodiment of the present disclosure.
  • Fig. 3 illustrates a flow chart of the training process of the estimation unit according to an embodiment of the present application.
  • Fig. 4 illustrates a flow chart of the testing process of the estimation unit according to an embodiment of the present application.
  • Fig. 5 illustrates a flow chart of the training process of the hallucination unit according to an embodiment of the present application.
  • Fig. 6 illustrates a flow chart of the testing process of the hallucination unit according to an embodiment of the present application.
  • Fig. 7 is a structural schematic diagram of an embodiment of computer equipment provided by the present invention.
  • a method for face hallucination is provided.
  • Fig. 1 is a flow chart of a method 100 for face hallucination according to an embodiment of the present disclosure.
  • an apparatus 200 for face hallucination is provided.
  • Fig. 1 is a flow chart of a method 100 for face hallucination according to an embodiment of the present disclosure.
  • Fig. 2 illustrates an apparatus 200 for face hallucination according to an embodiment of the present disclosure.
  • the apparatus 200 may comprises an estimation unit 201 and a hallucination unit 202.
  • a dense correspondence field is estimated by an estimation unit 201 based on an input first image 10 and parameters from a trained model 20.
  • the first image input into the estimation unit may be a facial image with a low resolution.
  • the dense correspondence field indicates the correspondence or mapping relationship of the first image to a warped image and denotes the warping of each pixel from the first image to the warped image.
  • the trained model contains various parameters that may be used for the estimation of the dense correspondence field.
  • step S102 face hallucination is executed by the hallucination unit 202 based on the first image 10 and the estimated dense correspondence field to obtain a second image 30.
  • the second image obtained after the face hallucination on the first image usually has a resolution higher than the first image.
  • the hallucination unit 202 is a bi-network which comprises a first branch 2021 being a common branch for face hallucination and a second branch 2022 being a high-frequency branch.
  • the processing in the common branch is similar to the face hallucination in the prior art.
  • the estimated dense correspondence field and parameters from the trained model 20 are further considered in addition to the input image 10.
  • the results obtained from both branches are incorporated through a gate network 2023 to obtain the second image 30.
  • the first image is updated with the second image so that the second image is used as an input to the estimation unit 201.
  • the steps S101 to S103 are performed repeatedly.
  • the steps may be performed repeatedly until the obtained second image has a desired image resolution.
  • the steps may be performed for pre-defined times.
  • the facial image may be denoted as a matrix I, and each pixel in the image may be denoted as x with coordinates (x, y) .
  • a mean face template for the facial image may be denoted as M, which comprises a plurality of pixels z.
  • the warping function W(z) may be determined based on a deformation coefficient p and a deformation base B(z) , which may be denoted as
  • the bases are pre-defined and shared by all samples.
  • the deformation base B (z) is predefined and shared by all samples, and thus the warping function is actually controlled by the deformation coefficient p for each sample.
  • f k is a Gauss-Newton descent regressor learned and stored in the trained model for predicting the dense correspondence field coefficients.
  • the coefficients f k may further be represented by a Gauss-Newton steepest descent regression matrixR k , which is obtained by training.
  • is the shape-indexed feature that concatenates the local appearance from all L landmarks, and is its average over all the training samples.
  • the dense correspondence field coefficients are estimated based on each pixel in the image.
  • the dense correspondence field coefficients are estimated based on landmarks in the image since using a sparse set of facial landmarks is more robust and accurate under low resolution.
  • a landmark base S k (l) is further considered in the estimation.
  • two sets of deformation bases i.e., the deformation base for the dense field and the landmark base for the landmarks are obtained, where l is the landmark index.
  • the bases for the dense field and landmarks are on-to-one related, i.e., both B k (z) and S k (l) are share the same deformation coefficients
  • the common branch conservatively recovers texture details that are only detectable from the low-resolution input, which is similar to the general super resolution.
  • the high-frequency branch super-resolves faces with the additional high-frequency prior warped by the estimated face correspondence field in the current cascade. Thanks to the guidance of prior, this branch is capable of recovering and synthesizing un-revealed texture details in the overly low-resolution input image.
  • a pixel-wise gate network is learned to fuse the results from the two branches.
  • the first image is upscaled and then input to the hallucination unit.
  • the upscaled image is input to both the common branch and the high-frequency branch.
  • the upscaled image is processed adaptively, for example, under a bicubic interpolation.
  • the estimated dense correspondence field is further input, and the upscaled image is processed based on the estimated dense correspondence field.
  • the results from both branches are combined in a gate network to obtain the second image.
  • the processing in the common branch is not limited to the bicubic interpolation, but may be any suitable process for the face hallucination.
  • the obtained image Ik is obtained by:
  • I k ⁇ I k-1 +g k ( ⁇ I k-1 ; W k (z) ) (4)
  • g k represents a hallucination bi-network learned and stored in the trained model for face hallucination.
  • the coefficients g k is obtained by training.
  • both the estimation unit and the hallucination unit may have a testing mode and a training mode.
  • the method 100 as shown in Fig. 1 illustrates the working process of the estimation unit and hallucination unit in the testing mode.
  • the estimation unit and hallucination unit may perform a training process to obtain and store parameters required in the testing mode into the trained model.
  • the estimation unit and the hallucination unit having both a testing mode and a training mode are described as an example.
  • the training process and the testing process may be performed by separate apparatus or separate units.
  • Fig. 3 illustrates a flow chart of the training process 300 of the estimation unit according to an embodiment of the present application. As shown, at step S301, the dense bases B k (z) , the landmark bases S k (l) and appearance eigen vectors ⁇ k are obtained.
  • the dense bases B k (z) and the landmark bases S k (l) are stored into the trained model for later use.
  • the average project-out Jacobian J k is learned, for example, by minimizing the following loss:
  • is the shape-indexed feature that concatenates the local appearance from all L landmarks, and is its average over all the training samples.
  • the Gauss-Newton steepest descent regression matrix R k is calculated by:
  • the process 300 may further include steps S304 and S305.
  • steps S304 and S305 the deformation coefficients for both the correspondence training set and the hallucination training set are updated.
  • step S305 the dense correspondence field for each location z for the hallucination training set is calculated. The deformation coefficients and the dense correspondence field obtained at steps S304 and S305 may be used in the later training process.
  • Fig. 4 illustrates a flow chart of the testing process 400 of the estimation unit according to an embodiment of the present application.
  • location for each landmark is obtained from the facial image input to the estimation unit.
  • the input image is the original low-resolution image in the first iteration.
  • the input image is the image obtained in the (k-1) th iteration, as well as the deformation coefficient obtained in the (k-1) th iteration.
  • the location of each landmark in the input image is obtained.
  • the SIFT feature from around the location of the landmark is obtained.
  • the SIFT feature is the shape-indexed feature described above.
  • the features from all the landmarks are combined as an appearance eigen vector.
  • the deformation coefficients are updated via regression according to the equation (2) .
  • the dense correspondence field for each location z is computed.
  • Fig. 5 illustrates a flow chart of the training process 500 of the hallucination unit according to an embodiment of the present application.
  • images from the training sets are upsampled by bicubic interpolation.
  • the warped high-frequency prior is obtained according to the dense correspondence field.
  • the deep bi-network is trained with three steps: pre-training the common sub-network, pre-training the high-frequency sub-network, and tuning the whole bi-network end-to-end.
  • the bi-network coefficient may be stored in the trained model.
  • the bi-network may be passed to compute the predict image for both the hallucination training set and the estimation training set.
  • Fig. 6 illustrates a flow chart of the testing process 600 of the hallucination unit according to an embodiment of the present application.
  • an input image I k-1 is upsampled by bicubic interpolation to obtain an upsampled image ⁇ I k-1 .
  • the warped high-frequency prior is obtained according to the dense correspondence field.
  • the learned bi-network coefficient g k is used to forward pass the deep bi-network with two inputs ⁇ I k-1 and so that the image I k is obtained.
  • Algorithm 1 is an exemplary training algorithm for learning the parameters by the apparatus according to an embodiment of the present application.
  • Algorithm 2 is an exemplary testing algorithm for hallucinating a low-resolution face according to an embodiment of the present application.
  • Fig. 7 is a structural schematic diagram of an embodiment of computer equipment provided by the present invention.
  • the computer equipment can be used for implementing the face hallucination method provided in the above embodiments.
  • the computer equipment may be greatly different due to different configuration or performance, and may include one or more processors (e.g. Central Processing Units, CPU) 710 and a memory 720.
  • the memory 720 may be a volatile memory or a nonvolatile memory.
  • One or more programs can be stored in the memory 720, and each program may include a series of instruction operations in the computer equipment.
  • the processor 710 can communicate with the memory 720, and execute the series of instruction operations in the memory 720 on the computer equipment.
  • data of one or more operating systems e.g.
  • the computer equipment may further include one or more power supplies 730, one or more wired or wireless network interfaces 740, one or more input/output interfaces 750, etc.
  • the method and the device according to the present invention described above may be implemented in hardware or firmware, or implemented as software or computer codes which can be stored in a recording medium (e.g. CD, ROM, RAM, soft disk, hard disk or magneto-optical disk) , or implemented as computer codes which are originally stored in a remote recording medium or a non-transient machine readable medium and can be downloaded through a network to be stored in a local recording medium, so that the method described herein can be processed by such software stored in the recording medium in a general purpose computer, a dedicated processor or programmable or dedicated hardware (e.g. ASIC or FPGA) .
  • the computer, the processor, the microprocessor controller or the programmable hardware include a storage assembly (e.g.
  • RAM random access memory
  • ROM read-only memory
  • flash memory etc.
  • the general purpose computer accesses the codes for implementing the processing shown herein, the execution of the codes converts the general purpose computer to a dedicated computer for executing the processing illustrated herein.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

L'invention concerne des procédés et un appareil d'hallucination de visage. Selon un mode de réalisation, un procédé d'hallucination comprend l'estimation d'un champ de correspondance dense en fonction d'une première image et d'un modèle formé ; l'exécution d'une hallucination de visage en fonction de la première image, du champ de correspondance dense estimé et du modèle entraîné par l'intermédiaire d'un réseau double pour obtenir une seconde image ; et la mise à jour de la première image avec la seconde image, les étapes d'estimation, d'exécution et de mise à jour étant effectuées de façon répétée jusqu'à ce que la seconde image obtenue présente une résolution souhaitée ou que les étapes d'estimation, d'exécution et de mise à jour aient été répétées un nombre de fois prédéfini.
PCT/CN2016/078960 2016-04-11 2016-04-11 Procédés et appareils d'hallucination de visage Ceased WO2017177363A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2016/078960 WO2017177363A1 (fr) 2016-04-11 2016-04-11 Procédés et appareils d'hallucination de visage
CN201680084409.3A CN109313795B (zh) 2016-04-11 2016-04-11 用于超分辨率处理的方法和设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/078960 WO2017177363A1 (fr) 2016-04-11 2016-04-11 Procédés et appareils d'hallucination de visage

Publications (1)

Publication Number Publication Date
WO2017177363A1 true WO2017177363A1 (fr) 2017-10-19

Family

ID=60041336

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/078960 Ceased WO2017177363A1 (fr) 2016-04-11 2016-04-11 Procédés et appareils d'hallucination de visage

Country Status (2)

Country Link
CN (1) CN109313795B (fr)
WO (1) WO2017177363A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008817A (zh) * 2019-01-29 2019-07-12 北京奇艺世纪科技有限公司 模型训练、图像处理方法、装置、电子设备及计算机可读存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112001861B (zh) * 2020-08-18 2024-04-02 香港中文大学(深圳) 图像处理方法和装置、计算机设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070103595A1 (en) * 2005-10-27 2007-05-10 Yihong Gong Video super-resolution using personalized dictionary
US20110305404A1 (en) * 2010-06-14 2011-12-15 Chia-Wen Lin Method And System For Example-Based Face Hallucination
CN103208109A (zh) * 2013-04-25 2013-07-17 武汉大学 一种基于局部约束迭代邻域嵌入的人脸幻构方法
US20150363634A1 (en) * 2014-06-17 2015-12-17 Beijing Kuangshi Technology Co.,Ltd. Face Hallucination Using Convolutional Neural Networks

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530863B (zh) * 2013-10-30 2017-01-11 广东威创视讯科技股份有限公司 一种多级重构的图像超分辨率方法
CN104091320B (zh) * 2014-07-16 2017-03-29 武汉大学 基于数据驱动局部特征转换的噪声人脸超分辨率重建方法
CN105405113A (zh) * 2015-10-23 2016-03-16 广州高清视信数码科技股份有限公司 一种基于多任务高斯过程回归的图像超分辨率重建方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070103595A1 (en) * 2005-10-27 2007-05-10 Yihong Gong Video super-resolution using personalized dictionary
US20110305404A1 (en) * 2010-06-14 2011-12-15 Chia-Wen Lin Method And System For Example-Based Face Hallucination
CN103208109A (zh) * 2013-04-25 2013-07-17 武汉大学 一种基于局部约束迭代邻域嵌入的人脸幻构方法
US20150363634A1 (en) * 2014-06-17 2015-12-17 Beijing Kuangshi Technology Co.,Ltd. Face Hallucination Using Convolutional Neural Networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DONG, CHAO ET AL., IMAGE SUPER-RESOLUTION USING DEEP CONVOLUTIONAL NETWORKS, vol. 2, no. 38, 1 June 2015 (2015-06-01), pages 1 - 6, XP011591233, ISSN: 0162-8828 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008817A (zh) * 2019-01-29 2019-07-12 北京奇艺世纪科技有限公司 模型训练、图像处理方法、装置、电子设备及计算机可读存储介质
CN110008817B (zh) * 2019-01-29 2021-12-28 北京奇艺世纪科技有限公司 模型训练、图像处理方法、装置、电子设备及计算机可读存储介质

Also Published As

Publication number Publication date
CN109313795B (zh) 2022-03-29
CN109313795A (zh) 2019-02-05

Similar Documents

Publication Publication Date Title
US11200696B2 (en) Method and apparatus for training 6D pose estimation network based on deep learning iterative matching
US11393092B2 (en) Motion tracking and strain determination
US12340575B2 (en) Vehicle detecting system, vehicle detecting method, and program storage medium
JP7030493B2 (ja) 画像処理装置、画像処理方法およびプログラム
US11314989B2 (en) Training a generative model and a discriminative model
JP6539901B2 (ja) 植物病診断システム、植物病診断方法、及びプログラム
US20190279014A1 (en) Method and apparatus for detecting object keypoint, and electronic device
US9886746B2 (en) System and method for image inpainting
US10817984B2 (en) Image preprocessing method and device for JPEG compressed file
US11734837B2 (en) Systems and methods for motion estimation
CN105981041A (zh) 使用粗到细级联神经网络的面部关键点定位
CN111507906A (zh) 用用于容错及波动鲁棒性的神经网络除抖动的方法及装置
US11449975B2 (en) Object count estimation apparatus, object count estimation method, and computer program product
CN109685805B (zh) 一种图像分割方法及装置
CN113167568B (zh) 坐标计算装置、坐标计算方法和计算机可读记录介质
KR101700030B1 (ko) 사전 정보를 이용한 영상 물체 탐색 방법 및 이를 수행하는 장치
WO2017177363A1 (fr) Procédés et appareils d'hallucination de visage
CN109784353B (zh) 一种处理器实现的方法、设备和存储介质
CN114170087A (zh) 一种基于跨尺度低秩约束的图像盲超分辨率方法
US20230343438A1 (en) Systems and methods for automatic image annotation
WO2021075465A1 (fr) Dispositif, procédé et programme pour la reconstruction tridimensionnelle d'un sujet à analyser
JPWO2011033744A1 (ja) 画像処理装置、画像処理方法および画像処理用プログラム
US20240338834A1 (en) Method and system for estimating temporally consistent 3d human shape and motion from monocular video
EP4343680A1 (fr) Débruitage de données
Khosravi et al. A new statistical technique for interpolation of landsat images

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16898182

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 16898182

Country of ref document: EP

Kind code of ref document: A1