WO2017177363A1 - Procédés et appareils d'hallucination de visage - Google Patents
Procédés et appareils d'hallucination de visage Download PDFInfo
- Publication number
- WO2017177363A1 WO2017177363A1 PCT/CN2016/078960 CN2016078960W WO2017177363A1 WO 2017177363 A1 WO2017177363 A1 WO 2017177363A1 CN 2016078960 W CN2016078960 W CN 2016078960W WO 2017177363 A1 WO2017177363 A1 WO 2017177363A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- hallucination
- trained model
- dense
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/73—Deblurring; Sharpening
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/20—Image enhancement or restoration using local operators
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/169—Holistic features and representations, i.e. based on the facial image taken as a whole
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Definitions
- the disclosure relates to image processing, in particular, to methods and apparatus for face hallucination.
- Face hallucination is a task that improves the resolution of facial images and provides a viable means for improving low-resolution face processing and analysis, e.g., person identification in surveillance videos and facial image enhancement.
- a method for face hallucination comprises: estimating a dense correspondence field based on a first image and a trained model; executing face hallucination based on the first image, the estimated dense correspondence field and the trained model through a bi-network to obtain a second image; and updating the first image with the second image, wherein the steps of estimating, executing and updating are performed repeatedly until the obtained second image has a desired resolution or the steps of estimating, executing and updating have been repeated for predetermined times.
- an apparatus for face hallucination which comprises: an estimating unit configured to estimate a dense correspondence field based on a first image and a trained model; and a hallucination unit configured to execute face hallucination based on the first image, the estimated dense correspondence field and the trained model through a bi-network to obtain a second image; wherein the first image is iteratively updated with the second image, and the estimation unit and the hallucination unit works for a predetermined times of iterations or until the obtained second image has a desired resolution.
- a device for face hallucination which comprises a processor and a memory storing computer-readable instructions, wherein, when the instructions are executed by the processor, the processor is operable to:estimate a dense correspondence field based on a first image and a trained model; execute face hallucination based on the first image, the estimated dense correspondence field and the trained model through a bi-network to obtain a second image; and update the first image with the second image, wherein the first image is iteratively updated with the second image for a predetermined times of iterations or until the obtained second image has a desired resolution.
- a nonvolatile storage medium containing computer-readable instructions, wherein, when the instructions are executed by a processor, the processor is operable to estimate a dense correspondence field based on a first image and a trained model; execute face hallucination based on the first image, the estimated dense correspondence field and the trained model through a bi-network to obtain a second image; and update the first image with the second image, wherein the first image is iteratively updated with the second image for a predetermined times of iterations or until the obtained second image has a desired resolution.
- Fig. 1 is a flow chart of a method for face hallucination according to an embodiment of the present disclosure.
- Fig. 2 illustrates an apparatus for face hallucination according to an embodiment of the present disclosure.
- Fig. 3 illustrates a flow chart of the training process of the estimation unit according to an embodiment of the present application.
- Fig. 4 illustrates a flow chart of the testing process of the estimation unit according to an embodiment of the present application.
- Fig. 5 illustrates a flow chart of the training process of the hallucination unit according to an embodiment of the present application.
- Fig. 6 illustrates a flow chart of the testing process of the hallucination unit according to an embodiment of the present application.
- Fig. 7 is a structural schematic diagram of an embodiment of computer equipment provided by the present invention.
- a method for face hallucination is provided.
- Fig. 1 is a flow chart of a method 100 for face hallucination according to an embodiment of the present disclosure.
- an apparatus 200 for face hallucination is provided.
- Fig. 1 is a flow chart of a method 100 for face hallucination according to an embodiment of the present disclosure.
- Fig. 2 illustrates an apparatus 200 for face hallucination according to an embodiment of the present disclosure.
- the apparatus 200 may comprises an estimation unit 201 and a hallucination unit 202.
- a dense correspondence field is estimated by an estimation unit 201 based on an input first image 10 and parameters from a trained model 20.
- the first image input into the estimation unit may be a facial image with a low resolution.
- the dense correspondence field indicates the correspondence or mapping relationship of the first image to a warped image and denotes the warping of each pixel from the first image to the warped image.
- the trained model contains various parameters that may be used for the estimation of the dense correspondence field.
- step S102 face hallucination is executed by the hallucination unit 202 based on the first image 10 and the estimated dense correspondence field to obtain a second image 30.
- the second image obtained after the face hallucination on the first image usually has a resolution higher than the first image.
- the hallucination unit 202 is a bi-network which comprises a first branch 2021 being a common branch for face hallucination and a second branch 2022 being a high-frequency branch.
- the processing in the common branch is similar to the face hallucination in the prior art.
- the estimated dense correspondence field and parameters from the trained model 20 are further considered in addition to the input image 10.
- the results obtained from both branches are incorporated through a gate network 2023 to obtain the second image 30.
- the first image is updated with the second image so that the second image is used as an input to the estimation unit 201.
- the steps S101 to S103 are performed repeatedly.
- the steps may be performed repeatedly until the obtained second image has a desired image resolution.
- the steps may be performed for pre-defined times.
- the facial image may be denoted as a matrix I, and each pixel in the image may be denoted as x with coordinates (x, y) .
- a mean face template for the facial image may be denoted as M, which comprises a plurality of pixels z.
- the warping function W(z) may be determined based on a deformation coefficient p and a deformation base B(z) , which may be denoted as
- the bases are pre-defined and shared by all samples.
- the deformation base B (z) is predefined and shared by all samples, and thus the warping function is actually controlled by the deformation coefficient p for each sample.
- f k is a Gauss-Newton descent regressor learned and stored in the trained model for predicting the dense correspondence field coefficients.
- the coefficients f k may further be represented by a Gauss-Newton steepest descent regression matrixR k , which is obtained by training.
- ⁇ is the shape-indexed feature that concatenates the local appearance from all L landmarks, and is its average over all the training samples.
- the dense correspondence field coefficients are estimated based on each pixel in the image.
- the dense correspondence field coefficients are estimated based on landmarks in the image since using a sparse set of facial landmarks is more robust and accurate under low resolution.
- a landmark base S k (l) is further considered in the estimation.
- two sets of deformation bases i.e., the deformation base for the dense field and the landmark base for the landmarks are obtained, where l is the landmark index.
- the bases for the dense field and landmarks are on-to-one related, i.e., both B k (z) and S k (l) are share the same deformation coefficients
- the common branch conservatively recovers texture details that are only detectable from the low-resolution input, which is similar to the general super resolution.
- the high-frequency branch super-resolves faces with the additional high-frequency prior warped by the estimated face correspondence field in the current cascade. Thanks to the guidance of prior, this branch is capable of recovering and synthesizing un-revealed texture details in the overly low-resolution input image.
- a pixel-wise gate network is learned to fuse the results from the two branches.
- the first image is upscaled and then input to the hallucination unit.
- the upscaled image is input to both the common branch and the high-frequency branch.
- the upscaled image is processed adaptively, for example, under a bicubic interpolation.
- the estimated dense correspondence field is further input, and the upscaled image is processed based on the estimated dense correspondence field.
- the results from both branches are combined in a gate network to obtain the second image.
- the processing in the common branch is not limited to the bicubic interpolation, but may be any suitable process for the face hallucination.
- the obtained image Ik is obtained by:
- I k ⁇ I k-1 +g k ( ⁇ I k-1 ; W k (z) ) (4)
- g k represents a hallucination bi-network learned and stored in the trained model for face hallucination.
- the coefficients g k is obtained by training.
- both the estimation unit and the hallucination unit may have a testing mode and a training mode.
- the method 100 as shown in Fig. 1 illustrates the working process of the estimation unit and hallucination unit in the testing mode.
- the estimation unit and hallucination unit may perform a training process to obtain and store parameters required in the testing mode into the trained model.
- the estimation unit and the hallucination unit having both a testing mode and a training mode are described as an example.
- the training process and the testing process may be performed by separate apparatus or separate units.
- Fig. 3 illustrates a flow chart of the training process 300 of the estimation unit according to an embodiment of the present application. As shown, at step S301, the dense bases B k (z) , the landmark bases S k (l) and appearance eigen vectors ⁇ k are obtained.
- the dense bases B k (z) and the landmark bases S k (l) are stored into the trained model for later use.
- the average project-out Jacobian J k is learned, for example, by minimizing the following loss:
- ⁇ is the shape-indexed feature that concatenates the local appearance from all L landmarks, and is its average over all the training samples.
- the Gauss-Newton steepest descent regression matrix R k is calculated by:
- the process 300 may further include steps S304 and S305.
- steps S304 and S305 the deformation coefficients for both the correspondence training set and the hallucination training set are updated.
- step S305 the dense correspondence field for each location z for the hallucination training set is calculated. The deformation coefficients and the dense correspondence field obtained at steps S304 and S305 may be used in the later training process.
- Fig. 4 illustrates a flow chart of the testing process 400 of the estimation unit according to an embodiment of the present application.
- location for each landmark is obtained from the facial image input to the estimation unit.
- the input image is the original low-resolution image in the first iteration.
- the input image is the image obtained in the (k-1) th iteration, as well as the deformation coefficient obtained in the (k-1) th iteration.
- the location of each landmark in the input image is obtained.
- the SIFT feature from around the location of the landmark is obtained.
- the SIFT feature is the shape-indexed feature described above.
- the features from all the landmarks are combined as an appearance eigen vector.
- the deformation coefficients are updated via regression according to the equation (2) .
- the dense correspondence field for each location z is computed.
- Fig. 5 illustrates a flow chart of the training process 500 of the hallucination unit according to an embodiment of the present application.
- images from the training sets are upsampled by bicubic interpolation.
- the warped high-frequency prior is obtained according to the dense correspondence field.
- the deep bi-network is trained with three steps: pre-training the common sub-network, pre-training the high-frequency sub-network, and tuning the whole bi-network end-to-end.
- the bi-network coefficient may be stored in the trained model.
- the bi-network may be passed to compute the predict image for both the hallucination training set and the estimation training set.
- Fig. 6 illustrates a flow chart of the testing process 600 of the hallucination unit according to an embodiment of the present application.
- an input image I k-1 is upsampled by bicubic interpolation to obtain an upsampled image ⁇ I k-1 .
- the warped high-frequency prior is obtained according to the dense correspondence field.
- the learned bi-network coefficient g k is used to forward pass the deep bi-network with two inputs ⁇ I k-1 and so that the image I k is obtained.
- Algorithm 1 is an exemplary training algorithm for learning the parameters by the apparatus according to an embodiment of the present application.
- Algorithm 2 is an exemplary testing algorithm for hallucinating a low-resolution face according to an embodiment of the present application.
- Fig. 7 is a structural schematic diagram of an embodiment of computer equipment provided by the present invention.
- the computer equipment can be used for implementing the face hallucination method provided in the above embodiments.
- the computer equipment may be greatly different due to different configuration or performance, and may include one or more processors (e.g. Central Processing Units, CPU) 710 and a memory 720.
- the memory 720 may be a volatile memory or a nonvolatile memory.
- One or more programs can be stored in the memory 720, and each program may include a series of instruction operations in the computer equipment.
- the processor 710 can communicate with the memory 720, and execute the series of instruction operations in the memory 720 on the computer equipment.
- data of one or more operating systems e.g.
- the computer equipment may further include one or more power supplies 730, one or more wired or wireless network interfaces 740, one or more input/output interfaces 750, etc.
- the method and the device according to the present invention described above may be implemented in hardware or firmware, or implemented as software or computer codes which can be stored in a recording medium (e.g. CD, ROM, RAM, soft disk, hard disk or magneto-optical disk) , or implemented as computer codes which are originally stored in a remote recording medium or a non-transient machine readable medium and can be downloaded through a network to be stored in a local recording medium, so that the method described herein can be processed by such software stored in the recording medium in a general purpose computer, a dedicated processor or programmable or dedicated hardware (e.g. ASIC or FPGA) .
- the computer, the processor, the microprocessor controller or the programmable hardware include a storage assembly (e.g.
- RAM random access memory
- ROM read-only memory
- flash memory etc.
- the general purpose computer accesses the codes for implementing the processing shown herein, the execution of the codes converts the general purpose computer to a dedicated computer for executing the processing illustrated herein.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
L'invention concerne des procédés et un appareil d'hallucination de visage. Selon un mode de réalisation, un procédé d'hallucination comprend l'estimation d'un champ de correspondance dense en fonction d'une première image et d'un modèle formé ; l'exécution d'une hallucination de visage en fonction de la première image, du champ de correspondance dense estimé et du modèle entraîné par l'intermédiaire d'un réseau double pour obtenir une seconde image ; et la mise à jour de la première image avec la seconde image, les étapes d'estimation, d'exécution et de mise à jour étant effectuées de façon répétée jusqu'à ce que la seconde image obtenue présente une résolution souhaitée ou que les étapes d'estimation, d'exécution et de mise à jour aient été répétées un nombre de fois prédéfini.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2016/078960 WO2017177363A1 (fr) | 2016-04-11 | 2016-04-11 | Procédés et appareils d'hallucination de visage |
| CN201680084409.3A CN109313795B (zh) | 2016-04-11 | 2016-04-11 | 用于超分辨率处理的方法和设备 |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2016/078960 WO2017177363A1 (fr) | 2016-04-11 | 2016-04-11 | Procédés et appareils d'hallucination de visage |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2017177363A1 true WO2017177363A1 (fr) | 2017-10-19 |
Family
ID=60041336
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2016/078960 Ceased WO2017177363A1 (fr) | 2016-04-11 | 2016-04-11 | Procédés et appareils d'hallucination de visage |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN109313795B (fr) |
| WO (1) | WO2017177363A1 (fr) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110008817A (zh) * | 2019-01-29 | 2019-07-12 | 北京奇艺世纪科技有限公司 | 模型训练、图像处理方法、装置、电子设备及计算机可读存储介质 |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112001861B (zh) * | 2020-08-18 | 2024-04-02 | 香港中文大学(深圳) | 图像处理方法和装置、计算机设备及存储介质 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070103595A1 (en) * | 2005-10-27 | 2007-05-10 | Yihong Gong | Video super-resolution using personalized dictionary |
| US20110305404A1 (en) * | 2010-06-14 | 2011-12-15 | Chia-Wen Lin | Method And System For Example-Based Face Hallucination |
| CN103208109A (zh) * | 2013-04-25 | 2013-07-17 | 武汉大学 | 一种基于局部约束迭代邻域嵌入的人脸幻构方法 |
| US20150363634A1 (en) * | 2014-06-17 | 2015-12-17 | Beijing Kuangshi Technology Co.,Ltd. | Face Hallucination Using Convolutional Neural Networks |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103530863B (zh) * | 2013-10-30 | 2017-01-11 | 广东威创视讯科技股份有限公司 | 一种多级重构的图像超分辨率方法 |
| CN104091320B (zh) * | 2014-07-16 | 2017-03-29 | 武汉大学 | 基于数据驱动局部特征转换的噪声人脸超分辨率重建方法 |
| CN105405113A (zh) * | 2015-10-23 | 2016-03-16 | 广州高清视信数码科技股份有限公司 | 一种基于多任务高斯过程回归的图像超分辨率重建方法 |
-
2016
- 2016-04-11 CN CN201680084409.3A patent/CN109313795B/zh active Active
- 2016-04-11 WO PCT/CN2016/078960 patent/WO2017177363A1/fr not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070103595A1 (en) * | 2005-10-27 | 2007-05-10 | Yihong Gong | Video super-resolution using personalized dictionary |
| US20110305404A1 (en) * | 2010-06-14 | 2011-12-15 | Chia-Wen Lin | Method And System For Example-Based Face Hallucination |
| CN103208109A (zh) * | 2013-04-25 | 2013-07-17 | 武汉大学 | 一种基于局部约束迭代邻域嵌入的人脸幻构方法 |
| US20150363634A1 (en) * | 2014-06-17 | 2015-12-17 | Beijing Kuangshi Technology Co.,Ltd. | Face Hallucination Using Convolutional Neural Networks |
Non-Patent Citations (1)
| Title |
|---|
| DONG, CHAO ET AL., IMAGE SUPER-RESOLUTION USING DEEP CONVOLUTIONAL NETWORKS, vol. 2, no. 38, 1 June 2015 (2015-06-01), pages 1 - 6, XP011591233, ISSN: 0162-8828 * |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110008817A (zh) * | 2019-01-29 | 2019-07-12 | 北京奇艺世纪科技有限公司 | 模型训练、图像处理方法、装置、电子设备及计算机可读存储介质 |
| CN110008817B (zh) * | 2019-01-29 | 2021-12-28 | 北京奇艺世纪科技有限公司 | 模型训练、图像处理方法、装置、电子设备及计算机可读存储介质 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN109313795B (zh) | 2022-03-29 |
| CN109313795A (zh) | 2019-02-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11200696B2 (en) | Method and apparatus for training 6D pose estimation network based on deep learning iterative matching | |
| US11393092B2 (en) | Motion tracking and strain determination | |
| US12340575B2 (en) | Vehicle detecting system, vehicle detecting method, and program storage medium | |
| JP7030493B2 (ja) | 画像処理装置、画像処理方法およびプログラム | |
| US11314989B2 (en) | Training a generative model and a discriminative model | |
| JP6539901B2 (ja) | 植物病診断システム、植物病診断方法、及びプログラム | |
| US20190279014A1 (en) | Method and apparatus for detecting object keypoint, and electronic device | |
| US9886746B2 (en) | System and method for image inpainting | |
| US10817984B2 (en) | Image preprocessing method and device for JPEG compressed file | |
| US11734837B2 (en) | Systems and methods for motion estimation | |
| CN105981041A (zh) | 使用粗到细级联神经网络的面部关键点定位 | |
| CN111507906A (zh) | 用用于容错及波动鲁棒性的神经网络除抖动的方法及装置 | |
| US11449975B2 (en) | Object count estimation apparatus, object count estimation method, and computer program product | |
| CN109685805B (zh) | 一种图像分割方法及装置 | |
| CN113167568B (zh) | 坐标计算装置、坐标计算方法和计算机可读记录介质 | |
| KR101700030B1 (ko) | 사전 정보를 이용한 영상 물체 탐색 방법 및 이를 수행하는 장치 | |
| WO2017177363A1 (fr) | Procédés et appareils d'hallucination de visage | |
| CN109784353B (zh) | 一种处理器实现的方法、设备和存储介质 | |
| CN114170087A (zh) | 一种基于跨尺度低秩约束的图像盲超分辨率方法 | |
| US20230343438A1 (en) | Systems and methods for automatic image annotation | |
| WO2021075465A1 (fr) | Dispositif, procédé et programme pour la reconstruction tridimensionnelle d'un sujet à analyser | |
| JPWO2011033744A1 (ja) | 画像処理装置、画像処理方法および画像処理用プログラム | |
| US20240338834A1 (en) | Method and system for estimating temporally consistent 3d human shape and motion from monocular video | |
| EP4343680A1 (fr) | Débruitage de données | |
| Khosravi et al. | A new statistical technique for interpolation of landsat images |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16898182 Country of ref document: EP Kind code of ref document: A1 |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 16898182 Country of ref document: EP Kind code of ref document: A1 |