CN109166176B

CN109166176B - Three-dimensional face image generation method and device

Info

Publication number: CN109166176B
Application number: CN201810968117.7A
Authority: CN
Inventors: 庞文杰
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2018-08-23
Filing date: 2018-08-23
Publication date: 2020-07-07
Anticipated expiration: 2038-08-23
Also published as: CN109166176A

Abstract

The embodiment of the invention discloses a method and a device for generating a three-dimensional face image, wherein the method comprises the following steps: acquiring N two-dimensional face images with different visual angles, wherein N is an integer greater than 1; performing grid division on the N two-dimensional face images to obtain N grid images corresponding to each feature on the face; sampling the N grid images corresponding to each feature, so that the pixels of the grid images in the N grid images corresponding to each feature are the same, and the visual angle intervals are equal; decomposing and compressing the N sampled grid images corresponding to all the features to determine surface light field data of the face; and obtaining a three-dimensional face image according to the surface light field data and the N two-dimensional face images. That is, in this embodiment, surface light field data of a human face is obtained from N two-dimensional human face images, and the N two-dimensional human face images are rendered using the surface light field data, so as to generate a high-precision three-dimensional human face image.

Description

Three-dimensional face image generation method and device

Technical Field

The embodiment of the invention relates to the technical field of image processing, in particular to a method and a device for generating a three-dimensional face image.

Background

With the development of image processing technology, the application scenes of three-dimensional face images are more and more. Nowadays, users have put forward a demand for generating a three-dimensional image face image based on a two-dimensional face image actually photographed for more realistic display.

In the prior art, the conversion of a two-dimensional face image into a three-dimensional face image is generally performed based on the following procedures: and combining various relations between the two-dimensional picture and the three-dimensional grid, constructing a constraint equation, deforming the three-dimensional grid based on the constraint equation, and mapping the image information of the two-dimensional picture to the deformed three-dimensional grid to generate a self-adaptive three-dimensional face image.

However, the three-dimensional face image generated by the prior art is not clear enough.

Disclosure of Invention

The embodiment of the invention provides a method and a device for generating a three-dimensional face image.

In a first aspect, an embodiment of the present invention provides a method for generating a three-dimensional face image, including:

acquiring N two-dimensional face images with different visual angles, wherein N is an integer greater than 1;

performing grid division on the N two-dimensional face images to obtain N grid images corresponding to each feature on the face;

sampling the N grid images corresponding to each feature, so that the pixels of the grid images in the N grid images corresponding to each feature are the same, and the visual angle intervals are equal;

decomposing and compressing the N sampled grid images corresponding to all the features to determine surface light field data of the face;

and obtaining a three-dimensional face image according to the surface light field data and the N two-dimensional face images.

In a possible implementation manner of the first aspect, before the sampling the N mesh images corresponding to each feature, the method further includes:

and aligning the N grid images corresponding to the same characteristic.

In another possible implementation manner of the first aspect, if the same pixel point on the N two-dimensional face images is a feature of the face, the mesh division is performed on the N two-dimensional face images to obtain N mesh images corresponding to each feature of the face, including:

and according to the pixel points, carrying out grid division on the N two-dimensional images to obtain N grid images corresponding to each pixel point.

In another possible implementation manner of the first aspect, if the face includes M preset features, the performing mesh division on the N two-dimensional face images to obtain N mesh images corresponding to each feature on the face includes:

and according to the M characteristics of the human face, carrying out grid division on the N two-dimensional images to obtain N grid images corresponding to each characteristic.

In another possible implementation manner of the first aspect, the obtaining a three-dimensional face image according to the surface light field data and the N two-dimensional face images includes:

obtaining a three-dimensional model of the human face according to the N two-dimensional human face images;

and rendering the three-dimensional model by using the surface light field data to obtain a rendered three-dimensional face image.

In another possible implementation manner of the first aspect, the aligning N mesh images corresponding to the same feature includes:

acquiring a grid image from N grid images corresponding to the same characteristic as a target grid image, and aligning N-1 grid images except the target grid image in the N grid images with the target grid image.

In another possible implementation manner of the first aspect, the acquiring one mesh image from N mesh images corresponding to the same feature as the target mesh image includes:

and taking the grid image with the maximum diffusion color value in the N grid images corresponding to the same feature as the target grid image of the same feature.

determining the sum of the energies of all the N grid images corresponding to the same feature;

and taking the grid image corresponding to each feature when the sum of the energies is minimum as a target grid image of each feature.

In another possible implementation manner of the first aspect, the determining a sum of energies of all N mesh images corresponding to the same feature includes:

according to the formula

Determining the sum E (P) of the energies of the N grid images corresponding to the same characteristic;

wherein, the

Color values of the ith grid image of feature f, said

Is that it is

Corresponding luminance value, said

Is that it is

Corresponding sample masses, said f' being a neighboring grid of said feature f, said

Color values of the jth grid image of the feature f

The color difference on the edge is shared for the feature f and the feature f'.

In another possible implementation manner of the first aspect, the aligning N-1 mesh images of the N mesh images other than the target mesh image with the target mesh image includes:

determining a similar energy value of each of the N-1 mesh images to the target mesh image;

determining that each mesh image of the N-1 mesh images is aligned with the target mesh image when the similar energy value is maximum.

In another possible implementation manner of the first aspect, the determining a similar energy value of each of the N-1 mesh images to the target mesh image includes:

according to the formula

Determining a similar energy value E for each of the N-1 mesh images to the target mesh image_f(D^f,t)；

Wherein, D is^fA target mesh image of the feature f, the

Color value of ith grid image of feature f, t_iAnd the translation amount of the ith grid image of the feature f.

according to the formula

Wherein, D is^fA target mesh image of the feature f, the

Color value of ith grid image of the feature f, t_iThe translation amount of the ith grid image of the feature f, t₀Is a preset value.

In another possible implementation manner of the first aspect, the mesh is a triangular mesh.

In a second aspect, an embodiment of the present application provides an apparatus for generating a three-dimensional face image, including:

the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring N two-dimensional face images with different visual angles, and N is an integer greater than 1;

the dividing module is used for carrying out grid division on the N two-dimensional face images to obtain N grid images corresponding to each feature on the face;

the sampling module is used for sampling the N grid images corresponding to each feature, so that the pixels of the grid images in the N grid images corresponding to each feature are the same, and the visual angle intervals are equal;

the determining module is used for decomposing and compressing the sampled N grid images corresponding to all the features and determining surface light field data of the human face;

and the model acquisition module is used for acquiring a three-dimensional face image according to the surface light field data and the N two-dimensional face images.

In a possible implementation manner of the second aspect, the apparatus further includes: and the alignment module is used for aligning the N grid images corresponding to the same characteristic.

In another possible implementation manner of the second aspect, if the same pixel point on the N two-dimensional face images is a feature of the face, the dividing module is specifically configured to perform mesh division on the N two-dimensional images according to the pixel point to obtain N mesh images corresponding to each pixel point.

In another possible implementation manner of the second aspect, if the face includes M preset features, the dividing module is specifically configured to perform mesh division on the N two-dimensional images according to the M preset features of the face, and obtain N mesh images corresponding to each feature.

In another possible implementation manner of the second aspect, the model obtaining module is specifically configured to obtain a three-dimensional model of the face according to the N two-dimensional face images; and rendering the three-dimensional model by using the surface light field data to obtain a rendered three-dimensional face image.

In another possible implementation manner of the second aspect, the alignment module includes:

the acquiring unit is used for acquiring a grid image from N grid images corresponding to the same characteristic as a target grid image;

an alignment unit configured to align N-1 mesh images of the N mesh images, excluding the target mesh image, with the target mesh image.

In another possible implementation manner of the second aspect, the acquiring one mesh image from N mesh images corresponding to the same feature as the target mesh image includes:

the acquiring unit is configured to use a grid image with the largest diffusion color value in the N grid images corresponding to the same feature as a target grid image of the same feature.

the acquiring unit is specifically configured to determine the sum of energies of all N mesh images corresponding to the same feature; and taking the grid image corresponding to each feature when the sum of the energies is minimum as a target grid image of each feature.

In another possible implementation manner of the second aspect, the obtaining unit is specifically configured to:

according to the formula

wherein, the

Color values of the ith grid image of feature f, said

Is that it is

Corresponding luminance value, said

Is that it is

Color values of the jth grid image of the feature f

In another possible implementation manner of the second aspect, the alignment unit is specifically configured to:

In another possible implementation manner of the second aspect, the aligning unit specifically includes:

according to the formula

Wherein, D is^fA target mesh image of the feature f, the

In another possible implementation manner of the second aspect, the aligning unit further specifically includes:

according to the formula

Wherein, D is^fA target mesh image of the feature f, the

In another possible implementation manner of the second aspect, the mesh is a triangular mesh.

In a third aspect, an embodiment of the present application provides an apparatus for generating a three-dimensional face image, including:

a memory for storing a computer program;

a processor for executing the computer program to implement the method for generating a three-dimensional face image according to any one of the first aspect.

In a fourth aspect, an embodiment of the present application provides an apparatus for generating a three-dimensional face image, including: a camera and a processor in communication with each other,

the camera is used for acquiring N two-dimensional face images with different visual angles, wherein N is an integer greater than 1;

the processor is used for carrying out grid division on the N two-dimensional face images to obtain N grid images corresponding to each feature on the face, and sampling the N grid images corresponding to each feature, so that the pixels of the grid images in the N grid images corresponding to each feature are the same, and the visual angle intervals are equal; decomposing and compressing the N sampled grid images corresponding to all the features to determine surface light field data of the face; and obtaining a three-dimensional face image according to the surface light field data and the N two-dimensional face images.

In a possible implementation manner of the fourth aspect, the processor is further configured to align the N mesh images corresponding to the same feature.

In another possible implementation manner of the fourth aspect, if the same pixel point on the N two-dimensional face images is a feature of the face, the processor is configured to:

In another possible implementation manner of the fourth aspect, if the face includes M preset features, the processor is configured to:

In another possible implementation manner of the fourth aspect, the processor is specifically configured to:

according to the formula

wherein, the

Color values of the ith grid image of feature f, said

Is that it is

Corresponding luminance value, said

Is that it is

Color values of the jth grid image of the feature f

according to the formula

Wherein, D is^fA target mesh image of the feature f, the

according to the formula

Wherein, D is^fA target mesh image of the feature f, the

In another possible implementation manner of the fourth aspect, the processor is specifically configured to determine that the mesh is a triangular mesh.

In a fifth aspect, an embodiment of the present application provides a computer storage medium, where a computer program is stored in the storage medium, and the computer program, when executed, implements the method for generating a three-dimensional face image according to any one of the first aspect.

According to the method and the device for generating the three-dimensional face image, N two-dimensional face images with different visual angles are obtained, wherein N is an integer larger than 1; performing grid division on the N two-dimensional face images to obtain N grid images corresponding to each feature on the face; sampling the N grid images corresponding to each feature, so that the pixels of the grid images in the N grid images corresponding to each feature are the same, and the visual angle intervals are equal; decomposing and compressing the N sampled grid images corresponding to all the features to determine surface light field data of the face; and obtaining a three-dimensional face image according to the surface light field data and the N two-dimensional face images. That is, in this embodiment, surface light field data of a human face is obtained from N two-dimensional human face images, and the N two-dimensional human face images are rendered using the surface light field data, so as to generate a high-precision three-dimensional human face image.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a schematic flow chart of a method for generating a three-dimensional face image according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of partial surface light field data according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of selecting a target image from unaligned images according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of images after alignment according to an embodiment of the present invention;

FIG. 5 is a schematic illustration of surface light field data optimized for the surface light field data shown in FIG. 2;

fig. 6 is a flowchart illustrating a process of determining a target mesh image according to the present embodiment;

fig. 7 is a diagram illustrating an example of a flow of performing image alignment according to the present embodiment;

fig. 8 is a schematic structural diagram of a device for generating a three-dimensional face image according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a three-dimensional face image generation apparatus according to a second embodiment of the present invention;

fig. 10 is a schematic structural diagram of a three-dimensional face image generation apparatus according to a third embodiment of the present invention;

fig. 11 is a schematic structural diagram of a device for generating a three-dimensional face image according to an embodiment of the present invention;

fig. 12 is a schematic structural diagram of a device for generating a three-dimensional face image according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The surface light field is a branch of the light field, the traditional light field rendering technology is further promoted by utilizing the geometric information of the target object, and the light field can be recorded in a distortion-free mode within a larger visual angle (360 degrees). Thus, the surface light field is used to recover some objects with more complex appearance and geometric models.

According to the method for generating the three-dimensional face image, the N two-dimensional face images are processed to obtain the surface light field data of the face, the three-dimensional face model approximate to the real environment is generated according to the surface light field data and the N two-dimensional face images, and the generation precision of the three-dimensional face model is improved.

The technical solution of the present invention will be described in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.

Fig. 1 is a schematic flow chart of a method for generating a three-dimensional face image according to an embodiment of the present invention. As shown in fig. 1, the method of this embodiment may include:

s101, acquiring N two-dimensional face images with different visual angles, wherein N is an integer larger than 1.

The execution subject of the embodiment may be a three-dimensional face image generation device having a function of generating a three-dimensional face image, which is simply referred to as a generation device. The generating means of the present embodiment may be a part of the electronic device, for example, a processor of the electronic device.

The generating means of this embodiment may alternatively be a separate electronic device.

The electronic device of this embodiment may be an electronic device such as a smart phone, a desktop computer, a notebook computer, an intelligent bracelet, an AR device, a VA device, and the like.

The embodiment takes the execution subject as an electronic device as an example for explanation.

Optionally, the N two-dimensional face images with different viewing angles of the present embodiment may be acquired from other devices, for example, a camera or a server.

Optionally, the electronic device of this embodiment may further have a shooting function, that is, the electronic device has a camera, and the camera on the electronic device may be used to shoot N two-dimensional face images of a face at different viewing angles.

And S102, carrying out grid division on the N two-dimensional face images to obtain N grid images corresponding to each feature on the face.

Specifically, after N two-dimensional face images are obtained according to the steps, the N two-dimensional face images are subjected to grid division, and N grid images corresponding to each feature on the face are generated. For example, N grid images of a characteristic nose are acquired.

The division of the features on the face may be determined as needed, for example, it may be determined that the face includes 8 features as needed: left eye, right eye, nose, mouth, left cheekbone, right cheekbone, left cheek, and right cheek.

Optionally, each pixel point in the two-dimensional face image may also be used as a feature of the face.

Optionally, in this embodiment, the two-dimensional face image may be divided by using a mesh in any shape, for example, the two-dimensional face image is divided by using a multi-deformation mesh such as a quadrangle, a pentagon, a hexagon, and the like.

In an example, in order to reduce the difficulty of the division, the present embodiment may perform mesh division on the two-dimensional face image using a triangular mesh. Thus, the two-dimensional face image is divided into a plurality of triangular mesh images.

The size of each grid can be the same or different, the two-dimensional image of the same face can be divided by using the grids in the same shape, and the grids in different shapes can also be divided.

Optionally, the size and shape of the mesh used by the same feature on the face are the same.

And carrying out grid division on the N two-dimensional face images according to the method to obtain N grid images corresponding to each feature on the face.

For example, the human face includes 8 features of a left eye, a right eye, a nose, a mouth, a left cheekbone, a right cheekbone, a left cheek, and a right cheek, and N mesh images corresponding to each feature of the 8 features may be obtained by dividing N two-dimensional human face images according to the 8 features.

In practical application, it is assumed that a two-dimensional face image is subjected to mesh division by using a surface mesh M, and each two-dimensional face image is divided into M mesh images, where at this time, a mesh image set F ═ F of each two-dimensional face image_i，…f_M}。

In this embodiment, when acquiring N two-dimensional face images, external parameters of a camera that captures the N two-dimensional face images may be extracted.

S103, sampling the N grid images corresponding to each feature, so that the pixels of the grid images in the N grid images corresponding to each feature are the same, and the visual angle intervals are equal.

Since the pixels (i.e., sizes) of the N mesh images corresponding to the same feature may be different, and the angles of the camera are not equally spaced when the two-dimensional face image is captured, in order to improve the accuracy of image processing, it is necessary to sample each mesh image corresponding to the same feature. Specifically, the pixels of the N mesh images corresponding to the same feature are the same, and the viewing angle intervals are equal.

Therefore, the pixels of all grid images in the grid images corresponding to the same characteristic are the same, and the visual angle intervals are equal, so that the decomposition and compression accuracy can be improved when subsequent decomposition and compression are carried out, and the finally generated surface light field data are more accurate.

And S104, decomposing and compressing the sampled N grid images corresponding to all the features, and determining the surface light field data of the human face.

And S105, obtaining a three-dimensional face image according to the surface light field data and the N two-dimensional face grid images.

Specifically, the grid images corresponding to the same feature are sampled according to the steps, so that the pixels of the grid images corresponding to the same feature are the same, and the visual angle intervals are equal. And then, decomposing the sampled grid images corresponding to all the features, and then compressing the decomposed grid images to generate high-precision surface light field data of the human face. And then, obtaining a three-dimensional face image according to the surface light field data and the N two-dimensional face grid images.

In one example, obtaining a three-dimensional face image from the surface light field data and the N two-dimensional face mesh images may include: and obtaining a gray level image difference value of each pixel point on the two-dimensional face image according to the surface light field data, and generating corresponding depth information according to the gray level image difference value so as to pre-estimate the three-dimensional face image.

In one example, obtaining a three-dimensional face image from the surface light field data and the N two-dimensional face mesh images may include: obtaining a three-dimensional model of the human face according to the N two-dimensional human face images; and rendering the three-dimensional model by using the surface light field data to obtain a rendered three-dimensional face image. Wherein, according to the N two-dimensional face images, the three-dimensional model of the face can be obtained by using the prior art.

In VR (Virtual Reality) or AR (Augmented Reality) technology, the above determined surface light field data may be used to render an object, so as to present a light field effect of the object in a real environment in a Virtual environment, so as to improve user experience.

The method for decomposing and compressing the grid image is not limited in this embodiment, and any existing method may be used specifically.

In one example, the surface light field may be represented as a four-dimensional function L (u, v, s, t), where (u, v) is the position on the surface and (s, t) is the view direction.

The light field function can be further decomposed into a sum of a small number of products of the low dimensional functions:

L(u,v,s,t)≈∑S(u,v)V(s,t) (1)

wherein S (u, V) is a curved surface function and V (S, t) is a view function.

This decomposition attempts to separate the changes in surface texture from the changes in illumination. These functions can be constructed by using PCA (Principal Component Analysis) or nonlinear optimization. The functional parameters may be stored in a texture map and rendered in real-time.

To make the curved lightfield easy to implement in the rendering pipeline, L (u, v, s, t) is made to span small surface primitives and an approximation is established for each part independently. Specifically, a set of vertex light fields L is sampled and constructed within an annular mesh region of vertices x^x(u,v,s,t)。

In an implementation, the vertex light field function L^x(u, v, s, t) can be represented as a matrix L^x[u,v,s,t]∈R^m×nDiscretization in curved patches and stereoscopic viewing angles. The columns n of the matrix represent camera views and the rows m represent surface locations. Memory matrix L^xIs impractical and requires decomposition and compression of the light field data. Furthermore, according to the theory of the dichromatic reflection model, a secondary matrix L is also required^xMiddle separation of the diffuse component D^x[s,t]And L is^xIs the residual component G^x[s,t,u,v]. The remaining portion is conventionally compressed as follows:

L^x[u,v,s,t]＝D^x[u,v]+G^x[u,v,s,t]＝D^x[u,v]+∑S^x[u,v]V^x[s,t](2)

wherein S is^x[u,v]Is the surface mapping matrix of vertex x, V^x[s,t]Is the view mapping matrix for vertex x, which can be discretized from the surface and view functions in equation (1). (u, v) are the spatial coordinates in the vertex-by-vertex patch synthesized by its one circular triangular neighbor. [ s, t ]]Are the view coordinates in the hemisphere harmonic.

In one example, G of SVD (Singular Value Decomposition) is used^x＝S^x.V^xResidual color G to be resampled^x(i.e., the remaining components of Lx above) into k-term surface maps and views. Wherein S^xM × k matrix which is left singular vector, multiplying diagonal matrix of singular values ordered in decreasing order, V^xIs a k × n matrix of right singular vectors, k<n。

It follows that this embodiment does not perform SVD completely, but iteratively computes the first k terms using a power iteration method.

According to the method for generating the three-dimensional face image, N two-dimensional face images with different visual angles are obtained, wherein N is an integer larger than 1; performing grid division on the N two-dimensional face images to obtain N grid images corresponding to each feature on the face; sampling the N grid images corresponding to each feature, so that the pixels of the grid images in the N grid images corresponding to each feature are the same, and the visual angle intervals are equal; decomposing and compressing the N sampled grid images corresponding to all the features to determine surface light field data of the face; and obtaining a three-dimensional face image according to the surface light field data and the N two-dimensional face images. That is, in this embodiment, surface light field data of a human face is obtained from N two-dimensional human face images, and the N two-dimensional human face images are rendered using the surface light field data, so as to generate a high-precision three-dimensional human face image.

In some implementation manners of this embodiment, if the same pixel point on the N two-dimensional face images is a feature of the face, in step S102, performing mesh division on the N two-dimensional face images to obtain N mesh images corresponding to each feature of the face may include: and according to the pixel points, carrying out grid division on the N two-dimensional images to obtain N grid images corresponding to each pixel point.

Specifically, in N two-dimensional face images, each two-dimensional face image includes N pixel points, so that each two-dimensional face image can be divided into N grids, and each grid corresponds to one grid image.

For a pixel point a, obtaining a grid image corresponding to the pixel point a on each two-dimensional image in N two-dimensional face images, and further obtaining N grid images of the pixel point a. With reference to the method, N mesh images corresponding to each pixel point of N pixel points can be obtained.

In this embodiment, a pixel point in the two-dimensional face image is used as a feature of the face, so that the feature is refined, and the refined feature can express the face in more detail.

In other implementation manners of this embodiment, if the face includes M preset features, the step S102 of performing mesh division on the N two-dimensional face images to obtain N mesh images corresponding to each feature on the face may include: and according to the M characteristics of the human face, carrying out grid division on the N two-dimensional images to obtain N grid images corresponding to each characteristic.

Specifically, in this embodiment, the M features of the face may be preset, for example, the face may be divided by a user according to actual needs, or the face may be automatically divided by a computer, for example, the user inputs M two-dimensional face images, and the computer automatically divides the two-dimensional face images according to an existing rule.

For example, M is 8, and 8 features of a human face may include: left eye, right eye, nose, mouth, left cheekbone, right cheekbone, left cheek, and right cheek.

In a possible implementation manner of this embodiment, before the step S103, the method of this embodiment further includes:

and S100, aligning the N grid images corresponding to the same characteristic.

As shown in fig. 2, conventional direct compression may cause artifacts when the geometry is significantly inaccurate.

To solve this technical problem, the present embodiment optimizes the mesh prior to the light field decomposition to eliminate the negative effects of inaccurate geometry.

Specifically, according to the above steps, after N mesh images of each feature are obtained, the N mesh images corresponding to each feature are aligned, for example, 3 mesh images of the feature a are aligned.

Before the surface light field is compressed, the sampled grid images are aligned, so that the influence of inaccurate geometric shapes is eliminated, and the dependence of surface light field sampling and processing on accurate geometric models is reduced.

Optionally, the aligning, in S100, the N mesh images corresponding to the same feature may include:

s1001, acquiring a grid image from N grid images corresponding to the same characteristic as a target grid image;

s1002, aligning N-1 grid images except the target grid image in the N grid images with the target grid image.

Specifically, in the alignment process, taking the feature a as an example, one target grid image of the grid a is selected from the N grid images corresponding to the feature a. For example, as shown in fig. 3, the dark mesh image is a target mesh image of feature a.

Next, the remaining mesh images of the feature A, except the target mesh image, are aligned with the target mesh image, e.g. introducing a 2D translation t in space_iCalculating the translation t between the rest grid images and the target grid image in the characteristic A_iTo align the remaining mesh images with the target mesh image. For example, as shown in fig. 4, the remaining mesh image of feature a is aligned with the target mesh image of the dark mesh image.

In this way, a set of the mesh images with the aligned features a is defined as a feature a mesh image set.

With reference to the above steps, a set of mesh images for each feature may be obtained.

In an example, in S1001, acquiring one mesh image from N mesh images corresponding to the same feature as the target mesh image may be:

and regarding each feature, taking the grid image with the maximum diffusion color value in each grid image corresponding to the feature as a target grid image of the feature.

For example, for feature a, the diffusion color value of each grid image is determined, and the grid image with the largest diffusion color value is taken as the target grid image of feature a.

In this way, a target grid image corresponding to each feature grid can be obtained.

Fig. 5 is a schematic diagram of compressing after aligning each grid image corresponding to fig. 2, and as shown in fig. 5, after aligning and then compressing, ghost images can be effectively reduced, and the precision of surface light field data can be improved.

In another example, as shown in fig. 6, the step S1001 of obtaining one mesh image from N mesh images corresponding to the same feature as the target mesh image may specifically include:

s201, determining the sum of the energies of the N grid images corresponding to the same feature.

In practical application, the target grid image may not be found by using the existing method due to factors such as large brightness variation. This embodiment solves this problem by finding the best mesh image among all the respective mesh images. Specifically, color values of all grid images corresponding to all the same features (namely all the features) on the face are used

As an energy function to determine the sum of the energies of all the mesh images corresponding to all the features.

In one example, the sum of the intensities of all the mesh images corresponding to all the features is calculated, and the sum of the intensities is taken as the sum of the energies of all the mesh images corresponding to all the features.

In another example, the sum of the quality of all the mesh images corresponding to all the meshes is calculated, and the sum of the quality is taken as the sum of the energy of all the mesh images corresponding to all the meshes.

In another example, the sum of the quality of all the mesh images corresponding to all the features is calculated, and the sum of the quality is taken as the sum of the energy of all the mesh images corresponding to all the features.

In another example, the sum of the luminances of all the mesh images corresponding to all the features and the sum of the qualities are calculated, and the sum of the luminances and the quality is taken as the sum of the energies of the mesh images.

In a possible implementation manner of this embodiment, the sum e (p) of energies of all N mesh images corresponding to the same feature may be further determined according to formula (3):

wherein, the

Color values of the ith grid image of feature f, said

Is that it is

Corresponding luminance value, said

Is that it is

Color value of jth grid image of the feature fSaid

The above-mentioned,

The above-mentioned

The determination may be performed according to an existing manner, and the description of the embodiment is omitted here.

Alternatively to this, the first and second parts may,

to this end, the mean luminance value of each grid is calculated, assuming that diffusion should be captured under good luminance conditions without specular highlights

Sum variance

The 5% samples with the lowest average brightness are discarded, which may be captured without enough light.

Then use

And

the most probable luminance mean and variance are extracted to define the luminance

The energy of (a).

Due to specular highlight

And

compared with the obvious difference, therefore E_lAn illuminated grid without highlights is favored.

Alternatively to this, the first and second parts may,

to represent

The original projection size of (a). In this embodiment, the projected size of the quality item can be chosen because it is a combined indicator of the camera position distance and the angular distance between the camera view and the triangle normal.

Alternatively to this, the first and second parts may,

in this embodiment, P is a shared edge of the feature f and the feature f'. While

Is that

RGB information of the middle p point. In short, Es calculates the color difference on the shared side of two adjacent triangles.

And S202, taking the grid image corresponding to each feature when the sum of the energies is minimum as a target grid image of each feature.

Specifically, according to the above manner, the sum of the energies of all the mesh images corresponding to all the features can be determined. Then, the sum of the energies is minimized, a set of grid images can be determined, and the set of grid images are used as target grid images of each feature in a one-to-one correspondence mode.

For example, the above equation (3) is minimized, and the mesh image corresponding to each feature in the equation (3) at this time is taken as the target mesh image for each feature.

According to the method, when the target grid image is determined, the influence of the color difference of the adjacent patches of the grid is considered, so that the determined target grid image is more in line with the actual requirement.

In some embodiments, as shown in fig. 7, the aligning, by the S1002, N-1 mesh images of the N mesh images except for the target mesh image with the target mesh image may specifically include:

s301, determining the similar energy values of the rest grid images of the grid and the target grid image.

Specifically, in this embodiment, when aligning the remaining N-1 mesh images of the same feature with the target mesh image of the feature, it is possible to ensure that the remaining N-1 mesh images are aligned with the target mesh image by determining the similar energy values of the remaining mesh images of the feature and the target mesh image and maximizing the similar energy values.

The present embodiment does not limit the specific way of determining the similar energy values of the remaining N-1 mesh image and the target mesh image of the feature.

In one example, the above S301 may be to determine the similar energy value E of each of the N-1 mesh images to the target mesh image according to formula (4)_f(D^f,t)

Wherein, D is^fA target mesh image of the feature f, the

Is in the original grid imageWith a 2D shift in space of t ═ (t)_x，t_y) The resampling model of (1). Since the similarity comparison needs to be performed bypassing the mirror information, the present embodiment can use the mutual information metric as MI. An appropriate D can be calculated by alternately searching for t according to the above equation (4)^f。

In another example, the above S301 may be to determine the similar energy value E of each of the N-1 mesh images to the target mesh image according to equation (5)_f(D^f,t)

Wherein, D is^fA target mesh image of the feature f, the

Color value of ith grid image of feature f, t_iThe translation amount of the ith grid image of the feature f, t₀Is a preset value.

t₀For avoiding zero offset and adjusting t_iBy a weight at a distance limit t_maxSearch t greedily inside_iTo solve this problem.

Optionally, t₀May be (15, 15), t_maxSet to 3 pixels.

S302, when the similar energy value is maximum, determining that each grid image in the N-1 grid images is aligned with the target grid image.

According to the steps, the similar energy value E of the rest N-1 grid images with the same characteristic and the target grid image is determined_f(D^fT). When the similar energy value is maximum, it may be determined that the remaining N-1 mesh images of the feature are aligned with the target mesh image.

In the embodiment, the rest grid images of the grid are aligned with the target grid image by determining the similar energy values of the rest N-1 grid images of the features and the target grid image, so that the alignment reliability and efficiency are improved.

Fig. 8 is a schematic structural diagram of a device for generating a three-dimensional face image according to an embodiment of the present invention. On the basis of the above embodiment, as shown in fig. 8, the apparatus 100 for generating a three-dimensional face image of the present embodiment may include:

an obtaining module 110, configured to obtain N two-dimensional face images at different viewing angles, where N is an integer greater than 1;

a dividing module 120, configured to perform mesh division on the N two-dimensional face images to obtain N mesh images corresponding to each feature on the face;

the sampling module 130 is configured to sample the N mesh images corresponding to each feature, so that pixels of the mesh images in the N mesh images corresponding to each feature are the same, and viewing angle intervals are equal;

a determining module 140, configured to decompose and compress the sampled N grid images corresponding to all the features, and determine surface light field data of the face;

and the model obtaining module 150 is configured to obtain a three-dimensional face image according to the surface light field data and the N two-dimensional face images.

The device for generating a three-dimensional face image according to the embodiment of the present invention may be used to implement the technical solution of the above-described method embodiment, and the implementation principle and technical effect are similar, which are not described herein again.

Fig. 9 is a schematic structural diagram of a three-dimensional face image generation apparatus according to a second embodiment of the present invention. On the basis of the foregoing embodiment, as shown in fig. 9, the apparatus 100 for generating a three-dimensional face image according to this embodiment may further include an alignment module 130:

an alignment module 160, configured to align the N grid images corresponding to the same feature.

In a possible implementation manner of this embodiment, if the same pixel point on the N two-dimensional face images is a feature of the face, the dividing module 120 is specifically configured to perform mesh division on the N two-dimensional images according to the pixel point to obtain N mesh images corresponding to each pixel point.

In another possible implementation manner of this embodiment, if the face includes M preset features, the dividing module 120 is specifically configured to perform mesh division on the N two-dimensional images according to the M preset features of the face, and obtain N mesh images corresponding to each feature.

In another possible implementation manner of this embodiment, the model obtaining module 150 is specifically configured to obtain a three-dimensional model of the human face according to the N two-dimensional human face images; and rendering the three-dimensional model by using the surface light field data to obtain a rendered three-dimensional face image.

Fig. 10 is a schematic structural diagram of a three-dimensional face image generation apparatus according to a third embodiment of the present invention. On the basis of the above embodiment, as shown in fig. 10, the alignment module 160 of this embodiment may include:

an obtaining unit 161, configured to obtain one grid image from N grid images corresponding to the same feature as a target grid image;

an alignment unit 162 configured to align N-1 mesh images of the N mesh images, excluding the target mesh image, with the target mesh image.

In a possible implementation manner of this embodiment, the obtaining unit 161 is configured to use a grid image with the largest diffusion color value in N grid images corresponding to the same feature as a target grid image of the same feature.

In another possible implementation manner of this embodiment, the obtaining unit 161 is specifically configured to determine a sum of energies of all N mesh images corresponding to the same feature; and taking the grid image corresponding to each feature when the sum of the energies is minimum as a target grid image of each feature.

In another possible implementation manner of this embodiment, the obtaining unit 161 is specifically configured to:

according to the formula

wherein, the

Color values of the ith grid image of feature f, said

Is that it is

Corresponding luminance value, said

Is that it is

Color values of the jth grid image of the feature f

In another possible implementation manner of this embodiment, the alignment unit 162 is specifically configured to:

In another possible implementation manner of this embodiment, the alignment unit 162 specifically includes:

according to the formula

Wherein, D is^fA target mesh image of the feature f, the

In another possible implementation manner of this embodiment, the aligning unit 162 further specifically includes:

according to the formula

Wherein, D is^fA target mesh image of the feature f, the

Optionally, the mesh is a triangular mesh.

Fig. 11 is a schematic structural diagram of a device for generating a three-dimensional face image according to an embodiment of the present invention, and as shown in fig. 11, the device 200 for generating a three-dimensional face image according to the embodiment includes:

a memory 220 for storing a computer program;

the processor 230 is configured to execute the computer program to implement the method for generating a three-dimensional face image, which has similar implementation principles and technical effects and is not described herein again.

Fig. 12 is a schematic structural diagram of a device for generating a three-dimensional face image according to an embodiment of the present invention, and as shown in fig. 12, the device 300 for generating a three-dimensional face image includes: a communicatively coupled camera 310 and a processor 320,

the camera 310 is configured to acquire N two-dimensional face images at different viewing angles, where N is an integer greater than 1;

the processor 320 is configured to perform mesh division on the N two-dimensional face images to obtain N mesh images corresponding to each feature on the face, and sample the N mesh images corresponding to each feature, so that pixels of the mesh images in the N mesh images corresponding to each feature are the same, and viewing angle intervals are equal; decomposing and compressing the N sampled grid images corresponding to all the features to determine surface light field data of the face; and obtaining a three-dimensional face image according to the surface light field data and the N two-dimensional face images.

Optionally, the apparatus 300 for generating a three-dimensional face image further includes a memory 330, and the memory 330 is used for storing a computer program. The processor 320 reads the computer program from the memory 330 and executes the computer program.

In one possible implementation, before sampling the N mesh images corresponding to each feature, the processor 320 is further configured to:

and aligning the N grid images corresponding to the same characteristic.

In another possible implementation manner, if the same pixel point on the N two-dimensional face images is a feature of the face, the processor 320 is configured to:

In another possible implementation manner, if the face includes preset M features, the processor 320 is configured to:

In another possible implementation manner, the processor 320 is specifically configured to:

according to the formula

wherein, the

Color values of the ith grid image of feature f, said

Is that it is

Corresponding luminance value, said

Is that it is

Color values of the jth grid image of the feature f

according to the formula

Wherein, D is^fA target mesh image of the feature f, the

according to the formula

Wherein, D is^fA target mesh image of the feature f, the

Optionally, the mesh is a triangular mesh.

Further, when at least a part of the functions of the method for generating a three-dimensional face image according to the embodiment of the present invention are implemented by software, an embodiment of the present invention further provides a computer storage medium, which is used for storing computer software instructions for generating the three-dimensional face image, and when the computer storage medium runs on a computer, the computer storage medium enables the computer to execute various possible methods for generating a three-dimensional face image according to the above-mentioned method embodiments. The processes or functions described in accordance with the embodiments of the present invention may be generated in whole or in part when the computer-executable instructions are loaded and executed on a computer. The computer instructions may be stored on a computer storage medium or transmitted from one computer storage medium to another via wireless (e.g., cellular, infrared, short-range wireless, microwave, etc.) to another website site, computer, server, or data center. The computer storage media may be any available media that can be accessed by a computer or a data storage device, such as a server, data center, etc., that incorporates one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., SSD), among others.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for generating a three-dimensional face image is characterized by comprising the following steps:

obtaining a three-dimensional face image according to the surface light field data and the N two-dimensional face images;

before sampling the N mesh images corresponding to each feature, the method further includes:

and aligning the N grid images corresponding to the same characteristic.

2. The method according to claim 1, wherein if the same pixel point on the N two-dimensional face images is a feature of the face, the mesh division is performed on the N two-dimensional face images to obtain N mesh images corresponding to each feature on the face, and the method includes:

3. The method according to claim 1, wherein if the face includes M preset features, the mesh-dividing the N two-dimensional face images to obtain N mesh images corresponding to each feature on the face comprises:

4. The method of claim 1, wherein obtaining a three-dimensional face image from the surface light field data and the N two-dimensional face images comprises:

5. The method of claim 1, wherein aligning the N mesh images corresponding to the same feature comprises:

6. The method according to claim 5, wherein the obtaining one grid image from the N grid images corresponding to the same feature as the target grid image comprises:

7. The method according to claim 5, wherein the obtaining one grid image from the N grid images corresponding to the same feature as the target grid image comprises:

8. The method of claim 7, wherein determining the sum of the energies of all of the N mesh images corresponding to the same feature comprises:

according to the formula

wherein, the P_i ^fColor values of the ith grid image of feature f, E_l(P_i ^f) Is said P_i ^fCorresponding brightness value, said E_q(P_i ^f) Is said P_i ^fCorresponding sample masses, said f' being a neighboring grid of said feature f, said

Color values of the jth grid image of the feature f

9. The method according to claim 5, wherein said aligning N-1 mesh images of the N mesh images other than the target mesh image with the target mesh image comprises:

10. The method of claim 9, wherein the determining the similar energy value of each of the N-1 mesh images to the target mesh image comprises:

according to the formula

Wherein, D is^fIs a target mesh image of feature f, said P_i ^fColor value of ith grid image of feature f, t_iAnd f, the translation amount of the ith grid image of the feature f, t is shift, phi () is a resampling model, and MI () is mutual information measurement.

11. The method of claim 9, wherein the determining the similar energy value of each of the N-1 mesh images to the target mesh image comprises:

according to the formula

Wherein, D is^fIs a target mesh image of feature f, said P_i ^fColor value of ith grid image of the feature f, t_iThe translation amount of the ith grid image of the feature f, t₀For a preset value, t is shift, Φ () is the resampling model, and MI () is the mutual information metric.

12. The method of claim 1, wherein the mesh is a triangular mesh.

13. An apparatus for generating a three-dimensional face image, comprising:

the model acquisition module is used for acquiring a three-dimensional face image according to the surface light field data and the N two-dimensional face images;

and the alignment module is used for aligning the N grid images corresponding to the same characteristic.

14. An apparatus for generating a three-dimensional face image, comprising:

a memory for storing a computer program;

a processor for executing the computer program to implement the method of generating a three-dimensional face image according to any one of claims 1 to 12.

15. An apparatus for generating a three-dimensional face image, comprising: a camera and a processor in communication with each other,

the processor is used for carrying out grid division on the N two-dimensional face images to obtain N grid images corresponding to each feature on the face, and sampling the N grid images corresponding to each feature, so that the pixels of the grid images in the N grid images corresponding to each feature are the same, and the visual angle intervals are equal; decomposing and compressing the N sampled grid images corresponding to all the features to determine surface light field data of the face; obtaining a three-dimensional face image according to the surface light field data and the N two-dimensional face images;

the processor is further configured to align the N mesh images corresponding to the same feature.

16. The apparatus of claim 15, wherein if the same pixel point on the N two-dimensional face images is a feature of the face, the processor is configured to:

17. The apparatus of claim 15, wherein if the face comprises M predetermined features, the processor is configured to:

18. A computer storage medium, characterized in that the storage medium has stored therein a computer program which, when executed, implements the method of generating a three-dimensional face image according to any one of claims 1-12.