CN104408405B

CN104408405B - Face representation and similarity calculating method

Info

Publication number: CN104408405B
Application number: CN201410604043.0A
Authority: CN
Inventors: 不公告发明人
Original assignee: BEIJING CKLEADER SOFTWARE TECHNOLOGY Co Ltd
Current assignee: BEIJING CKLEADER SOFTWARE TECHNOLOGY Co Ltd
Priority date: 2014-11-03
Filing date: 2014-11-03
Publication date: 2018-06-15
Anticipated expiration: 2034-11-03
Also published as: CN104408405A

Abstract

The invention discloses a kind of face representations and similarity calculation algorithm.The face training set with mark is collected in off-line training step, everyone includes 50 100 facial images, the face block of 2*2 same size will be divided into after image normalization in set.To each face block, the fritter of dense extraction k*k pixels carries out mean value zero, after the pretreatment of variance normalizing, based on K means clustering learning convolution kernels.Each face block obtains K convolved image, to K*2*2 convolved image of whole input face, stretches the feature of all response diagrams after carrying out mean value Pooling operations and ReLU nonlinear operations respectively.To the feature after stretching, learn principal component analysis（PCA）Projection, after PCA reduces dimension, based on linear discriminant analysis（LDA）Algorithm study differentiates projection, obtains compact, robust face representation, and carrying out inner product operation to the face representation of two images obtains similarity.

Description

Face representation and similarity calculating method

Technical field

The invention belongs to computer vision and technical field of image processing, expression and similarity meter in particular to face Calculation method.

Background technology

Computer face identification refers to based on known Face Sample Storehouse, and image and pattern-recognition skill are analyzed using computer For art from either statically or dynamically scene, identification or the one or more faces of verification extract the multiple of face using Feature Extraction Technology Possible character representation, the technology are widely used in the important events such as public safety, proof of identification, effectively carry out face The key of identification is quickly, accurately carries out face characteristic expression.There is the calculation of some face character representations in the prior art Method carries out input facial image using convolution kernel process of convolution and obtains the multidimensional characteristic vectors expression of face, what operation was related to Convolution kernel selection does not make full use of abundant face training data, and the dimension of feature vector sets too small influence precision, setting Excessive influence speed, so as to constrain the accuracy of face characteristic expression and similarity calculation on the whole.

To overcome the above-mentioned shortcoming of the prior art, the present invention proposes a kind of face representation learnt based on convolution kernel Algorithm and similarity calculating method, the convolution kernel obtained using training set off-line learning participate in the convolution operation of facial image In, the dimensionality reduction and analysis matrix that are learnt using training set carry out human face discriminating feature extraction, are obtaining compact, robust people After face represents, the similarity that can be quickly obtained between image, it is accurate to realize, fast to read, the face representation of robust and similar Degree calculates.

Invention content

The present invention proposes a kind of face representation and similarity calculating method, it is characterised in that：

(1) generation network face training set is collected；

(2) it trains to obtain convolution kernel using the facial image in training set；

(3) convolution kernel is utilized, character representation is carried out to the facial image in training set；

(4) convolution kernel is utilized, character representation is carried out to input facial image；

(5) extraction differentiates feature in the facial image character representation in training set；

(6) extraction differentiates feature in facial image character representation is inputted；

(7) using feature is differentiated, image similarity is calculated.

Preferably, wherein step (1) is specially：C famous person, wherein c=1,2 ... ..., C are selected, famous person is corresponding with a figure Picture, wherein 50<<100, the corresponding all image collections of famous person are to be denoted as training set, Image in training set, wherein i=1,2 ... ..., N embody posture, illumination and the variation of expression.

Preferably, wherein step (2) is specially：Facial image in training set is aligned, the facial image after alignment Size is 128*80 pixels, each image is divided into the identical face block of 2*2 size, each face block diagram is as size 64*40 pixels are respectively processed all face blocks of all images, obtain convolution kernel.

Preferably, wherein all face blocks to all images are specific the step of being respectively processed, obtain convolution kernel For：The face block of each image corresponding position j is denoted as, wherein i=1,2 ... ... N, j=1, and 2,3,4, it is extracted on each face block The fritter of k*k sizes, step-length 1, then the face block be expressed as：

, wherein 1<k<40, m=64-k+1, n=40-k+1 calculate, will subtract the mean value, and by the variance of every one-dimensional characteristic Normalizing, the face block for obtaining each image corresponding position j represent：, the face block of all N width image corresponding position j is located successively Reason, obtained N number of expression is connected, the face block for obtaining the position j of N width image in training set is expressed as：, to this progress K-means clustering learnings obtain K cluster centre, are denoted as：, wherein dj, t expression t-th of cluster centre therein, t=1, 2 ... ... K, each cluster centre convolution kernel independent as one, to the face block of all position j of all images all in accordance with Above-mentioned steps carry out convolution kernel study respectively, obtain, j=1,2,3,4 }.

Preferably, wherein step (3) is specially：To carrying out Pooling operations and ReLU after facial image convolution operation It operates to carry out character representation.

Preferably, convolution operation therein is specially：By each cluster centre dj, t is as a convolution kernel, in progress Product operation, two dimensional image is reduced to by the convolution vector stretched, to all in the facial image in training set, j=1, and 2,3, 4, above-mentioned inner product operation is carried out, the face block of same position corresponds to the convolution kernel that same position training obtains, is obtained 2*2*K Convolved image, wherein the size of each convolved image is m*n.

Preferably, Pooling therein, which is operated, is specially：It operates to extract invariant features using mean value Pooling, it will The convolved image of each m*n is divided into nonoverlapping grid of 4*4 pixel sizes, calculates the mean value of each grid, obtains mean value Response diagram after Pooling, after Pooling is operated, the size of each response diagram is（m/4）*（n/4）.

Preferably, ReLU therein, which is operated, is specially：Negative sound in the response diagram exported after mean value Pooling is operated It should be worth and return 0, retain positive response, operated by ReLU, add the non-linear of feature, for input, after ReLU is operated All response diagrams stretch after obtained face block be expressed as, characteristic dimension K*（m/4）*（n/4）, 2*2 position is obtained Face block expression be connected in turn, obtain the character representation of training set facial image, characteristic dimension is A=2*2*K* （m/4）*（n/4）Dimension.

Preferably, wherein step (4) is specially：To carrying out Pooling and ReLU operations after facial image convolution operation To carry out character representation.

Preferably, convolution operation therein is specially：To inputting facial image, it is aligned to the face figure of 128*80 pixels Picture is divided into the identical face block of 2*2 size, and each face block diagram is 64*40 pixels as size, corresponding position j's Face block is denoted as, j=1, and 2,3,4, the fritter of extraction k*k sizes on each face block, step-length 1, then face block expression For：, wherein 1<k<40, m=64-k+1, n=40-k+1 calculate, will subtract the mean value, and the variance of every one-dimensional characteristic is returned One, obtain the expression of the jth position face block of the image, by each cluster centre Dj, t carry out inner product operation, the convolution vector stretched are reduced to two dimensional image, to inputting facial image as a convolution kernelYIn all above-mentioned inner product operations of carry out, j=1,2,3,4, the face block of same position corresponds to the volume that same position training obtains Product core, is obtained 2*2*K convolved image, wherein the size of each convolved image is m*n.

Preferably, ReLU therein, which is operated, is specially：By the Negative Acknowledgment value in the response diagram exported after mean value Pooling Return 0, retain positive response, operated by ReLU, add the non-linear of feature, for input, the institute after ReLU is operated There is the face block that response diagram obtains after stretching to be expressed as, characteristic dimension K*（m/4）*（n/4）, people that 2*2 position is obtained The expression of face block is connected in turn, and obtains the expression of input facial image, characteristic dimension is A=2*2*K*（m/4）*（n/4） Dimension.

Preferably, wherein step (5) is specially：To facial image character representation by principal component analysis dimensionality reduction calculate and Linear discriminant analysis obtains differentiating feature.

Preferably, wherein principal component analysis dimensionality reduction calculate the step of be specially：Calculate the association side of training set character representation Poor matrix wherein being the mean value of the character representation of all N width images in training set, enables, wherein corresponding preceding r maximum eigenvalue Corresponding feature vector, for input feature vector, it is obtained after principal component analysis dimensionality reduction, all face figures in training set The character representation of picture is by dimensionality reduction postscript.

Preferably, wherein linear discriminant analysis is specially：Respectively calculate dimensionality reduction after character representation class scatter and class Interior divergenceSW, wherein, it represents the mean value after the feature vector principal component analysis dimensionality reduction degree of person ownership image, is training set In all images character representationLMean value, PCA_Dim=r, wherein being q feature of following generalized eigenvalues problem maximum It is worth corresponding feature vector：, to carrying out differentiation feature extraction, obtain differentiating feature.

Preferably, wherein step（6）Specially：To facial image character representation by principal component analysis dimensionality reduction calculate and Linear discriminant analysis obtains differentiating feature.

Preferably, principal component analysis dimensionality reduction therein calculate the specific steps are：To the character representation of input picture, meter It calculates, after obtaining, differentiation feature is calculated.

Preferably, wherein step（7）Specially：It is obtained to carrying out two norm normalizings：, to carrying out two models Number normalizing obtains：, carry out inner product calculating) obtain input pictureYWith The similarity of training set facial image.

Description of the drawings

Fig. 1 is face representation learning process figure

Fig. 2 is the facial image after alignment

Fig. 3 is the convolution kernel example based on K-means clustering learnings

Fig. 4 is fast convolution process schematic.

Specific embodiment

To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with the accompanying drawings and the specific embodiments The present invention is made a further detailed description.

Fig. 1 is the face representation learning process figure disclosed in the present invention, is described with reference to Fig. 1.

Step 1：Training set is set and normalization

A network face data set is collected, selects C famous person, wherein c=1,2 ... ..., C are corresponding with an image, wherein 50<<100, the corresponding all image collections of famous person are to be denoted as training set, Image in training set should embody the variation of posture, illumination and expression.

Based on eye position, face is aligned, the facial image See Figure 2 after alignment, it is preferred that the people after alignment It is bold small for 128*80 pixels.

Step 2：Face representation learns

（1）Facial image after alignment is divided into the face block of 2*2 same size, the face table of corresponding 128*80 Show, the size of each face block is 64*40 pixels.

The N images included in gathering training, the face block of every image corresponding position j are denoted as, wherein i=1, 2 ... ... N, j=1,2,3,4.

（2）Each face block is handled as follows：

On each face block extract k*k sizes fritter, step-length 1, then the face block can be expressed as：

Formula 1

Wherein m=64-k+1, n=40-k+1 are preferably provided with k=9, then m=56, n=32.

It calculates, the mean value will be subtracted, and by the variance normalizing of every one-dimensional characteristic, obtain the i-th width image The expression of jth position face block：

Formula 2

（3）To being handled successively per piece image in training set, by step（2）The feature of the corresponding position of obtained image Matrix expression connects, and obtains the expression of the jth position face block of all N width images：

Formula 3

（4）To what is obtained, based on K cluster centre of K-means clustering learnings, it is denoted as：

Formula 4

Wherein dj, t represent t-th of cluster centre therein, t=1,2 ... ... K, and each cluster centre can regard one as Independent convolution kernel, is preferable to provide K=32, that is, corresponds to each face block and obtain K convolution kernel.Fig. 3 is the example of convolution kernel.

（5）Convolution kernel study is carried out respectively all in accordance with above-mentioned steps to the face block of all positions of all images, is obtained D1 to D4.

Step 3：Face characteristic extracts

The feature extraction of face is divided into three steps：Convolution, Pooling and ReLU.

（1）Convolution operation

For inputting facial imageY, by step 2（1）With（2）Two-step pretreatment obtains the imageYJth position The expression of face block：

Formula 5

By each cluster centre dj, t is considered as a convolution kernel, carries out convolution operation, which can regard as grasps as inner product Make, obtain

Formula 6

It is two dimensional image by inner product and then by the vector stretched " reduction ", Fig. 4 is the signal of shown convolution operation Figure.The facial image of input is divided into 2*2 face block, and all face blocks are carried out with above-mentioned convolution respectively, j=1, and 2,3,4, phase Inner product is carried out using the convolution kernel of same position with the face block that position is trained, that is, corresponds to D1, corresponding D4, then for input ImageY, 2*2*K=2*2*32=128 convolved image is always obtained, wherein the size of each convolved image is m*n=56*32.

（2）Pooling is operated

It operates to extract invariant features by mean value Pooling.First, the convolved image of each 56*32 is divided into 4*4 Nonoverlapping grid of pixel size calculates the mean value of each grid, obtains the response diagram after mean value Pooling, passes through After Pooling operations, the size of each response diagram is 14*8.

（3）ReLU is operated

Negative Acknowledgment value in the response diagram exported after mean value Pooling is returned 0 by ReLU operations, is retained positive response, is passed through ReLU is operated, and adds the non-linear of feature.

For input picture face block, the face representation that all response diagrams after ReLU is operated obtain after stretching is, special Sign dimension is K*（m/4）*（n/4）=32*14*8 will input facial imageYWhat all positions obtained connects, and is inputted Facial imageYExpression, characteristic dimension is A=2*2* K*（m/4）*（n/4）=14336 dimensions.

（4）All images in gathering training are by step 2（1）With（2）Two-step pretreatment obtains the of the image The expression of j positions face block：

By each cluster centredj,tIt is considered as a convolution kernel, carries out convolution operation, which can regard as grasps as inner product Make, obtain

Formula 7

It is two dimensional image by inner product and then by the vector stretched " reduction ", Fig. 4 is the signal of shown convolution operation Figure.To the facial image, 2*2 face block is divided into, to all face blocksJ=1,2,3,4, above-mentioned convolution is carried out respectively, The face block of same position carries out inner product using the convolution kernel of same position, i.e., correspond to D1, correspondence D4, then for the image, always 2*2*K=2*2*32=128 convolved image is obtained, wherein the size of each convolved image is m*n=56*32.

（5）Pooling is operated

（6）ReLU is operated

For training set image face block, the face block that all response diagrams after ReLU is operated obtain after stretching represents For characteristic dimension K*（m/4）*（n/4）=32*14*8 is connected what all positions of training set facial image obtained, is obtained To the expression of training set facial image, characteristic dimension is A=2*2* K*（m/4）*（n/4）=14336 dimensions.

Step 4：Differentiate feature extraction

Differentiation feature extraction is carried out to the face representation f obtained in step 3, is divided into principal component analysis dimensionality reduction and linear discriminant Analyze two steps.

Principal component analysis dimensionality reduction

Calculate the covariance matrix of training set character representation

Formula 8,

Wherein it is the mean value of the feature vector of all N width images in training set.

It enablesFormula 9

The corresponding feature vector of wherein corresponding preceding r maximum eigenvalue, it is preferred that PCA_Dim=r=2000.For defeated Enter feature, obtained after principal component analysis dimensionality reduction

Formula 10

The feature of face images in training set is by dimensionality reduction postscript

（2）Linear discriminant analysis

Respectively calculate dimensionality reduction after feature class scatter and class in divergence.

Wherein formula 11

Formula 12

It wherein represents the mean value after the feature vector principal component analysis dimensionality reduction degree of person ownership image, is institute in training set There is the mean value of the feature vector L of image N.

Formula 13

Wherein it is the corresponding feature vector of q characteristic value of following generalized eigenvalues problem maximum：

Formula 14

Preferably, q=800 are set, to carrying out differentiation feature extraction

Formula 15

To carrying out two norm normalizings：Formula 16

Step 5：Similarity calculation

For input pictureY, its character representation is obtained by step 3, is updated in formula 10 and obtained, will be updated to It in formula 15, obtains, formula 16 will be substituted into and carry out two norm normalizings：Formula 17

Similarity is calculated based on cosine distances, since two norm normalizings have been carried out in feature, so only characteristics of needs Inner product

) formula 18

Obviously, those skilled in the art can carry out the present invention essence of various changes and deformation without departing from the present invention God and range, if in this way, these modification and variation of the present invention belong to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to comprising including these changes and deformation.

Claims

1. a kind of face representation and similarity calculating method, it is characterised in that：

(1) generation network face training set is collected；

(7) using feature is differentiated, image similarity is calculated;

Wherein step (1) is specially：Select C famous person, wherein c=1,2 ... ..., C, famous personIt is corresponding withImage is opened, In 50<<100, famous personCorresponding all image collections are, training set is denoted as,, train the image in set, wherein i=1,2 ... ..., N embody posture, illumination and the variation of expression;

Wherein step (2) is specially：Facial image in training set is aligned, the facial image after alignmentSize is 128*80 Pixel, by each imageThe identical face block of 2*2 size is divided into, each face block diagram is 64*40 pixels as size, right All face blocks of all images are respectively processed, and obtain convolution kernel;

The step of being wherein respectively processed to all face blocks of all images, obtain convolution kernel be specially：Each image The face block of corresponding position j is denoted as, wherein i=1,2 ... ... N, j=1,2,3,4, in each face blockUpper extraction k*k is big Small fritter, step-length 1, then the face blockIt is expressed as：

=[, ... ... ,], wherein 1<k<40, m=64-k+1, n=40-k+1 are calculated, Mean value will be subtracted, and by the variance normalizing of every one-dimensional characteristic, the face block for obtaining each image corresponding position j represents：, the face block of all N width image corresponding position j is handled successively, will be obtained N number of expression connect, the face block for obtaining the position j of N width image in training set is expressed as：, to the carry out K-means clustering learnings, K cluster centre is obtained, It is denoted as：, wherein dj, t expression t-th of cluster centre therein, t=1,2 ... ... K, each cluster centre convolution kernel independent as one, to the face block of all position j of all images all in accordance with above-mentioned step Suddenly convolution kernel study is carried out respectively, obtain, j=1,2,3,4 }.

2. the method as described in claim 1, wherein step (3) are specially：To carrying out Pooling after facial image convolution operation Operation and ReLU operate to carry out character representation.

3. method as claimed in claim 2, convolution operation therein are specially：By each cluster centre dj, t is as a volume Product core, carries out inner product operation, by the convolution stretched vectorTwo dimensional image is reduced to, to training The facial image of concentrationIn it is all, j=1,2,3,4, above-mentioned inner product operation is carried out, the face block of same position corresponds to The convolution kernel that same position training obtains, is obtained 2*2*K convolved image, wherein the size of each convolved image is m*n.

4. method as claimed in claim 3, Pooling operations therein are specially：It operates to extract using mean value Pooling The convolved image of each m*n is divided into nonoverlapping grid of 4*4 pixel sizes, calculates the equal of each grid by invariant features Value, obtains the response diagram after mean value Pooling, after Pooling is operated, the size of each response diagram is（m/4）*（n/4）.

5. method as claimed in claim 4, ReLU operations therein are specially：The sound exported after mean value Pooling is operated Negative Acknowledgment value that should be in figure returns 0, retains positive response, is operated by ReLU, the non-linear of feature is added, for input, face block that all response diagrams after ReLU is operated obtain after stretching represents For, characteristic dimension K*（m/4）*（n/4）, the face block that 2*2 position is obtained representsIt is connected in turn, is instructed Practice collection facial imageCharacter representation, characteristic dimension is A=2*2*K*（m/4）*（n/4）Dimension.

6. the method as described in claim 1, wherein step (4) are specially：To carrying out Pooling after facial image convolution operation It operates to carry out character representation with ReLU.

7. method as claimed in claim 6, convolution operation therein is specially：To inputting facial image, it is aligned to 128*80 The facial image of pixel is divided into the identical face block of 2*2 size, and each face block diagram is 64*40 pixels as size, The face block of corresponding position j is denoted as, j=1,2,3,4, in each face blockThe fritter of upper extraction k*k sizes, step-length 1, The then face blockIt is expressed as：, wherein 1<k<40, m=64-k+1,n= 40-k+1 is calculated, willSubtract mean value, and by the variance normalizing of every one-dimensional characteristic, obtain the image's The expression of jth position face block, by each cluster centre dj, t is as a volume Product core, carries out inner product operation, by the convolution stretched vectorTwo dimensional image is reduced to, to input Facial imageIn it is allCarry out above-mentioned inner product operation, j=1,2,3,4, the face block of same position corresponds to same position instruction 2*2*K convolved image is obtained in the convolution kernel got out, wherein the size of each convolved image is m*n.

8. the method for claim 7, Pooling operations therein are specially：It operates to extract using mean value Pooling The convolved image of each m*n is divided into nonoverlapping grid of 4*4 pixel sizes, calculates the equal of each grid by invariant features Value, obtains the response diagram after mean value Pooling, after Pooling is operated, the size of each response diagram is（m/4）*（n/4）.

9. method as claimed in claim 8, ReLU operations therein are specially：The response diagram that will be exported after mean value Pooling In Negative Acknowledgment value return 0, retain positive response, operated by ReLU, the non-linear of feature is added, for input, face block that all response diagrams after ReLU is operated obtain after stretching represents For, characteristic dimension K*（m/4）*（n/4）, the face block that 2*2 position is obtained representsIt is connected in turn, obtains Input facial imageExpression, characteristic dimension is A=2*2*K*（m/4）*（n/4）Dimension.

10. method as claimed in claim 9, wherein step (5) are specially：To facial image character representation by principal component point Analysis dimensionality reduction calculates and linear discriminant analysis obtains differentiating feature.

11. the step of method as claimed in claim 10, wherein principal component analysis dimensionality reduction calculate is specially：It is special to calculate training set Levy the covariance matrix represented, whereinIt is the mark sheet of all N width images in training set ShowMean value, enable, wherein It is correspondingThe corresponding feature vector of preceding r maximum eigenvalue, for input feature vector, after principal component analysis dimensionality reduction It arrives,, the character representation of the face images in training set is by dimensionality reduction postscript。

12. method as claimed in claim 10, wherein linear discriminant analysis are specially：It is dissipated between the class of character representation after calculating dimensionality reduction respectively DegreeWith divergence Sw in class, wherein, ,Represent thePerson ownership imageFeature vector principal component analysis dimensionality reduction degree after mean value,It is institute in training set There is the character representation of imageMean value,, PCA_ Dim=r, whereinIt is q characteristic value of following generalized eigenvalues problem maximum Corresponding feature vector：,, it is rightDifferentiation feature extraction is carried out, obtains differentiating feature。

13. method as claimed in claim 12, wherein step（6）Specially：Principal component is passed through to facial image character representation Analysis dimensionality reduction calculates and linear discriminant analysis obtains differentiating feature.

14. method as claimed in claim 13, principal component analysis dimensionality reduction therein calculate the specific steps are：To input picture Character representation, calculate, obtainAfterwards, differentiation feature is calculated。

15. method as claimed in claim 13, wherein step（7）Specially：It is rightTwo norm normalizings are carried out to obtain：, It is rightTwo norm normalizings are carried out to obtain：, carry out inner product calculatingIt obtains Input pictureWith training set facial imageSimilarity.