WO2006003270A1 - Procede pour la reconnaissance de visages, a analyse discriminante lineaire bidimensionnelle - Google Patents
Procede pour la reconnaissance de visages, a analyse discriminante lineaire bidimensionnelle Download PDFInfo
- Publication number
- WO2006003270A1 WO2006003270A1 PCT/FR2004/001395 FR2004001395W WO2006003270A1 WO 2006003270 A1 WO2006003270 A1 WO 2006003270A1 FR 2004001395 W FR2004001395 W FR 2004001395W WO 2006003270 A1 WO2006003270 A1 WO 2006003270A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- matrix
- base
- face
- images
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/169—Holistic features and representations, i.e. based on the facial image taken as a whole
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2132—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
- G06F18/21322—Rendering the within-class scatter matrix non-singular
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
Definitions
- the present invention relates to the field of face recognition in digital images.
- Statistical dimension reduction models are constructed using a set of face images hereinafter referred to as "learning base” (or, in abbreviated notation, BA). The same person is preferentially represented several times in the learning base. We then call "class" the different representations of the face of the same person.
- learning base or, in abbreviated notation, BA.
- the face to be recognized is compared to the faces of the learning base to assign an identity to it.
- Biometrics or CCTV for example, are authentication-based applications.
- I 1 ACP A commonly used definition of I 1 ACP is that which associates, with a set of input vectors, a set of orthogonal main axes (so-called "principal components") on which the projection of the variance of the input vectors is Max.
- vector refers to a column vector. Moreover, it is 1 rating X the vector to be projected, large 1-m. Its orthogonal projection is denoted X. This projection is performed on the dimension space k whose orthogonal basis is stored in the form of columns in the matrix P. The matrix P is therefore of size (1-m) x k.
- the projection of the vector X is expressed by:
- the matrix P is called "projection matrix of the initial space in f space of the main components.”
- n denotes the number of images present in the learning base.
- S is the covariance matrix of the learning base BA, of size (1-m) x (1- ⁇ a), given by the following relation:
- I / ACP is commonly used to represent or recognize faces.
- the most well-known method of face recognition based on PCA has been proposed in the above document:
- the method requires a learning base consisting of a set of images presented as input - in the form of a vector per image.
- Each image X 1 consisting of 2 rows and m columns of pixels in gray levels, is thus reduced to a vector of size 1-m by concatenation of its lines of pixels.
- a PCR is performed directly on these vectors, giving a set of k principal components of the same size 1-m as the initial image vectors and are designated by the term "Eigenfaces" (or, in French, "faces owns”) .
- the number k of principal components to be retained can be fixed or determined from the eigenvalues.
- the comparison between two images of faces is made following a projection in the base of the eigenvectors according to relation (1) above.
- the two projected vectors are compared according to a measurement based on a predetermined criterion of similarity.
- the principal components constitute the subspace of dimension k minimizing the mean squared error of reconstruction, defined as the distance denoted L 2 between the learning base and its orthogonal projection in the base composed of the principal components. .
- this database does not necessarily offer an optimal classification of the data. Indeed, the principal components maximize the total variance of the learning base, without distinguishing the internal variations in each class of the variations between classes.
- the ADL differs from the ACP in that ADL is a supervised method, that is to say that the construction of the model requires, for each image of the learning base, the data of a vector and than the data of his class of membership.
- - n c is the number of images of the person corresponding to the class c and contained in the learning base
- X 1 P 7 X 1 is a vector of size k corresponding to the image X 1 projected in the base P according to equation (1) above.
- S b is the inter-class covariance matrix of the base
- the columns of P contain the k eigenvectors of the matrix S w -1S b associated with the k largest eigenvectors, where S w -1 is the inverse of S w .
- the size of input vectors is much larger than the number of examples acquired in the learning base ⁇ 1-m »n).
- the matrix S w is then singular and non-invertible.
- ACP + ADL ACP + ADL
- the number of components to retain can then be determined in the same way as for the ACP described above.
- the classification is performed after orthogonal projection in the Fisher faces space, in the same way as for the PCA.
- the ADL gives better results than the ACP and especially allows to better manage the differences of illumination in the shots of the faces, the differences of facial expressions and pose.
- a model is constructed from all the images of the learning database stored as pixel matrices with 1 rows and m columns.
- m orthonormal length
- m such as the projection of the learning base on this vector basis ensures maximum variance of the learning base.
- the projection of an image X of size 2 xm on the matrix P of size mxk is given by the following linear relation, where X is a matrix of size 1 xk:
- the criterion to be maximized is the following:
- S p designates the covariance matrix (called "two-dimensional” as opposed to a unicolon vector) of the n vectors of the projected images of the learning base on the vector base P. If we consider:
- S is the covariance matrix of the image columns and is calculated as follows:
- X -Y 1 X, is an average matrix, of size 1 xm, wn images of the learning base.
- the criterion to be maximized according to relation (14) is called "generalized total dispersion criterion".
- the k vectors [P 1 ,..., P *] to be retained are the eigenvectors of the matrix 5 corresponding to the largest eigenvalues.
- the projection matrix in the sense of relation (11) The projection of the image X 1 by P is noted is the vector of length 2 (projection of the image X 1 on the vector P J ).
- the number k of components to retain can be determined in the same way as for the ACP, seen above. As before, the comparison between two faces takes place in the projected space. Here, the projection of an image on P no longer gives a vector, but a matrix. Here we use a measure of similarity between matrices X 1 of size lxk. The distance between the matrices X 1 and X t can be as follows:
- the two-dimensional APA even if it does not have this disadvantage and if it guarantees an optimal conservation of the overall variance of the projected learning base, does not make it possible to distinguish variations between classes and variations within the classes .
- this technique is very suitable for reconstructing face images after compression by dimension reduction.
- the CPA it does not necessarily ensure good 'discrimination between classes.
- the present invention improves the situation.
- An object of the invention is to ensure better discrimination between the classes of the learning base.
- Another object of the invention is to provide a reduction of the calculations and the memory space required for the storage of the model built to be able to classify the images, compared to conventional discriminant methods such as ADL.
- the images are represented by matrices whose coefficients are pixel values.
- the method then comprises a pretreatment of the learning base in which:
- a matrix system representative of a variance between different classes and of the inverse of a variance in each of the classes is formed from these matrices, and a vector base is determined from this matrix system; with discriminant components of the learning base, aiming jointly at minimizing the variance in each of the classes and maximizing the variance between different classes.
- the method is more robust to changes in facial expressions than the prior art technique of 1 ⁇ CP2D and its implementation requires much less memory size and computing capacity. than the classical ADL technique.
- the pretreatment of the learning base comprises the following steps: - we estimate an average face of all the faces of the learning base;
- a second matrix, of interclass covariance corresponding to the mean of the matrices of the squared difference between the average face of each class and the average face of the learning base 1 ;
- the eigenvectors associated with a limited, chosen number of larger eigenvalues are advantageously selected.
- the method which is the subject of the invention comprises a step prior to the formation of the abovementioned matrix system and of applying a transposition of the matrices representing the images of the learning base. It is indicated that this variant leads to discriminant components denoted "2D-lines", whereas the succession of the steps above, without the the the
- the method may comprise a subsequent classification processing of a face image to be recognized, comprising the following steps:
- an identification matrix is constituted comprising vectors obtained by projecting the image of the face to be recognized on said vector base,
- a reference matrix comprising vectors obtained by the projection of a reference image on said vector base is constituted, and the identification matrix is compared with the reference matrix, according to a chosen distance criterion.
- this projection is performed by multiplying the matrix associated with the face to be recognized and / or the matrix associated with a reference face, by each of the selected eigenvectors.
- the distance criterion comprises the evaluation of a sum of the distances between the corresponding columns of the identification and reference matrices. This distance can be given by the relation (16) above but applied in the context of the invention.
- the distance criterion is set between the identification matrix and the projected average of each class of the learning base, as a reference matrix. This distance is estimated preferentially by the relation (24), given and commented on later.
- a reference image as the average image of a class.
- the reference image can instead be a real image directly drawn from the learning base.
- a possible application is the triggering of an alarm of a surveillance system if the comparison between a face to be recognized and one or more reference faces belonging to the learning base has a distance greater than a threshold.
- the reference image does not belong to the learning base but is taken from a previous shot.
- a possible application may be to know how many times a person appears on a video sequence, a previous image of that person then being the reference image.
- the method according to the invention may advantageously comprise an additional step of reconstructing the face images, after compression of the image data by dimension reduction, since the choice of a limited number of discriminant components (or eigenvectors) as indicated above) allows this size reduction.
- the present invention may be implemented by a computer workstation driven by the course of a suitable software (or "computer program product").
- a suitable software or "computer program product”
- the present invention is directed to such a computer program product, intended to be stored in a memory of a processing unit, or to a removable memory medium and intended to cooperate with a reader of said processing unit.
- This program then comprises instructions for the implementation of all or part of the steps of the above method.
- FIG. 2 represents extracts from a general face database distributed online by Intelligent Multimedia Laboratory
- FIG. 3 represents extracts from the learning base (a) and extracts from the test base (b), these bases being used to test the robustness to the laying,
- FIG. 4 represents extracts from the training base (a) and from the test base (b) for testing the robustness to facial expressions, the non-neutral facial expressions being randomly distributed in the test base (two images per class) and in the learning base (two images per class), while a class-neutral facial expression has been introduced into the learning base, the
- FIG. 5 represents the face recognition rates of the base of the poses described below, for the conventional ADL, with two distance criteria, the usual distance L 2 (squares) between the vectors of the projected images and another distance L 2 -moyCl (triangles) given by the relation L 2 between the vector of the projected image to be identified and the average vector of the projected images of a class of images;
- FIG. 6 represents the face recognition rates of the base of the poses, with use of the method in the sense of the invention, for:
- FIG. 7 represents comparative CMC ("Cumulative Match Characteristic") variations of the classical ADL (dashed) for a criterion of distance L 2 between the vectors of the projected images and of 1 ⁇ DL2D within the meaning of the invention ( solid lines) for a distance criterion L 2 given by relation (16) above and for the base of the poses,
- FIG. 8 represents the face recognition rates of the base of the poses, for the prior art PCMA 2 (diamonds) and for 1 ⁇ DL2D in the sense of FIG. the
- FIG. 9 represents the face recognition rates of the base of the poses, for 1 ⁇ CP2D. of the prior art (diamonds) and for 1 ⁇ DL2D within the meaning of the invention (squares), with the same distance criterion L 2 -moyCl given by the relation (24) below,
- FIG. 10 represents comparative CMC variations of 1CP2D of the prior art (dashed lines) and of 1 ⁇ DL2D within the meaning of the invention (solid lines) for a distance criterion L 2 given by equation (16) above, for the base of the poses,
- FIG. 11 illustrates the reconstructions within the meaning of the invention of images of the base of the poses, with two 2D-column discriminant components (a), with three 2D-column discriminant components (b) and with twenty 2D-discriminant components. columns (c),
- FIG. 12 represents the face recognition rates of the base of the expressions described below, for conventional ADL with the distance criterion L 2 between vectors of the projected images (squares), and with the given distance criterion L 2 -HiOyCl by the distance L 2 between the vector of the projected image to be recognized and the mean vector of the images of the same class (triangles),
- FIG. 13 represents the face recognition rates of the expression base, with use of the method in the sense of the invention, for:
- FIG. 15 represents the face recognition rates of the base of the expressions, for the ADL2D-Lig ' nes (squares) and 1 ⁇ CP2D (diamonds) of the prior art, with distance criterion L 2 given by the relation (16) above,
- FIG. 16 represents the face recognition rates of the base of the expressions, for ADL2D-lines (squares) and 1 ⁇ CP2D (diamonds) of the prior art, with the distance criterion as the distance L 2 -HiOyCl given by the relation (24) below,
- FIG. 17 represents comparative CMC variations of 1CP2D of the prior art (dashed lines) and of 1 ⁇ DL2D-lines according to the invention (solid lines) for a distance criterion L 2 given by the relation (16) below. before and for the base of expressions,
- FIG. 18 represents an example of classification of a face image (a):
- FIG. 19 represents a reconstruction within the meaning of the invention of the images of the base of the expressions of FIG. 4 (a), with:
- Figure 20 schematically shows a device such as a computer workstation for the implementation of the method within the meaning of the present invention.
- a preferred embodiment of the present invention is a method based on a supervised statistical method, hereinafter referred to as "Bidimensional Linear Discriminant Analysis” (or “ADL2D”).
- I / ADL2D can come in two forms: the ADL2D-columns and the ADL2D-lines. One can possibly combine these two methods to cross the information and / or to compare the results that they give each one.
- the ADL2D-columns consists of a method preferably comprising the main steps below.
- a learning base BA containing face images in the form of two-dimensional matrices, with values of matrix coefficients preferably corresponding to gray levels in the images.
- the learning base BA contains several views of each of the persons (as indicated above, all the images representing the same person constitute a class);
- the average face of all the faces of the learning base is calculated (step 12);
- a mean face is calculated (step 13) for each CLi class
- intra-class covariance 2D-columns denoted VOC (intra-CL)
- the square matrix of inter-class covariance 2D-columns (denoted COV (inter-CL)) corresponding to the mean of the matrices of the squared difference between the average face of each class and the average face is calculated (step 15) the learning base;
- the eigenvectors EIGEN (V) preferentially associated with the largest eigenvalues are determined (step 17) as 2D-column discriminant components.
- vectors constitute a vector base that will be used during the classification described below.
- a variant of the matrix multiplication step consists in calculating instead the product of the inverse of the intra-class 2D-column covariance matrix by the inter-class 2D-column matrix. The proper system of this matrix product is then calculated as before.
- a projection of the image of the face to be recognized on this basis (typically by multiplication of the face matrix to be recognized by each of the eigenvectors) is carried out (step 18), the vectors thus obtained being grouped together to constitute a matrix associated with this face image; this matrix is compared (step 19) to the matrix of an image representing a reference face, calculated in the same way. It is indicated that the reference face may typically belong to the learning base.
- the comparison between the two matrices is performed using a sum of distances L ⁇ between the corresponding columns of these two matrices.
- the ADL2D-lines consists of an equivalent process, but performed from transposed matrices of 2D images of faces.
- a matrix product involving intra-class and inter-class covariance matrices is evaluated to form a system of discriminant components denoted "2D-lines”.
- the general process of treatment may, advantageously but not necessarily, be concluded by a step 20 of reconstruction of the faces after reduction of dimension
- . - S ⁇ is the intra-class covariance matrix of the learning base projected on P according to relation (11),
- - n c is the number of images of the class c
- X 1 , X e and X are matrices of size IXk.
- matrix S w (respectively the matrix S 6 ), of size mx in, "matrix of covariance intra-classes 2D-columns” (respectively “matrix of covariance inter ⁇ classes 2D-columns”).
- the matrix S w is invertible. Therefore, the use of a first size reduction, as implemented in the ACP + ADL method described above, is unnecessary here.
- Linear two-dimensional within the meaning of the invention, can also be made from the transposed image matrices of the learning base.
- X are now matrices of size mxk.
- S w and S h are henceforth calculated as follows: -X) (T -Jf (23)
- the process continues as described above. It can be considered that the number m of columns of the images of the learning base is assumed to be smaller than the number n of examples present in the learning base.
- the general method (using an ADL2D-Lines or a
- ADL2D-Columns continues with a classification of the faces of the learning base (frame B in Figure 1).
- each image X 1 of the learning base is projected in the base in the sense of the relation (11) to obtain a matrix X 1 per image, of size 1 ⁇ k, corresponding intuitively to the position of the image X 1 in the base P of the discriminant components 2D-columns.
- X CJ - ⁇ X
- J is a vector of size m.
- the face image to be recognized X 1 will subsequently be considered as belonging to the nearest class in the sense of distance according to formula (24), or to the same class of membership as the nearest image , following the distance L 2 columns.
- the reconstruction of the faces (frame C of FIG. 1), performed optionally after the size reduction, can be carried out as follows.
- the projection X 1 of the image X 1 and the corresponding 2D discriminant components [P 1 , .., / **] (rows or columns) can be combined to reconstruct a face image in the the the the
- a learning base and a test database are created, containing faces to be recognized.
- the Intelligent Multimedia Laboratory (IML) database downloadable from http://nova.postech.ac.kr/ was chosen.
- Figure 2 There is shown in Figure 2 a sample of this base of faces.
- the base contains color images of one hundred and seven people, including fifty-six men and fifty-one women. Each person is represented by seventeen views:
- the base has variations of scale. Occultations such as wearing glasses, wearing a hat, or other, are also represented.
- the tests conducted use two subsets of this IML database.
- the first subset illustrates the robustness of the methods evaluated at the pose (later called the “base of the poses”).
- the second subset illustrates the robustness of the methods evaluated for facial expression (hereinafter referred to as the "basis of expressions").
- a training database consisting of three hundred and twenty-one images of faces (three frontal views per person) was used.
- a view represents an expression Neutral facial, the other two are chosen at random from the four facial expressions proposed in the base.
- the test base on the other hand, consists of two hundred and fourteen images (two views per person), representing the remaining facial expressions.
- All the images of the learning base like those of the test database, have been preprocessed so that the images are preferably of size 150x130 pixels, with a distance of 70 pixels between two centers of pupils.
- the right eye and the top left corner of the image are located at a distance of 30 pixels along the abscissa axis and 45 pixels along the ordinate axis.
- the images, initially in color, are simply transformed into luminance images, grayscale, without any other pretreatment.
- the considered variables are digital images, thus presenting themselves as discrete two-dimensional variables.
- the processed images are for example sampled on a discrete grid of 150 lines and 130 columns and values in the real luminance space between 0 and 255.
- the construction of the ADL2D-column model is now described. From the images of the learning database stored in the form of matrices of 150 x 130 pixels, the covariance matrices S w and S b given by the relation (23) are calculated. In particular, we calculate the proper system of S ⁇ 1 S 1 , using a singular value decomposition. This yields the matrix P ⁇ JPV- J P j components discriminant 2D-columns. We can deduce by projection by applying the relation (11) the projections
- test image will simply be assigned to the nearest class c according to the chosen distance criterion.
- similarity measures are given by the distance L 2 columns of the relation (16) and by the other distance estimate given by the relation (24).
- Class assignment and face reconstruction are as described above.
- the results of 1 ⁇ DL2D within the meaning of the invention are compared with the results of the conventional ADL, calculated according to the combination of ACP + ADL method.
- the PCA method was first conducted in the pose database. With more than 200 main components, the rates of recognition decline substantially. It was therefore chosen to retain 200 main components to build the classic ADL.
- L 2 L 2 vector-to-vector distance
- Z 2 moyCl arithmetic mean L 2 vector distance of the projected vectors of each class
- Figure 5 shows the recognition rates obtained by the ADL, compared for these two distances.
- the model is constructed using ACP + ADL combination processing from 200 main components.
- the best recognition rate (93.7%) is achieved for the L 2 distance with 40 discriminating components of 19500 pixels each.
- Figure 6 shows the recognition rates of 1 ⁇ DL2D, both 1 ⁇ DL2D columns and 1 ⁇ DL2D rows.
- the distances compared are the distances L 2 of the relation (16) (denoted by “L 2 ”) and the distances according to the relation (24) (denoted by "L 2 moyCl”).
- L 2 the distances of the relation (16)
- L 2 moyCl the distances according to the relation (24)
- the best recognition rate of 1 ⁇ DL2D for distance L 2 is 94.4% and is reached with only 8 components.
- the best recognition rate is 95.8% and is reached with only 13 components.
- the ADL L 2 achieves its best recognition rate, which is 93.7%, 2.1% less than
- the ADL2D-columns gives significantly better results than the conventional ADL, and especially with a storage space necessary for the storage of much less model.
- FIG. 7 gives the variations called "Cumulative Match Characteristic" (CMC) of 1 ⁇ DL2D L 2 to 8 discriminating components 2D-columns and ADL L 2 to 40 components, discriminating.
- CMC Cumulative Match Characteristic
- a variation CMC makes it possible to give the recognition rates at different ranks, in the sense that a face has been recognized at rank n if an image belonging to the right class is the nth closest to the sense of distance L 2 , after projection.
- the x-axis therefore indicates the rank n, while the y-axis indicates the cumulative recognition rates of ranks less than or equal to the rank n.
- FIG. 8 gives the comparative recognition rates of 1 ⁇ DL2D and 1 ⁇ CP2D as a function of the number of components selected, with an L 2 distance classification. For any number of components k fixed between 1 and 15, 1 ⁇ DL2D gives better results than 1 ⁇ CP2D.
- the components of 1 ⁇ CP2D and 1 ⁇ DL2D both have 130 pixels. The best recognition rate for 1 ⁇ CP2D (92.3%) is achieved with 8 parts of 130 pixels each, and is lower than "2.1% at the best 1 ⁇ DL2D recognition rate, also obtained with 8 components.
- FIG. 9 gives the comparative recognition rates of 1 ⁇ DL2D and 1 ⁇ CP2D as a function of the number of components selected, with an L 2 moyCl distance classification. At any number of components k fixed between 1 and 15, 1 ⁇ DL2D gives better results than 1 ⁇ CP2D.
- the components of 1 ⁇ CP2D and 1 ⁇ DL2D have 130 pixels. The best recognition rate for 1 ⁇ CP2D (94.4%) is reached with 9 components of 130 pixels each, and is 1.4% lower than the best recognition rate of 1 ⁇ DL2D, obtained with 13 components.
- Figure 10 gives the comparative CMC variations of 1 ⁇ DL2D and 1 ⁇ CP2D with a distance L 2 . Both variations
- Figures 8, 9 and 10 show that 1 ⁇ DL2D gives better recognition rates than 1 ⁇ CP2D, both at rank 1 for the same number of selected components ranging from 1 to 15 ( Figures 8 and 9) than at higher ranks ( Figure 10). I / ADL2D is therefore more robust to laying changes than 1 ⁇ CP2D.
- FIG. 11 the reconstructions of the images shown in FIG. 3 are shown. It can be seen that from only three components the facial characteristics (eyes, nose, mouth) already appear, but without being able to distinguish the poses. It will also be noted that a satisfactory reconstruction of the initial image with 20 components is already obtained.
- Figure 12 gives the recognition rates of conventional ADL.
- the distances tested are the distance L 2 and the distance L 2 moyCl.
- the model is constructed by ACP + ADL treatment with 200 main components. The number of main components to be retained was determined in the same way as previously. The best recognition rate (60.7%) is achieved for the the
- distance L 2 moyCl with 60 discriminating components of 19500 pixels each.
- the best recognition rate for the distance L 2 (57.5%) is reached with 30 discriminating components of 19500 pixels each.
- Figure 13 gives the recognition rates of 1 ⁇ DL2D rows and columns with distances L 2 and L 2 moyCl. It can be seen that, unlike the results obtained on the basis intended to test the robustness to the changes of laying, 1 ⁇ DL2D lines give here better results than 1 ⁇ DL2D columns.
- FIGS. 12 and 13 shows, as previously, that 1 ⁇ DL2D makes it possible to obtain better results than the conventional ADL with a much smaller number of components.
- the best recognition rates are obtained for 30 components the
- Figure 14 gives comparative CMC variations of the classical ADL and 1 ⁇ DL2D.
- the CMC variations are constructed with 5 components of 150 pixels each for 1 ⁇ DL2D lines and 30 discriminating components of 19500 pixels each for the ADL.
- the recognition rates obtained by ADL2D lines is better not only at rank 1 but also at higher ranks. At low ranks, we see that 1 ⁇ DL2D is much more efficient than the conventional ADL. The recognition rate is indeed 11.95% higher on average in the first seven rows.
- Figures 12, 13 and 14 show that 1 ⁇ DL2D gives better results on the basis of expressions than the conventional ADL, both in the first rank (comparison of Figures 12 and 13), and higher ranks (Figure 14). L ⁇ DL2D is more robust to changes in facial expressions than conventional ADL.
- FIG. 15 gives the comparative recognition rates of 1 ⁇ DL2D and 1 ⁇ CP2D as a function of the number of components selected, with an L 2 distance classification. For any fixed number of k components ranging from 1 to 11, ADL2D-lines gives better results than 1 ⁇ CP2D. The best recognition rate of 1 ⁇ CP2D (61.2%) is achieved for 18 the
- the ADL2D-lines gives much better results with fewer components.
- the improvement in performance (up to 19.7% of 6-component deviation) as well as the smaller number of components required largely offsets the fact that the 2D-line discriminant components comprise 150 pixels while the 1 ⁇ CP2D components do not only 130 pixels.
- FIG. 16 gives the comparative recognition rates of 1 ⁇ DL2D and 1 ⁇ CP2D as a function of the number of components selected, with an L 2 moyCl distance classification. For a fixed number of k components ranging from 1 to 15, 1 ⁇ DL2D lines gives much better results than 1 ⁇ CP2D. The best recognition rate of 1 ⁇ CP2D
- Figure 17 gives the compared CMC variations of 1 ⁇ DL2D lines L 2 and 1 ⁇ CP2D L 2 .
- the CMC variation of 1 ⁇ DL2D-line-s is constructed with 5 discriminant components 2D-lines.
- the CMC variation of 1 ⁇ CP2D is constructed with 18 components.
- ADL2D-lines provides a big improvement over the results 1 ⁇ CP2D (average 11.4% improvement in the first two rows).
- Figures 15, 16 and 17 show that 1 ⁇ DL2D. gives better recognition rates than 1 ⁇ CP2D, both at rank 1 for the same number of selected components ranging from 1 to 15 ( Figures 15 and 16) to higher ranks ( Figure 17).
- FIGS. 18 (b1), 18 (b.2) and 18 (b.3) The three images of the learning base corresponding to this same person are represented in FIGS. 18 (b1), 18 (b.2) and 18 (b.3). Instead of one of these three faces, 1 ⁇ CP2D gives the face shown in Figure 18 (c) as being closest to the image of the face to be recognized.
- the ADL2D-5-component line assigns to the test face image given in Figure 18 (a) the face image given in Figure 18 (b.l).
- the face is well recognized by L'ADL2D.
- L ⁇ CP2D therefore seems too influenced by intra-class variations, compared to 1 ⁇ DL2D within the meaning of the invention.
- 1CP2D which maximizes the total covariance without distinguishing the inter-class covariance of the intra-class covariance, the sub-space image groupings would occur not in terms of the identity, but of the facial expression, this which leads to a bad separation of classes.
- L ⁇ DL2D jointly minimizing intra-class variance and maximizing variance inter-classes, is therefore more robust to changes in facial expressions than 1 ⁇ CP2D.
- FIG. 19 shows the reconstructions of the images shown in FIG. 4.
- the facial features eyes, nose, mouth
- more components typically 20 components or more
- visually satisfactory reconstruction of the initial faces can be achieved.
- the two-dimensional ADL offering a satisfactory classification by size reduction, combines the discriminating power of the conventional ADL and the calculation gain of 1 ⁇ CP2D.
- 1 ⁇ DL2D is more robust to pose changes and facial expressions which are two major issues of face recognition.
- FIG. 20 shows a computing device for implementing the present invention, equipped for example with display means such as an ECR screen, input devices (keyboard CLA, mouse SOU, or others). , a central unit UC equipped with a processor PROC, a random access memory RAM, a memory ROM MEM, a graphics card CG connected to the display means ECR, and communication means COM (such as a modem), for example with a SER server, via a communication network (for example the Internet), to receive, if necessary, images of faces to be recognized and / or to transmit reconstructed images.
- the memory MEM of the device is arranged to store at least:
- the processor PROC is able to cooperate with this memory MEM to process the images of the learning base and / or reference images, as well as images to be recognized and subsequently compared to the images of the learning base and / or the reference images.
- the device comprises an LEC reader of a removable memory medium (CD-ROM, ZIP, USB key, or other) suitable for storing the program product and / or the learning base.
- a removable memory medium CD-ROM, ZIP, USB key, or other
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Human Computer Interaction (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/FR2004/001395 WO2006003270A1 (fr) | 2004-06-04 | 2004-06-04 | Procede pour la reconnaissance de visages, a analyse discriminante lineaire bidimensionnelle |
| US11/628,321 US7856123B2 (en) | 2004-06-04 | 2004-06-04 | Method for recognising faces by means of a two-dimensional linear discriminant analysis |
| EP04767263A EP1751689A1 (fr) | 2004-06-04 | 2004-06-04 | Procede pour la reconnaissance de visages, a analyse discriminante lineaire bidimensionnelle |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/FR2004/001395 WO2006003270A1 (fr) | 2004-06-04 | 2004-06-04 | Procede pour la reconnaissance de visages, a analyse discriminante lineaire bidimensionnelle |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2006003270A1 true WO2006003270A1 (fr) | 2006-01-12 |
Family
ID=34958394
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/FR2004/001395 Ceased WO2006003270A1 (fr) | 2004-06-04 | 2004-06-04 | Procede pour la reconnaissance de visages, a analyse discriminante lineaire bidimensionnelle |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US7856123B2 (fr) |
| EP (1) | EP1751689A1 (fr) |
| WO (1) | WO2006003270A1 (fr) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104951756A (zh) * | 2015-06-08 | 2015-09-30 | 浙江科技学院 | 一种基于压缩感知的人脸识别方法 |
| CN109815990A (zh) * | 2018-12-28 | 2019-05-28 | 天津大学 | 基于中心化权重的主成分分析系统 |
| CN111652021A (zh) * | 2019-04-30 | 2020-09-11 | 上海铼锶信息技术有限公司 | 一种基于bp神经网络的人脸识别方法及系统 |
| CN115830678A (zh) * | 2022-11-28 | 2023-03-21 | 浙江大华技术股份有限公司 | 表情特征提取方法、表情识别方法和电子设备 |
Families Citing this family (28)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7840060B2 (en) * | 2006-06-12 | 2010-11-23 | D&S Consultants, Inc. | System and method for machine learning using a similarity inverse matrix |
| US7936906B2 (en) * | 2007-06-15 | 2011-05-03 | Microsoft Corporation | Face recognition using discriminatively trained orthogonal tensor projections |
| US8077994B2 (en) * | 2008-06-06 | 2011-12-13 | Microsoft Corporation | Compression of MQDF classifier using flexible sub-vector grouping |
| US7996343B2 (en) * | 2008-09-30 | 2011-08-09 | Microsoft Corporation | Classification via semi-riemannian spaces |
| US20110183764A1 (en) * | 2010-01-20 | 2011-07-28 | Gregg Franklin Eargle | Game process with mode of competition based on facial similarities |
| JP5900052B2 (ja) * | 2012-03-15 | 2016-04-06 | オムロン株式会社 | 登録判定装置、その制御方法および制御プログラム、並びに電子機器 |
| CN102831389B (zh) * | 2012-06-28 | 2015-03-11 | 北京工业大学 | 基于判别分量分析的人脸表情识别算法 |
| US8879805B2 (en) * | 2012-09-12 | 2014-11-04 | Academia Sinica | Automated image identification method |
| US20140341443A1 (en) * | 2013-05-16 | 2014-11-20 | Microsoft Corporation | Joint modeling for facial recognition |
| CN103514445B (zh) * | 2013-10-15 | 2016-09-14 | 武汉科技大学 | 基于多流形学习的带钢表面缺陷识别方法 |
| US9614724B2 (en) | 2014-04-21 | 2017-04-04 | Microsoft Technology Licensing, Llc | Session-based device configuration |
| US9639742B2 (en) | 2014-04-28 | 2017-05-02 | Microsoft Technology Licensing, Llc | Creation of representative content based on facial analysis |
| US9773156B2 (en) | 2014-04-29 | 2017-09-26 | Microsoft Technology Licensing, Llc | Grouping and ranking images based on facial recognition data |
| CN103942545A (zh) * | 2014-05-07 | 2014-07-23 | 中国标准化研究院 | 一种基于双向压缩数据空间维度缩减的人脸识别方法和装置 |
| CN103942572A (zh) * | 2014-05-07 | 2014-07-23 | 中国标准化研究院 | 一种基于双向压缩数据空间维度缩减的面部表情特征提取方法和装置 |
| US9430667B2 (en) | 2014-05-12 | 2016-08-30 | Microsoft Technology Licensing, Llc | Managed wireless distribution network |
| US10111099B2 (en) | 2014-05-12 | 2018-10-23 | Microsoft Technology Licensing, Llc | Distributing content in managed wireless distribution networks |
| US9384335B2 (en) | 2014-05-12 | 2016-07-05 | Microsoft Technology Licensing, Llc | Content delivery prioritization in managed wireless distribution networks |
| US9384334B2 (en) | 2014-05-12 | 2016-07-05 | Microsoft Technology Licensing, Llc | Content discovery in managed wireless distribution networks |
| US9874914B2 (en) | 2014-05-19 | 2018-01-23 | Microsoft Technology Licensing, Llc | Power management contracts for accessory devices |
| US9367490B2 (en) | 2014-06-13 | 2016-06-14 | Microsoft Technology Licensing, Llc | Reversible connector for accessory devices |
| US9460493B2 (en) | 2014-06-14 | 2016-10-04 | Microsoft Technology Licensing, Llc | Automatic video quality enhancement with temporal smoothing and user override |
| US9373179B2 (en) | 2014-06-23 | 2016-06-21 | Microsoft Technology Licensing, Llc | Saliency-preserving distinctive low-footprint photograph aging effect |
| WO2016011204A1 (fr) * | 2014-07-15 | 2016-01-21 | Face Checks Llc | Système et procédé de reconnaissance faciale à base d'algorithmes multiples avec partitionnement d'ensemble de données optimal pour un environnement en nuage |
| JP6448325B2 (ja) * | 2014-11-19 | 2019-01-09 | キヤノン株式会社 | 画像処理装置、画像処理方法及びプログラム |
| WO2016119076A1 (fr) * | 2015-01-27 | 2016-08-04 | Xiaoou Tang | Procédé et système de reconnaissance faciale |
| CN106803054B (zh) * | 2015-11-26 | 2019-04-23 | 腾讯科技(深圳)有限公司 | 人脸模型矩阵训练方法和装置 |
| CN111259780B (zh) * | 2020-01-14 | 2022-06-24 | 南京审计大学 | 一种基于分块线性重构鉴别分析的单样本人脸识别方法 |
-
2004
- 2004-06-04 WO PCT/FR2004/001395 patent/WO2006003270A1/fr not_active Ceased
- 2004-06-04 US US11/628,321 patent/US7856123B2/en active Active
- 2004-06-04 EP EP04767263A patent/EP1751689A1/fr not_active Ceased
Non-Patent Citations (7)
| Title |
|---|
| BELHUMEUR P N ET AL: "EIGENFACES VS. FISHERFACES: RECOGNITION USING CLASS SPECIFIC LINEAR PROJECTION", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, IEEE INC. NEW YORK, US, vol. 19, no. 7, July 1997 (1997-07-01), pages 711 - 720, XP000698170, ISSN: 0162-8828 * |
| KE LIU ET AL: "Algebraic feature extraction for image recognition based on an optimal discriminant criterion", PATTERN RECOGNITION UK, vol. 26, no. 6, 1993, pages 903 - 911, XP008040298, ISSN: 0031-3203 * |
| M. LI & B. YUAN: "A novel statistical linear discriminant analysis for image matrix: Two-dimensional fisherfaces", 7TH INT. CONF. ON SIGNAL PROCESSING PROC. ICSP'04, vol. 2, 31 August 2004 (2004-08-31), pages 1419 - 1422, XP008040329 * |
| SIROVICH L ET AL: "LOW-DIMENSIONAL PROCEDURE FOR THE CHARACTERIZATION OF HUMAN FACES", JOURNAL OF THE OPTICAL SOCIETY OF AMERICA - A, OPTICAL SOCIETY OF AMERICA, WASHINGTON, US, vol. 4, no. 3, 1 March 1987 (1987-03-01), pages 519 - 524, XP000522491, ISSN: 1084-7529 * |
| TURK M ET AL: "EIGENFACES FOR RECOGNITION", JOURNAL OF COGNITIVE NEUROSCIENCE, CAMBRIDGE, MA, US, vol. 3, no. 1, January 1991 (1991-01-01), pages 71 - 86, XP000490270 * |
| YANG J ET AL: "TWO-DIMENSIONAL PCA: A NEW APPROACH TO APPEARANCE-BASED FACE REPRESENTATION AND RECOGNITION", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, IEEE INC. NEW YORK, US, vol. 26, no. 1, January 2004 (2004-01-01), pages 131 - 137, XP001185863, ISSN: 0162-8828 * |
| YANG J ET AL: "UNCORRELATED PROJECTION DISCRIMINANT ANALYSIS AND ITS APPLICATION TO FACE IMAGE FEATURE EXTRACTION", INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, WORLD SCIENTIFIC PUBLISHING COMPAGNY, SINGAPORE, SI, vol. 17, no. 8, December 2003 (2003-12-01), pages 1325 - 1347, XP001186868, ISSN: 0218-0014 * |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104951756A (zh) * | 2015-06-08 | 2015-09-30 | 浙江科技学院 | 一种基于压缩感知的人脸识别方法 |
| CN109815990A (zh) * | 2018-12-28 | 2019-05-28 | 天津大学 | 基于中心化权重的主成分分析系统 |
| CN109815990B (zh) * | 2018-12-28 | 2023-06-30 | 天津大学 | 基于中心化权重的主成分分析系统 |
| CN111652021A (zh) * | 2019-04-30 | 2020-09-11 | 上海铼锶信息技术有限公司 | 一种基于bp神经网络的人脸识别方法及系统 |
| CN111652021B (zh) * | 2019-04-30 | 2023-06-02 | 上海铼锶信息技术有限公司 | 一种基于bp神经网络的人脸识别方法及系统 |
| CN115830678A (zh) * | 2022-11-28 | 2023-03-21 | 浙江大华技术股份有限公司 | 表情特征提取方法、表情识别方法和电子设备 |
Also Published As
| Publication number | Publication date |
|---|---|
| EP1751689A1 (fr) | 2007-02-14 |
| US20080014563A1 (en) | 2008-01-17 |
| US7856123B2 (en) | 2010-12-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP1751689A1 (fr) | Procede pour la reconnaissance de visages, a analyse discriminante lineaire bidimensionnelle | |
| Nishiyama et al. | Facial deblur inference using subspace analysis for recognition of blurred faces | |
| Rao et al. | Learning discriminative aggregation network for video-based face recognition and person re-identification | |
| EP3640843B1 (fr) | Procédé d'extraction de caractéristiques d'une empreinte digitale représentée par une image d'entrée | |
| WO2006103240A1 (fr) | Procédé d'identification de visages à partir d'images de visage, dispositif et programme d'ordinateur correspondants | |
| Li et al. | Vehicle and person re-identification with support neighbor loss | |
| EP3655893A1 (fr) | Systeme d'apprentissage machine pour diverses applications informatiques | |
| Chokkadi et al. | A Study on various state of the art of the Art Face Recognition System using Deep Learning Techniques | |
| EP3018615B1 (fr) | Procede de comparaison de donnees ameliore | |
| Vignesh et al. | Face image quality assessment for face selection in surveillance video using convolutional neural networks | |
| EP3582141B1 (fr) | Procédé d'apprentissage de paramètres d'un réseau de neurones à convolution | |
| CN112633221B (zh) | 一种人脸方向的检测方法及相关装置 | |
| Mansourifar et al. | One-shot gan generated fake face detection | |
| EP4099228A1 (fr) | Apprentissage automatique sans annotation ameliore par regroupements adaptatifs en ensemble ouvert de classes | |
| Juefei-Xu et al. | Facial ethnic appearance synthesis | |
| Mukhopadhyay et al. | Real time facial expression and emotion recognition using eigen faces, LBPH and fisher algorithms | |
| FR3103045A1 (fr) | Procédé d’augmentation d’une base d’images d’apprentissage représentant une empreinte sur un arrière-plan au moyen d’un réseau antagoniste génératif | |
| Wu et al. | Heterogeneous feature selection by group lasso with logistic regression | |
| Praveenbalaji et al. | ID photo verification by face recognition | |
| Knoche et al. | Susceptibility to image resolution in face recognition and trainings strategies | |
| FR3112008A1 (fr) | Procédé de détection d’au moins un trait biométrique visible sur une image d’entrée au moyen d’un réseau de neurones à convolution | |
| EP2804175A1 (fr) | Procédé de reconnaissance vocale visuelle par suivi des déformations locales d'un ensemble de points d'intérêt de la bouche du locuteur | |
| EP3825915A1 (fr) | Procede de classification d'une empreinte biometrique representee par une image d'entree | |
| Yifrach et al. | Improved nuisance attribute projection for face recognition | |
| CN111353353A (zh) | 跨姿态的人脸识别方法及装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 11628321 Country of ref document: US |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWW | Wipo information: withdrawn in national office |
Country of ref document: DE |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2004767263 Country of ref document: EP |
|
| WWP | Wipo information: published in national office |
Ref document number: 2004767263 Country of ref document: EP |
|
| WWP | Wipo information: published in national office |
Ref document number: 11628321 Country of ref document: US |