[go: up one dir, main page]

GB2402311A - Facial recognition using synthetic images - Google Patents

Facial recognition using synthetic images Download PDF

Info

Publication number
GB2402311A
GB2402311A GB0312096A GB0312096A GB2402311A GB 2402311 A GB2402311 A GB 2402311A GB 0312096 A GB0312096 A GB 0312096A GB 0312096 A GB0312096 A GB 0312096A GB 2402311 A GB2402311 A GB 2402311A
Authority
GB
United Kingdom
Prior art keywords
face
data
face recognition
synthetic
image data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB0312096A
Other versions
GB2402311B (en
GB0312096D0 (en
Inventor
Simon Michael Rowe
Richard Ian Taylor
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Priority to GB0312096A priority Critical patent/GB2402311B/en
Publication of GB0312096D0 publication Critical patent/GB0312096D0/en
Publication of GB2402311A publication Critical patent/GB2402311A/en
Application granted granted Critical
Publication of GB2402311B publication Critical patent/GB2402311B/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06K9/00
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Processing Or Creating Images (AREA)

Abstract

In an image processing apparatus 2, synthetic training images of a face are generated for a face recogniser by rendering a three-dimensional computer model in accordance with different viewing positions, viewing directions, backgrounds and/or lighting conditions. The synthetic training image data is used to train the face recogniser by generating representation data for the face recogniser characterising each face in the training images. In this way, the trained face recogniser is operable to process input images using the generated representation data to recognise the face in input images. The apparatus is arranged to update the representation data for the face recogniser using non-synthetic image data on which face recognition processing has been performed by the face recogniser. In this way, the face recogniser is "boot strapped" using synthetic image data so that it can begin face recognition processing, and real image data is subsequently used to improve the processing by the face recogniser when it is available.

Description

240231 1
IMAGE PROCESSING
The present invention relates to the field of image processing and, more particularly, to the processing of image data by an image processing apparatus to perform face recognition to identify a face in the image.
Many different types of face recognition system are known. These include, for example, exemplar-based systems (for example as described in 'Exemplar-based Face Recognition from Video'' by Krueger and Zhou in ECCV 2002 Seventh European Conference on Computer Vision, Proceedings Part IV, pages 732-746), neural network systems (for example as described in Efface Recognition: A Convolutional Neural Network Approach" by Lawrence et al in IEEE Transactions on Neural Networks, Special Issue on Neural Networks and Pattern Recognition, Volume 8, Number 1, pages 98-113, 1997, and Multi layer Perceptron in Face Recognition'' by Oravec available at www.electronicsletters.com, paper 10/11/2001 ISSN 1213-161X) and eigenface systems (for example as described in "Eigenfaces for Recognition' by Turk and Pentland in the Journal of Cognitive Neuroscience, Volume 3, Number 1, page 71-86).
All of these systems require training data, comprising images of each face to be recognized, to train the face recogniser. This training data is processed to generate representation data for the face recogniser comprising data which characterizes each face to be recognised by the system.
The present invention has been made with this in mind.
According to the present invention, there is provided a face recognition apparatus comprising a face recogniser operable to process image data to identify a face therein in accordance with representation data, a representation data generator operable to generate representation data for the face recogniser using training image data of faces, and a synthetic image generator operable to generate synthetic training images for the representation data generator.
These features provide the advantage that a sufficient number of training images can be quickly and accurately generated.
Preferably, the synthetic image data is generated by processing a threedimensional computer model of at least the face of each person for which representation data is to be generated.
In this way, synthetic images representing views of the three-dimensional computer model with different viewing
parameters, backgrounds and/or different lighting
conditions may be generated, providing a wide variety of input images for training. Further, the three dimensional computer model may be animated or morphed so that synthetic images representing different facial expressions can be generated as training images.
The present invention also provides face recognition apparatus comprising a face recogniser operable to process image data in accordance with representation data to determine if the image contains a face defined by the representation data, and a representation data generator operable to generate representation data for the face recogniser using training image data comprising synthetic image data defining a plurality of images of a face, wherein the representation data generator is operable to update the representation data for the face recogniser using non-synthetic image data which has been processed to recognise a face therein.
This provides the advantage that synthetic image data can be used to 'bootstrap. the face recogniser by generating preliminary representation data to enable the face recogniser to operate, and then the representation data can be updated using ''real'. image data (that is, non synthetic image data) when such image data is received.
In this way, the reliability of the representation data can be improved, increasing the accuracy of subsequent face recognition processing by the face recogniser.
The present invention also provides a computer program product, embodied for example as a storage medium carrying instructions or as a signal carrying instructions, comprising instructions for causing a programmable processing apparatus to become configured as an apparatus as set out above.
Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings, in which: Figure 1 schematically shows the components of an embodiment of the invention, together with the notional functional processing units and data stores into which the processing apparatus component may be thought of as being configured when programmed by programming instructions; Figure 2 shows the processing operations performed to train the face recogniser in the processing apparatus of Figure 1; and Figures 3a and 3b show the processing operations performed by the apparatus of Figure 1 during face recognition processing of image data.
Referring to Figure 1, an embodiment of the invention comprises a programmable processing apparatus 2, such as a personal computer, containing, in a conventional manner, one or more processors, memories, graphics cards etc., together with a display device 4 and user input devices 6, such as a keyboard, mouse etc. The processing apparatus 2 is programmed to operate in accordance with programming instructions input, for example, as data stored on a data storage medium 12 (such as an optical CD ROM, semiconductor ROM, or magnetic recording medium, etc.), and/or as a signal 14 (for example an electrical or optical signal input to the processing apparatus 2, for example from a remote database, by transmission over a communication network such as the Internet or by transmission through the atmosphere), and/or entered by a user via a user input device 6 such as a keyboard.
As will be described in detail below, the programming instructions comprise instructions to program the processing apparatus 2 to become configured to generate synthetic training images of a face for a face recogniser by rendering a three-dimensional computer model in accordance with different viewing positions, viewing directions, backgrounds and/or lighting conditions. The synthetic training image data is used to train the face recogniser by generating representation data for the face recogniser characterizing each face in the training images. Synthetic training image data and representation data is generated for faces of a plurality of different people. In this way, the trained face recogniser is operable to process input images using the generated representation data to recognize different faces in the input images. Each input image processed by the trained face recogniser is stored in an image database together with data defining the name of each person's face recognized in the image. The database can then be searched in accordance with a person's name to retrieve images of that person. The apparatus is arranged to update the representation data for the face recogniser using non-synthetic image data on which face recognition processing has been performed by the face recogniser.
In this way, the face recogniser is ''boot stepped 't using synthetic image data so that it can begin face recognition processing, and real image data is subsequently used to improve the processing by the face recogniser when it is available.
When programmed by the programming instructions, the processing apparatus 2 can be thought of as being configured as a number of functional units for performing processing operations and a number of data stores configured to store data. Examples of such functional units and data stores together with their interconnections are shown in Figure 1. The functional units, data stores and interconnections illustrated in Figure 1 are, however, notional, and are shown for illustration purposes only to assist understanding; as will be appreciated by the skilled person, they do not necessarily represent the units, data stores and connections into which the processors, memories, etc. of the processing apparatus 2 actually become configured.
Referring to the functional units shown in Figure 1, central controller 20 is arranged to process inputs from the user input devices 6, and also to provide control and processing for the other functional units. Working memory 30 is provided for use by central controller 20 and the other functional units.
Input data interface 40 is arranged to receive, and write to memory, data defining a three-dimensional computer model of the head of each person for which the face recogniser is to be trained to perform face recognition, together with data defining the name of each person for which a three-dimensional computer head model is input.
Input data interface 40 is also arranged to receive, and write to memory, input image data defining each image on which face recognition is to be performed by the trained face recogniser.
The input data defining each three-dimensional computer head model, the name data and the images on which face recognition is to be performed may be input to processing apparatus 2 as data stored on a storage medium 46, or as data carried by a signal 44. Each three-dimensional computer head model may be generated, for example, by laser scanning the head of the subject person in a conventional way.
3D computer head model store 50 is configured to store the input data defining each of the three-dimensional computer head models.
Image data store 60 is configured to store the input image data defining images on which face recognition is to be performed by the trained face recogniser.
Renderer 70 is operable to render images of each three dimensional computer model with different viewing positions, viewing directions, lighting conditions and backgrounds behind the three-dimensional computer model, thereby generating synthetic training image data for the face recogniser.
Training data store 80 is configured to store the synthetic training image data generated by renderer 70.
Skin pixel detector 90 is operable to process synthetic image data from training data store 80 and non-synthetic image data from image data store 60 to detect areas within each image which represent human skin.
Face recogniser 100 is operable to use the training data from training data store 80 to generate representation data characterizing each person's face in the training data. The generation of representation data by the face recogniser 100 is referred to as training to generate a trained face recogniser 100. The trained face recogniser is operable to process image data from image data store 60 using the representation data to determine whether the image contains a face defined by the representation data and, if it does, to identify which of the faces the image contains.
The processing performed to train face recogniser 100 to generate representation data, the content of the representation data itself, and the processing performed by the trained face recogniser 100 to recognise a face in an input image will vary depending upon the type of face recogniser 100. In subsequent description, examples of the processing and representation data will be given for an exemplar-based face recogniser 100, a neural network face recogniser 100 and an eigenface face recogniser 100, although face recogniser 100 is not restricted to these types and other types of face recogniser 100 are possible.
Representation data store 110, is configured to store representation data for face recogniser 100.
Image database 120 is configured to store image data from image data store 60 which has been processed by face recogniser 100 and in which at least one face has been recognised. Image database 120 is also configured to store name data associated with each image identifying the people whose faces have been recognized in the image.
Database search engine 130 is operable to search the data in the image database 120 in accordance with a name input by a user using a user input device 6 such as a keyboard, to identify each image in the image database 120 which contains the face of the person with the input name.
Database search engine 130 is further operable to enable a user to select one or more of the identified images from image database 120 and to display the selected image(s) on display device 4.
Display controller 140, under the control of central controller 20, is operable to control display device 4 to display image data received as input image data, and to display image data retrieved from image database 120.
12 --- Output data interface 150 is operable to output data from processing apparatus 2 for example as data on a storage medium 152 (such as an optical CD ROM, semiconductor ROM or magnetic recording medium, etc.) and/or as a signal 154 (for example an electrical or optical signal transmitted over a communication network such as the Internet or through the atmosphere). In this embodiment, the output data comprises data defining the representation data from representation data store 110 and, optionally, data defining the face recogniser 100.
A recording of the output data may be made by recording the output signal 154 either directly or indirectly (for example by making a recording and then making a subsequent copy recording) using recording apparatus (not shown).
Figure 2 shows the processing operations performed by processing apparatus 2 to train face recogniser 100 in this embodiment.
Referring to Figure 2, at step S2, input data interface stores each threedimensional computer head model input to processing apparatus 2 in 3D computer head model store 50, together with input name data defining the respective name of the person for each 3D computer head model.
At step S4, renderer 70 renders a plurality of images of each 3D computer head model stored at step S2. Each image of a respective head model is rendered from a different respective viewing position, viewing direction, with a different background behind the three-dimensional computer model, and/or with different lighting conditions. This ensures that the images of each head model are sufficiently different to facilitate reliable training of face recogniser 100. The image data generated by renderer 70 therefore comprises synthetic image data defining "n" training images of each face for which face recogniser 100 is to be trained to recognize.
At step S6, skin pixel detector 90 processes each synthetic training image generated at step S4 to detect skin pixels in the image. This processing is performed in a conventional way, for example as described in JP-A-1194051 or EP-A-1211638. The result of this processing is a respective skin pixel image (comprising the skin coloured pixels extracted from the input image data) for the face in each input image.
At step S8, each skin pixel image generated at step S6 is stored in training data store 80 together with data identifying it as a synthetic image.
At step S10, face recogniser 100 is trained using the synthetic skin pixel image data stored at step S8 to generate representation data for subsequent use in face recognition processing.
The processing performed at step S10 and the representation generated by the processing is dependent upon the type of the face recogniser 100.
For example, in an exemplar-based face recogniser 100, the processing at step S10 comprises storing synthetic image data defining each skin pixel image stored at step S8 as representation data for the face recogniser 100.
More particularly, a respective set of exemplars is stored for each person comprising the synthetic skin pixel images stored at S8 for that person. Processing may be performed to select exemplars which are sufficiently different from each other for storage as representation data. However, in this embodiment, renderer 70 is arranged to generate images of each 3D computer model which are sufficiently different by selection of different viewing parameters, backgrounds and/or lighting conditions. Accordingly, processing to select exemplars at step S10 is not performed in this embodiment.
In a neural network face recogniser 100, the processing at step S10 comprises processing to determine the synaptic weights for the links between the neurons in the neural network. This is performed, for example, using a back propagation technique to generate synaptic weights which give the same output value(s) from the neural network for each input synthetic skin pixel image of the same person. The representation data stored in representation data store 110 therefore comprises, for each person to be recognised, a set of synaptic weights and the output value(s) generated by the neural network for that person. Suitable processing for training a neural network face recogniser at step S10 is described, for example, in "Face Recognition: A Convolutional Neural Network Approach" by Lawrence et al in IEEE Transactions on Neural Networks, Special Issue on Neural Networks and Pattern Recognition, Volume 8, Number 1, pages 98- 113, 1997, and "Multilayer Perceptron in Face Recognition" by Oravec available at www.electronicsletters.com, paper 10/11/2001 ISSN 1213-161X.
For an eigenface face recogniser 100, the processing at step S10 comprises calculating the ''eigenfaces.' which characterize the variation in the synthetic skin pixel images stored at step S8, these eigenfaces defining a multi-dimensional "face space". This is performed in a conventional way, for example as described in "Eigenfaces for Recognition'' by Turk and Pentland in the Journal of Cognitive Neuroscience, Volume 3, Number 1, page 71-86). The processing comprises calculating an average face (represented by a vector) from the faces in the skin pixel training images, calculating a respective difference vector for each skin pixel image defining the difference between the skin pixel image and the average face, arranging the difference vectors in an "m" by ''m' matrix (where m is the total number of skin pixel images), calculating the eigenvectors and eigenvalues of the differences matrix, selecting the eigenvectors with the largest associated eigenvalues, and linearly combining the skin pixel images in accordance with the selected eigenvectors to define a set of.'eigenfaces" which define a "face space". For each person's face represented in the training images, a class vector in the "face space'' is then calculated by transforming each skin pixel image for that face into its eigenface components and calculating a vector that describes the contribution of each eigenface representing the face. An average of the calculated vectors for each person's face is then calculated to define a class vector for that face. In effect, the class vector for a person's face defines a region of face space characterizing the face. A threshold value is then set defining a distance within the "face space" from any face class within which a vector calculated for a face to be recognised must lie to be identified as a face in that class (that is, to recognise the person as the person defined by that class vector). Accordingly, in an eigenface face recogniser 100, the representation data comprises data defining the eigenfaces, the class vectors for each person, and the threshold distance.
Figures 3a and 3b show the processing operations performed by processing apparatus 2 containing the trained face recogniser 100 to perform face recognition processing on input image data.
It should be noted that a time delay may occur between the processing operations of Figure 2 and the processing operations of Figure 3 because the user may delay inputting image data upon which face recognition is to be performed.
Referring to Figure 3, at step Sll, input data interface stores nonsynthetic image data input to processing apparatus 2 in image data store 60 as image data on which face recognition is to be performed by the trained face recogniser 100. It should be noted that each image stored at step Sll may comprise a frame of video data or a "still" image.
At step S12, image data for the next non-synthetic image to be processed is read from image data store 60 (this being image data for the first image the first time step S12 is performed).
At step S14, skin pixel detector 90 detects skin pixels in the image data using processing the same as that performed at step S6, to generate a respective skin pixel image for each face in the input image. Accordingly, if there is more than one face in the input image, then more than one skin pixel image is generated at step S14.
At step S16, face recogniser 100 processes each skin pixel image generated at step S14 using the representation data stored in representation data store to perform face recognition.
As with step S10, the processing performed at step S16 will be dependent upon the type of the face recogniser 100.
For example, for an exemplar-based face recogniser 100, the processing at step S16 comprises comparing the image data for each skin pixel image generated at step S14 with each exemplar image in the representation data using conventional image comparison techniques. Such comparison techniques may comprise, for example, one or more of a pixel-by-pixel intensity value comparison of the image data, an adaptive least squares correlation technique (for example as described in "Adaptive Least Squares Correlation: A Powerful Image Matching Technique" by Gruen in Photogrammatry Remote Sensing and Cartography, 1985, pages 175-187), and detection of edges or other salient features in each image and processing to determine whether the detected edges/features align.
In this way, for each skin pixel image data generated at step S14, a respective match score is calculated for each exemplar defining the accuracy of the match between the exemplar and the skin pixel image.
For a neural network face recogniser 100, the processing performed at step S16 comprises, for each skin pixel image generated at step S14, processing the image data using the neural network and each respective set of synaptic weights stored in the representation data to generate one or more output values for each set of synaptic weights.
For an eigenface face recogniser 100, processing may be performed at step S16 for example as described in "Eigenfaces for Recognition'' by Turk and Pentland in the Journal of Cognitive Neuroscience, Volume 3, Number 1, page 71-86. This processing effectively comprises, for each skin pixel image generated at step S14, projecting the skin pixel image data into the face space defined by the eigenfaces in the representation data, and then classifying the skin pixel image data by comparing its position in face space with the positions of people known to the system. To do this, for each skin pixel image, the image data generated at step S14 is transformed into its eigenface components, a vector is calculated describing the contribution of each eigenface representing the face in the image data, and the respective difference between the calculated vector and each class vector stored in the representation data is calculated (each of these differences effectively representing a distance in face space).
At step S18 face recogniser 100 determines whether a face has been recognised as a result of the processing at step S16.
In an exemplar-based face recogniser 100, the processing at step S18 comprises, for each skin pixel image generated at step S14, selecting the highest match score calculated at step S16 and determining whether the selected match score is above a threshold. In the event that the highest match score is above the threshold, then it is determined that the face of the person to which the matching exemplar relates has been identified in the input image.
For a neural network face recogniser 100, the processing at step S18 comprises, for each skin pixel image generated at step S14, calculating the difference between the output value(s) of the neural network at step S16 and the output value(s) for each person stored in the representation data to generate one or more difference values for each person. The smallest difference value(s) are then selected and compared with a threshold to determine whether the difference(s) are sufficiently small. If it is determined that the difference(s) are less than the threshold, then it is determined that the face of the person to which the representation data for the smallest difference(s) relates has been recognized in the input image.
For an eigenface face recogniser 100, the processing at step S18 comprises, for each skin pixel image generated at step S14, selecting the smallest distance value calculated at step S16 and determining whether it is within the threshold distance defined in the representation data (for example as described in "Eigenfaces for Recognition' by Turk and Pentland in the Journal of Cognitive Neuroscience, Volume 3, Number 1, page 71-86). If it is determined that the distance is less than the threshold, then it is determined that the face of the person corresponding to the class vector to which the smallest distance relates has been recognised in the input image.
It should be noted that the processing at step S14 may detect the skin pixels from more than one face and that, consequently, the processing at step S16 and S18 may recognise more than one face in the input image.
If it is determined at step S18 that a face has not been recognised, then processing proceeds to step S32 to determine whether there is another input image on which face recognition processing is to be performed.
On the other hand, if it is determined at step S18 that a face has been recognised in the input image, then processing proceeds to step S20.
At step S20, the image data read at step S12 is stored in the image database 120 together with data defining the name of each person whose face was recognised by the processing at steps S16 and S18. In this way, image data and name data is stored in the image database 120 for subsequent searching and retrieval by the database search engine 130.
At step S22, face recogniser 100 determines, for each person identified at step S18, the training image within training data store 80 which most closely matches the identified face of the person in the input image data read at step S12.
For an exemplar-based face recogniser 100, the processing at step S22 comprises identifying the corresponding training image for each exemplar selected at step S18 as an exemplar matching a face in the input image.
For a neural network face recogniser 100 and an eigenface face recogniser 100, the processing at step S22 comprises comparing the skin pixels of each face detected at step S14 in the input image with each training image stored in training data store 80 using processing the same as that performed by exemplar-based face recogniser 100 at step S16. In other words, the comparison performed at step S22 is not neural network-based or eigenface-based for these types of system.
At step S24, face recogniser 100 determines whether any training image identified at step S22 is a synthetic image (this being performed by reading the data stored in training data store 80 identifying whether the training image comprises synthetic image data).
If it is determined at step S24 that a matching training image is a synthetic image, then, at step S26, the image data for that synthetic image is removed from the training data store 80.
On the other hand, if it is determined at step S24 that no matching training image is a synthetic image, then step S26 is omitted.
At step S28, the non-synthetic skin pixel image data generated at step S14 is stored as training image data for face recogniser 100 in training data store 80.
At step S30, face recogniser 100 is restrained using the training images now stored in training data store 80 to generate updated representation data for the face recogniser 100.
For an exemplar-based system, the processing at step S30 comprises replacing the existing exemplars in the representation data store 110 with the updated training images from training data store 80.
For a neural network face recogniser 100, the processing at step S30 comprises, for example, performing at least one iteration of back propagation using the output value(s) generated by the face recognitionprocessing at step S16, or discarding the current representation data and restraining the neural network using the updated training data and the same processing as that described above at step S10.
In the case of an eigenface face recogniser 100, the processing at step S30 comprises discarding the existing representation data and restraining the face recogniser to generate new representation data using the updated training data and the same processing as that described above at step S10.
At step S32, a check is carried out to determine whether image data was received at step S11 for another image on which face recognition is to be performed. Steps S12 to S32 are repeated until face recognition processing has lO been performed for each input image stored at step S11.
Many modifications and variations can be made to the embodiment described above within the scope of the claims.
For example, renderer 70 may be arranged to animate each 3D computer head model at step S4 to provide different facial expressions of the 3D computer head model, and to render image data of each 3D computer head model with different facial expressions to generate synthetic training images of varying facial expressions for each person.
In the embodiment above, 3D computer head model store 50 and renderer 70 are provided as part of processing apparatus 2. However, instead, 3D computer head model store 50 and renderer 70 may be provided as part of a separate apparatus, and step S4 may be performed in the separate apparatus, with the resulting synthetic training image data and associated person identification data being input to processing apparatus 2.
In the embodiment described above, the synthetic image data for training the face recogniser 100 is generated by inputting a three-dimensional computer head model (generated for example by laser scanning) into processing apparatus 2 and rendering images of the 3D computer head model. However, the synthetic image may be generated in any other way instead. For example, a generic, morphable three-dimensional face model may be stored in processing apparatus 2, one or more non-synthetic images of a person that the face recogniser is to be trained to recognise may be input, the three- dimensional face model may be matched to the face in the input image(s) and morphed into a three-dimensional computer face model of the subject person, and then images of the three-dimensional computer face model may be rendered to generate different, synthetic images of the person. The synthetic images may then be used alone, or together with the starting non-synthetic images, to train the face recogniser. Morphing processing is described, for example, in.tA Morphable Model for the Synthesis of 3D Faces" by Blanz and Vetter in Proceedings of SIGGRAPH 99, pages 187-194. Other methods of generating synthetic images starting with one or more real images could, of course, be used instead. The synthetic image data may be generated by an apparatus separate to processing apparatus 2 and then input to processing apparatus 2 to train the face recogniser 100 therein.
Depending upon the type of face recogniser 100, skin pixel detector 90 and the processing operations performed thereby at step S6 and S14 may be omitted from the embodiment above.
At steps S22 to S28 in the embodiment above, at most one synthetic training image for each person is deleted and replaced with a nonsynthetic training image (that is, only the synthetic training image which most closely matches the face in the input image for that person may be replaced). However, instead, the processing may be modified to replace more than one synthetic training image for each person. For example, for each person identified in an input image, every synthetic training image for that person which matches sufficiently closely the face of that person in the input image may be replaced.
In the embodiment above, if it is determined at step S18 that no face has been recognised for the current input image, then processing proceeds to step S32 to perform face recognition for another input image. However, instead, if it is determined at step S18 that a face has not been recognised, then the non-synthetic image data read at step S12 may be used to generate a plurality of different synthetic training images of each face in the non-synthetic image data. This may be done, for example, by pre-storing a morphable three-dimensional computer face model in processing apparatus 2, matching features of the face model with features of each face in the non synthetic image data read at step S12, morphing the three-dimensional computer face model on the basis of the matched features into a respective three-dimensional computer face model of each person in the image read at step S12, and rendering images thereof with different viewing positions, directions, lighting, backgrounds, etc. Suitable processing is described, for example in "A Morphable Model for the Synthesis of 3D Faces" by Blanz and Vetter in Proceedings of SIGGRAPH.99, pages 187-194. Other techniques which could be used are described in co-pending UK patent application 0128158.3 and co-pending US patent application 10/301,748, the full contents of which are incorporated herein by cross reference. The user of the apparatus may then be prompted to identify each person in the image data read at step S12, and the face recogniser 100 may be re trained using the synthetic images generated from the non- synthetic image to generate updated representation data.
In the embodiment above, if it is determined at step S24 that no matching training image is a synthetic image, then processing proceeds to step S28. However, instead, processing may proceed to step S32, thereby omitting steps S28 and S30.
In the embodiment described above, processing is performed by a computer using processing routines defined by software programming instructions. However, some, or all, of the processing could, of course, be performed using hardware or firmware.

Claims (28)

1. Face recognition apparatus, comprising: face recognition means, trainable using image data defining a plurality of images of a face to generate face recognition means storing representation data characterizing the face and operable to process image data defining an image using the representation data to determine if the image contains the face; synthetic data generating means operable to generate image data defining a plurality of different synthetic images of a person's face; and control means for controlling the apparatus to: - train the face recognition means using the synthetic image data generated by the synthetic data generating means; store at least some of the data used to train the face recognition means as training data such that the data is identifiable as synthetic training data; and - in response to the processing of non-synthetic image data by the face recognition means to determine if an image contains the person's face and a determination that the image does contain the person's face, update the stored training data in dependence upon the non synthetic image data and re-train the face recognition means using the updated training data to generate updated representation data for use in subsequent face recognition processing by the face recognition means.
2. Apparatus according to claim 1, wherein the control means is operable to control the apparatus in response to the processing of non-synthetic image data by the face recognition means to determine if an image contains the person's face and a determination that the image does contain the person's face, to update the stored training data in dependence upon the non-synthetic image data by replacing the synthetic training data for at least one image with training data for the non- synthetic image, and to re-train the face recognition means using the updated training data.
3. Apparatus according to claim 1 or claim 2, wherein the synthetic data generating means comprises rendering means operable to render a threedimensional computer model of at least the face of the person to generate the synthetic image data. 33.
4. Apparatus according to claim 3, wherein the rendering means is operable to render images from different viewing directions relative to the three dimensional computer model to generate image data for a plurality of different synthetic images for training the face recognition means.
5. Apparatus according to claim 3 or claim 4, wherein the rendering means is operable to render images from different viewing positions relative to the three dimensional computer model to generate image data for a plurality of different synthetic images for training the face recognition means.
6. Apparatus according to any of claims 3 to 5, wherein the rendering means is operable to light the three dimensional computer model with different lighting conditions, and to render images of the threedimensional computer model with the different lighting conditions to generate image data for a plurality of different synthetic images for training the face recognition means.
7. Apparatus according to any of claims 3 to 6, wherein the rendering means is operable to change the three dimensional computer model to represent different facial expressions, and to render images of the three dimensional computer with different facial expressions to generate image data for a plurality of different synthetic images for training the face recognition means.
8. Apparatus according to any of claims 3 to 7, wherein the rendering means is operable to change the background behind the three-dimensional computer model, and to render images of the three-dimensional computer model with different backgrounds to generate image data for a plurality of different synthetic images for training the face recognition means.
9. Face recognition apparatus, comprising: receiving means for receiving synthetic image data comprising a plurality of images of a person's face; face recognition means, trainable using the synthetic image data to generate face recognition means storing representation data characterizing the face and operable to process image data defining an image using the representation data to determine if the image contains the face; and control means for controlling the apparatus to: - train the face recognition means using the received synthetic image data; - store at least some of the data used to train the face recognition means as training data such that the data is identifiable as synthetic training data; and - in response to the processing of non-synthetic image data by the face recognition means to determine if an image contains the person's face and a determination that the image does contain the person's face, update the stored training data in dependence upon the non synthetic image data by replacing the synthetic training data for at least one image with training data for the non-synthetic image, and to re-train the face recognition means using the updated training data to generate updated representation data for use in subsequent face recognition processing by the face recognition means.
10. Apparatus according to any preceding claim, wherein: the face recognition means comprises exemplar-based face recognition means; and the control means is operable to train the face recognition means by storing the image data of the synthetic images as exemplars to generate the representation data for the face recognition means.
11. Apparatus according to any of claims 1 to 9, wherein: the face recognition means comprises a neural network; and the control means is operable to train the face recognition means by processing the image data defining the synthetic images to calculate synaptic weights for the links between the neurons in the neural network to generate the representation data for the face recognition means.
12. Apparatus according to any of claims 1 to 9, wherein: the face recognition means comprises eigenface face recognition means; and the control means is operable to train the face recognition means to generate representation data by processing the image data defining the synthetic images to calculate eigenfaces and a class vector for the face.
13. Apparatus according to any preceding claim, further comprising: a database configured to store image data processed by the face recognition means and associated identification information identifying each person recognised by the face recognition means in each image; and means for writing image data and identification information to the database in dependence upon the result of face recognition processing by the face recognition means.
14. An image processing method, comprising: generating image data defining a plurality of different synthetic images of a person's face; processing the synthetic image data to train a face recognition apparatus to generate representation data characterizing the face so that the face recognition apparatus is operable to process image data defining an image using the representation data to determine if the image contains the face; storing at least some of the data used to train the face recognition apparatus as training data such that the data is identifiable as synthetic training data; and processing non-synthetic image data using the face recognition apparatus to determine if an image contains the person's face and, in response to a determination that the image does contain the person's face, updating the stored training data in dependence upon the non synthetic image data and restraining the face recognition apparatus using the updated training data to generate updated representation data for use in subsequent face recognition processing by the face recognition apparatus.
15. A method according to claim 14, wherein, in response to the processing of non-synthetic image data by the face recognition apparatus to determine if an image contains the person's face and a determination that the image does contain the person's face, the stored training data is updated in dependence upon the non-synthetic image data by replacing the synthetic training data for at least one image with training data for the non-synthetic image, and the face recognition apparatus is restrained using the updated training data.
16. A method according to claim 14 or claim 15, wherein the synthetic image data is generated by rendering a three-dimensional computer model of at least the face of the person.
17. A method according to claim 16, wherein the synthetic image data is generated by rendering images from different viewing directions relative to the three dimensional computer model to generate image data for a plurality of different synthetic images for training the face recognition apparatus.
18. A method according to claim 16 or claim 17, wherein synthetic image data is generated by rendering images from different viewing positions relative to the three dimensional computer model to generate image data for a plurality of different synthetic images for training the face recognition apparatus.
19. A method according to any of claims 16 to 18, wherein the synthetic image data is generated by rendering images of the three-dimensional computer model with different lighting conditions to generate image data for a plurality of different synthetic images for training the face recognition apparatus.
20. A method according to any of claims 16 to 19, wherein the synthetic image data is generated by rendering images of the three-dimensional computer with different facial expressions to generate image data for a plurality of different synthetic images for training the face recognition apparatus.
21. A method according to any of claims 16 to 20, wherein the synthetic image data is generated by rendering images of the three-dimensional computer model with different backgrounds to generate image data for a plurality of different synthetic images for training the face recognition apparatus.
22. An image processing method, comprising: receiving synthetic image data comprising a lO plurality of images of a person's face; training a face recognition apparatus using the synthetic image data to generate a face recognition apparatus storing representation data characterizing the face and operable to process image data defining an image using the representation data to determine if the image contains the face; storing at least some of the data used to train the face recognition apparatus as training data such that the data is identifiable as synthetic training data; and processing non-synthetic image data using the face recognition apparatus and the representation data to determine if an image contains the person's face and, in response to a determination that the image does contain the person's face, updating the stored training data in dependence upon the non-synthetic image data by replacing the synthetic training data for at least one image with training data for the non- synthetic image, and restraining the face recognition apparatus using the updated training data to generate updated representation data for use in subsequent face recognition processing by the face recognition apparatus.
23. A method according to any of claims 14 to 22, wherein the synthetic image data is used to train an exemplar-based face recognition apparatus by storing the image data of the synthetic images as exemplars to generate the representation data for the face recognition apparatus.
24. A method according to any of claims 14 to 22, wherein the synthetic image data is used to train a neural network by processing the image data defining the synthetic images to calculate synaptic weights for the links between the neurons in the neural network to generate the representation data for the face recognition apparatus.
25. A method according to any of claims 14 to 22, wherein the synthetic image data is used to train an eigenface face recognition apparatus by processing the image data defining the synthetic images to calculate eigenfaces and a class vector for the face.
26. A method according to any of claims 14 to 25, further comprising: storing in a database image data processed by the face recognition apparatus and identification information identifying each person recognised by the face recognition apparatus in the image data.
27. A storage medium storing computer program instructions for programming a programmable processing apparatus to become configured as an apparatus as set out in at least one of claims 1 to 13.
28. A signal carrying computer program instructions for programming a programmable processing apparatus to become configured as an apparatus as set out in at least one of claims 1 to 13.
GB0312096A 2003-05-27 2003-05-27 Image processing Expired - Fee Related GB2402311B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB0312096A GB2402311B (en) 2003-05-27 2003-05-27 Image processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0312096A GB2402311B (en) 2003-05-27 2003-05-27 Image processing

Publications (3)

Publication Number Publication Date
GB0312096D0 GB0312096D0 (en) 2003-07-02
GB2402311A true GB2402311A (en) 2004-12-01
GB2402311B GB2402311B (en) 2006-03-08

Family

ID=9958796

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0312096A Expired - Fee Related GB2402311B (en) 2003-05-27 2003-05-27 Image processing

Country Status (1)

Country Link
GB (1) GB2402311B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020131970A1 (en) * 2018-12-17 2020-06-25 Bodygram, Inc. Methods and systems for automatic generation of massive training data sets from 3d models for training deep learning networks
DE102021204611A1 (en) 2021-05-06 2022-11-10 Continental Automotive Technologies GmbH Computer-implemented method for generating training data for use in the field of vehicle occupant observation
US11507781B2 (en) 2018-12-17 2022-11-22 Bodygram, Inc. Methods and systems for automatic generation of massive training data sets from 3D models for training deep learning networks
US20230282028A1 (en) * 2022-03-04 2023-09-07 Opsis Pte., Ltd. Method of augmenting a dataset used in facial expression analysis
CN116783630A (en) * 2020-09-11 2023-09-19 西门子股份公司 Object recognition method and object recognition system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001037222A1 (en) * 1999-11-18 2001-05-25 Anthropics Technology Limited Image processing system
US20020106114A1 (en) * 2000-12-01 2002-08-08 Jie Yan System and method for face recognition using synthesized training images
US20030123713A1 (en) * 2001-12-17 2003-07-03 Geng Z. Jason Face recognition system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001037222A1 (en) * 1999-11-18 2001-05-25 Anthropics Technology Limited Image processing system
US20020106114A1 (en) * 2000-12-01 2002-08-08 Jie Yan System and method for face recognition using synthesized training images
US20030123713A1 (en) * 2001-12-17 2003-07-03 Geng Z. Jason Face recognition system and method

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020131970A1 (en) * 2018-12-17 2020-06-25 Bodygram, Inc. Methods and systems for automatic generation of massive training data sets from 3d models for training deep learning networks
US11507781B2 (en) 2018-12-17 2022-11-22 Bodygram, Inc. Methods and systems for automatic generation of massive training data sets from 3D models for training deep learning networks
CN116783630A (en) * 2020-09-11 2023-09-19 西门子股份公司 Object recognition method and object recognition system
US12488580B2 (en) * 2020-09-11 2025-12-02 Siemens Aktiengesellschaft Method and system for identifying objects
DE102021204611A1 (en) 2021-05-06 2022-11-10 Continental Automotive Technologies GmbH Computer-implemented method for generating training data for use in the field of vehicle occupant observation
US20230282028A1 (en) * 2022-03-04 2023-09-07 Opsis Pte., Ltd. Method of augmenting a dataset used in facial expression analysis
US12142077B2 (en) * 2022-03-04 2024-11-12 Opsis Pte., Ltd. Method of augmenting a dataset used in facial expression analysis

Also Published As

Publication number Publication date
GB2402311B (en) 2006-03-08
GB0312096D0 (en) 2003-07-02

Similar Documents

Publication Publication Date Title
Van Kuilenburg et al. A model based method for automatic facial expression recognition
CN110909651B (en) Video subject identification method, device, equipment and readable storage medium
GB2402535A (en) Face recognition
CN113128271B (en) Forgery Detection in Facial Images
Boughrara et al. Facial expression recognition based on a mlp neural network using constructive training algorithm
Bartlett et al. Real time face detection and facial expression recognition: development and applications to human computer interaction.
JP7007829B2 (en) Information processing equipment, information processing methods and programs
Kumar et al. An object detection technique for blind people in real-time using deep neural network
Wang et al. Expression of Concern: Facial feature discovery for ethnicity recognition
CN110674748A (en) Image data processing method, image data processing device, computer equipment and readable storage medium
US20080144891A1 (en) Method and apparatus for calculating similarity of face image, method and apparatus for retrieving face image, and method of synthesizing face image
EP2091021A1 (en) Face authentication device
US20030179911A1 (en) Face detection in digital images
Wimmer et al. Low-level fusion of audio and video feature for multi-modal emotion recognition
US11645328B2 (en) 3D-aware image search
CN116964619A (en) Electronic device for enhancing image quality and method of enhancing image quality by using the electronic device
CN110675312A (en) Image data processing method, image data processing device, computer equipment and storage medium
Okokpujie et al. Development of an adaptive trait-aging invariant face recognition system using convolutional neural networks
GB2402311A (en) Facial recognition using synthetic images
Paterson et al. 3D head tracking using non-linear optimization.
Jatain et al. Automatic human face detection and recognition based on facial features using deep learning approach
WO2023215253A1 (en) Systems and methods for rapid development of object detector models
Sanchez-Ruiz et al. Face expression recognition using recurrent neural networks
Rehman et al. Smart monitoring: employing person re-identification to uncover suspicious behavior
Abate et al. Face authentication using speed fractal technique

Legal Events

Date Code Title Description
PCNP Patent ceased through non-payment of renewal fee

Effective date: 20190527