WO2006061365A1 - Reconnaissance faciale au moyen de caracteristiques considerees le long de contours isoradiaux - Google Patents
Reconnaissance faciale au moyen de caracteristiques considerees le long de contours isoradiaux Download PDFInfo
- Publication number
- WO2006061365A1 WO2006061365A1 PCT/EP2005/056470 EP2005056470W WO2006061365A1 WO 2006061365 A1 WO2006061365 A1 WO 2006061365A1 EP 2005056470 W EP2005056470 W EP 2005056470W WO 2006061365 A1 WO2006061365 A1 WO 2006061365A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- interest
- point
- contour
- data
- irad
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
Definitions
- the present invention relates to the representation of 3D objects and is also concerned, although not exclusively, with the alignment, recognition and/or verification of 3D objects.
- Preferred embodiments of the invention are concerned with the representation of three-dimensional (3D) structural data and its associated registered two-dimensional (2D) colour-intensity image data.
- 3D structural data may be acquired from any 3D measurement system — for example, stereo camera systems, projected light ranging systems, shape from shading systems and so on.
- 3D data and 3D data models In order to recognise the shape of an object or verify that an object belongs to a particular class of objects, using 3D data and 3D data models, it is necessary to acquire 3D data using some form of 3D sensor and then compare the acquired data with examples of 3D data or models stored in a database.
- the way in which the 3D data is modelled and represented is crucial, since it dictates the manner in which the comparison between captured 3D data and pre-stored 3D data is made. Thus, ultimately, it has a large influence on the performance of the system, for example, in terms of the system's overall recognition performance.
- Typical approaches for matching 3D objects first attempt to align the object to a standard orientation, consistent with the orientation of the models stored in a database. There are many ways of encoding 3D structure in the public domain.
- a 3D curve C is defined by intersecting a sphere of radius r, centred on a feature point-of- interest, with the captured 3D facial surface.
- a best- fit plane is then fitted to this curve, although the curve is not likely to be planar and in some cases could be highly non-planar.
- a second plane is then defined by translating the extracted plane such that it contains the point-of-interest.
- the orthogonal projection of the curve C onto this plane is then denoted by the planar curve C and the orthogonal projection distance of C onto its corresponding curve C forms a (signed) distance profile, which is sampled at regular angles through the full range of 360 degrees.
- the best fit plane will also be sensitive to changes in facial structure, such as those caused by changes in expression. Hence, any local change in surface structure will affect the whole representation of the surface around the point-of-interest.
- Preferred embodiments of the present invention aim to provide a method that maintains a consistent signal for all rigid sections of the surface, regardless of any structural changes in other sections. For example, the part of a contour passing through the rigid forehead is not affected by the same contour passing through the malleable mouth area.
- a method of representing 3D object data comprising the steps of determining a point-of-interest in a predetermined position relative to the object, and generating for that point-of-interest a set of multiple isoradius surface contours, each of which comprises a locus of a set of points on the surface of the object that are at a constant predetermined distance from the point-of-interest, which distance is different for each contour of the set.
- Said point-of-interest may be located in or on said object.
- Said point-of-interest may be located on a surface of said object.
- a method as above may include the step of determining one or more property of the object at points along each said contour.
- Said one or more property may include at least one of: curvature of the respective contour; object surface orientation; local gradient of object surface orientation along the respective contour; and object surface curvature along the respective contour.
- Said one or more property may include colour and/or colour-intensity.
- a plurality of properties of the object are determined at points along each said contour, and a plurality of aligned ID signals are derived therefrom.
- said object is a human face.
- said point-of-interest is a nose-tip.
- a method according to any of the preceding aspects of the invention may include the further step of comparing the represented 3D data with stored data to recognise or verify the object.
- Such a method may further include at least one of the following steps: a. size-shape vector prefiltering;
- a method according to any of the preceding aspects of the invention may comprise the steps of determining a plurality of points-of-interest as aforesaid, and generating for each point-of-interest a set of multiple isoradius surface contours as aforesaid.
- the invention also extends to apparatus for representing 3D data, the apparatus being adapted to carry out a method according to any of the preceding aspects of the invention.
- the data would be in the form of a set of 3D point coordinates. This is often called a ⁇ point-cloud' representation.
- this point cloud is converted into a mesh representation, where each point is connected to several neighbours, creating a mesh of triangular planar facets. If a corresponding standard colour- intensity image can be acquired at the same time as the 3D model, this colour- intensity information can be aligned to the captured 3D structure and texture- mapped to the surface of the model using the known facets in the mesh representation. In this way the 2D image is said to be registered to the 3D data.
- Example embodiments of the invention can be divided into two parts, namely:
- a "point-of-interest” (POI) is extracted.
- the first stage in extracting the representation is to locate one or more interest points on the captured 3D object surface.
- a crucial point is that the 3D position of such interest points should be detected reliably, such that there is high repeatability (low variance in 3D position).
- Good interest points are those that have a local maximum in 3D surface curvature.
- the tip of the nose is a good choice for interest point.
- the absolute orientation of the face is immaterial, since contours are generated by intersecting several spheres of different radius with the object surface.
- each point-of-interest used in the method must be "semantic” in the sense it is labelled ("nose-tip”, “eye-corner”, etc.) and one must be able to match points of interest between test data and data in a stored database.
- An advantage of using a single point-of-interest is that matching is implicit as it is one-to-one.
- POIs could be detected using colour-texture data (from a standard 2D image) as well, or may be combined with the use of structural information.
- IRAD is the locus in 3D space of a point on the surface of the object, which is a constant distance from a "point-of-interest".
- IRAD is the locus in 3D space of a point on the surface of the object, which is a constant distance from a "point-of-interest”.
- IRAD locii can take many forms.
- Typical methods include (i) simple approaches such as linear interpolation across a 3D mesh, (ii) accurate but computationally expensive approaches such as the direct use of a parametric Gaussian smoothing function, and (iii) more complex but faster approaches such as the generation of a gridded depth map and the subsequent use of a 3D interpolation scheme, such as those based on the generation of 3D Hermite patches.
- a typical set of properties might include shape properties, for example:-
- the shape of the IRAD itself for example, expressed as a curvature property (see below) or orientation property relative to some reference orientation. Note that an orientation signal has less noise than a curvature signal as it is a first order difference rather than a second order difference.
- Depth properties may be included, as may be colour-intensity properties, as for example:- 1. Intensity(possibly normalised).
- Red, green or blue (RGB) colour channel possibly normalised, for example, with respect to intensity
- RGB data any other transformation of RGB data (such as HSV, CIE, etc).
- the final representation is thus a set of ID signals.
- Extraction of a set of ID signals facilitates matching to a pre-stored (database) 3D data and models through a process of correlation, although we do not preclude using the representation in other forms of matching process. For example, if several interest points were used, graph matching approaches may be appropriate.
- Size-shape prefilter 1.
- ID signal correlation 2.
- LDA Linear Discriminant Analysis
- the length of an isoradius contour depends on how much it meanders across its spherical surface, although obviously one generally expects contour lengths at lower radii to be shorter than contour lengths at higher radii.
- a distance measure eg Euclidean, Mahalanobis
- the prefilter should be implemented as a weak constraint, so that the expected number of false rejects of the prefilter is zero.
- a slightly more sophisticated prefilter would use the ratio of contour lengths associated with different IRAD radii. This would prevent scaled versions of the same object shape being rejected. Such scalings could occur with slight variations in camera calibration, such as the value used for the stereo baseline.
- the method can only be applied to contours which do not intersect with any holes on the 3D surface. In such cases, the contour length cannot be measured and should be eliminated from the feature vector before the feature vector match is made.
- ID signal correlation the process of ID signal correlation is standard and well documented. It forms the core of the matching process. Much of the power of the disclosed embodiments of the invention is that the dense, comprehensive multi-contour, multi-feature representation employed makes this technique central to the matching process. It is noted that, for fast searching of large databases, the correlation process is likely to be implemented directly in hardware.
- This weighting scheme is applied by computing the inner (dot) product of correlation scores with a predetermined weights vector.
- the element values of these weight vectors will be dependent on the application scenario in mind and calculated using such methods as LDA applied to typical examples of the target data.
- Such a process produces a weighting scheme in which between-class variance of correlation scores is maximised and within-class variance (e.g. due to facial expression and other factors) is minimised.
- results of LDA on a particular database may indicate certain features of certain IRADs are not helpful in that they do not provide sufficient discriminatory information and should therefore not be used in the matching process, or even computed in the live operation of the system.
- the method handles this situation by maintaining multiple signal fragments along a single IRAD in the face representation and generating correlation scores for all fragments independently. The aim then is to determine the maximum correlation score across all fragments of all IRADs that is consistent with a single 3D rotation.
- the ID correlation processes proceed independently on a number of counts: 1. Every feature (colour, shape) on every IRAD is correlated independently.
- Fragmented IRADs due to holes in the surface patch have multiple segments, which are correlated independently. What we are then looking for is the set of correlations, both within IRADs (due to multiple features and multiple fragments) and across the whole set of IRADs, which has a consistent orientation alignment between test data and model being matched.
- orientation alignment is a 3 degree of freedom (dof) rotation about the point-of-interest (nose- tip). The rotation is determined from the known correlation between test data surface coordinates and database model surface coordinates on the specific IRAD.
- the output of the system may wish to display a list of descending scores, with some sort of cut-off below which it is unlikely that a correct match has been achieved.
- facial range/ colour /intensity images of a subject need to be collected, analysed and stored.
- the representation needs to be augmented, such that it captures all variations of facial structure in a wide range of expressions. Given that a face in a semantically known expression (smile, frown, wrinkle nose) will match, the system will also be capable of outputting facial expression.
- PCA Principle Component Analysis
- the IRAD technique may be used for alignment to a standard pose.
- the IRAD representation is used to simultaneously perform recognition and alignment, where the alignment is between a pair of 3D or 3D/2D images (one captured image and a database image).
- the IRAD technique may also be used as a means of aligning data to a standard pose (position and orientation), which is used as a precursor to other recognition techniques, such as LDA-based recognition and verification.
- a face may be aligned to a forward looking pose, with some feature or features in a standard position.
- the IRAD representation may be used to align the 3D face to a standard forward looking pose in the following way:
- the "point signature method” uses a single contour around a point-of-interest.
- the IRAD representation uses multiple contours.
- PSM uses multiple points-of-interest to identify features in the face.
- the IRAD methods may typically use one point-of-interest as a reference point to encode the whole face (with multiple contours).
- PSM uses a single contour as a marker to encode local surface shape by measuring orthogonal depth to a reference plane.
- the IRAD methods encode IRAD contours themselves by measuring IRAD contour curvature. They also use IRAD contours in a similar way to the PSM method, i.e. to act as a repeatable surface position locator (a "surface marker"), but the way in which the data is encoded along the contour is different.
- the IRAD methods measure the way in which the surface normal is changing along the contour and measure three colour-intensity signals along the contour (namely RGB values or any colour-space transformation of these values, such as HSV, CIE).
- PSM uses the encoded depth to match point signatures using Euclidean distance on an ordered feature vector.
- the ordering of the feature vector is chosen from a distinctive point on the (orthonormally projected) IRAD curve, such as the point of maximum curvature.
- the IRAD methods use a signal correlation process, before any feature vector matching is used.
- PSM matches a single signal/ feature vector.
- the IRAD methods match multiple signals/feature vectors, due to the representation employing:
- the first task is to find the nose tip. This may be achieved by finding the local maxima in surface curvature.
- the nose forms a distinct ridge, which may be helpful to locate the nose tip.
- Another approach is to determine surface regions above a certain threshold of curvature and locate the centroid of that surface patch.
- Other approaches may use physical analogies, such as artificial potential fields.
- IRAD contours are then generated.
- a specific IRAD contour that is, we wish to identify a locus of points on the face surface that is a fixed radius from the nose-tip point-of-interest. This set of points meanders across the surface of a sphere, centred on the point-of- interest. Assume that this sphere has radius, R.
- any facet in the mesh we can detect if it straddles or touches the intersecting sphere, by checking the distance of its three points relative to the nose-tip. For sufficiently high resolution, we are not likely to find many instances when a point lies exactly on the sphere. More likely are the instances when we find two points of a facet inside the sphere and one outside, or one inside and two outside. In such cases, two linear interpolations can be used on two sides of the facet to generate two connected points on both the facial surface and the sphere surface, thus forming part of the IRAD contour. The information required to link such pairs of connected points is gathered via the known 3D mesh. It is known from the mesh data, which facets neighbour which other facets and thus it is straightforward to link up pairs of connected points into a chain, which represents the IRAD contour.
- the first of these is addressed by re-sampling the IRAD contour through a process of interpolation, to generate what are called reference points along the IRAD contour.
- the second of these requires us to maintain multiple signal fragments in the representation, as discussed previously.
- Encoding the face shape may be achieved in a number of ways, the most obvious of which are :
- contour in the direction ⁇ is computed as the cross-product of the two vectors
- the pixels from the standard 2D colour image that correspond to the 3D data along an IRAD can easily be extracted so that an IRAD effectively consists of a set of ID signals, registered in terms of their position on the IRAD contour.
- These signals could be the raw RGB values (registered to the 3D mesh) from the colour camera or derivatives of this information, such as HSV or CIE colour space.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP05813319A EP1828959A1 (fr) | 2004-12-06 | 2005-12-05 | Reconnaissance faciale au moyen de caracteristiques considerees le long de contours isoradiaux |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| GBGB0426595.5A GB0426595D0 (en) | 2004-12-06 | 2004-12-06 | Representation of 3D objects |
| GB0426595.5 | 2004-12-06 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2006061365A1 true WO2006061365A1 (fr) | 2006-06-15 |
Family
ID=34044031
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2005/056470 Ceased WO2006061365A1 (fr) | 2004-12-06 | 2005-12-05 | Reconnaissance faciale au moyen de caracteristiques considerees le long de contours isoradiaux |
Country Status (3)
| Country | Link |
|---|---|
| EP (1) | EP1828959A1 (fr) |
| GB (1) | GB0426595D0 (fr) |
| WO (1) | WO2006061365A1 (fr) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101894254A (zh) * | 2010-06-13 | 2010-11-24 | 南开大学 | 一种基于等高线法的三维人脸识别方法 |
| WO2010133938A1 (fr) * | 2009-05-22 | 2010-11-25 | Nokia Corporation | Procédé et appareil pour l'extraction de caractéristiques au moyen d'un code primitif local |
| US8532344B2 (en) | 2008-01-09 | 2013-09-10 | International Business Machines Corporation | Methods and apparatus for generation of cancelable face template |
| US8538096B2 (en) | 2008-01-09 | 2013-09-17 | International Business Machines Corporation | Methods and apparatus for generation of cancelable fingerprint template |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112784680B (zh) * | 2020-12-23 | 2024-02-02 | 中国人民大学 | 一种人流密集场所锁定密集接触者的方法和系统 |
-
2004
- 2004-12-06 GB GBGB0426595.5A patent/GB0426595D0/en not_active Ceased
-
2005
- 2005-12-05 WO PCT/EP2005/056470 patent/WO2006061365A1/fr not_active Ceased
- 2005-12-05 EP EP05813319A patent/EP1828959A1/fr not_active Withdrawn
Non-Patent Citations (1)
| Title |
|---|
| WANG Y ET AL: "Facial feature detection and face recognition from 2D and 3D images", PATTERN RECOGNITION LETTERS, NORTH-HOLLAND PUBL. AMSTERDAM, NL, vol. 23, no. 10, August 2002 (2002-08-01), pages 1191 - 1202, XP004349766, ISSN: 0167-8655 * |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8532344B2 (en) | 2008-01-09 | 2013-09-10 | International Business Machines Corporation | Methods and apparatus for generation of cancelable face template |
| US8538096B2 (en) | 2008-01-09 | 2013-09-17 | International Business Machines Corporation | Methods and apparatus for generation of cancelable fingerprint template |
| WO2010133938A1 (fr) * | 2009-05-22 | 2010-11-25 | Nokia Corporation | Procédé et appareil pour l'extraction de caractéristiques au moyen d'un code primitif local |
| CN102439606A (zh) * | 2009-05-22 | 2012-05-02 | 诺基亚公司 | 用于使用局部原码执行特征提取的方法和装置 |
| US8571273B2 (en) | 2009-05-22 | 2013-10-29 | Nokia Corporation | Method and apparatus for performing feature extraction using local primitive code |
| CN101894254A (zh) * | 2010-06-13 | 2010-11-24 | 南开大学 | 一种基于等高线法的三维人脸识别方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| EP1828959A1 (fr) | 2007-09-05 |
| GB0426595D0 (en) | 2005-01-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Li et al. | Towards 3D face recognition in the real: a registration-free approach using fine-grained matching of 3D keypoint descriptors | |
| Perakis et al. | 3D facial landmark detection under large yaw and expression variations | |
| CN106682598B (zh) | 一种基于级联回归的多姿态的人脸特征点检测方法 | |
| Bronstein et al. | Three-dimensional face recognition | |
| Mian et al. | An efficient multimodal 2D-3D hybrid approach to automatic face recognition | |
| Passalis et al. | Using facial symmetry to handle pose variations in real-world 3D face recognition | |
| Papazov et al. | Real-time 3D head pose and facial landmark estimation from depth images using triangular surface patch features | |
| Gökberk et al. | 3D shape-based face representation and feature extraction for face recognition | |
| Bustard et al. | Toward unconstrained ear recognition from two-dimensional images | |
| Mian et al. | Automatic 3d face detection, normalization and recognition | |
| US20090185746A1 (en) | Image recognition | |
| US20150177846A1 (en) | Hand pointing estimation for human computer interaction | |
| Davies et al. | Advanced methods and deep learning in computer vision | |
| JPH10177650A (ja) | 画像特徴抽出装置,画像特徴解析装置,および画像照合システム | |
| Chowdhary | 3D object recognition system based on local shape descriptors and depth data analysis | |
| Szeliski | Feature detection and matching | |
| US20090028442A1 (en) | Method And Apparatus For Determining Similarity Between Surfaces | |
| Russ et al. | 3D facial recognition: a quantitative analysis | |
| Salah et al. | Registration of three-dimensional face scans with average face models | |
| Al-Osaimi | A novel multi-purpose matching representation of local 3D surfaces: A rotationally invariant, efficient, and highly discriminative approach with an adjustable sensitivity | |
| US7542624B1 (en) | Window-based method for approximating the Hausdorff in three-dimensional range imagery | |
| Perakis et al. | Partial matching of interpose 3D facial data for face recognition | |
| Tang et al. | 3D face recognition with asymptotic cones based principal curvatures | |
| Boukamcha et al. | 3D face landmark auto detection | |
| EP1828959A1 (fr) | Reconnaissance faciale au moyen de caracteristiques considerees le long de contours isoradiaux |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KN KP KR KZ LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2005813319 Country of ref document: EP |
|
| WWP | Wipo information: published in national office |
Ref document number: 2005813319 Country of ref document: EP |