CN112084816A

CN112084816A - A cross-domain facial recognition method

Info

Publication number: CN112084816A
Application number: CN201910510836.9A
Authority: CN
Inventors: 刘若鹏; 栾琳; 季春霖; 陈�峰
Original assignee: Hangzhou Guangqi Artificial Intelligence Research Institute
Current assignee: Foshan Shunde Guangqi Advanced Equipment Co ltd
Priority date: 2019-06-13
Filing date: 2019-06-13
Publication date: 2020-12-15

Abstract

The invention provides a cross-domain face recognition method, which comprises the following steps: preparing source sample data and target sample data; a fixed feature extractor E, which maximizes a domain discriminator G and discriminates the difference between the source sample data and the target sample data; a fixed domain discriminator G for minimizing the feature extractor E and discriminating the difference between the source sample data and the target sample data; and the maximization of the domain discriminator G and the minimization of the feature extractor E reach a balance point, and a feature value without inter-domain difference between the source sample data and the target sample data is extracted from the feature extractor E. On the basis of facenet, the feature extractor E adopts an increment-Resnet-V1 network structure, and the source classification loss function and the target classification loss function adopt triple loss functions, and then a domain confusion loss function is added, so that the problem of cross-domain facial recognition under the condition that the image to be recognized and the training sample image have different statistical distribution characteristics is solved, the facial features in the same domain are closer, and the cross-domain recognition accuracy is improved.

Description

A cross-domain facial recognition method

【技术领域】【Technical field】

本发明涉及面部识别技术领域，尤其涉及一种跨域面部识别方法。The invention relates to the technical field of facial recognition, in particular to a cross-domain facial recognition method.

【背景技术】【Background technique】

面部识别，是基于人的脸部特征信息进行身份识别的一种生物识别技术。用摄像机或摄像头采集含有人脸的图像或视频流，并自动在图像中检测和跟踪人脸，进而对检测到的人脸进行脸部识别的一系列相关技术，通常也叫做人像识别、面部识别。Facial recognition is a biometric technology for identifying people based on facial feature information. A series of related technologies that use cameras or cameras to collect images or video streams containing faces, and automatically detect and track faces in the images, and then perform face recognition on the detected faces. .

现有面部识别算法大多可以解决单域面部识别(即待识别图像和训练样本图像具有相同的统计分布特性)问题，例如，FaceNet,DeepId等算法，在单域面部识别任务上取得良好的效果。Most of the existing face recognition algorithms can solve the problem of single-domain face recognition (that is, the image to be recognized and the training sample image have the same statistical distribution characteristics). For example, algorithms such as FaceNet and DeepId have achieved good results in single-domain face recognition tasks.

但在实际应用场景下，例如识别监控摄像头下的重点监控对象任务，通过监控摄像头拍摄的视频中提取的人脸照片，与一组对象的身份证照片比对，判别是否是其中的一位。它的难点在于，摄像头的图像与身份证照片差别很大，身份证照片是限制域的，需要对象的配合，对分辨率、角度、光线、表情等有严格的要求，而监控摄像头是开放域的，所摄取的人脸照片的分辨率、角度、光线、表情不受任何限制，因此，摄像头拍摄的人脸照片与身份证照片这两个样本的统计规律很不相同，可以界定为跨域面部识别问题。对于跨域识别，FaceNet,DeepId等基于CNN(卷积神经网络)算法并不能取的很好效果。However, in practical application scenarios, such as the task of identifying key monitoring objects under surveillance cameras, the face photos extracted from the videos captured by the surveillance cameras are compared with the ID photos of a group of objects to determine whether it is one of them. The difficulty is that the image of the camera is very different from the ID card photo. The ID card photo is in the restricted domain and requires the cooperation of the object. It has strict requirements on resolution, angle, light, expression, etc., while the surveillance camera is an open domain. The resolution, angle, light, and expression of the captured face photos are not subject to any restrictions. Therefore, the statistical laws of the face photos captured by the camera and the ID card photos are very different, and can be defined as cross-domain Facial recognition problem. For cross-domain recognition, FaceNet, DeepId and other algorithms based on CNN (Convolutional Neural Network) cannot achieve good results.

【发明内容】[Content of the invention]

本发明所要解决的技术问题是提供一种跨域面部识别方法，能够解决待识别图像与训练样本图像具有不同统计分布特性的条件下的面部跨域识别问题，使得同一域间的面部特征更为接近，提升了跨域识别准确性。The technical problem to be solved by the present invention is to provide a cross-domain face recognition method, which can solve the problem of face cross-domain recognition under the condition that the image to be recognized and the training sample image have different statistical distribution characteristics, so that the facial features in the same domain are more accurate. close, improving the accuracy of cross-domain recognition.

为解决上述技术问题，本发明一实施例提供了一种跨域面部识别方法，其特征在于，包括：In order to solve the above technical problems, an embodiment of the present invention provides a cross-domain facial recognition method, which is characterized by comprising:

准备源样本数据和目标样本数据；Prepare source sample data and target sample data;

固定特征提取器E，让域判别器G最大化，判别源样本数据和目标样本数据间的差异；Fix the feature extractor E, maximize the domain discriminator G, and discriminate the difference between the source sample data and the target sample data;

固定域判别器G，让特征提取器E最小化，判别源样本数据和目标样本数据间的差异；Fix the domain discriminator G, minimize the feature extractor E, and discriminate the difference between the source sample data and the target sample data;

域判别器G最大化和特征提取器E最小化达到平衡点，从特征提取器E提取出源样本数据和目标样本数据间没有域间差异的特征值。The domain discriminator G is maximized and the feature extractor E is minimized to reach a balance point, and feature values that have no inter-domain difference between the source sample data and the target sample data are extracted from the feature extractor E.

优选地，准备源样本数据和目标样本数据包括：Preferably, preparing the source sample data and the target sample data includes:

将源样本数据和目标样本数据分别进行面部特征对齐。The source sample data and the target sample data are respectively aligned with facial features.

优选地，所述域判别器G和特征提取器E对不同的域共享参数。Preferably, the domain discriminator G and feature extractor E share parameters for different domains.

对源样本数据和目标样本数据进行处理，增加样本数据的多样性。The source sample data and the target sample data are processed to increase the diversity of the sample data.

优选地，固定特征提取器E，让域判别器G最大化，判别源样本数据和目标样本数据间的差异之前包括：确定损失函数。Preferably, the feature extractor E is fixed, the domain discriminator G is maximized, and before discriminating the difference between the source sample data and the target sample data, the method includes: determining a loss function.

优选地，对源样本数据和目标样本数据进行处理包括：对源样本数据和目标样本数据进行随机裁剪、翻转、调整亮度、对比度、色相、饱和度处理。Preferably, processing the source sample data and the target sample data includes: randomly cropping, flipping, adjusting brightness, contrast, hue, and saturation on the source sample data and the target sample data.

优选地，所述损失函数包括分类损失函数和域混淆损失函数。Preferably, the loss function includes a classification loss function and a domain confusion loss function.

优选地，所述损失函数为分类损失函数和域混淆损失函数的加权和。Preferably, the loss function is a weighted sum of a classification loss function and a domain confusion loss function.

优选地，总损失＝源分类损失+目标分类损失－KL散度，其中KL散度用于衡量源样本数据和目标样本数据特征概率分布间的差异。Preferably, total loss=source classification loss+target classification loss−KL divergence, where KL divergence is used to measure the difference between the feature probability distributions of the source sample data and the target sample data.

优选地，所述KL散度又称为相对熵，为源样本数据和目标样本数据特征概率分布间的度量。Preferably, the KL divergence is also called relative entropy, which is a measure between the feature probability distributions of the source sample data and the target sample data.

与现有技术相比，上述技术方案具有以下优点：在facenet原有技术方案的基础上，通过特征提取器E采用Inception-Resnet-V1网络结构，源分类损失函数和目标分类损失函数采用三元组损失函数triplet loss的基础上，加入域混淆损失函数，从而解决待识别图像与训练样本图像具有不同统计分布特性的条件下的面部跨域识别问题，使得同一域间的面部特征更为接近，减少了误识率，提升了跨域识别准确性。Compared with the prior art, the above technical solution has the following advantages: on the basis of the original technical solution of facenet, the Inception-Resnet-V1 network structure is adopted through the feature extractor E, and the source classification loss function and the target classification loss function adopt ternary On the basis of the group loss function triplet loss, the domain confusion loss function is added to solve the problem of face cross-domain recognition under the condition that the image to be recognized and the training sample image have different statistical distribution characteristics, so that the facial features in the same domain are closer. The false recognition rate is reduced and the cross-domain recognition accuracy is improved.

【附图说明】【Description of drawings】

为了更清楚地说明本发明实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其它的附图。In order to illustrate the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without any creative effort.

图1是本发明一种跨域面部识别方法流程图。FIG. 1 is a flow chart of a cross-domain facial recognition method according to the present invention.

图2是本发明一种跨域面部识别方法一优选实施例流程图。FIG. 2 is a flow chart of a preferred embodiment of a cross-domain facial recognition method according to the present invention.

图3是本发明一种跨域面部识别方法网络架构图。FIG. 3 is a network architecture diagram of a cross-domain facial recognition method according to the present invention.

图4是本发明一种跨域面部识别方法中训练20个epoch后的t-SNE示意图。FIG. 4 is a schematic diagram of t-SNE after training 20 epochs in a cross-domain face recognition method of the present invention.

图5是本发明一种跨域面部识别方法中训练50个epoch后的t-SNE示意图。FIG. 5 is a schematic diagram of t-SNE after training 50 epochs in a cross-domain face recognition method of the present invention.

【具体实施方式】【Detailed ways】

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其它实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

实施例一Example 1

图1是本发明一种跨域面部识别方法流程图。如图1所示，一种跨域面部识别方法，包括步骤：FIG. 1 is a flow chart of a cross-domain facial recognition method according to the present invention. As shown in Figure 1, a cross-domain facial recognition method includes steps:

S11、准备源样本数据和目标样本数据。准备源样本数据和目标样本数据包括：将源样本数据和目标样本数据分别进行面部特征对齐，对源样本数据和目标样本数据进行处理，增加样本数据的多样性。对源样本数据和目标样本数据进行处理包括：对源样本数据和目标样本数据进行随机裁剪、翻转、调整亮度、对比度、色相、饱和度处理。S11. Prepare source sample data and target sample data. The preparation of the source sample data and the target sample data includes: aligning the facial features of the source sample data and the target sample data respectively, processing the source sample data and the target sample data, and increasing the diversity of the sample data. The processing of the source sample data and the target sample data includes: randomly cropping, flipping, adjusting the brightness, contrast, hue and saturation of the source sample data and the target sample data.

S12、固定特征提取器E，让域判别器G最大化源样本数据和目标样本数据间的差异。S12. Fix the feature extractor E, and let the domain discriminator G maximize the difference between the source sample data and the target sample data.

S13、固定域判别器G，让特征提取器E最小化源样本数据和目标样本数据间的差异；S13, fix the domain discriminator G, and let the feature extractor E minimize the difference between the source sample data and the target sample data;

S14、域判别器G最大化和特征提取器E最小化达到平衡点，从特征提取器E提取出源样本数据和目标样本数据间没有域间差异的特征值。S14, the domain discriminator G is maximized and the feature extractor E is minimized to reach a balance point, and feature values with no inter-domain difference between the source sample data and the target sample data are extracted from the feature extractor E.

所述域判别器G和特征提取器E对不同的域共享参数。The domain discriminator G and feature extractor E share parameters for different domains.

具体实施时，固定特征提取器E，让域判别器G最大化，判别源样本数据和目标样本数据间的差异之前包括：确定损失函数。During specific implementation, the feature extractor E is fixed, the domain discriminator G is maximized, and before discriminating the difference between the source sample data and the target sample data, the method includes: determining a loss function.

所述损失函数包括分类损失函数和域混淆损失函数。损失函数为分类损失函数和域混淆损失函数的加权和。总损失＝源分类损失+目标分类损失－KL散度，其中KL散度用于衡量源样本数据和目标样本数据特征概率分布间的差异。所述KL散度又称为相对熵，为源样本数据和目标样本数据特征概率分布间的度量。The loss functions include a classification loss function and a domain confusion loss function. The loss function is the weighted sum of the classification loss function and the domain confusion loss function. Total loss = source classification loss + target classification loss - KL divergence, where KL divergence is used to measure the difference between the feature probability distributions of source sample data and target sample data. The KL divergence, also called relative entropy, is a measure between the feature probability distributions of the source sample data and the target sample data.

图2是本发明一种跨域面部识别方法一优选实施例流程图。一种跨域面部识别方法，包括步骤：FIG. 2 is a flow chart of a preferred embodiment of a cross-domain facial recognition method according to the present invention. A cross-domain facial recognition method, comprising steps:

步骤21：准备数据。准备source domain即身份证照片和target domian即摄像头照片，并对照片作人脸对齐。可以采用用MTCNN算法对人脸进行对齐。Step 21: Prepare the data. Prepare the source domain, which is the ID card photo, and the target domian, which is the camera photo, and align the photo with the face. The faces can be aligned using the MTCNN algorithm.

步骤22：对数据进行处理。对2个不同域的照片分别作随机裁剪、翻转、调整亮度、对比度、色相、饱和度等，以增加样本的多样性。Step 22: Process the data. Randomly crop, flip, and adjust brightness, contrast, hue, saturation, etc. for photos in two different domains to increase the diversity of samples.

步骤23：确定网络结构。图3是本发明一种跨域面部识别方法网络架构图。其中的源样本数据输入和目标样本数据输入都是同一指定像素如但是不限于160像素×160像素的图像，特征提取器E是多层卷积神经网络，域判别器G是多层全连接神经网络。特征提取器E和域判别器G对不同域是共享参数的，实际上只有一个网络。本算法对各种经典网络结构，如特征提取器E的网络结构可以是Inception、ResNet等，如Inception-Resnet-V1网络结构。当然，原始输入图片可能会大小不一，但是输入图片优选为长宽差别不大，否则如果统一缩放为同一大小，则有些输出图片失真太多，则失去这样图片处理的意义。Step 23: Determine the network structure. FIG. 3 is a network architecture diagram of a cross-domain facial recognition method according to the present invention. The source sample data input and the target sample data input are the same specified pixel image, such as but not limited to an image of 160 pixels × 160 pixels, the feature extractor E is a multi-layer convolutional neural network, and the domain discriminator G is a multi-layer fully connected neural network. network. Feature extractor E and domain discriminator G share parameters for different domains, and there is actually only one network. This algorithm can be used for various classical network structures, such as the network structure of the feature extractor E, which can be Inception, ResNet, etc., such as the Inception-Resnet-V1 network structure. Of course, the original input images may vary in size, but the input images preferably have little difference in length and width. Otherwise, if they are uniformly scaled to the same size, some output images will be distorted too much, and the meaning of such image processing will be lost.

步骤24：确定损失函数。Step 24: Determine the loss function.

良好的domain adaptation方法应该基于源域和目标域尽可能相似的特征，同时尽可能地减少源域中的预测误差。因此我们的loss包含2个部分。A good domain adaptation method should be based on features that are as similar as possible in the source and target domains while minimizing the prediction error in the source domain. So our loss consists of 2 parts.

步骤241：确定分类损失函数。classification loss即分类损失函数，使模型具备分类能力，使类内(intra-class)的特征距离尽可能小，类间(inter-class)的特征距离尽可能大。本算法对损失函数是普适的，source和target的分类损失函数可以是softmax、triplet等。triplet loss是深度学习中的一种损失函数，用于训练差异性较小的样本，如人脸等，Feed数据包括锚(Anchor)示例、正(Positive)示例、负(Negative)示例，通过优化锚示例与正示例的距离小于锚示例与负示例的距离，实现样本的相似性计算。Step 241: Determine the classification loss function. Classification loss is a classification loss function, which enables the model to have the ability to classify, so that the feature distance within the class (intra-class) is as small as possible, and the feature distance between the classes (inter-class) is as large as possible. This algorithm is universal to the loss function, and the classification loss function of source and target can be softmax, triplet, etc. Triplet loss is a loss function in deep learning, used to train samples with small differences, such as faces, etc. Feed data includes anchor (Anchor) examples, positive (Positive) examples, negative (Negative) examples, through optimization The distance between the anchor example and the positive example is smaller than the distance between the anchor example and the negative example, and the similarity calculation of the samples is realized.

步骤242：确定域混淆函数。domain confusionloss，即域混淆函数，利用特征提取器E和域判别器G之间的对抗。Step 242: Determine the domain obfuscation function. domain confusionloss, the domain confusion function, exploits the confrontation between the feature extractor E and the domain discriminator G.

步骤2421：固定特征提取器E，让域判别器G最大化地判别出source domain和target domain的样本间的差异。Step 2421: Fix the feature extractor E, and let the domain discriminator G discriminate the difference between the samples of the source domain and the target domain to the maximum extent.

步骤2422：固定域判别器G，让特征提取器E最小化地判别出source domain和target domain的样本间的差异。Step 2422: Fix the domain discriminator G, and let the feature extractor E discriminate the difference between the samples of the source domain and the target domain in a minimal manner.

上述差异可以常用的KL散度来表示，训练的目标为让域判别器G最大化地判别出样本来自于source domain还是target domain的样本间的差异，让特征提取器E最小化source domain样本与target domain样本间的差异，即是min_Gmax_EKL。The above differences can be represented by the commonly used KL divergence. The training goal is to allow the domain discriminator G to maximize the difference between the samples from the source domain or the target domain, and to let the feature extractor E minimize the source domain samples and The difference between the target domain samples is min _G max _E KL.

步骤25：特征提取器E和域判别器G优化。训练时，以facenet的业界预训练模型初始化特征提取器E，分成2个步骤交替进行。Step 25: Feature extractor E and domain discriminator G are optimized. During training, the feature extractor E is initialized with the industry pre-training model of facenet, and it is divided into two steps to alternate.

步骤251优化特征提取器E。总的loss是上述2个域分类损失函数classificationloss和domain confusionloss的加权和，即total loss＝source classification loss+target classification loss-KL，其中alpha是目标分类损失函数targetclassification loss的相对权重。以最小化该loss为目标，用SGD(stochastic gradientdescent)优化器训练特征提取器E。KL前面取负号，是因为特征提取器E的目标是最大化KL，而优化器是最小化loss。Step 251 optimizes the feature extractor E. The total loss is the weighted sum of the above two domain classification loss functions classificationloss and domain confusionloss, namely total loss=source classification loss+target classification loss-KL, where alpha is the relative weight of the target classification loss function targetclassification loss. With the goal of minimizing this loss, a feature extractor E is trained with an SGD (stochastic gradient descent) optimizer. The negative sign in front of KL is because the goal of feature extractor E is to maximize KL, and the optimizer is to minimize loss.

步骤252优化域判别器G。总的loss为KL。以最小化该loss为目标，用SGD(stochastic gradient descent)优化器训练域判别器G。Step 252 optimizes the domain discriminator G. The total loss is KL. With the goal of minimizing this loss, the domain discriminator G is trained with an SGD (stochastic gradient descent) optimizer.

步骤S26、获取目标。这样交替进行，则达到了不同的人特征分布远离和同一个人在不同域中特征分布接近的双重目标。Step S26, acquiring the target. In this way, the dual goal of the feature distribution of different people being far away from the feature distribution of the same person in different domains is achieved.

图4是本发明一种跨域面部识别方法中训练20个epoch后的t-SNE示意图。如图4所示，在训练过程中，可以以直观的方式验证训练的效果，把特征提取器E的多维输出通过T-sne降成二维显示，以显示特征的类内距离和类间距离。FIG. 4 is a schematic diagram of t-SNE after training 20 epochs in a cross-domain face recognition method of the present invention. As shown in Figure 4, during the training process, the effect of training can be verified in an intuitive way, and the multi-dimensional output of the feature extractor E is reduced to two-dimensional display through T-sne to display the intra-class distance and inter-class distance of features .

特征提取器E网络设为Inception-ResNet V1，输出的特征为128维，域判别器G网络设为2个全连接层和sigmoid，classification loss设为triplet loss，刚开始时我们加载用业界图片webface训练过的预训练模型。The feature extractor E network is set to Inception-ResNet V1, the output features are 128 dimensions, the domain discriminator G network is set to 2 fully connected layers and sigmoid, and the classification loss is set to triplet loss. At the beginning, we loaded the industry image webface The trained pretrained model.

图4上每个点表示一张脸部图像的特征，奇数代表源域source domain，偶数代表目标域target domain，相邻序号的奇数和偶数(例如1和2，3和4，等等)分别代表同一个人的源域source domain和目标域target domain图像。点间距离近表示特征相近，跨域识别的理想效果是同一个人的所有点高度聚集，不同人间的点距离较远。Each point on Figure 4 represents the feature of a face image, the odd number represents the source domain, the even number represents the target domain, and the odd and even numbers of adjacent serial numbers (such as 1 and 2, 3 and 4, etc.) are respectively Source domain and target domain images representing the same person. The short distance between points means that the features are similar. The ideal effect of cross-domain recognition is that all points of the same person are highly clustered, and points of different people are far apart.

图5是本发明一种跨域面部识别方法中训练50个epoch后的t-SNE示意图。图5上每个点表示一张脸部图像的特征，奇数代表源域source domain，偶数代表目标域targetdomain，相邻序号的奇数和偶数(例如1和2，3和4，等等)分别代表同一个人的源域sourcedomain和目标域target domain图像。点间距离近表示特征相近，跨域识别的理想效果是同一个人的所有点高度聚集，不同人间的点距离较远。FIG. 5 is a schematic diagram of t-SNE after training 50 epochs in a cross-domain face recognition method of the present invention. Each point on Figure 5 represents the feature of a face image, the odd number represents the source domain, the even number represents the target domain, and the odd and even numbers of adjacent serial numbers (such as 1 and 2, 3 and 4, etc.) represent respectively Source domain sourcedomain and target domain target domain images of the same person. The short distance between points means that the features are similar. The ideal effect of cross-domain recognition is that all points of the same person are highly clustered, and points of different people are far apart.

通过图4与图5的比较，随着训练epoch的个数的增多，我们可以发现，随着训练的进行，不仅同一个人域内的特征更为聚集，同一个人域间的特征也更为接近。Through the comparison between Figure 4 and Figure 5, as the number of training epochs increases, we can find that as the training progresses, not only the features within the same human domain are more aggregated, but also the features between the same human domain are closer.

由上述说明可知，使用根据本发明的跨域面部识别方法，其有益技术效果为：在facenet原有技术方案的基础上，通过特征提取器E采用Inception-Resnet-V1网络结构，源分类损失函数和目标分类损失函数采用三元组损失函数triplet loss的基础上，加入域混淆损失函数，从而解决待识别图像与训练样本图像具有不同统计分布特性的条件下的面部跨域识别问题，使得同一域间的面部特征更为接近，减少了误识率，提升了跨域识别准确性。It can be seen from the above description that using the cross-domain face recognition method according to the present invention, the beneficial technical effects are: on the basis of the original technical solution of facenet, the feature extractor E adopts the Inception-Resnet-V1 network structure, and the source classification loss function On the basis of the triplet loss function and the target classification loss function, the domain confusion loss function is added to solve the problem of face cross-domain recognition under the condition that the image to be recognized and the training sample image have different statistical distribution characteristics, so that the same domain The facial features between them are closer, which reduces the misrecognition rate and improves the accuracy of cross-domain recognition.

以上对本发明实施例进行了详细介绍，本文中应用了具体个例对本发明的原理及实施方式进行了阐述，以上实施例的说明只是用于帮助理解本发明的方法及其核心思想；同时，对于本领域的一般技术人员，依据本发明的思想，在具体实施方式及应用范围上均会有改变之处，综上所述，本说明书内容不应理解为对本发明的限制。The embodiments of the present invention have been introduced in detail above, and specific examples are used to illustrate the principles and implementations of the present invention. The descriptions of the above embodiments are only used to help understand the methods and core ideas of the present invention; at the same time, for Persons of ordinary skill in the art, according to the idea of the present invention, will have changes in the specific embodiments and application scope. To sum up, the contents of this specification should not be construed as limiting the present invention.

Claims

1. a cross-domain facial recognition method, is characterized in that, comprises:

Prepare source sample data and target sample data;

Fix the feature extractor E, let the domain discriminator G maximize the difference between the source sample data and the target sample data;

Fix the domain discriminator G, and let the feature extractor E minimize the difference between the source sample data and the target sample data;

The domain discriminator G is maximized and the feature extractor E is minimized to reach a balance point, and feature values that have no inter-domain difference between the source sample data and the target sample data are extracted from the feature extractor E.

2. The cross-domain face recognition method according to claim 1, wherein preparing the source sample data and the target sample data comprises:

The source sample data and the target sample data are respectively aligned with facial features.

3 . The cross-domain facial recognition method according to claim 1 , wherein the domain discriminator G and the feature extractor E share parameters for different domains. 4 .

4. cross-domain facial recognition method according to claim 1, is characterized in that, preparing source sample data and target sample data comprises:

The source sample data and the target sample data are processed to increase the diversity of the sample data.

5. The cross-domain facial recognition method according to claim 1, wherein the feature extractor E is fixed, the domain discriminator G is maximized, and before discriminating the difference between the source sample data and the target sample data, comprising: determining a loss function .

6. The cross-domain facial recognition method according to claim 4, wherein processing the source sample data and the target sample data comprises: randomly cropping, flipping, adjusting brightness, contrast, Hue and saturation processing.

7. The cross-domain facial recognition method according to claim 5, wherein the loss function comprises a classification loss function and a domain confusion loss function.

8. The cross-domain facial recognition method according to claim 7, wherein the loss function is a weighted sum of a classification loss function and a domain confusion loss function.

9. The cross-domain facial recognition method according to claim 8, wherein the total loss=source classification loss+target classification loss−KL divergence, wherein KL divergence is used to measure the feature probability of source sample data and target sample data difference between distributions.

10 . The cross-domain facial recognition method according to claim 9 , wherein the KL divergence, also called relative entropy, is a measure between the feature probability distributions of the source sample data and the target sample data. 11 .