CN111626091A

CN111626091A - Face image annotation method and device and computer readable storage medium

Info

Publication number: CN111626091A
Application number: CN202010155962.XA
Authority: CN
Inventors: 程星星
Original assignee: China Mobile Communications Group Co Ltd; MIGU Culture Technology Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; MIGU Culture Technology Co Ltd
Priority date: 2020-03-09
Filing date: 2020-03-09
Publication date: 2020-09-04
Anticipated expiration: 2040-03-09
Also published as: CN111626091B

Abstract

The embodiment of the present invention relates to the field of computer machine learning, and discloses a face image labeling method, a device and a computer-readable storage medium. The face image labeling method includes: acquiring multiple face region images of an original image of a person; Perform feature extraction on the region image to obtain multiple face feature vectors used to characterize the identity of the person, wherein one face region image corresponds to one face feature vector; perform feature clustering on multiple face feature vectors to obtain multiple face feature vectors The category to which each face feature vector belongs to in , where the categories include positive and negative categories; the face region images corresponding to the face feature vectors belonging to the positive category are marked. The face image labeling method, device and computer-readable storage medium provided by the present invention can improve the labeling efficiency of images, ensure labeling accuracy, and reduce the labor cost of image labeling.

Description

Face image labeling method, device and computer-readable storage medium

技术领域technical field

本发明实施例涉及计算机机器学习领域，特别涉及一种人脸图像标注方法、装置及计算机可读存储介质。Embodiments of the present invention relate to the field of computer machine learning, and in particular, to a face image labeling method, device, and computer-readable storage medium.

背景技术Background technique

在大规模人脸识别应用中，要保证较高的识别准确率，准确识别同一人物在各年龄段、各种角度、光照、对比度下的人脸图像，在应用开发阶段需要进行大量的数据清洗、标注工作，一个人物需要准备数百张标准人脸图像(大小如112*112)。对于标准人脸数据的收集，现有的解决方案主要是通过爬虫工具从互联网上爬取大量的公开图片，然后使用人脸检测算法批量裁剪出所有检测到的人脸图像，然后由专业的数据标注团队或数据标注众包平台完成图像筛选工作。以一个人爬取200张图像为例，每张图像上假设存在5个人物，则在检测阶段可以裁剪出1000张大小为112*112的人脸图像。其中，在这1000张图像中，至少有800张图像是无效的，需要通过人工标注手动删除。In large-scale face recognition applications, to ensure high recognition accuracy and accurately identify face images of the same person under various age groups, various angles, illumination, and contrast, a large amount of data cleaning is required in the application development stage. , Labeling work, a character needs to prepare hundreds of standard face images (such as 112*112 in size). For the collection of standard face data, the existing solutions mainly use crawler tools to crawl a large number of public pictures from the Internet, and then use face detection algorithms to crop out all detected face images in batches, and then use professional data An annotation team or a data annotation crowdsourcing platform completes the image screening. Take a person crawling 200 images as an example. Assuming that there are 5 people on each image, 1000 face images with a size of 112*112 can be cropped in the detection stage. Among them, among these 1000 images, at least 800 images are invalid and need to be manually deleted by manual annotation.

发明人发现现有技术中至少存在如下问题：通过人工注手动删除，人工成本高、标注效率低、且标注质量不能得到有效保障，不足以支撑大规模人脸识别应用的快速部署。The inventor found that there are at least the following problems in the prior art: manual deletion through manual annotation, high labor cost, low labeling efficiency, and the quality of labeling cannot be effectively guaranteed, which is insufficient to support the rapid deployment of large-scale face recognition applications.

发明内容SUMMARY OF THE INVENTION

本发明实施方式的目的在于提供一种人脸图像标注方法、装置及计算机可读存储介质，其能够在提高图像的标注效率、确保标注准确性的同时，减少图像标注的人力成本。The purpose of the embodiments of the present invention is to provide a face image labeling method, device, and computer-readable storage medium, which can reduce the labor cost of image labeling while improving the labeling efficiency of images and ensuring labeling accuracy.

为解决上述技术问题，本发明的实施方式提供了一种人脸图像标注方法，包括：In order to solve the above technical problems, embodiments of the present invention provide a method for labeling a face image, including:

获取人物原始图像的多个人脸区域图像；对多个所述人脸区域图像进行特征提取，得到用于表征人物身份的多个人脸特征向量，其中，一个人脸区域图像对应一个人脸特征向量；对多个所述人脸特征向量进行特征聚类，得到多个所述人脸特征向量中每个人脸特征向量所属的类别，其中，所述类别包括用于表征人脸特征向量对应的人物身份为目标人物的正类、人脸特征向量对应的人物身份为非目标人物的负类；标注属于所述正类的人脸特征向量对应的人脸区域图像。Obtain multiple face region images of the original image of the person; perform feature extraction on the multiple face region images to obtain multiple face feature vectors used to characterize the identity of the person, wherein one face region image corresponds to one face feature vector ; Carry out feature clustering to a plurality of described facial feature vectors, obtain the category to which each facial feature vector belongs to in a plurality of described facial feature vectors, wherein the category includes a character that is used to characterize the corresponding facial feature vectors The identity is the positive category of the target person, and the person identity corresponding to the face feature vector is the negative category of the non-target person; the face region image corresponding to the face feature vector belonging to the positive category is marked.

本发明的实施方式还提供了一种人脸图像标注装置，包括：至少一个处理器；以及，与所述至少一个处理器通信连接的存储器；其中，所述存储器存储有可被所述至少一个处理器执行的指令，所述指令被所述至少一个处理器执行，以使所述至少一个处理器能够执行上述的人脸图像标注方法。An embodiment of the present invention also provides a face image labeling device, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores data that can be used by the at least one processor. Instructions executed by the processor, where the instructions are executed by the at least one processor, so that the at least one processor can execute the above-mentioned method for labeling a face image.

本发明的实施方式还提供了一种计算机可读存储介质，存储有计算机程序，所述计算机程序被处理器执行时实现上述的人脸图像标注方法。Embodiments of the present invention also provide a computer-readable storage medium storing a computer program, and when the computer program is executed by a processor, the above-mentioned method for labeling a face image is implemented.

本发明的实施方式相对于现有技术而言，通过对多个所述人脸区域图像进行特征提取，得到用于表征人物身份的多个人脸特征向量，也即对难以应用于计算的人脸区域图像进行数字化处理，以便于后续步骤的顺利进行；通过对多个所述人脸特征向量进行特征聚类，得到多个所述人脸特征向量中每个人脸特征向量所属的类别，能够根据聚类结果判断人脸特征向量对应的人物身份是否为目标人物，从而能够快速、准确识别出人脸图像中的噪声数据和有效数据；最后标注属于所述正类的人脸特征向量对应的人脸区域图像从而完成人脸数据图像的标注工作，在提升标注效率，可以有效的保证标注准确性的同时，降低了人工标注的时间成本，也减少了人力成本，为大规模人脸识别应用的快速构建提供支撑。Compared with the prior art, the embodiment of the present invention obtains a plurality of face feature vectors for characterizing the identity of a person by performing feature extraction on a plurality of the face region images, that is, for a face that is difficult to apply to calculation The region image is digitized to facilitate the smooth progress of the subsequent steps; by performing feature clustering on a plurality of the face feature vectors, the category to which each face feature vector of the plurality of the face feature vectors belongs is obtained, which can be determined according to the The clustering result determines whether the identity of the person corresponding to the face feature vector is a target person, so that the noise data and valid data in the face image can be quickly and accurately identified; finally, the person corresponding to the face feature vector belonging to the positive class is marked. The face area image is used to complete the labeling of face data images, which can improve the labeling efficiency, effectively ensure the labeling accuracy, reduce the time cost of manual labeling, and reduce labor costs, which is suitable for large-scale face recognition applications. Fast builds provide support.

另外，在所述标注属于所述正类的人脸特征向量对应的人脸区域图像之前，还包括：删除属于所述负类的人脸特征向量，对属于所述正类的人脸特征向量再次进行所述特征聚类，判断再次进行所述特征聚类的人脸特征向量中是否存在属于所述负类的人脸特征向量；若存在，重复上述步骤直至再次进行所述特征聚类的人脸特征向量不存在属于所述负类的人脸特征向量。In addition, before the labeling of the face region image corresponding to the face feature vector belonging to the positive class, the method further includes: deleting the face feature vector belonging to the negative class, and retrieving the face feature vector belonging to the positive class Carry out the feature clustering again, and determine whether there is a face feature vector belonging to the negative class in the face feature vector of the feature clustering again; if there is, repeat the above steps until the feature clustering is performed again. There are no face feature vectors belonging to the negative class.

另外，所述对多个所述人脸特征向量进行特征聚类，具体包括：分别将N个所述人脸特征向量中的每个人脸特征向量作为聚类中心，并在将第i个人脸特征向量作为所述聚类中心时，计算N个所述人脸特征向量中其他的N-1个人脸特征向量到所述聚类中心的度量距离，其中，N为大于1的整数，i为小于或等于N的整数；判断所述度量距离中是否存在预设数量的小于预设阈值的度量距离；若不存在，则判定所述第i个人脸特征向量属于所述负类；若存在，则判定小于预设阈值的度量距离的数量是否大于预设数量，若大于，则判定所述第i个人脸特征向量属于所述正类；若小于，则判定所述第i个人脸特征向量属于所述负类。In addition, the performing feature clustering on a plurality of the face feature vectors specifically includes: respectively taking each face feature vector in the N face feature vectors as a clustering center, and placing the i-th face When the feature vector is used as the cluster center, calculate the metric distance from the other N-1 face feature vectors in the N face feature vectors to the cluster center, where N is an integer greater than 1, and i is an integer less than or equal to N; determine whether there is a preset number of metric distances less than a preset threshold in the metric distance; if not, then determine that the i-th face feature vector belongs to the negative class; if there is, Then determine whether the number of metric distances less than the preset threshold is greater than the preset number, if it is greater than, then determine that the i-th face feature vector belongs to the positive class; if it is less than, then determine that the i-th face feature vector belongs to the negative class.

另外，在所述分别将N个所述人脸特征向量中的每个人脸特征向量作为聚类中心之前，还包括：设置滑动窗口大小和滑动步长；所述分别将N个所述人脸特征向量中的每个人脸特征向量作为聚类中心，具体包括：根据所述滑动窗口大小、所述滑动步长和所述N个人脸特征向量建立多个滑动窗口，其中，每个所述滑动窗口内的人脸特征向量数目等于所述滑动窗口大小；依次将每个所述滑动窗口内的每个人脸特征向量作为所述聚类中心。In addition, before taking each of the N face feature vectors as a clustering center, the method further includes: setting a sliding window size and a sliding step; Each face feature vector in the feature vector is used as a clustering center, which specifically includes: establishing a plurality of sliding windows according to the sliding window size, the sliding step size and the N face feature vectors, wherein each sliding window The number of face feature vectors in the window is equal to the size of the sliding window; each face feature vector in each of the sliding windows is sequentially used as the cluster center.

另外，在根据所述滑动窗口大小、所述滑动步长和所述N个人脸特征向量建立多个滑动窗口之前，还包括：对所述N个人脸特征向量进行随机化处理。In addition, before establishing a plurality of sliding windows according to the sliding window size, the sliding step size and the N face feature vectors, the method further includes: randomizing the N face feature vectors.

另外，所述对多个所述人脸区域图像进行特征提取，得到用于表征人物身份的多个人脸特征向量，具体包括：将多个所述人脸区域图像依次输入预设神经网络模型，得到所述人脸特征向量。In addition, the performing feature extraction on a plurality of the face region images to obtain a plurality of face feature vectors for characterizing the identity of a person specifically includes: sequentially inputting the plurality of the face region images into a preset neural network model, Obtain the face feature vector.

另外，所述预设神经网络模型包括第一级神经网络和第二级神经网络；所述人脸特征向量通过以下方式计算得到：将所述人脸区域图像输入所述第一级神经网络，得到初始向量；将所述初始向量输入所述第二级神经网络，通过所述第二级神经网络中的权重向量和预设的特征参数训练所述初始向量，得到所述人脸特征向量，其中，所述特征参数为大于0的常数。In addition, the preset neural network model includes a first-level neural network and a second-level neural network; the face feature vector is calculated by: inputting the face region image into the first-level neural network, Obtain an initial vector; input the initial vector into the second-level neural network, train the initial vector through the weight vector in the second-level neural network and preset feature parameters, and obtain the face feature vector, Wherein, the characteristic parameter is a constant greater than 0.

另外，在对多个所述人脸区域图像进行特征提取之前，还包括：对所述多个人脸区域图像进行人脸图像数据预处理，得到分辨率满足预设要求的人脸区域图像；所述对多个所述人脸区域图像进行特征提取，具体包括：对分辨率满足预设要求的人脸区域图像进行特征提取。In addition, before the feature extraction is performed on the multiple face region images, the method further includes: performing face image data preprocessing on the multiple face region images to obtain face region images whose resolutions meet preset requirements; The performing feature extraction on a plurality of the face region images specifically includes: performing feature extraction on the face region images whose resolutions meet preset requirements.

附图说明Description of drawings

一个或多个实施例通过与之对应的附图中的图片进行示例性说明，这些示例性说明并不构成对实施例的限定，附图中具有相同参考数字标号的元件表示为类似的元件，除非有特别申明，附图中的图不构成比例限制。One or more embodiments are exemplified by the pictures in the corresponding drawings, and these exemplifications do not constitute limitations of the embodiments, and elements with the same reference numerals in the drawings are denoted as similar elements, Unless otherwise stated, the figures in the accompanying drawings do not constitute a scale limitation.

图1是根据本发明第一实施方式提供的人脸图像标注方法的流程图；1 is a flowchart of a method for labeling a face image provided according to a first embodiment of the present invention;

图2是根据本发明第一实施方式提供的MTCNN人脸检测的流程图；Fig. 2 is the flow chart of the MTCNN face detection provided according to the first embodiment of the present invention;

图3是根据本发明第一实施方式提供的人脸区域图像特征提取的流程图；3 is a flowchart of feature extraction of a face region image provided according to the first embodiment of the present invention;

图4是根据本发明第一实施方式提供的人脸身份识别原理图；4 is a schematic diagram of a face identity recognition provided according to the first embodiment of the present invention;

图5是根据本发明第二实施方式提供的人脸图像标注方法的流程图；5 is a flowchart of a method for labeling a face image according to a second embodiment of the present invention;

图6是根据本发明第三实施方式提供的人脸图像标注方法的流程图；6 is a flowchart of a method for labeling a face image provided according to a third embodiment of the present invention;

图7是根据本发明第三实施方式提供的K邻近算法的聚类原理图；7 is a schematic diagram of the clustering of the K-neighbor algorithm provided according to the third embodiment of the present invention;

图8是根据本发明第四实施方式提供的人脸图像标注装置的结构示意图。FIG. 8 is a schematic structural diagram of an apparatus for labeling a face image according to a fourth embodiment of the present invention.

具体实施方式Detailed ways

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合附图对本发明的各实施方式进行详细的阐述。然而，本领域的普通技术人员可以理解，在本发明各实施方式中，为了使读者更好地理解本发明而提出了许多技术细节。但是，即使没有这些技术细节和基于以下各实施方式的种种变化和修改，也可以实现本发明所要求保护的技术方案。In order to make the objectives, technical solutions and advantages of the embodiments of the present invention clearer, the various embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, those of ordinary skill in the art can appreciate that, in the various embodiments of the present invention, many technical details are set forth for the reader to better understand the present invention. However, even without these technical details and various changes and modifications based on the following embodiments, the technical solutions claimed in the present invention can be realized.

除非上下文明确要求，否则整个说明书和权利要求书中的“包括”、“包含”等类似词语应当解释为包含的含义而不是排他或穷举的含义；也就是说，是“包括但不限于”的含义。Unless clearly required by the context, words such as "including", "comprising" and the like throughout the specification and claims should be construed in an inclusive rather than an exclusive or exhaustive sense; that is, "including but not limited to" meaning.

在本发明公开的描述中，需要理解的是，术语“第一”、“第二”等仅用于描述目的，而不能理解为指示或暗示相对重要性。此外，在本发明公开的描述中，除非另有说明，“多个”的含义是两个或两个以上。In the description of the present disclosure, it should be understood that the terms "first", "second" and the like are used for descriptive purposes only and should not be construed as indicating or implying relative importance. Furthermore, in the description of the present disclosure, unless stated otherwise, "plurality" means two or more.

本发明的第一实施方式涉及一种人脸图像标注方法，具体流程如图1所示，包括：The first embodiment of the present invention relates to a face image labeling method. The specific process is shown in FIG. 1 , including:

S101：获取人物原始图像的多个人脸区域图像。S101: Acquire multiple face region images of the original image of the person.

具体的说，本实施方式以爬虫工具爬取人物原始图像(如演员剧照、写真、工作照等)，人物原始图像中包含有人脸图像，人物原始图像中的人脸图像可能为一个，也可能为多个，一个人物原始图像中的多个人脸图像可能属于同一身份，也可能属于不同的身份。Specifically, in this embodiment, a crawler tool is used to crawl original images of characters (such as actor stills, portraits, work photos, etc.), and the original images of characters include images of faces. For multiple, multiple face images in an original image of a person may belong to the same identity or may belong to different identities.

值得一提的是，如图2所示，本实施方式中采用基于多级神经网络的级联人脸检测算法MTCNN(Multi-Task Cascaded Convolutional Networks)从人物原始图像中检测人脸区域图像，为了便于理解，下面对MTCNN进行详细的说明：It is worth mentioning that, as shown in FIG. 2, in this embodiment, a multi-level neural network-based cascade face detection algorithm MTCNN (Multi-Task Cascaded Convolutional Networks) is used to detect the face region image from the original image of the person. For ease of understanding, the following is a detailed description of MTCNN:

P-Net网络预测人物原始图像中人脸区域的bounding box(边界框)，将boundingbox区域中的图像裁剪并缩放到24*24的大小并输入到R-Net网络，生成校正后的boundingbox。将R-Net网络生成的bounding box区域中的图像裁剪并缩放到48*48的大小并输入到O-Net网络，生成校正后的bounding box坐标和人脸五官位置坐标，以及bounding box区域包含人脸的概率值。MTCNN人脸检测的主要步骤如下：The P-Net network predicts the bounding box (bounding box) of the face area in the original image of the person, crops and scales the image in the boundingbox area to a size of 24*24 and inputs it to the R-Net network to generate the corrected boundingbox. Crop and scale the image in the bounding box area generated by the R-Net network to a size of 48*48 and input it to the O-Net network to generate the corrected bounding box coordinates and facial features position coordinates, and the bounding box area contains people The probability value of the face. The main steps of MTCNN face detection are as follows:

(1)判断人物原始图像中是否存在人脸区域：(1) Determine whether there is a face area in the original image of the person:

对于一个区域中是否存在人脸是一个二分类问题，使用逻辑回归损失函数进行评估：The presence or absence of a face in a region is a binary classification problem, evaluated using the logistic regression loss function:

其中，

是模型训练样本的真实人脸区域概率，

p_i是模型预测的人脸概率，p_i∈[0，1]；

表示概率值

与概率值p_i的偏移程度，偏移越大，

越大。in,

is the real face region probability of the model training sample,

_pi is the face probability predicted by the model, pi _∈ [0, 1];

represents the probability value

The degree of deviation from the probability value p _i , the larger the deviation, the

bigger.

(2)判断人脸区域位置是否准确。(2) Determine whether the location of the face region is accurate.

具体的说，通过如下公式判断人脸区域位置是否准确：Specifically, the following formula is used to determine whether the position of the face region is accurate:

其中，

是模型训练样本的真实人脸区域坐标，

是模型预测的人脸区域坐标，

和

由对应区域的起始顶点坐标和区域的宽、高定义，使用欧式距离的平方度量真实坐标

与预测坐标

的偏移程度。

in,

are the coordinates of the real face region of the model training sample,

are the coordinates of the face region predicted by the model,

and

Defined by the starting vertex coordinates of the corresponding area and the width and height of the area, using the square of the Euclidean distance to measure the real coordinates

with predicted coordinates

degree of offset.

(3)判断人脸五官坐标位置是否准确。(3) Determine whether the coordinates of the facial features of the face are accurate.

具体的说，通过如下公式判断人脸五官坐标位置是否准确：Specifically, the following formula is used to determine whether the coordinates of the facial features are accurate:

其中，

是模型训练样本的人脸五官坐标，

是模型预测的人脸五官坐标，使用欧式距离的平方度量真实坐标

与预测坐标

的偏移程度。上述处理过程，提取出图像上的人脸图像位置以及人脸五官坐标位置。

in,

is the facial features coordinates of the model training sample,

is the facial facial features predicted by the model, using the square of the Euclidean distance to measure the real coordinates

with predicted coordinates

degree of offset. In the above processing process, the position of the face image on the image and the coordinate position of the facial features of the face are extracted.

优选地，在检测到人脸区域之后，本实施方式中还可以对检测到的人脸区域进行人脸校正。具体的说，人脸校正也叫人脸对齐，就是将人脸头像统一旋转到水平位置。基于前述步骤中检测到的人脸区域以及人脸的五官(左眼、右眼、鼻子、左嘴角、右嘴角)坐标位置，对人脸图像进行仿射变换，经过变换之后的人脸图像处于水平状态，即两眼之间的连线保持水平，并将经过校正后的图像缩放到112*112大小。Preferably, after the human face region is detected, in this embodiment, face correction may also be performed on the detected human face region. Specifically, face correction is also called face alignment, which is to uniformly rotate the face avatar to a horizontal position. Based on the face region detected in the preceding steps and the coordinate positions of the facial features (left eye, right eye, nose, left mouth corner, right mouth corner), affine transformation is performed on the face image, and the transformed face image is in Horizontal state, that is, the line between the two eyes remains horizontal, and the corrected image is scaled to 112*112 size.

更优地，由于通过MTCNN检测到的人脸图像最小大小为12个像素，将12*12的图像缩放到112*112会出现严重失真，失真图像对人脸识别准确性是无益的。因此，本实施方式还会对对所述多个人脸区域图像进行人脸图像数据预处理，得到分辨率满足预设要求的人脸区域图像。具体的说，本实施方式可以根据图像字节大小4*1024(即4kb)对图像进行过滤，去除分辨率低的图像(即去除图像大小在4kb以下的图像)，可以理解的是，本实施方式并不对图像分辨率高低的判断标准作具体限定，可以去除图像大小在4kb以下的图像，也可以图像大小在5kb或6kb以下的图像，均可达到相同的技术效果。通过此种方式，能够在对人脸区域图像进行特征提取之前将失真图像剔除，从而降低了后续步骤的工作量，进一步提高了人脸图像标注方法的工作效率。More preferably, since the minimum size of the face image detected by MTCNN is 12 pixels, scaling the 12*12 image to 112*112 will cause severe distortion, and the distorted image is useless for face recognition accuracy. Therefore, in this embodiment, face image data preprocessing is also performed on the plurality of face region images, so as to obtain face region images whose resolutions meet preset requirements. Specifically, this embodiment can filter the image according to the image byte size 4*1024 (ie 4kb), and remove the image with low resolution (ie, remove the image with the image size below 4kb). It can be understood that this embodiment The method does not specifically limit the judging criteria of the image resolution. It can remove images with an image size of less than 4kb, or images with an image size of less than 5kb or 6kb, both of which can achieve the same technical effect. In this way, the distorted image can be removed before the feature extraction is performed on the face region image, thereby reducing the workload of the subsequent steps and further improving the working efficiency of the face image labeling method.

S102：对多个人脸区域图像进行特征提取，得到用于表征人物身份的多个人脸特征向量。S102: Perform feature extraction on multiple face region images to obtain multiple face feature vectors used to represent character identities.

具体的说，本实施方式中可以通过如下方式得到人脸特征向量：将所述人脸区域图像输入所述第一级神经网络，得到初始向量；将所述初始向量输入所述第二级神经网络，通过所述第二级神经网络中的权重向量和预设的特征参数训练所述初始向量，得到所述人脸特征向量，其中，所述特征参数为大于0的常数。Specifically, in this embodiment, a face feature vector can be obtained in the following manner: inputting the face region image into the first-level neural network to obtain an initial vector; inputting the initial vector into the second-level neural network network, train the initial vector through the weight vector in the second-level neural network and the preset feature parameter, and obtain the face feature vector, wherein the feature parameter is a constant greater than 0.

为了便于理解，下面对第二级神经网络如何训练初始向量进行详细的说明：For ease of understanding, the following is a detailed description of how the second-level neural network trains the initial vector:

假设初始向量为512维向量，如图3所示，x_i是使用卷积神经网络输出的512维特征向量，w_j是权重向量。通过迭代训练w_j和x_i，降低w_j与x_i向量之间的夹角θ，从而增大余弦值cosθ达到增加w_jx_i向量乘积的目的，使得权重向量w_j代表的人物身份获得更高的预测概率，并且在训练过程中额外增加一个参数m，从而增加算法对不同人物身份的判别性。Assuming that the initial vector is a 512-dimensional vector, as shown in Figure 3, x _i is the 512-dimensional feature vector output by the convolutional neural network, and w _j is the weight vector. By iterative training w _j and x _i , the angle θ between w _j and x _i vectors is reduced, so as to increase the cosine value cosθ to increase the product of w _j x _i vectors, so that the identity of the person represented by the weight vector w _j can be obtained. Higher prediction probability, and an additional parameter m is added in the training process, thereby increasing the algorithm's discriminative ability for different character identities.

具体的说，w_j为随机生成的一组向量值，该向量值组是用于判断输入ArcFace算法的人脸特征向量的人物身份。例如，分别输入特征向量x₁和x₂，若该两个向量均与权重向量组中的w₁无限接近，则可以判定x₁和x₂是属于同一人物身份；若出现较大分隔边界，则可以判定属于不同人物身份。Specifically, w _j is a randomly generated set of vector values, and the vector value set is used to determine the person identity of the face feature vector input to the ArcFace algorithm. For example, input feature vectors x ₁ and x ₂ respectively, if the two vectors are infinitely close to w ₁ in the weight vector group, it can be determined that x ₁ and x ₂ belong to the same person identity; if there is a large separation boundary, It can be determined that they belong to different personalities.

如图4所示，进一步阐述了ArcFace算法提取的人脸特征具有高内聚(属于同一身份)和较大分隔边界(属于不同身份)的特性。向量x表示一张人脸图像的特征向量，w₁和w₂分别是ArcFace算法经过训练后的权重向量，向量x与权重w₁的夹角为θ₁，向量x与权重w₁的夹角为θ₂，θ₁<θ₂。特征向量x所属的人物身份概率计算过程如下：As shown in Figure 4, it is further elaborated that the facial features extracted by the ArcFace algorithm have the characteristics of high cohesion (belonging to the same identity) and large separation boundaries (belonging to different identities). The vector x represents the feature vector of a face image, w ₁ and w ₂ are the weight vectors after training of the ArcFace algorithm, the angle between the vector x and the weight w ₁ is θ ₁ , and the angle between the vector x and the weight w ₁ is θ ₂ , θ ₁ <θ ₂ . The calculation process of the probability of the identity of the person to which the feature vector x belongs is as follows:

w₁x＝||w₁||||x||cos(θ₁)；w₂x＝||w₂||||x||cos(θ₂)；||w₁||||x||cos(θ₁+m)>||w₂||||x||cos(θ₂)；w ₁ x=||w ₁ ||||x||cos(θ ₁ ); w ₂ x=||w ₂ ||||x||cos(θ ₂ ); ||w ₁ |||| x||cos(θ ₁ +m)>||w ₂ ||||x||cos(θ ₂ );

||w₁||||x||cos(θ₁)>||w₂||||x||cos(θ₂)。||w ₁ ||||x||cos(θ ₁ )>||w ₂ ||||x||cos(θ ₂ ).

其中，w₁x和w₂x分别表示特征向量x属于两个人物身份的概率，图中的空白区域表示额外增加的参数m对人物身份判别性的提升。算法借助于余弦函数cosine在[0，π]区间内单调递减的特性，在训练时额外加上一个非负的m参数，增加不同身份人脸之间的分隔边界。通过ArcFace算法提取的特征向量，同一身份的特征之间具有更高的内聚性，不同身份的特征之间具有较大的分隔边界。Among them, w ₁ x and w ₂ x represent the probability that the feature vector x belongs to two person identities, respectively, and the blank area in the figure represents the improvement of the discriminativeness of the person identity by the additional parameter m. The algorithm relies on the characteristic that the cosine function cosine decreases monotonically in the [0, π] interval, and adds a non-negative m parameter during training to increase the separation boundary between faces with different identities. The feature vector extracted by the ArcFace algorithm has higher cohesion between the features of the same identity, and a larger separation boundary between the features of different identities.

S103：对多个人脸特征向量进行特征聚类，得到多个人脸特征向量中每个人脸特征向量所属的类别。S103: Perform feature clustering on multiple face feature vectors to obtain a category to which each face feature vector of the multiple face feature vectors belongs.

具体的说，所述类别包括用于表征人脸特征向量对应的人物身份为目标人物的正类、人脸特征向量对应的人物身份为非目标人物的负类。本实施方式中可以采用基于滑动窗口的特征聚类方法，通过调整滑动步长和窗口大小，逐步降低人脸特征向量的平均类内距离，逐步提升聚类结果的准确性。Specifically, the categories include a positive category for characterizing the person identity corresponding to the face feature vector as a target person, and a negative category for representing the person identity corresponding to the face feature vector as a non-target person. In this embodiment, a feature clustering method based on a sliding window can be used, and by adjusting the sliding step size and the window size, the average intra-class distance of the face feature vector is gradually reduced, and the accuracy of the clustering result is gradually improved.

S104：标注属于正类的人脸特征向量对应的人脸区域图像。S104: Label the face region image corresponding to the face feature vector belonging to the positive class.

具体的说，本实施方式采用深度神经网络提取人脸特征向量，依据同一身份人脸的特征向量具有更高相似性的原则，采用统计学习方法在滑动窗口中进行特征聚类，聚类结果识别出不属于同一身份的人脸图像，能快速、准确的从大量人脸图像中筛选出属于某一身份的人脸图像，清除其它噪声数据。Specifically, in this embodiment, a deep neural network is used to extract face feature vectors, and according to the principle that the feature vectors of the same identity face have higher similarity, a statistical learning method is used to perform feature clustering in a sliding window, and the clustering results are identified. It can quickly and accurately screen out face images belonging to a certain identity from a large number of face images, and remove other noise data.

本发明的第二实施方式涉及一种人脸图像标注方法，第二实施方式是在第一实施方式的基础上做了进一步的改进，具体改进之处在于：在第二实施方式中，会删除属于负类的人脸特征向量，并多次判断人脸特征向量中是否还存在属于负类的人脸特征向量，直至最后得到的人脸特征向量中不存在属于负类的人脸特征向量，从而能够进一步提高标准的准确性，确保标注质量。The second embodiment of the present invention relates to a face image labeling method. The second embodiment is further improved on the basis of the first embodiment. The specific improvement is that in the second embodiment, the deletion of face feature vector belonging to the negative class, and repeatedly determine whether there is a face feature vector belonging to the negative class in the face feature vector, until there is no face feature vector belonging to the negative class in the finally obtained face feature vector, This can further improve the accuracy of the standard and ensure the quality of annotation.

本实施方式的具体流程如图5所示，包括：The specific process of this embodiment is shown in Figure 5, including:

S201：获取人物原始图像的多个人脸区域图像。S201: Acquire multiple face region images of the original image of the person.

S202：对多个人脸区域图像进行特征提取，得到用于表征人物身份的多个人脸特征向量。S202: Perform feature extraction on multiple face region images to obtain multiple face feature vectors used to represent character identities.

S203：对多个人脸特征向量进行特征聚类，得到多个人脸特征向量中每个人脸特征向量所属的类别。S203: Perform feature clustering on multiple face feature vectors to obtain a category to which each face feature vector of the multiple face feature vectors belongs.

S204：删除属于负类的人脸特征向量，对属于正类的人脸特征向量再次进行特征聚类。S204: Delete the face feature vectors belonging to the negative class, and perform feature clustering again on the face feature vectors belonging to the positive class.

S205：判断再次进行所述特征聚类的人脸特征向量中是否存在属于负类的人脸特征向量，若是，则执行步骤S204；若否，则执行步骤S206。S205: Determine whether there is a face feature vector belonging to a negative class in the face feature vector for performing the feature clustering again, if yes, execute step S204; if not, execute step S206.

具体的说，本实施方式中在第一次判断人脸特征向量中不存在属于负类的人脸特征向量时，还可以对本次判断的人脸特征向量再次进行所述特征聚类，并再次判断人脸特征向量中是否存在属于负类的人脸特征向量，重复多次，直至多次判断的结果均为人脸特征向量中不存在属于负类的人脸特征向量。通过此种方式，能够进一步提高人脸图像标注方法的准确性。Specifically, in this embodiment, when it is judged for the first time that there is no face feature vector belonging to the negative class in the face feature vector, the feature clustering may be performed again on the face feature vector judged this time, and It is judged again whether there is a face feature vector belonging to the negative class in the face feature vector, and repeated for many times until the result of the multiple judgments is that there is no face feature vector belonging to the negative class in the face feature vector. In this way, the accuracy of the face image labeling method can be further improved.

S206：标注属于正类的人脸特征向量对应的人脸区域图像。S206: Label the face region image corresponding to the face feature vector belonging to the positive class.

本发明的第三实施方式涉及一种人脸图像标注方法，本实施方式是对第一实施方式的举例说明，具体说明了：第一实施方式中对多个所述人脸特征向量进行特征聚类，得到多个所述人脸特征向量中每个人脸特征向量所属的类别的过程。The third embodiment of the present invention relates to a face image labeling method. This embodiment is an example of the first embodiment, and specifically describes that: in the first embodiment, feature clustering is performed on a plurality of the face feature vectors. The process of obtaining the category to which each face feature vector of the multiple face feature vectors belongs.

具体的说，如图6所示，在本实施方式中，包含步骤S301至步骤S310，其中，步骤S301至步骤S302分别与第一实施方式中的步骤S101至步骤S102大致相同，此处不再赘述。下面主要介绍不同之处：Specifically, as shown in FIG. 6 , in this embodiment, steps S301 to S310 are included, wherein steps S301 to S302 are respectively substantially the same as steps S101 to S102 in the first embodiment, and are not repeated here. Repeat. The main differences are as follows:

执行步骤S301至步骤S302。Steps S301 to S302 are performed.

S303：分别将N个人脸特征向量中的第i个人脸特征向量作为聚类中心。S303: respectively use the i-th face feature vector in the N face feature vectors as the cluster center.

S304：计算N个人脸特征向量中其他的N-1个人脸特征向量到聚类中心的度量距离。S304: Calculate the metric distance from the other N-1 face feature vectors in the N face feature vectors to the cluster center.

S305：判断度量距离中是否存在小于预设阈值的度量距离，若存在，执行步骤S306；若不存在，判定第i个人脸特征向量属于负类。S305: Determine whether there is a metric distance smaller than a preset threshold in the metric distance, and if so, perform step S306; if not, determine that the ith face feature vector belongs to the negative class.

S306：判断小于预设阈值的度量距离的数量是否大于预设数量，若是，判定第i个人脸特征向量属于正类；若不是，判定第i个人脸特征向量属于负类；判断i是否小于N，在i小于N时，令i＝i+1，执行步骤S303；否则流程结束。S306: Determine whether the number of metric distances less than the preset threshold is greater than the preset number, and if so, determine that the i-th face feature vector belongs to the positive class; if not, determine that the i-th face feature vector belongs to the negative class; determine whether i is less than N , when i is less than N, set i=i+1, and execute step S303; otherwise, the process ends.

值得一提的是，由于直接对N个人脸特征向量进行特征聚类，可能会存在多个连续的非目标人物图像对应的人脸特征向量影响聚类结果的情况，导致人脸图像标注方法的准确性不高，因此，本实施方式中还可以采用基于图像列表随机化的滑动窗口迭代聚类方法，图像列表随机化可以降低连续出现的噪声数据对聚类准确性带来的影响。基于滑动窗口的特征聚类方法，通过调整滑动步长和窗口大小，逐步降低人脸特征向量的平均类内距离，逐步提升聚类结果的准确性。It is worth mentioning that, due to the direct feature clustering of N face feature vectors, there may be situations where the face feature vectors corresponding to multiple consecutive non-target person images may affect the clustering results, resulting in the inconsistency of the face image annotation method. The accuracy is not high. Therefore, in this embodiment, a sliding window iterative clustering method based on image list randomization can also be used, and image list randomization can reduce the impact of continuously occurring noise data on the clustering accuracy. The feature clustering method based on sliding window, by adjusting the sliding step and window size, gradually reduces the average intra-class distance of the face feature vector, and gradually improves the accuracy of the clustering results.

也就是说，在所述将N个所述人脸特征向量中的每个人脸特征向量作为聚类中心之前，还包括：设置滑动窗口大小和滑动步长；所述分别将N个所述人脸特征向量中的每个人脸特征向量作为聚类中心，具体包括：根据所述滑动窗口大小、所述滑动步长和所述N个人脸特征向量建立多个滑动窗口，其中，每个所述滑动窗口内的人脸特征向量数目等于所述滑动窗口大小；依次将每个所述滑动窗口内的每个人脸特征向量作为所述聚类中心。That is to say, before taking each of the N face feature vectors as a clustering center, it also includes: setting a sliding window size and a sliding step; Each face feature vector in the face feature vector is used as a clustering center, which specifically includes: establishing a plurality of sliding windows according to the sliding window size, the sliding step size and the N face feature vectors, wherein each of the The number of face feature vectors in the sliding window is equal to the size of the sliding window; each face feature vector in each of the sliding windows is sequentially used as the cluster center.

为了便于理解，下面对本实施方式中基于滑动窗口的人脸特征聚类进行详细的说明：For ease of understanding, the following describes in detail the face feature clustering based on the sliding window in this embodiment:

首先，对本实施方式中特征聚类的原理进行简单的介绍：聚类过程采用滑动窗口进行窗口内的局部聚类，在滑动窗口内部分别以每个人脸特征f_i∈{f₁,f₂,...,f_k}向量为聚类中心，计算窗口内其它特征向量f_j与聚类中心f_i的度量距离，并依据阈值K判断聚类中心f_i所属的类别(正类或负类)。First, the principle of feature clustering in this embodiment is briefly introduced: the clustering process uses a sliding window to perform local clustering within the window, and within the sliding window, each face feature f _i ∈ {f ₁ , f ₂ , ...,f _k } vector is the cluster center, calculate the metric distance between other feature vectors f _j in the window and the cluster center f _i , and judge the category (positive or negative) of the cluster center f _i according to the threshold K ).

K近邻算法的聚类原理如图7所示，图中每个三角形和正方形代表一个特征向量，三角形和正方形分别代表特征向量所属的类别，在给定度量距离和阈值K的情况下，与圆形特征相近的三角形个数较多，因此将圆形特征识别为三角形类别。The clustering principle of the K-nearest neighbor algorithm is shown in Figure 7. In the figure, each triangle and square represent a feature vector, and the triangle and square respectively represent the category to which the feature vector belongs. Given the metric distance and the threshold K, the same as the circle The number of triangles with similar shape features is large, so the circular feature is identified as the triangle category.

本实施方式采用的K近邻算法核心三要素距离度量、K值、分类决策规则设置如下：The three core elements of the K-nearest neighbor algorithm adopted in this embodiment are distance metrics, K values, and classification decision rules are set as follows:

(1)距离度量(1) Distance metric

本提案中距离度量采用L₂范数，即欧几里得距离(欧氏距离)进行度量，特征向量之间的欧氏距离表示如下：The distance metric in this proposal adopts the L ₂ norm, that is, the Euclidean distance (Euclidean distance) to measure, and the Euclidean distance between feature vectors is expressed as follows:

(2)K值的选择(2) Selection of K value

阈值K与滑动窗口大小、滑动步长对应关系如下：The corresponding relationship between the threshold K and the sliding window size and sliding step size is as follows:

表1Table 1

(3)分类决策规则(3) Classification decision rules

如果一个聚类中心f_i∈{f₁,f₂,...,f_k}特征向量的聚类结果，在设置的滑动窗口大小、滑动步长、阈值K下，如果聚类结果中与f_i相近的特征数量小于阈值K，则将聚类中心f_i标记为负类，即噪声数据；如果聚类结果中与f_i相近的特征数量大于阈值K，则将聚类中心f_i标记为正类，即有效数据。If a cluster center f _i ∈{f ₁ ,f ₂ ,...,f _k } is the clustering result of the feature vector, under the set sliding window size, sliding step size, and threshold K, if the clustering result matches the If the number of features close to f _i is less than the threshold K, the cluster center f _i will be marked as a negative class, that is, noise data; if the number of features similar to f _i in the clustering result is greater than the threshold K, the cluster center f _i will be marked For the positive class, that is, valid data.

基于上述原理，可以得到本实施方式中滑动窗口的人脸特征聚类步骤如下：Based on the above principles, the steps of clustering the facial features of the sliding window in this embodiment can be obtained as follows:

1、将属于“同一身份”的所有人脸图像列表随机化，降低连续出现的噪声图像对聚类结果带来影响。1. Randomize the list of all face images belonging to the "same identity" to reduce the impact of continuous noise images on the clustering results.

2、按照滑动步长slide stride，滑动窗口大小window size进行滑动窗口计算。2. Calculate the sliding window according to the sliding step size slide stride and the sliding window size window size.

3、滑动窗口内特征聚类可分为如下几个子步骤：3. The feature clustering in the sliding window can be divided into the following sub-steps:

步骤A：向量化窗口中的每张图像为512维特征向量。Step A: Each image in the vectorization window is a 512-dimensional feature vector.

步骤B：使用K近邻算法基于最小距离原则进行特征聚类，循环计算每个特征向量与其余特征向量的度量距离，计算结果取由小到大排列的top N。聚类计算过程如下：Step B: Use the K-nearest neighbor algorithm to perform feature clustering based on the principle of minimum distance, cyclically calculate the metric distance between each feature vector and the rest of the feature vectors, and the calculation results are top N arranged from small to large. The clustering calculation process is as follows:

1)任意选取一个特征向量作为聚类中心，记为f_i∈{f₁,f₂,...,f_k}，设f_i属于P类别。1) Arbitrarily select a feature vector as the cluster center, denoted as f _i ∈ {f ₁ , f ₂ ,..., f _k }, and let f _i belong to the P category.

2)计算下一个特征向量f_j到的f_i度量距离dist，若dist<0.95，则f_j归为P类别，否则f_i属于P类别，f_j属于N类别。2) Calculate the fi metric distance dist from the next feature vector f _j to. If dist<0.95, _{f j} _is classified as P category, otherwise f _i belongs to P category, and f _j belongs to N category.

3)依次将每个特征向量f_i∈{f₁,f₂,...,f_k}作为聚类中心，计算其余的特征向量f_j到各中心f_i的距离，重复此步至遍历所有的聚类中心。3) Take each feature vector f _i ∈{f ₁ ,f ₂ ,...,f _k } as the cluster center in turn, calculate the distances from the remaining feature vectors f _j to each center f _i , and repeat this step until the traversal All cluster centers.

4)对于每个聚类中心f_i∈{f₁,f₂,...,f_k}的聚类结果，依据聚类分类规则，如果P类别集合大小小于阈值K，则将聚类中心f_i判定为负类，反之将聚类中心f_i判定为正类。正类即为有效数据，负类即为噪声数据。4) For the clustering result of each cluster center f _i ∈ {f ₁ ,f ₂ ,...,f _k }, according to the clustering classification rules, if the size of the P category set is less than the threshold K, the cluster center _fi is judged as a negative class, otherwise, the cluster center _fi is judged as a positive class. The positive class is valid data, and the negative class is noise data.

步骤C：一轮滑动窗口计算代表一轮特征聚类迭代，重复步骤A、步骤B至迭代结束，迭代收敛条件为出现3轮迭代的聚类结果中，被判别为负类的集合为空集。Step C: One round of sliding window calculation represents one round of feature clustering iteration, repeating step A and step B to the end of the iteration, the iterative convergence condition is that in the clustering results of 3 rounds of iterations, the set judged to be a negative class is an empty set .

4、依次选择不同的滑动步长、滑动窗口大小、聚类阈值K，遵照步骤1、2、3进行特征聚类，并基于步骤3的分类决策结果，删除所有被识别为负类的噪声数据，保留被识别为正类的有效数据。4. Select different sliding step sizes, sliding window sizes, and clustering threshold K in turn, follow steps 1, 2, and 3 to perform feature clustering, and delete all noise data identified as negative classes based on the classification decision results of step 3 , retaining valid data identified as positive classes.

基于滑动窗口的聚类过程伪代码描述如下:The pseudo-code description of the clustering process based on sliding window is as follows:

本发明第四实施方式涉及一种人脸图像标注装置，如图8所示，包括：The fourth embodiment of the present invention relates to a face image labeling device, as shown in FIG. 8 , including:

至少一个处理器401；以及，与至少一个处理器401通信连接的存储器402；其中，存储器402存储有可被至少一个处理器401执行的指令，指令被至少一个处理器401执行，以使至少一个处理器401能够执行上述人脸图像标注方法。at least one processor 401; and, a memory 402 in communication with the at least one processor 401; wherein the memory 402 stores instructions executable by the at least one processor 401, the instructions being executed by the at least one processor 401 to cause the at least one processor The processor 401 can execute the above-mentioned face image labeling method.

其中，存储器402和处理器401采用总线方式连接，总线可以包括任意数量的互联的总线和桥，总线将一个或多个处理器401和存储器402的各种电路连接在一起。总线还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路连接在一起，这些都是本领域所公知的，因此，本文不再对其进行进一步描述。总线接口在总线和收发机之间提供接口。收发机可以是一个元件，也可以是多个元件，比如多个接收器和发送器，提供用于在传输介质上与各种其他装置通信的单元。经处理器401处理的数据通过天线在无线介质上进行传输，进一步，天线还接收数据并将数据传送给处理器401。The memory 402 and the processor 401 are connected by a bus, and the bus may include any number of interconnected buses and bridges, and the bus connects one or more processors 401 and various circuits of the memory 402 together. The bus may also connect together various other circuits, such as peripherals, voltage regulators, and power management circuits, which are well known in the art and therefore will not be described further herein. The bus interface provides the interface between the bus and the transceiver. A transceiver may be a single element or multiple elements, such as multiple receivers and transmitters, providing a means for communicating with various other devices over a transmission medium. The data processed by the processor 401 is transmitted on the wireless medium through the antenna, and further, the antenna also receives the data and transmits the data to the processor 401 .

处理器401负责管理总线和通常的处理，还可以提供各种功能，包括定时，外围接口，电压调节、电源管理以及其他控制功能。而存储器402可以被用于存储处理器401在执行操作时所使用的数据。The processor 401 is responsible for managing the bus and general processing, and may also provide various functions including timing, peripheral interface, voltage regulation, power management, and other control functions. The memory 402 may be used to store data used by the processor 401 when performing operations.

本发明第五实施方式涉及一种计算机可读存储介质，存储有计算机程序。计算机程序被处理器执行时实现上述方法实施例。A fifth embodiment of the present invention relates to a computer-readable storage medium storing a computer program. The above method embodiments are implemented when the computer program is executed by the processor.

即，本领域技术人员可以理解，实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成，该程序存储在一个存储介质中，包括若干指令用以使得一个设备(可以是单片机，芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-OnlyMemory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。That is, those skilled in the art can understand that all or part of the steps in the method for implementing the above embodiments can be completed by instructing the relevant hardware through a program, and the program is stored in a storage medium and includes several instructions to make a device ( It may be a single chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, removable hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes.

本领域的普通技术人员可以理解，上述各实施方式是实现本发明的具体实施例，而在实际应用中，可以在形式上和细节上对其作各种改变，而不偏离本发明的精神和范围。Those skilled in the art can understand that the above-mentioned embodiments are specific examples for realizing the present invention, and in practical applications, various changes in form and details can be made without departing from the spirit and the spirit of the present invention. scope.

Claims

1. A method for labeling a face image is characterized by comprising the following steps:

acquiring a plurality of face area images of a figure original image;

performing feature extraction on the plurality of face region images to obtain a plurality of face feature vectors for representing the identity of a person, wherein one face region image corresponds to one face feature vector;

performing feature clustering on the plurality of face feature vectors to obtain a category to which each face feature vector in the plurality of face feature vectors belongs, wherein the category comprises a positive category for representing that a person identity corresponding to the face feature vector is a target person and a negative category for representing that a person identity corresponding to the face feature vector is a non-target person;

and labeling the face region image corresponding to the face feature vector belonging to the positive class.

2. The method for labeling a face image according to claim 1, before labeling the face region image corresponding to the face feature vector belonging to the positive class, further comprising:

deleting the face feature vectors belonging to the negative class, performing feature clustering again on the face feature vectors belonging to the positive class, and judging whether the face feature vectors belonging to the negative class exist in the face feature vectors subjected to feature clustering again;

if the facial feature vectors exist, the steps are repeated until the facial feature vectors of the feature clustering do not exist, and the facial feature vectors belong to the negative class.

3. The method for labeling a face image according to claim 1 or 2, wherein the performing feature clustering on the plurality of face feature vectors specifically comprises:

respectively taking each face feature vector in the N face feature vectors as a clustering center, and calculating the measurement distance from other N-1 face feature vectors in the N face feature vectors to the clustering center when the ith face feature vector is taken as the clustering center, wherein N is an integer greater than 1, and i is an integer less than or equal to N;

judging whether a measurement distance smaller than a preset threshold value exists in the measurement distances or not;

if not, judging that the ith personal face feature vector belongs to the negative class;

if yes, judging whether the number of the measurement distances smaller than a preset threshold value is larger than or equal to a preset number, and if yes, judging that the ith personal face feature vector belongs to the positive class; if not, the ith personal face feature vector is judged to belong to the negative class.

4. The method for labeling a human face image according to claim 3, wherein before said respectively using each of the N human face feature vectors as a cluster center, the method further comprises:

setting the size of a sliding window and the sliding step length;

the respectively using each face feature vector in the N face feature vectors as a clustering center specifically includes:

establishing a plurality of sliding windows according to the size of the sliding window, the sliding step length and the N face feature vectors, wherein the number of the face feature vectors in each sliding window is equal to the size of the sliding window;

and sequentially taking each face feature vector in each sliding window as the clustering center.

5. The method for labeling human face images according to claim 4, before establishing a plurality of sliding windows according to the sliding window size, the sliding step size and the N human face feature vectors, further comprising:

and performing randomization processing on the N face feature vectors.

6. The method for labeling a human face image according to claim 1, wherein the extracting features of the plurality of human face region images to obtain a plurality of human face feature vectors for representing the identity of a person specifically comprises:

and sequentially inputting the plurality of face region images into a preset neural network model to obtain the face feature vector.

7. The method for labeling a human face image according to claim 6, wherein the preset neural network model comprises a first-level neural network and a second-level neural network; the face feature vector is calculated in the following way:

inputting the face region image into the first-stage neural network to obtain an initial vector;

inputting the initial vector into the second-stage neural network, training the initial vector through a weight vector in the second-stage neural network and preset characteristic parameters to obtain the face characteristic vector, wherein the characteristic parameters are constants larger than 0.

8. The method for labeling a human face image according to claim 1, before extracting features of the plurality of human face region images, further comprising:

carrying out face image data preprocessing on the plurality of face area images to obtain face area images with resolution meeting preset requirements;

the feature extraction of the plurality of face region images specifically includes:

and carrying out feature extraction on the face region image with the resolution meeting the preset requirement.

9. A face image labeling device is characterized by comprising:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of labeling a face image according to any one of claims 1 to 8.

10. A computer-readable storage medium storing a computer program, wherein the computer program is executed by a processor to implement the method for labeling a human face image according to any one of claims 1 to 8.