CN100585617C

CN100585617C - Face Recognition System and Method Based on Classifier Ensemble

Info

Publication number: CN100585617C
Application number: CN200810150268A
Authority: CN
Inventors: 张莉; 周伟达; 霍婕婷; 刁丹丹; 焦李成
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2008-07-04
Filing date: 2008-07-04
Publication date: 2010-01-27
Anticipated expiration: 2028-07-04
Also published as: CN101303730A

Abstract

The invention discloses a face recognition system and method based on classifier integration, the purpose of which is to integrate the outputs of multiple sub-classifiers by solving the weight coefficients of multiple sub-classifiers to obtain a better recognition rate for the face recognition system . The whole system includes two parts, the training system and the classification system. The training system is completed: feature extraction of face images, selection of multiple sub-classifiers with posterior probability output, input of different training samples to the sub-classifiers for training, and obtaining The posterior probability of the original training samples is obtained by linear programming optimization to obtain the weighting coefficients of each sub-classifier; the classification system is completed: input the samples to be classified after feature extraction into the trained sub-classifiers, and obtain the posterior probability of the samples to be classified Probability, and design classification rules through the posterior probability and sub-classifier weighting coefficients, and output classification results. The invention has the advantage of high recognition rate and can be used for face recognition in the fields of machine learning and pattern recognition.

Description

Face Recognition System and Method Based on Classifier Ensemble

技术领域 technical field

本发明属于图像处理技术领域，特别是涉及人脸的识别，可用于公共安全，信息安全，金融安全的监督和防护。The invention belongs to the technical field of image processing, and in particular relates to face recognition, which can be used for supervision and protection of public safety, information safety and financial safety.

背景技术 Background technique

人脸识别是比较容易为人们所接受的非侵犯性识别手段，从而成为备受计算机视觉与模式识别等领域关注的热点问题。人脸识别技术的目的是赋予计算机根据人的面孔辨别人物身份的能力。人脸识别作为一个科学问题，是一个典型的图像模式分析，理解与分类的计算机问题，它涉及模式识别，计算机视觉，智能人机交互，图形学，认知科学等多个学科。作为生物特征识别关键技术之一的人脸识别技术在公共安全，信息安全，金融等领域具有潜在的应用前景。在人脸识别技术中，高精度核心识别算法是问题的关键，设计识别系统的最终目的是为了得到好的识别率。传统的方法是设计不同的分类器来实现人脸识别这个目的，在识别系统中的分类器的作用是：根据特征提取器得到的特征向量来给一个被测样本赋一个类别标记，从而达到分类的目的。由于不同分类器的错分样本不一定都相同，因而可对分类器进行融合，以产生更好的性能。大量的研究表明，集成多个子分类器是提高识别率一种有效手段。用这种手段可以实现对人脸的识别。Face recognition is a non-invasive recognition method that is relatively easy to be accepted by people, so it has become a hot issue that has attracted much attention in the fields of computer vision and pattern recognition. The purpose of face recognition technology is to give computers the ability to identify people based on their faces. As a scientific problem, face recognition is a typical computer problem of image pattern analysis, understanding and classification, which involves pattern recognition, computer vision, intelligent human-computer interaction, graphics, cognitive science and other disciplines. As one of the key technologies of biometric identification, face recognition technology has potential application prospects in public security, information security, finance and other fields. In face recognition technology, the high-precision core recognition algorithm is the key to the problem, and the ultimate goal of designing a recognition system is to obtain a good recognition rate. The traditional method is to design different classifiers to achieve the purpose of face recognition. The role of the classifier in the recognition system is to assign a class mark to a tested sample according to the feature vector obtained by the feature extractor, so as to achieve classification. the goal of. Since the misclassified samples of different classifiers are not necessarily the same, the classifiers can be fused to produce better performance. A large number of studies have shown that integrating multiple sub-classifiers is an effective means to improve the recognition rate. In this way, face recognition can be realized.

现有的分类器输出集成方法主要有：英国的J.Kittler等人在1998年论文中总结了分类器输出集成的方法。该方法提出，如果单个分类器的输出能表示成后验概率的形式，则可以采用乘积规则、和规则、最大规则、最小规则和中值规则来对多个子分类器的结果进行集成，这些规则是属于非线性的集成方式，在应用中比较复杂。而在实际应用中线性集成的方式是最为常见的，其中简单投票规则是常用的线性集成方式之一，为此意大利的G.Fumera等人在2005年对简单投票和加权投票方法进行了比较，指出如果单个分类器有相同的性能而且对估计误差有相同的相关性，则简单平均投票是最优的规则。否则，加权投票规则会优于简单平均投票规则。关于如何寻找权系数，J.A.Benediktsson等人于1997年和M.P.Perrone等人在1993年提出了用回归估计的方式来求解权系数，但这些方法不适用于分类问题。N.Ueda在2000年针对分类问题，设计了基于最小分类误差原则的线性加权方法。该方法的目标函数是非线性的，从理论上存在着局部极值点，而且求解目标函数所采用的梯度下降方法在很大程度上依赖于初始权值的选择，如果初始权值选择不好，将会降低分类器的识别率，导致人脸识别系统性能的恶化。The existing classifier output integration methods mainly include: J.Kittler et al. from the United Kingdom summarized the method of classifier output integration in their 1998 paper. This method proposes that if the output of a single classifier can be expressed in the form of posterior probability, the results of multiple sub-classifiers can be integrated by using the product rule, sum rule, maximum rule, minimum rule and median rule. It is a nonlinear integration method, which is more complicated in application. In practical applications, the linear integration method is the most common, and the simple voting rule is one of the commonly used linear integration methods. For this reason, Italian G. Fumera et al. compared simple voting and weighted voting methods in 2005. Point out that simple average voting is the optimal rule if the individual classifiers have the same performance and have the same correlation to the estimation error. Otherwise, the weighted voting rule is superior to the simple average voting rule. Regarding how to find the weight coefficients, J.A.Benediktsson et al. in 1997 and M.P.Perrone et al. in 1993 proposed to use regression estimation to solve the weight coefficients, but these methods are not suitable for classification problems. N. Ueda designed a linear weighting method based on the principle of minimum classification error for classification problems in 2000. The objective function of this method is nonlinear, and there are local extremum points in theory, and the gradient descent method used to solve the objective function depends to a large extent on the selection of initial weights. If the initial weights are not well selected, It will reduce the recognition rate of the classifier, leading to the deterioration of the performance of the face recognition system.

发明的内容content of the invention

本发明的目的在于克服上述已有技术的不足，提出了一种基于分类器集成的人脸识别系统及其方法，以提高集成分类器的识别率，改善人脸识别系统的性能。The purpose of the present invention is to overcome above-mentioned deficiencies in prior art, has proposed a kind of face recognition system and its method based on classifier integration, to improve the recognition rate of integrated classifier, improve the performance of face recognition system.

为实现上述目的，本发明的技术方案是：For realizing the above object, technical scheme of the present invention is:

一.基于分类器集成的人脸识别系统，包括：1. A face recognition system based on classifier integration, including:

原始人脸图像特征提取模块，用于对输入到计算机中的原始人脸图像进行特征提取，获取c类个有标识的原始训练样本集；The original face image feature extraction module is used to extract features from the original face image input into the computer, and obtain c class original training sample sets with identification;

待分类人脸图像特征提取模块，用于对输入到计算机中的待分类人脸图像进行特征提取，获取待分类样本x∈Rⁿ；The face image feature extraction module to be classified is used for feature extraction of the face image to be classified input into the computer, and obtains samples x∈R ⁿ to be classified;

训练模块，用于选择有后验概率输出的N个子分类器，并根据原始训练样本集对其进行训练，获得关于原始训练样本集的后验概率R_j(ω_k|x_i)，x_i表示第i个训练样本，ω_k表示第k个类别；The training module is used to select N sub-classifiers with posterior probability output, and train them according to the original training sample set to obtain the posterior probability R _j (ω _k |xi ₎ , _xi of the original training sample set Represents the i-th training sample, ω _k represents the k-th category;

子分类器加权系数计算模块，用于根据后验概率P_j(ω_k|x_i)，通过线性规划求解各个子分类器的加权系数α_j，并输出到集成模块；The sub-classifier weight coefficient calculation module is used to solve the weight coefficient α _j of each sub-classifier through linear programming according to the posterior probability P _j (ω _k |xi ₎ , and output it to the integration module;

子分类器分类模块，用于将待分类样本输入到训练过程中训练好的N个子分类器中，获得待分类样本的后验概率P_j(ω_k|x)；The sub-classifier classification module is used to input the sample to be classified into the N sub-classifiers trained in the training process to obtain the posterior probability P _j (ω _k |x) of the sample to be classified;

集成模块，用于根据训练过程得到的加权系数α_j和待分类的后验概率P_j(ω_k|x)设计分类规则，并根据该分类规则得到分类结果；The integration module is used to design a classification rule according to the weight coefficient α _j obtained in the training process and the posterior probability P _j (ω _k |x) to be classified, and obtain the classification result according to the classification rule;

分类结果输出模块，用于将待分类样本的分类结果以类别标识的形式输出，并在计算机显示屏上显示。The classification result output module is used for outputting the classification result of the sample to be classified in the form of category identification and displaying it on the computer display screen.

二、基于分类器集成的人脸识别训练方法，包括如下过程：2. A face recognition training method based on classifier integration, including the following process:

提取输入到计算机中的原始人脸图像特征，获取c类个有标识的原始训练样本集：{(x_i，y_i)|x_i∈Rⁿ，y_i∈{1，2，…，c}，i＝1，…，l}，其中：x_i表示n维实数空间中的一个样本，y_i是其标识，在1到c之间的正整数中取值，如果y_i＝k，则表示x_i∈ω_k类，l为样本的个数；Extract the features of the original face image input into the computer, and obtain c-type labeled original training sample sets: {( _xi , y _i )| _xi ∈ R ⁿ , y _i ∈ {1, 2, ..., c }, i=1,..., l}, wherein: x _i represents a sample in the n-dimensional real number space, and y _i is its mark, takes a value in a positive integer between 1 and c, if y _i =k, Then it means x _i ∈ ω _k class, l is the number of samples;

选择有后验概率输出的N个子分类器，并根据原始训练样本集对其进行训练，获得关于原始训练样本集的后验概率P_j(ω_k|x_i)，该式表示第j个子分类器关于x_i样本属于ω_k类的后验概率，其中j＝1，…，N，k＝1，…，c，i＝1，…，l；Select N sub-classifiers with posterior probability output, and train them according to the original training sample set to obtain the posterior probability P _j (ω _k | _xi ) of the original training sample set. This formula represents the jth sub-classification The device is the posterior probability that the x _i sample belongs to the class ω _k , where j=1,...,N, k=1,...,c, i=1,...,l;

根据后验概率P_j(ω_k|x_i)，通过线性规划求解各个子分类器的加权系数α_j，其求解公式为：According to the posterior probability P _j (ω _k | _xi ), the weighting coefficient α _j of each sub-classifier is solved by linear programming, and the solution formula is:

$\underset{α α,, ξ ξ}{min min} {Σ Σ}_{j j = = 11}^{N N} {α α}_{j j} + + C C {Σ Σ}_{q q = = 11}^{l l ((c c - - 11))} {ξ ξ}_{q q}$

subject to $[Σ_{j = 1}^{N} α_{j} (P_{j} (ω_{m} | x_{i}) - P_{j} (ω_{k} | x_{i}))] 1 (x_{i} &Element; ω_{m}) &GreaterEqual; - ξ_{q}$ subject to $[Σ_{j = 1}^{N} α_{j} (P_{j} (ω_{m} | x_{i}) - P_{j} (ω_{k} | x_{i}))] 1 (x_{i} &Element; ω_{m}) &Greater Equal; - ξ_{q}$

ξ_q≥0，α_j≥0，i＝1，…，l，j＝1，…，N，k≠m，k＝1，…，cξ _q ≥ 0, α _j ≥ 0, i=1,..., l, j=1,..., N, k≠m, k=1,..., c

q＝1，2，…，l(c-1)q=1, 2, ..., l(c-1)

式中，j＝1，…，N；

C是折中系数，ξ_q是松弛变量，

是容量控制项，

是经验风险项；In the formula, j=1,...,N;

C is the compromise coefficient, ξ _q is the slack variable,

is the capacity control item,

is the empirical risk item;

将各个子分类器的加权系数α_j输出到分类系统。Output the weighting coefficient α _j of each sub-classifier to the classification system.

三、基于分类器集成的人脸识别分类方法，包括如下过程：3. A face recognition classification method based on classifier integration, including the following process:

提取输入到计算机中的待分类人脸图像特征，获取待分类样本x∈Rⁿ；Extracting the feature of the face image to be classified inputted into the computer, and obtaining the sample x∈R ⁿ to be classified;

将待分类样本输入到训练过程中训练好的N个子分类器中，获得待分类样本的后验概率P_j(ω_k|x)，j＝1，…，N，k＝1，…，c；Input the sample to be classified into the N sub-classifiers trained in the training process, and obtain the posterior probability P _j (ω _k |x) of the sample to be classified, j=1,...,N, k=1,...,c ;

根据训练过程得到的加权系数α_j和待分类的后验概率P_j(ω_k|x)设计分类规则，该模块所设定的分类规则为：According to the weight coefficient α _j obtained in the training process and the posterior probability P _j (ω _k |x) to be classified, the classification rules are designed. The classification rules set by this module are:

如果 $Σ_{j = 1}^{N} α_{j} P_{j} (ω_{m} | x) = \max_{k = 1,2, \cdot \cdot \cdot, c} Σ_{j = 1}^{N} α_{j} P_{j} (ω_{k} | x),$ 则x∈ω_m类，m∈{1，2，…，c}代表不同的类别，其类别标识为y＝m。if $Σ_{j = 1}^{N} α_{j} P_{j} (ω_{m} | x) = \max_{k = 1,2, \cdot &Center Dot; &Center Dot;, c} Σ_{j = 1}^{N} α_{j} P_{j} (ω_{k} | x),$ Then x ∈ ω _m class, m ∈ {1, 2, ..., c} represent different categories, and its category identification is y=m.

将待分类样本的分类结果以类别标识的形式输出，并在计算机显示屏上显示。The classification results of the samples to be classified are output in the form of category identification and displayed on the computer display screen.

本发明由于在训练系统中通过子分类器加权系数计算模块得到子分类器加权系数是全局最优的，从而保证了整个分类系统对待分人脸图像较高的识别率。仿真结果表明，在对360幅图像重复进行30次训练和分类的条件下，本发明的平均识别率比现有的Ueda线性加权方法的平均识别率高3.54％。In the present invention, the weight coefficient of the sub-classifier obtained through the weight coefficient calculation module of the sub-classifier in the training system is globally optimal, thereby ensuring a higher recognition rate of the whole classification system for the face image to be divided. Simulation results show that under the condition of repeating 30 times of training and classification on 360 images, the average recognition rate of the present invention is 3.54% higher than that of the existing Ueda linear weighting method.

附图说明 Description of drawings

图1是本发明的虚拟系统框图；Fig. 1 is a virtual system block diagram of the present invention;

图2是本发明的训练过程流程图；Fig. 2 is the training process flowchart of the present invention;

图3是本发明的分类过程流程图。Fig. 3 is a flow chart of the classification process of the present invention.

具体实施方式 Detailed ways

参照图1，本发明的人脸识别系统包括训练系统和分类系统两部分，其中训练系统由原始人脸图像特征提取模块、训练模块、子分类器加权系数计算模块和训练结果输出模块组成。分类系统由待分类人脸图像特征提取模块、子分类器分类模块、集成模块和分类结果输出模块组成。其工作原理为：With reference to Fig. 1, face recognition system of the present invention comprises two parts of training system and classification system, and wherein training system is made up of original face image feature extraction module, training module, sub-classifier weight coefficient calculation module and training result output module. The classification system consists of a face image feature extraction module to be classified, a sub-classifier classification module, an integration module and a classification result output module. Its working principle is:

原始人脸图像特征提取模块对输入到计算机中的原始人脸图像进行特征提取，获取c类个有标识原始训练样本集：{(x_i，y_i)|x_i∈Rⁿ，y_i∈{1，2，…，c}，i＝1，…，l}，其中：x_i表示n维实数空间中的一个样本，y_i是其标识，在1到c之间的正整数中取值，如果y_i＝k，则表示x_i∈ω_k类，l为样本的个数，该原始训练样本集输送给训练模块。训练模块，首先选择有后验概率输出的N个子分类器，然后根据原始训练样本集对这些子分类器进行训练，获得关于原始训练样本集的后验概率P_j(ω_k|x_i)，式中，x_i表示第i个训练样本，ω_k表示第k个类别，j＝1，…，N，k＝1，…，c，i＝1，…，l，每个子分类器的训练样本是通过对原始训练样本集的特征进行随机采样生成的，并且每个子分类器的特征数是相同的。子分类器加权系数计算模块，根据后验概率P_j(ω_k|x_i)，通过线性规划求解各个子分类器的加权系数α_j，其求解公式为：The original face image feature extraction module extracts features from the original face image input into the computer, and obtains c-type original training sample sets with labels: {( _xi , y _i )| _xi ∈ R ⁿ , y _i ∈ {1, 2, ..., c}, i=1, ..., l}, wherein: x _i represents a sample in the n-dimensional real number space, y _i is its identity, and it is selected from positive integers between 1 and c value, if y _i =k, it means x _i ∈ ω _k class, l is the number of samples, the original training sample set is sent to the training module. The training module first selects N sub-classifiers with posterior probability output, and then trains these sub-classifiers according to the original training sample set to obtain the posterior probability P _j (ω _k | _xi ) of the original training sample set, In the formula, x _i represents the i-th training sample, ω _k represents the k-th category, j=1,...,N, k=1,...,c, i=1,...,l, the training of each sub-classifier The samples are generated by randomly sampling the features of the original training sample set, and the number of features of each sub-classifier is the same. The sub-classifier weight coefficient calculation module, according to the posterior probability P _j (ω _k | _xi ), solves the weight coefficient α _j of each sub-classifier through linear programming, and its solution formula is:

q＝1，2，…，l(c-1)q=1, 2, ..., l(c-1)

式中，j＝1，…，N；C是折中系数，ξ_q是松弛变量，

是容量控制项，

是经验风险项。该加权系数α_j的具体计算可以调用已有的工具包来实现，比如调用Matlab中的线性规划函数，即可求得线性规划问题的最优解α_j。得到各个子分类器的加权系数α_j后，输出到分类系统中的集成模块作为后续分类的一个输入参数。In the formula, j=1,...,N; C is the compromise coefficient, ξ _q is the slack variable,

is the capacity control item,

is the empirical risk term. The specific calculation of the weighting coefficient α _j can be realized by calling the existing toolkit, for example, calling the linear programming function in Matlab, and the optimal solution α _j of the linear programming problem can be obtained. After getting the weighted coefficient α _j of each sub-classifier, it is output to the integration module in the classification system as an input parameter for subsequent classification.

分类系统的待分类人脸图像特征提取模块，对输入到计算机中的待分类人脸图像进行特征提取，获取待分类样本x∈Rⁿ，并输出给子分类器分类模块，其中待分类样本不能出现在训练样本集中。子分类器分类模块，将该待分类样本再输入到训练模块中已经训练好的N个子分类器中，获得待分类样本的后验概率P_j(ω_k|x)，并输出给集成模块。集成模块，根据训练系统得到的权系数α_j和待分类的后验概率P_j(ω_k|x)设计分类规则并获得分类结果，该模块所设定的分类规则为：如果 $Σ_{j = 1}^{N} α_{j} P_{j} (ω_{m} | x) = \max_{k = 1,2, \cdot \cdot \cdot, c} Σ_{j = 1}^{N} α_{j} P_{j} (ω_{k} | x),$ 则x∈ω_m类，m∈{1，2，…，c}代表不同的类别，其分类结果可用类别标识y＝m表示，并通过分类结果输出模块在计算机显示屏上显示。The feature extraction module of the face image to be classified in the classification system extracts the feature of the face image to be classified input into the computer, obtains the sample x∈R ⁿ to be classified, and outputs it to the sub-classifier classification module, wherein the sample to be classified cannot appear in the training sample set. The sub-classifier classification module inputs the sample to be classified into the N sub-classifiers that have been trained in the training module, obtains the posterior probability P _j (ω _k |x) of the sample to be classified, and outputs it to the integration module. The integration module, according to the weight coefficient α _j obtained by the training system and the posterior probability P _j (ω _k |x) to be classified, designs classification rules and obtains classification results. The classification rules set by this module are: if $Σ_{j = 1}^{N} α_{j} P_{j} (ω_{m} | x) = \max_{k = 1,2, &Center Dot; \cdot &Center Dot;, c} Σ_{j = 1}^{N} α_{j} P_{j} (ω_{k} | x),$ Then x ∈ ω _m class, m ∈ {1, 2, ..., c} represent different categories, the classification result can be represented by the class identification y=m, and displayed on the computer screen through the classification result output module.

上述整个人脸识别系统中的各个模块均通过计算机程序实现其功能，完成对人脸图像的识别。Each module in the above-mentioned entire face recognition system realizes its function through a computer program, and completes the recognition of the face image.

参照图2，对本发明实现人脸识别的训练实施过程进行如下详细描述：With reference to Fig. 2, the training implementation process that the present invention realizes face recognition is described in detail as follows:

该实施例是在以本发明技术方案为前提下进行实施的，给出了详细的实施方式和具体的操作过程，但本发明的保护范围不限于下述的实施例。This embodiment is implemented on the premise of the technical solution of the present invention, and detailed implementation and specific operation process are given, but the protection scope of the present invention is not limited to the following embodiments.

实施例采用了一个公用的人脸数据库——UMIST数据库。在该数据库中共有574幅20个不同的人脸图像。这个图像库是一个多视角，具有从侧面到正面的不同姿态人脸的数据库，其训练过程是：The embodiment adopts a public face database——UMIST database. There are 574 images of 20 different faces in this database. This image library is a multi-view database with different poses from the side to the front. The training process is:

步骤1，对原始人脸图像进行特征提取，获取c类有标识的原始训练样本集：{(x_i，y_i)|x_i∈Rⁿ，yi∈{1，2，…，c}，i＝1，…，l}，其中：x_i表示n维实数空间中的一个样本，y_i是其标识，在1到c之间的正整数中取值，如果y_i＝k，则表示x_i∈ω_k类，l为样本的个数；特征提取是指对图像进行主分量分析或下采样或各种变换。Step 1, perform feature extraction on the original face image, and obtain the original training sample set with class c identification: {( _xi , y _i )| _xi ∈ R ⁿ , yi ∈ {1, 2, ..., c}, i=1,...,l}, wherein: x _i represents a sample in the n-dimensional real number space, y _i is its identifier, and takes a value in a positive integer between 1 and c, if y _i =k, then represents x _i ∈ω _k category, l is the number of samples; feature extraction refers to principal component analysis or downsampling or various transformations on the image.

原始的人脸图像的大小均为112×92，这里特征提取方法采用了下四采样方法，采样后每幅图像的大小为n＝28×23。The size of the original face image is 112×92. Here, the feature extraction method adopts the next four sampling method, and the size of each image after sampling is n=28×23.

由于每个类别拥有的样本个数有多有少，最少的为19幅。对每个类别随机抽取18个样本，则整个数据集大小为360幅图像。仿真时对每类人脸从18个中随机选取5个样本作为训练样本，其余的13个作为测试样本。这样形成了一组训练-测试样本集。这里训练样本的参数n、c和l的取值分别为n＝644，c＝20和l＝100。Since each category has more or less samples, the minimum is 19. 18 samples are randomly drawn for each category, so the size of the entire dataset is 360 images. During the simulation, 5 samples are randomly selected from 18 for each type of face as training samples, and the remaining 13 are used as test samples. This forms a set of training-testing samples. Here, the values of the parameters n, c and l of the training samples are n=644, c=20 and l=100 respectively.

步骤2，选择有后验概率输出的分类器，并设定子分类器的个数N，设计子分类器；分类器的选择可以有神经网络或支持向量机或K近邻或判别方法或决策树或贝叶斯决策分类器。设计子分类器是指给子分类器设计不同的训练集合，其方法采用对原始训练集直接随机采样或对原始训练集的特征进行随机采样，使得子分类器具有多样性。Step 2, select a classifier with posterior probability output, and set the number N of sub-classifiers, and design a sub-classifier; the choice of classifier can be neural network or support vector machine or K nearest neighbor or discriminant method or decision tree Or a Bayesian decision classifier. Designing sub-classifiers refers to designing different training sets for sub-classifiers. The method uses direct random sampling of the original training set or random sampling of the features of the original training set, so that the sub-classifiers have diversity.

在实施过程中，选择了K近邻分类器，这是一种非参数学习模型，而且由于K近邻分类器本身不是具有概率输出的，所以要对其输出进行处理，以获得概率输出的形式。具体处理过程为：对某个训练或待分类样本x′在训练样本中找到其K个近邻，如果在这K个近邻中属于ω_k类的训练样本为K_k，则 $K = Σ_{k = 1}^{c} K_{k} .$ 用P(ω_k|x′)＝K_k/K来表示K近邻分类器关于x′的后验概率。这里设定分类器个数N＝100，而近邻个数K当成一个可变的参数。在实验中K的变化范围为{3，4，5}。对每个子分类器其训练样本是采用对原始训练集的特征进行随机采样，每个子分类器的特征数是相同的。这里把特征数也当成是一个可变的参数，变化范围为{2³，2⁴，2⁵，2⁶}。In the implementation process, the K-nearest neighbor classifier is selected, which is a non-parametric learning model, and since the K-nearest neighbor classifier itself does not have a probability output, its output must be processed to obtain a form of probability output. The specific processing process is: for a certain training or to-be-classified sample x′, find its K neighbors in the training sample, if the training sample belonging to the ω _k class among the K neighbors is K _k , then $K = Σ_{k = 1}^{c} K_{k} .$ Use P(ω _k |x')=K _k /K to represent the posterior probability of the K nearest neighbor classifier about x'. Here, the number of classifiers is set to N=100, and the number of neighbors K is regarded as a variable parameter. The variation range of K in the experiment is {3, 4, 5}. The training samples of each sub-classifier are randomly sampled from the features of the original training set, and the number of features of each sub-classifier is the same. Here, the characteristic number is also regarded as a variable parameter, and the range of variation is {2 ³ , 2 ⁴ , 2 ⁵ , 2 ⁶ }.

步骤3，训练N个子分类器，并获得各个子分类器关于原始训练样本的后验概率P_j(ω_k|x_i)，式中，x_i表示第i个样本，ω_k表示第k个类别，j＝1，…，N，k＝1，…，c，i＝1，…，l；Step 3, train N sub-classifiers, and obtain the posterior probability P _j (ω _k | _xi ) of each sub-classifier on the original training sample, where x _i represents the i-th sample, and ω _k represents the k-th category, j=1,...,N,k=1,...,c,i=1,...,l;

步骤4，通过线性规划求得各个子分类器加权系数α_j，j＝1，…，N。Step 4, obtain the weighting coefficient α _j of each sub-classifier through linear programming, j=1,...,N.

线性规划的形式为：The form of linear programming is:

q＝1，2，…，l(c-1)q=1, 2, ..., l(c-1)

其中C是折中系数，ξ_q是松弛变量和α_j是第j个分类器的加权系数。in C is the compromise coefficient, _ξq is the slack variable and _αj is the weighting coefficient of the jth classifier.

在目标函数

中，第一项是容量控制项，第二项是经验风险项，该目标函数的最小化体现了结构风险最小化原则。线性规划的求解可以调用已有的工具包来实现，比如调用Matlab中的线性规划函数，即可求得线性规划问题的最优解α_j。in the objective function

Among them, the first item is the capacity control item, and the second item is the empirical risk item. The minimization of the objective function embodies the principle of structural risk minimization. The solution of linear programming can be realized by calling the existing toolkit, for example, calling the linear programming function in Matlab to obtain the optimal solution α _j of the linear programming problem.

参照图3，对本发明实现人脸识别的分类实施过程进行如下详细描述：With reference to Fig. 3, the classification implementation process that the present invention realizes face recognition is described in detail as follows:

步骤A，对待分类的人脸图像进行特征提取，获取待分类样本x∈Rⁿ，并将其输入到训练过程中训练好的N个子分类器中；这里的特征提取和训练过程中的特征提取方式是一致的；Step A, perform feature extraction on the face image to be classified, obtain the sample x∈R ⁿ to be classified, and input it into the N sub-classifiers trained during the training process; here, the feature extraction and the feature extraction during the training process the way is consistent;

在本实施例中待分类样本也被称为是测试样本，不能出现在训练样本集中。In this embodiment, the samples to be classified are also referred to as test samples, which cannot appear in the training sample set.

步骤B，采用与训练过程相同的方法获得对待分类样本的后验概率P_j(ω_k|x)，j＝1，…，N，k＝1，…，c；Step B, using the same method as the training process to obtain the posterior probability P _j (ω _k |x) of the sample to be classified, j=1,..., N, k=1,..., c;

步骤C，输出对待分类样本的分类结果；分类规则为：Step C, output the classification result of the sample to be classified; the classification rule is:

如果 $Σ_{j = 1}^{N} α_{j} P_{j} (ω_{m} | x) = \max_{k = 1,2, \cdot \cdot \cdot, c} Σ_{j = 1}^{N} α_{j} P_{j} (ω_{k} | x),$ 则x∈ω_k类，即其标识为y＝m。if $Σ_{j = 1}^{N} α_{j} P_{j} (ω_{m} | x) = \max_{k = 1,2, \cdot \cdot \cdot, c} Σ_{j = 1}^{N} α_{j} P_{j} (ω_{k} | x),$ Then x∈ω _k class, that is, its identification is y=m.

本发明的效果可以通过以下仿真数据进一步说明：Effect of the present invention can be further illustrated by the following simulation data:

1、仿真条件与内容1. Simulation conditions and content

对UMIST数据集都按照上述具体实施过程获取训练-分类样本集的方式进行仿真，生成30组随机的训练-分类样本集，重复上述训练和分类过程30次，并计算其平均识别率。几种方法的仿真是在相同的实验环境下进行的。The UMIST data set is simulated by obtaining the training-classification sample set according to the above-mentioned specific implementation process, generating 30 sets of random training-classification sample sets, repeating the above training and classification process 30 times, and calculating the average recognition rate. The simulations of several methods are carried out under the same experimental environment.

2、仿真结果2. Simulation results

在重复上述训练和分类过程30次中，记录下每次仿真得到的识别率，并计算其平均识别率，如表1所示。表1中的“K近邻方法”是没有采用集成的分类器；集成方法中，均采用了100个子分类器，“简单投票”指的是权值都取为1/100，以及“Ueda线性加权”采用的是N.Ueda在2000年针对分类问题设计的最优线性加权方法进行的仿真。In repeating the above training and classification process 30 times, record the recognition rate obtained by each simulation, and calculate the average recognition rate, as shown in Table 1. The "K nearest neighbor method" in Table 1 is a classifier that does not use integration; in the integration method, 100 sub-classifiers are used, "simple voting" means that the weights are all taken as 1/100, and "Ueda linear weighted "Used is the simulation of the optimal linear weighting method designed by N. Ueda in 2000 for classification problems.

表1.识别性能比较Table 1. Recognition performance comparison

从表1的仿真结果可以看出，集成方法的识别率高于没有集成的方法，而且在这些集成方法中，本发明方法的平均识别率要高于简单投票和Ueda等人提出的最优线性加权方法。As can be seen from the simulation results in Table 1, the recognition rate of the integrated method is higher than that of the method without integration, and in these integrated methods, the average recognition rate of the method of the present invention is higher than the optimal linearity proposed by simple voting and Ueda et al. weighting method.

Claims

1. A face recognition system based on classifier integration, comprising:

The original face image feature extraction module is used to extract features from the original face image input into the computer, and obtain c original training sample sets with identification;

The face image feature extraction module to be classified is used to extract the feature of the face image to be classified input into the computer, and obtain the sample x to be classified in the n-dimensional real number space;

The training module is used to select N sub-classifiers with posterior probability output, and train them according to the original training sample set. The expression of the original training sample set is: {( _xi , y _i )| _xi ∈ R ⁿ , y _i ∈ {1, 2, ..., c}, i=1, ..., l}, where: x _i represents the ith sample in the n-dimensional real number space, y _i is its identity, between 1 and The value is taken from positive integers between c, if y _i =k, it means x _i ∈ ω _k category, ω _k means the kth category, l is the number of samples; through training to obtain the original training sample set after Posterior probability P _j (ω _k | _xi ), which expresses the posterior probability of the j-th sub-classifier about the x _i sample belonging to ω _k class, where j=1,...,N, k=1,...,c, i=1,...,l;

The sub-classifier weight coefficient calculation module is used to solve the weight coefficient α _j of each sub-classifier through linear programming according to the posterior probability P _j (ω _k |xi ₎ , and output it to the integration module. The formula for solving the weight coefficient for:

\underset{α α,, ξ ξ}{min min} {Σ Σ}_{j j = = 11}^{N N} {α α}_{j j} + + C C {Σ Σ}_{q q = = 11}^{l l ((c c - - 11))} {ξ ξ}_{q q}

subject  to subject to [[{Σ Σ}_{j j = = 11}^{N N} {α α}_{j j} (({P P}_{j j} (({ω ω}_{m m} | | {x x}_{i i})) - - {P P}_{j j} (({ω ω}_{k k} | | {x x}_{i i}))))]] 11 (({x x}_{i i} &Element; &Element; {ω ω}_{m m})) &GreaterEqual; &Greater Equal; - - {ξ ξ}_{q q}

ξ _q ≥ 0, α _j ≥ 0, i=1,..., l, j=1,..., N, k≠m, k=1,..., c,

q=1, 2, ..., l(c-1)

In the formula,

C is the compromise coefficient, _ξq is the slack variable and _αj is the weighting coefficient of the jth classifier,

is the capacity control item,

is the empirical risk item;

The sub-classifier classification module is used to input the sample to be classified into the N sub-classifiers trained in the training process to obtain the posterior probability P _j (ω _k |x) of the sample to be classified;

The integration module is used to design a classification rule according to the weight coefficient α _j obtained in the training process and the posterior probability P _j (ω _k |x) to be classified, and obtain the classification result according to the classification rule. The classification rule is:

if

Σ_{j = 1}^{N} α_{j} P_{j} (ω_{m} | x) = \max_{k = 1,2, \cdot \cdot \cdot, c} Σ_{j = 1}^{N} α_{j} P_{j} (ω_{k} | x),

Then x ∈ ω _m class, m ∈ {1, 2, ..., c} represent different categories, and the classification result can be represented by the category identifier y=m;

The classification result output module is used for outputting the classification result of the sample to be classified in the form of category identification and displaying it on the computer display screen.

2. A face recognition method based on classifier integration, comprising:

(1) Face recognition training process:

Extract the original face image features input into the computer, and obtain c original training sample sets with labels: {( _xi , y _i )| _xi ∈ R ⁿ , y _i ∈ {1, 2, ..., c}, i=1,...,l}, wherein: x _i represents the i-th sample in the n-dimensional real number space, and y _i is its identification, taking a value in a positive integer between 1 and c, if y _i =k, It means x _i ∈ ω _k class, ω _k represents the kth category, l is the number of samples;

Select N sub-classifiers with posterior probability output, and train them according to the original training sample set to obtain the posterior probability P _j (ω _k | _xi ) of the original training sample set. This formula represents the jth sub-classification The device is the posterior probability that the x _i sample belongs to the class ω _k , where j=1,...,N, k=1,...,c, i=1,...,l;

According to the posterior probability P _j (ω _k | _xi ), the weighting coefficient α _j of each sub-classifier is solved by linear programming, and the solution formula is:

\underset{α α,, ξ ξ}{min min} {Σ Σ}_{j j = = 11}^{N N} {α α}_{j j} + + C C {Σ Σ}_{q q = = 11}^{l l ((c c - - 11))} {ξ ξ}_{q q}

subject  to subject to [[{Σ Σ}_{j j = = 11}^{N N} {α α}_{j j} (({P P}_{j j} (({ω ω}_{m m} | | {x x}_{i i})) - - {P P}_{j j} (({ω ω}_{k k} | | {x x}_{i i}))))]] 11 (({x x}_{i i} &Element; &Element; {ω ω}_{m m})) &GreaterEqual; &Greater Equal; - - {ξ ξ}_{q q}

ξ _q ≥ 0, α _j ≥ 0, i=1,..., l, j=1,..., N, k≠m, k=1,..., c,

q=1, 2, ..., l(c-1)

In the formula, j=1,...,N;

C is the compromise coefficient, ξ _q is the slack variable,

is the capacity control item,

is the empirical risk item;

Output the weighting coefficient α _j of each sub-classifier to the classification system;

(2) Face recognition classification process:

Extracting the feature of the face image to be classified inputted into the computer, and obtaining the sample x∈R ⁿ to be classified;

Input the sample to be classified into the N sub-classifiers trained in the training process, and obtain the posterior probability P _j (ω _k |x) of the sample to be classified, j=1,...,N, k=1,...,c ;

According to the weight coefficient α _j obtained in the training process and the posterior probability P _j (ω _k |x) of the sample to be classified, the classification rule is designed as:

if

Σ_{j = 1}^{N} α_{j} P_{j} (ω_{m} | x) = \max_{k = 1,2, &Center Dot; &Center Dot; &Center Dot;, c} Σ_{j = 1}^{N} α_{j} P_{j} (ω_{k} | x),

Then x ∈ ω _m class, m ∈ {1, 2, ..., c} represent different categories, and its category identification is y=m;

The classification results of the samples to be classified are output in the form of category identification and displayed on the computer display screen.