CN103761532B

CN103761532B - Label space dimensionality reducing method and system based on feature-related implicit coding

Info

Publication number: CN103761532B
Application number: CN201410024964.XA
Authority: CN
Inventors: 丁贵广; 林梓佳; 林运祯
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2014-01-20
Filing date: 2014-01-20
Publication date: 2017-04-19
Anticipated expiration: 2034-01-20
Also published as: CN103761532A

Abstract

The present invention proposes a label space dimensionality reduction method based on feature-related implicit coding, which includes the following steps: providing a training data set; constructing a feature matrix and a labeling matrix according to the training data set; obtaining the optimal dimensionality reduction matrix and feature matrix according to the feature matrix Optimum correlation function, and obtain the optimal restoration error function of the dimension reduction matrix and the annotation matrix according to the labeling matrix; construct the objective function according to the optimal correlation function and the optimal restoration error function; apply the objective function to optimize the dimensionality reduction matrix, and according to the optimized The dimensionality reduction matrix is solved to obtain the decoding matrix; the optimized dimensionality reduction matrix is used to learn and train to obtain the prediction model; the characteristics of the test instance are extracted, and the prediction model is used to predict the representation of the test instance in the latent semantic space; and the decoding matrix is used to analyze the test instance in the latent semantic space The representation in the latent semantic space is decoded to obtain the classification result of the test instance in the original label space. The method of the invention has high compression rate, good stability and strong universality.

Description

Label space dimensionality reduction method and system based on feature-related implicit coding

技术领域technical field

本发明涉及计算机软件技术，尤其涉及一种基于特征相关隐式编码的标签空间降维方法及系统。The invention relates to computer software technology, in particular to a label space dimensionality reduction method and system based on feature-related implicit coding.

背景技术Background technique

多标签分类技术（Multi-label classification）主要用于将某个实例划分到一个或者多个类别之中，从而能更加完整、细致地描述实例的特征，而实例所归属的类别亦被称为其所对应的标签（Label）。多标签分类技术在现实中有着十分广泛的应用，诸如多标签文本分类、图像语义标注、音频情感分析等等。近年来，随着网络应用的大量涌现与迅猛发展，多标签分类应用开始面临数据量膨胀所带来的诸多挑战和困难，其中包括标签空间的快速增长等。例如，在图片分享网站Flickr上，用户在上传图片时可以从数以百万计甚至更多的词汇表中选择若干用于描述图片的内容。对于诸如网络图像语义标注等借助于Flickr数据的多标签分类应用而言，这些文本词汇将被视为不同的标签，从而如此庞大的标签数量将给这些应用底层的算法学习过程带来成本上的极大提升。对于多标签分类而言，目前大量方法的基本思想仍然是将其分解为多个二分类问题，即为每一个标签训练相应的预测模型（Predictive model）用于判断一个实例是否属于该标签，最终将该实例所归属的所有标签作为其对应的多个描述。当标签空间急速膨胀，即标签数量十分庞大时，这些方法所需要训练的预测模型数量也急速增加，从而导致其训练成本极大上升。Multi-label classification technology (Multi-label classification) is mainly used to divide an instance into one or more categories, so as to describe the characteristics of the instance more completely and in detail, and the category to which the instance belongs is also called its category. The corresponding label (Label). Multi-label classification technology has a very wide range of applications in reality, such as multi-label text classification, image semantic annotation, audio sentiment analysis and so on. In recent years, with the emergence and rapid development of network applications, multi-label classification applications have begun to face many challenges and difficulties brought about by the expansion of data volume, including the rapid growth of label space. For example, on the picture sharing website Flickr, when uploading pictures, users can choose several words to describe the pictures from millions or even more vocabularies. For multi-label classification applications using Flickr data, such as network image semantic annotation, these text words will be regarded as different labels, so such a large number of labels will bring cost to the underlying algorithm learning process of these applications Great improvement. For multi-label classification, the basic idea of a large number of current methods is still to decompose it into multiple binary classification problems, that is, to train a corresponding predictive model (Predictive model) for each label to judge whether an instance belongs to the label, and finally All tags to which the instance belongs are its corresponding multiple descriptions. When the label space expands rapidly, that is, when the number of labels is very large, the number of prediction models that these methods need to train also increases rapidly, resulting in a significant increase in their training costs.

标签空间降维的出现为解决标签数量庞大情况下的多标签分类问题指出了一个可行的探索方向，并提供了技术支撑，近几年来逐步成为了研究界的一个热点，并涌现出了若干优秀的降维方法。例如，利用原始标签空间的稀疏性，通过借助压缩感知（Compressedsensing）方法进行标签空间的降维，并利用其对应的解码算法进行从潜语义空间到原始标签空间的恢复。在此方案的基础上有研究者进一步将降维过程与预测模型的学习过程统一到同一个概率模型框架下，进而通过同时优化上述两个过程获得分类性能的提升。另外，有些研究也将主成分分析方法（Principal component analysis）应用到标签空间降维上，称为Principal label space transformation方法。进一步地，有研究者将特征空间与潜语义空间之间的相关性考虑进来，提出了Feature-ware conditional principal labelspace transformation方法，获得了较为明显的性能提升。另有研究者也提出了利用线性的高斯随机投影方向对原始标签空间进行映射，并保留映射后的符号值作为降维结果，而解码过程则是利用一系列基于KL散度（Kullback-Leibler divergence）的假设测试来实现。还有研究者直接通过对训练数据的标注矩阵进行布尔矩阵分解（Boolean matrixdecomposition），得到降维矩阵和解码矩阵，其中，降维矩阵即为降维结果，而解码矩阵则是将潜语义空间恢复到原始标签空间的线性映射。The emergence of label space dimensionality reduction points out a feasible exploration direction for solving the multi-label classification problem with a large number of labels, and provides technical support. In recent years, it has gradually become a hot spot in the research community, and several excellent dimensionality reduction method. For example, using the sparsity of the original label space, the dimensionality reduction of the label space is carried out by means of compressed sensing (Compressed sensing), and the corresponding decoding algorithm is used to recover from the latent semantic space to the original label space. On the basis of this scheme, some researchers further unify the dimensionality reduction process and the learning process of the prediction model under the same probability model framework, and then improve the classification performance by optimizing the above two processes at the same time. In addition, some studies also apply the principal component analysis method (Principal component analysis) to label space dimensionality reduction, which is called the Principal label space transformation method. Furthermore, some researchers took into account the correlation between feature space and latent semantic space, and proposed the Feature-ware conditional principal labelspace transformation method, which achieved a more obvious performance improvement. Another researcher also proposed to use linear Gaussian random projection direction to map the original label space, and retain the mapped symbol value as the result of dimensionality reduction, while the decoding process uses a series of KL divergence (Kullback-Leibler divergence) ) for hypothesis testing. There are also researchers who directly perform Boolean matrix decomposition on the annotation matrix of the training data to obtain the dimensionality reduction matrix and the decoding matrix. A linear map to the original label space.

从目前的研究来看，主要的解决方案是预先假定一个显式编码函数，并且通常取为线性函数。但由于高维空间结构的复杂性，显式编码函数可能无法精确地描述原始标签空间到最优的潜语义空间之间的映射关系，从而影响最终的降维结果。此外，尽管有少量工作可以不假定显式编码函数，而是直接学习降维结果，但目前这些工作却没有将潜语义空间与特征空间的相关性考虑进来，可能导致最终得到的降维结果难以被从特征空间上学习到的预测模型所描述，从而导致最终的分类性能不佳。From the current research, the main solution is to presuppose an explicit encoding function, and it is usually taken as a linear function. However, due to the complexity of the high-dimensional space structure, the explicit encoding function may not be able to accurately describe the mapping relationship between the original label space and the optimal latent semantic space, thus affecting the final dimensionality reduction results. In addition, although there are a few works that can directly learn the dimensionality reduction results without assuming an explicit encoding function, these works do not take into account the correlation between the latent semantic space and the feature space, which may make the final dimensionality reduction results difficult. Described by a predictive model learned from the feature space, leading to poor final classification performance.

发明内容Contents of the invention

本发明旨在至少在一定程度上解决相关技术中的技术问题之一。为此，本发明的一个目的在于提出一种具有信息考虑充分、分类性能保持度高、标签空间压缩率大、稳定性好、普适性强的基于特征相关隐式编码的标签空间降维方法。The present invention aims to solve one of the technical problems in the related art at least to a certain extent. Therefore, an object of the present invention is to propose a label space dimensionality reduction method based on feature-related implicit coding, which has sufficient information considerations, high classification performance retention, high label space compression rate, good stability, and strong universality. .

本发明的另一个目的在于提出一种基于特征相关隐式编码的标签空间降维系统。Another object of the present invention is to propose a label space dimensionality reduction system based on feature correlation implicit coding.

本发明第一方面实施例提出了一种基于特征相关隐式编码的标签空间降维方法，包括以下步骤：提供训练数据集；根据所述训练数据集构造特征矩阵和标注矩阵；根据所述特征矩阵得到降维矩阵与所述特征矩阵的最优相关函数，并根据所述标注矩阵得到所述降维矩阵与所述标注矩阵的最优恢复误差函数；根据所述最优相关函数和所述最优恢复误差函数构造目标函数；应用所述目标函数优化所述降维矩阵，并根据优化后的降维矩阵求解出解码矩阵；利用所述优化后的降维矩阵学习训练以获取预测模型；提取测试实例特征，并利用所述预测模型预测所述测试实例在潜语义空间中的表示；以及利用所述解码矩阵对所述测试实例在所述潜语义空间中的表示进行解码，以获取所述测试实例在原始标签空间的分类结果。The embodiment of the first aspect of the present invention proposes a label space dimensionality reduction method based on feature-related implicit coding, including the following steps: providing a training data set; constructing a feature matrix and a labeling matrix according to the training data set; Matrix to obtain the optimal correlation function between the dimensionality reduction matrix and the feature matrix, and obtain the optimal recovery error function between the dimensionality reduction matrix and the annotation matrix according to the labeling matrix; according to the optimal correlation function and the constructing an objective function with an optimal recovery error function; optimizing the dimensionality reduction matrix by applying the objective function, and solving a decoding matrix according to the optimized dimensionality reduction matrix; using the optimized dimensionality reduction matrix to learn and train to obtain a prediction model; extracting test instance features, and using the predictive model to predict the representation of the test instance in the latent semantic space; and using the decoding matrix to decode the representation of the test instance in the latent semantic space to obtain the The classification results of the above test examples in the original label space.

根据本发明实施例的基于特征相关隐式编码的标签空间降维方法，在学习降维结果的过程中也充分考虑了其与标注矩阵的恢复误差以及与特征空间的相关性，通过优化的过程保证了降维结果能够良好地恢复到标注矩阵，同时也能够被特征空间上学习到的预测模型所描述，从而能够在较低的训练成本下取得较好的多标签分类性能。According to the label space dimensionality reduction method based on feature-related implicit coding according to the embodiment of the present invention, in the process of learning the dimensionality reduction result, the recovery error with the labeling matrix and the correlation with the feature space are also fully considered. Through the optimization process It ensures that the dimensionality reduction results can be well restored to the label matrix, and can also be described by the prediction model learned in the feature space, so that better multi-label classification performance can be achieved at a lower training cost.

在一些示例中，所述潜语义空间的各个维度相互正交。In some examples, the dimensions of the latent semantic space are mutually orthogonal.

在一些示例中，对所述测试实例在原始标签空间的分类结果进行二值化处理。In some examples, binary processing is performed on the classification result of the test instance in the original label space.

在一些示例中，所述潜语义空间的维数小于所述原始标签空间的维数。In some examples, the dimensionality of the latent semantic space is smaller than the dimensionality of the original label space.

本发明第二方面的实施例提出一种基于特征相关隐式编码的标签空间降维系统，包括：训练模块，用于根据训练数据集进行学习训练以获取预测模型；预测模块，用于根据所述预测模型获取测试实例在原始标签空间的分类结果。The embodiment of the second aspect of the present invention proposes a label space dimensionality reduction system based on feature-related implicit coding, including: a training module, which is used to perform learning and training according to the training data set to obtain a prediction model; The above prediction model obtains the classification result of the test instance in the original label space.

根据本发明实施例的基于特征相关隐式编码的标签空间降维系统，在学习降维结果的过程中也充分考虑了其与标注矩阵的恢复误差以及与特征空间的相关性，通过优化的过程保证了降维结果能够良好地恢复到标注矩阵，同时也能够被特征空间上学习到的预测模型所描述，从而能够在较低的训练成本下取得较好的多标签分类性能。According to the label space dimensionality reduction system based on feature-related implicit coding according to the embodiment of the present invention, in the process of learning dimensionality reduction results, the recovery error with the labeling matrix and the correlation with the feature space are also fully considered. Through the optimization process It ensures that the dimensionality reduction results can be well restored to the label matrix, and can also be described by the prediction model learned in the feature space, so that better multi-label classification performance can be achieved at a lower training cost.

在一些示例中，所述训练模块具体包括：构造模块，用于根据训练数据构造特征矩阵和标注矩阵；优化模块，用于根据所述特征矩阵得到降维矩阵与所述特征矩阵间的最优相关函数，并且根据所述标注矩阵得到降维矩阵与所述标注矩阵间的最优恢复误差函数；建模模块，用于根据所述最优相关函数和所述最优恢复误差函数构造目标函数，并应用所述目标函数优化所述降维矩阵后，利用优化后的降维矩阵求解出解码矩阵；学习模块，用于利用所述优化后的降维矩阵学习训练以获取预测模型。In some examples, the training module specifically includes: a construction module, configured to construct a feature matrix and a label matrix according to training data; an optimization module, used to obtain the optimal dimensionality reduction matrix and the feature matrix according to the feature matrix A correlation function, and obtain the optimal restoration error function between the dimensionality reduction matrix and the annotation matrix according to the annotation matrix; a modeling module is used to construct an objective function according to the optimal correlation function and the optimal restoration error function , and after applying the objective function to optimize the dimensionality reduction matrix, use the optimized dimensionality reduction matrix to solve the decoding matrix; a learning module is used to use the optimized dimensionality reduction matrix to learn and train to obtain a prediction model.

本发明附加的方面和优点将在下面的描述中部分给出，部分将从下面的描述中变得明显，或通过本发明的实践了解到。Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

附图说明Description of drawings

图1是根据本发明实施例的基于特征相关隐式编码的标签空间降维方法的流程图；Fig. 1 is a flow chart of a label space dimensionality reduction method based on feature correlation implicit coding according to an embodiment of the present invention;

图2是本发明一个实施例的基于特征相关隐式编码的标签空间降维方法的原理图；Fig. 2 is a schematic diagram of a label space dimensionality reduction method based on feature-related implicit coding according to an embodiment of the present invention;

图3是根据本发明实施例的基于特征相关隐式编码的标签空间降维系统的结构框图；和Fig. 3 is a structural block diagram of a label space dimensionality reduction system based on feature correlation implicit coding according to an embodiment of the present invention; and

图4是本发明一个实施例的训练模块的结构框图。Fig. 4 is a structural block diagram of a training module of an embodiment of the present invention.

具体实施方式detailed description

下面详细描述本发明的实施例，所述实施例的示例在附图中示出，其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的，旨在用于解释本发明，而不能理解为对本发明的限制。Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

事实上，标签空间降维，其主要目的是压缩高维的原始标签空间（Original labelspace），在保持可接受的算法性能的前提下，将其编码到一个低维的潜语义空间（Latentsemantic space），从而将原来的模型训练过程由从特征空间（Feature space）到原始标签空间的预测模型的学习过程，分解成从特征空间到潜语义空间的预测模型的学习过程以及从潜语义空间到原始标签空间的解码过程。通过降维，从特征空间到潜语义空间所需的预测模型数量，和降维前所需的数量相比，将大幅度减少。并且，如果预测模型足够精确，同时，从潜语义空间到原始标签空间的解码过程也足够精确和高效，那么最终得到的多标签分类性能理论上讲应该仍是可接受的，而与此同时训练成本却得以极大地降低。In fact, the main purpose of label space dimensionality reduction is to compress the high-dimensional original label space (Original labelspace), and encode it into a low-dimensional latent semantic space (Latentsemantic space) while maintaining acceptable algorithm performance. , so that the original model training process is decomposed from the learning process of the prediction model from the feature space (Feature space) to the original label space into the learning process of the prediction model from the feature space to the latent semantic space and from the latent semantic space to the original label Spatial decoding process. Through dimensionality reduction, the number of prediction models required to go from feature space to latent semantic space will be greatly reduced compared to the number required before dimensionality reduction. And, if the prediction model is accurate enough, and at the same time, the decoding process from the latent semantic space to the original label space is also accurate and efficient enough, then the final multi-label classification performance should still be acceptable theoretically, while training Costs are greatly reduced.

本发明一方面的实施例中提出了一种基于特征相关隐式编码的标签空间降维方法，包括以下步骤：提供训练数据集；根据训练数据集构造特征矩阵和标注矩阵；根据特征矩阵得到降维矩阵与特征矩阵的最优相关函数，并根据标注矩阵得到降维矩阵与标注矩阵的最优恢复误差函数；根据最优相关函数和最优恢复误差函数构造目标函数；应用目标函数优化降维矩阵，并根据优化后的降维矩阵求解出解码矩阵；利用优化后的降维矩阵学习训练以获取预测模型；提取测试实例特征，并利用预测模型预测测试实例在潜语义空间中的表示；以及利用解码矩阵对测试实例在潜语义空间中的表示进行解码，以获取测试实例在原始标签空间的分类结果。In the embodiment of one aspect of the present invention, a label space dimensionality reduction method based on feature-related implicit coding is proposed, which includes the following steps: providing a training data set; constructing a feature matrix and a labeling matrix according to the training data set; The optimal correlation function between the dimension matrix and the feature matrix, and the optimal restoration error function of the dimensionality reduction matrix and the annotation matrix are obtained according to the annotation matrix; the objective function is constructed according to the optimal correlation function and the optimal restoration error function; the optimal dimensionality reduction is applied to the objective function Matrix, and solve the decoding matrix according to the optimized dimensionality reduction matrix; use the optimized dimensionality reduction matrix to learn and train to obtain the prediction model; extract the characteristics of the test instance, and use the prediction model to predict the representation of the test instance in the latent semantic space; and The representation of the test instance in the latent semantic space is decoded by the decoding matrix to obtain the classification result of the test instance in the original label space.

图1是根据本发明实施例的基于特征相关隐式编码的标签空间降维方法的流程图，图2是本发明一个实施例的基于特征相关隐式编码的标签空间降维方法的原理框架图。结合图1和图2具体描述本发明的基于特征相关隐式编码的标签空间降维方法。Figure 1 is a flow chart of a label space dimensionality reduction method based on feature-related implicit coding according to an embodiment of the present invention, and Figure 2 is a schematic diagram of a feature-related implicit coding-based label space dimensionality reduction method according to an embodiment of the present invention . The label space dimensionality reduction method based on feature correlation implicit coding of the present invention is described in detail with reference to FIG. 1 and FIG. 2 .

步骤S101：提供训练数据集。Step S101: providing a training data set.

如图2所示的基于特征相关隐式编码的标签空间降维方法的原理框架图，本发明的方法包括了训练过程和预测过程。在训练过程中，需要给定一定数量的训练数据集。As shown in FIG. 2 , the principle frame diagram of the label space dimensionality reduction method based on feature-related implicit coding, the method of the present invention includes a training process and a prediction process. In the training process, a certain number of training data sets need to be given.

步骤S102：根据训练数据集构造特征矩阵和标注矩阵。Step S102: Construct a feature matrix and a label matrix according to the training data set.

具体地，对给定的包含m个测试实例的训练数据集，根据数据本身的属性选择合适的特征类型，并为其中每一个测试实例抽取相应的特征向量x=[x₁,x₂,...,x_d]，其中，x_i是特征向量x的第i维。在得到所有测试实例的特征向量后，可以以任意顺序按行将其拼接成所需的特征矩阵X，X是m×d的矩阵，其中，d是特征向量的维度。Specifically, for a given training data set containing m test instances, select the appropriate feature type according to the attributes of the data itself, and extract the corresponding feature vector x=[x ₁ ,x ₂ ,. .., x _d ], where x _i is the i-th dimension of the feature vector x. After obtaining the eigenvectors of all test instances, they can be concatenated into the required eigenmatrix X by rows in any order, where X is an m×d matrix, where d is the dimension of the eigenvectors.

与此同时，对包含m个测试实例的训练数据集，统计其中出现的不同标签的数量值k，并根据每一个测试实例的标签归属情况，为其构造出相应的标签向量y=[y₁,y₂,...,y_k]，其中，y_j表示该实例是否属于第j个标签。如果是，则取值为1，反之，取值为0，以此类推。同样地，在得到所有测试实例的标签向量后，可以按行将其拼接成标注矩阵Y，这里与特征矩阵的拼接顺序一致，Y是m×k的矩阵。At the same time, for the training data set containing m test instances, count the number k of different labels appearing in it, and construct the corresponding label vector y=[y ₁ ,y ₂ ,...,y _k ], where y _j indicates whether the instance belongs to the jth label. If yes, the value is 1, otherwise, the value is 0, and so on. Similarly, after obtaining the label vectors of all test instances, they can be concatenated into a label matrix Y by row, which is consistent with the concatenation sequence of the feature matrix, and Y is an m×k matrix.

步骤S103：根据特征矩阵得到降维矩阵与特征矩阵的最优相关函数，并根据标注矩阵得到降维矩阵与标注矩阵的最优恢复误差函数。Step S103: Obtain the optimal correlation function between the dimensionality reduction matrix and the feature matrix according to the feature matrix, and obtain the optimal restoration error function between the dimensionality reduction matrix and the labeling matrix according to the labeling matrix.

具体地，一方面根据特征矩阵得到降维矩阵与特征矩阵的最优相关函数。Specifically, on the one hand, the optimal correlation function between the dimensionality reduction matrix and the feature matrix is obtained according to the feature matrix.

在实际操作过程中，结合隐式编码的方法，假定存在降维矩阵C。降维矩阵C与特征矩阵X之间的相关性可以分解为降维矩阵C的各个列与特征矩阵X之间的相关性之和。对于降维矩阵C中的任意一个列c，其与特征矩阵X之间的相关性可以通过余弦相关性来描述，表达成函数形式如下：In the actual operation process, combined with the implicit coding method, it is assumed that there is a dimensionality reduction matrix C. The correlation between the dimensionality reduction matrix C and the feature matrix X can be decomposed into the sum of the correlations between each column of the dimensionality reduction matrix C and the feature matrix X. For any column c in the dimensionality reduction matrix C, the correlation between it and the feature matrix X can be described by cosine correlation, expressed as a function as follows:

其中，r是X的一个线性映射，用于将X投影到c所在的空间。Among them, r is a linear map of X, which is used to project X to the space where c is located.

与此同时，为了降低降维结果中的冗余信息，假定降维矩阵C的各个列相互正交，亦即降维矩阵C描述的潜语义空间的各个维度相互正交，其对应的数学表达式为C^TC=E。由C^TC=E可以得知c^Tc=1，并且，r进行线性缩放不影响余弦相关性ρ的取值，因此可以构造出如下的优化函数用于求解最优的线性映射r，进而得到c与X的最优相关性。At the same time, in order to reduce the redundant information in the dimensionality reduction results, it is assumed that the columns of the dimensionality reduction matrix C are orthogonal to each other, that is, the dimensions of the latent semantic space described by the dimensionality reduction matrix C are orthogonal to each other, and the corresponding mathematical expression The formula is C ^T C = E. From C ^T C = E, it can be known that c ^T c = 1, and the linear scaling of r does not affect the value of the cosine correlation ρ, so the following optimization function can be constructed to solve the optimal linear map r, and then Get the optimal correlation between c and X.

ρ^*=max_r(Xr)^Tcρ ^* =max _r (Xr) ^T c

约束条件：(Xr)^T(Xr)=1Constraints: (Xr) ^T (Xr)=1

通过拉格朗日乘子法可以得出最优的线性映射重新代入到上述优化函数可以得到最优的相关性为其中，Δ=X(X^TX)^-1X^T。The optimal linear mapping can be obtained by the method of Lagrange multipliers Re-substituting into the above optimization function can get the optimal correlation as Among them, Δ=X(X ^T X) ^-1 X ^T .

因此，降维矩阵C与特征矩阵X之间的最优相关性可以表达为如下函数式：Therefore, the optimal correlation between the dimensionality reduction matrix C and the feature matrix X can be expressed as the following function:

其中，C_.,i表示降维矩阵C的第i列。Among them, C _.,i represents the i-th column of the dimensionality reduction matrix C.

另一方面根据标注矩阵得到降维矩阵与标注矩阵的最优恢复误差函数。On the other hand, the optimal recovery error function of the dimensionality reduction matrix and the labeling matrix is obtained according to the labeling matrix.

结合隐式编码的方法，假定存在降维矩阵C，C是m×l的矩阵，其中，l是降维后的维度大小，因此l＜＜k。Combined with the method of implicit coding, it is assumed that there is a dimensionality reduction matrix C, and C is an m×l matrix, where l is the dimension size after dimensionality reduction, so l<<k.

在假定降维矩阵C存在的前提下，由C恢复到标注矩阵的误差可以表达为如下函数式：Assuming that the dimensionality reduction matrix C exists, the error restored from C to the label matrix can be expressed as the following function:

其中，D是为保证解码效率而引入的线性解码矩阵，表示的是矩阵的Frobenius范式的平方。在降维矩阵C给定的情况下，最优的恢复误差即为最小的ε。因此，通过最小化ε可以得到最优的解码矩阵D和最优的恢复误差。通过构造如下优化函数：Among them, D is the linear decoding matrix introduced to ensure the decoding efficiency, Represents the square of the Frobenius normal form of the matrix. Given the dimensionality reduction matrix C, the optimal restoration error is the smallest ε. Therefore, the optimal decoding matrix D and the optimal recovery error can be obtained by minimizing ε. Optimize the function by constructing the following:

可以求解出解码矩阵D的最优解码矩阵D^*=(C^TC)^-1C^TY，由于C^TC=E为单位矩阵，D^*可以进一步化简为D^*=C^TY，重新代入上述优化函数，得到最优恢复误差函数：ε^*=Tr[Y^TY-Y^TCC^TY]，其中Tr[·]表示矩阵的迹，亦即对角线元素之和。The optimal decoding matrix D ^* =(C ^T C) ^-1 C ^T Y of the decoding matrix D can be solved. Since C ^T C = E is an identity matrix, D ^* can be further simplified to D ^* = C ^T Y, and re Substituting the above optimization function, the optimal recovery error function is obtained: ε ^* =Tr[Y ^T YY ^T CC ^T Y], where Tr[ ] represents the trace of the matrix, that is, the sum of the diagonal elements.

步骤S104：根据最优相关函数和最优恢复误差函数构造目标函数。Step S104: Construct an objective function according to the optimal correlation function and the optimal restoration error function.

通过步骤S103可以得到从降维矩阵C恢复到标注矩阵Y的最优恢复误差函数ε^*=Tr[Y^TY-Y^TCC^TY]，以及其与特征矩阵X的最优相关函数因此最优的降维矩阵应该能同时最小化上述最优恢复误差函数并最大化上述最优相关函数。Through step S103, the optimal restoration error function ε ^* =Tr[Y ^T YY ^T CC ^T Y] from the dimensionality reduction matrix C to the labeling matrix Y can be obtained, as well as its optimal correlation function with the feature matrix X Therefore, the optimal dimensionality reduction matrix should be able to minimize the above optimal restoration error function and maximize the above optimal correlation function at the same time.

由矩阵迹的性质可知，ε^*=Tr[Y^TY-Y^TCC^TY]=Tr[Y^TY]-Tr[Y^TCC^TY]，由于Tr[Y^TY]是常数，则最小化ε^*等价于最大化Tr[Y^TCC^TY]。此外，对于最优相关性部分，由于最大化等价于最大化因此，最大化P^*等价于最大化因此，为了同时最小化ε^*并且最大化P^*，可以通过构造如下等价的目标函数，以求解最优的降维矩阵C。According to the properties of the matrix trace, ε ^* =Tr[Y ^T YY ^T CC ^T Y]=Tr[Y ^T Y]-Tr[Y ^T CC ^T Y], since Tr[Y ^T Y] is a constant, then minimize ε ^* is equivalent to maximizing Tr[Y ^T CC ^T Y]. Furthermore, for the optimal correlation part, since maximizing is equivalent to maximizing Therefore, maximizing P ^* is equivalent to maximizing Therefore, in order to minimize ε ^* and maximize P ^* at the same time, the optimal dimensionality reduction matrix C can be solved by constructing the following equivalent objective function.

Ω=max_CTr[Y^TCC^TY]+αTr[C^TΔC]Ω=max _C Tr[Y ^T CC ^T Y]+αTr[C ^T ΔC]

约束条件：C^TC=EConstraints: C ^T C = E

其中，α是权重参数，用于调整最优恢复误差与最优相关性之间的权重关系。根据矩阵迹的性质，上述目标函数可以改写为如下形式：Among them, α is a weight parameter, which is used to adjust the weight relationship between the optimal recovery error and the optimal correlation. According to the properties of the matrix trace, the above objective function can be rewritten as follows:

Ω=max_CTr[C^T(YY^T+αΔ)C]Ω=max _C Tr[C ^T (YY ^T +αΔ)C]

约束条件：C^TC=EConstraints: C ^T C = E

步骤S105：应用目标函数优化降维矩阵，并根据优化后的降维矩阵求解出解码矩阵。Step S105: Applying the objective function to optimize the dimensionality reduction matrix, and solving the decoding matrix according to the optimized dimensionality reduction matrix.

具体地，通过步骤S104得到的目标函数Ω，可以得到最优降维矩阵C。C的求解可以分解为各个列的优化问题。对于降维矩阵C第i列C_.,i的优化求解，可以构造出如下优化子问题：Specifically, the optimal dimensionality reduction matrix C can be obtained through the objective function Ω obtained in step S104. The solution of C can be decomposed into the optimization problem of each column. For the optimal solution of the i-th column C _.,i of the dimensionality reduction matrix C, the following optimization sub-problems can be constructed:

约束条件： Restrictions:

利用拉格朗日乘子法，可以得出最优的C_.，i需满足以下最优性条件：Using the Lagrange multiplier method, the optimal C can be obtained _{, i} needs to meet the following optimality conditions:

其中，λ_i是引入的拉格朗日乘子，并且代入上述的优化子问题可以得到最优的Ω_i，即λ_i。Among them, λ _i is the introduced Lagrangian multiplier, and substituting it into the above optimization sub-problem can get the optimal Ω _i , namely λ _i .

由上述最优性条件可以观察得到，最优的C_.,i是矩阵(YY^T+αΔ)的一个单位特征向量（Eigenvector），且由于特征向量的正交性，后面的正交约束可以自然满足。至此可以发现，降维矩阵C实际上是由矩阵(YY^T+αΔ)中对应最大的l个特征值（Eigenvalue）的单位特征向量按列拼接而成。因此，求解降维矩阵C的过程是对矩阵(YY^T+αΔ)进行特征值分解（Eigenvalue decomposition）的过程，从而可以得到该矩阵的所有特征值和各个特征值对应的单位特征向量。由于特征值分解的复杂度不大于并且由于(YY^T+αΔ)是对称矩阵，同时在本发明的一个实施例中仅需要最大的l个特征值所对应的单位特征向量，因此降维矩阵C的复杂度可以低于从而保证了求解降维矩阵C的过程足够高效。It can be observed from the above optimality conditions that the optimal C _.,i is a unit eigenvector (Eigenvector) of the matrix (YY ^T +αΔ), and due to the orthogonality of the eigenvectors, the following orthogonal constraints can be naturally Satisfy. So far, it can be found that the dimensionality reduction matrix C is actually composed of unit eigenvectors corresponding to the largest l eigenvalues (Eigenvalue) in the matrix (YY ^T + αΔ) concatenated by columns. Therefore, the process of solving the dimensionality reduction matrix C is the process of performing eigenvalue decomposition on the matrix (YY ^T + αΔ), so that all eigenvalues of the matrix and the unit eigenvectors corresponding to each eigenvalue can be obtained. Since the complexity of eigenvalue decomposition is not greater than And because (YY ^T +αΔ) is a symmetric matrix, and in one embodiment of the present invention, only the unit eigenvectors corresponding to the largest l eigenvalues are required, so the complexity of the dimensionality reduction matrix C can be lower than This ensures that the process of solving the dimensionality reduction matrix C is sufficiently efficient.

另外，在通过特征值分解方法求解出降维矩阵C之后，可以通过最小化从降维矩阵C恢复到原始标注矩阵Y之间的最优恢复误差来求得最优解码矩阵D。由步骤S103的计算过程可知，最优的解码矩阵D^*=C^TY。In addition, after the dimensionality reduction matrix C is solved by the eigenvalue decomposition method, the optimal decoding matrix D can be obtained by minimizing the optimal restoration error from the dimensionality reduction matrix C to the original labeling matrix Y. It can be seen from the calculation process of step S103 that the optimal decoding matrix D ^* =C TY ^.

步骤S106：利用优化后的降维矩阵学习训练以获取预测模型。Step S106: learning and training by using the optimized dimensionality reduction matrix to obtain a prediction model.

根据步骤S105得到的最优降维矩阵C，为其所描述的潜语义空间的各个维度训练相应的预测模型。具体而言，对于第i维（1≤i≤l），第j个测试实例在该维度上的取值为C_j,i，从而所有测试实例的训练数据在该维度上的取值构成的向量即为C的第i列C_.,i。根据所有测试实例的训练数据的特征矩阵X以及在潜语义空间第i维的取值向量C_.,i，可以学习训练出相应的预测模型h_i:X→C_.,i，用于预测任意实例在该维度上的取值情况，其输入为测试实例的特征向量，而输出则是测试实例在第i维上的取值。According to the optimal dimensionality reduction matrix C obtained in step S105, corresponding prediction models are trained for each dimension of the latent semantic space described by it. Specifically, for the i-th dimension (1≤i≤l), the value of the j-th test instance in this dimension is C _j,i , so that the values of the training data of all test instances in this dimension constitute the The vector is the ith column C _.,i of C. According to the feature matrix X of the training data of all test instances and the value vector C _.,i in the i-th dimension of the latent semantic space, the corresponding prediction model h _i :X→C _.,i can be learned and trained to predict any The value of the instance on this dimension, the input is the feature vector of the test instance, and the output is the value of the test instance on the i-th dimension.

在实际的运行中，预测模型的选择可以根据应用的具体情况设置，常用的包括线性回归（Linear regression）等。经过标签空间降维，这里潜语义空间的维度l往往远小于原始标签空间的维度k，从而使得所需的预测模型的数量大幅度降低，有效地减少了训练成本。In actual operation, the choice of prediction model can be set according to the specific situation of the application, and commonly used ones include linear regression (Linear regression). After label space dimensionality reduction, the dimension l of the latent semantic space is often much smaller than the dimension k of the original label space, which greatly reduces the number of required prediction models and effectively reduces the training cost.

步骤S107：提取测试实例特征，并利用预测模型预测测试实例在潜语义空间中的表示。Step S107: Extract the features of the test instance, and use the predictive model to predict the representation of the test instance in the latent semantic space.

具体地，当给定一个待分类的测试实例时，需要对其抽取与训练过程相同的特征，并得到该测试实例的d维特征向量 Specifically, when a test instance to be classified is given, it is necessary to extract the same features as the training process, and obtain the d-dimensional feature vector of the test instance

在得到测试实例的特征向量之后，利用步骤S106中学习得到的l个预测模型，预测出测试实例在潜语义空间的各个维度上的取值，从而得到其在潜语义空间上的l维表示向量 After getting the feature vector of the test instance Afterwards, use the l prediction models learned in step S106 to predict the values of the test instance in each dimension of the latent semantic space, so as to obtain its l-dimensional representation vector in the latent semantic space

步骤S108：利用解码矩阵对测试实例在潜语义空间中的表示进行解码，以获取测试实例在原始标签空间的分类结果。Step S108: Use the decoding matrix to decode the representation of the test instance in the latent semantic space to obtain the classification result of the test instance in the original label space.

利用步骤S105所得到的最优解码矩阵D^*，对测试实例在潜语义空间的l维表示向量进行解码，恢复到原始标签空间的k维表示向量亦即 Utilize the optimal decoding matrix D ^* obtained in step S105 to decode the l-dimensional representation vector of the test instance in the latent semantic space, and restore the k-dimensional representation vector of the original label space that is

此时得到的向量取值为实数值，需要通过设定阈值（通常取为0.5）将其二值化。具体而言，各个维度上的值如果超过所设阈值则取值为1，反之取值为0，从而表示出测试实例在原始标签空间中的标签归属情况，亦即，取值为1的各个维度所对应的标签即为待分类测试实例的多标签分类结果。The vector obtained at this time The value is a real value, which needs to be binarized by setting a threshold (usually 0.5). Specifically, if the value of each dimension exceeds the set threshold, the value is 1, otherwise the value is 0, thus indicating the label attribution of the test instance in the original label space, that is, each dimension with a value of 1 The label corresponding to the dimension is the multi-label classification result of the test instance to be classified.

在本发明的一个实施例中，潜语义空间的各个维度相互正交，最小化了降维结果中的冗余信息，从而使得本方法能够以更低的维度编码更多的原始标签空间中的信息，保证了降维过程中对原始标签空间有更大的压缩率。In one embodiment of the present invention, each dimension of the latent semantic space is orthogonal to each other, which minimizes the redundant information in the dimensionality reduction result, so that this method can encode more information in the original label space with a lower dimension. information, which ensures a greater compression rate for the original label space during dimensionality reduction.

在本发明的一个实施例中，潜语义空间的维数小于原始标签空间的维数，并且不需要预先设置显式的编码函数，从而保证本发明实施例的方法所需的预测模型的数量大幅度降低，有效地减少了训练成本，使得在不同的数据场景下都能自适应地学习出最优的降维结果，稳定性更好。In one embodiment of the present invention, the dimensionality of the latent semantic space is smaller than the dimensionality of the original label space, and there is no need to pre-set an explicit encoding function, thereby ensuring that the number of prediction models required by the method of the embodiment of the present invention is large The reduction in magnitude effectively reduces the training cost, so that the optimal dimension reduction result can be learned adaptively in different data scenarios, and the stability is better.

例如，通过在文本分类领域的标准数据集delicious上的实验，验证了本发明实施例的基于特征相关隐式编码的标签空间降维方法的有效性。具体而言，将delicious数据集上的标签空间维度降到了原来的10%、20%、30%、40%和50%，并观察在不同的比例下本发明实施例的方法所能达到的分类性能，分别用基于标签的平均F1值和基于实例的平均精确度来衡量（两者均是越高越好）。表1给出了本发明方法的分类性能统计结果，同时也给出了在不进行标签空间降维时，采用性能突出的线性支持向量机（Support vector machine）所能达到的分类性能，以便进行降维前后的比较。从表中的结果可以看出，本发明实施例的方法在只保留原始标签空间维度10%的情况下便能在平均F1值上保持未降维前F1值的68%，同时在平均精确度上保持未降维前的85%。由此可见，本发明实施例的方法能够有效地对原始标签空间进行降维，并且在大幅度降低训练成本的同时能良好地保证可接受的分类性能。For example, through experiments on the standard data set delicious in the field of text classification, the effectiveness of the label space dimensionality reduction method based on feature-related implicit coding in the embodiment of the present invention is verified. Specifically, the label space dimension on the delicious data set is reduced to the original 10%, 20%, 30%, 40% and 50%, and the classification achieved by the method of the embodiment of the present invention is observed at different ratios Performance, measured by label-based average F1 score and instance-based average precision (both higher is better). Table 1 shows the statistical results of the classification performance of the method of the present invention, and also provides the classification performance that can be achieved by using a linear support vector machine (Support vector machine) with outstanding performance when the label space dimensionality reduction is not performed, so as to carry out Comparison before and after dimensionality reduction. It can be seen from the results in the table that the method of the embodiment of the present invention can maintain 68% of the F1 value before dimensionality reduction on the average F1 value while only retaining 10% of the original label space dimension, and at the same time, the average accuracy 85% of that before dimensionality reduction. It can be seen that the method of the embodiment of the present invention can effectively reduce the dimension of the original label space, and can well guarantee acceptable classification performance while greatly reducing the training cost.

表1本发明实施例的方法在delicious数据集上的实验结果Table 1 Experimental results of the method of the embodiment of the present invention on the delicious data set

降维后的维度比例Dimension ratio after dimensionality reduction 10%10% 20%20% 30%30% 40%40% 50%50% 降维前Before dimensionality reduction 基于标签的平均F1值Tag-based average F1-score 0.0540.054 0.0590.059 0.0600.060 0.0600.060 0.0590.059 0.0790.079 基于实例的平均精确度Instance-based average precision 0.1200.120 0.1210.121 0.1200.120 0.1200.120 0.1120.112 0.1420.142

本发明另一方面的实施例提出了一种基于特征相关隐式编码的标签空间降维系统，包括：训练模块100和预测模块200，如图3所示。Another embodiment of the present invention proposes a label space dimensionality reduction system based on feature-related implicit coding, including: a training module 100 and a prediction module 200 , as shown in FIG. 3 .

其中，训练模块100，用于根据训练数据集进行学习训练以获取预测模型。预测模块200，用于根据训练模块100得到的预测模型获取测试实例在原始标签空间的分类结果。Wherein, the training module 100 is configured to perform learning and training according to the training data set to obtain a prediction model. The prediction module 200 is configured to obtain the classification result of the test instance in the original label space according to the prediction model obtained by the training module 100 .

具体地，如图4所示，本发明实施例的训练模块100具体包括：构造模块10、优化模块20、建模模块30和学习模块40。Specifically, as shown in FIG. 4 , the training module 100 of the embodiment of the present invention specifically includes: a construction module 10 , an optimization module 20 , a modeling module 30 and a learning module 40 .

其中，构造模块100，用于根据训练数据集构造特征矩阵和标注矩阵。Wherein, the construction module 100 is configured to construct a feature matrix and a label matrix according to the training data set.

具体地，具体地，对给定的包含m个测试实例的训练数据集，根据数据本身的属性选择合适的特征类型，并为其中每一个测试实例抽取相应的特征向量x=[x₁,x₂,...,x_d]，其中，xi是特征向量x的第i维。在得到所有测试实例的特征向量后，可以以任意顺序按行将其拼接成所需的特征矩阵X，X是m×d的矩阵，其中，d是特征向量的维度。Specifically, for a given training data set containing m test instances, select the appropriate feature type according to the attributes of the data itself, and extract the corresponding feature vector x=[x ₁ ,x for each test instance ₂ ,...,x _d ], where xi is the i-th dimension of the feature vector x. After obtaining the eigenvectors of all test instances, they can be concatenated into the required eigenmatrix X by rows in any order, where X is an m×d matrix, where d is the dimension of the eigenvectors.

优化模块20，用于利用特征矩阵得到降维矩阵与所述特征矩阵间的最优相关函数，并根据标注矩阵得到降维矩阵与标注矩阵间的最优恢复误差函数。The optimization module 20 is used to obtain the optimal correlation function between the dimensionality reduction matrix and the feature matrix by using the feature matrix, and obtain the optimal restoration error function between the dimensionality reduction matrix and the labeling matrix according to the labeling matrix.

ρ^*=max_r(Xr)^Tcρ ^* =max _r (Xr) ^T c

约束条件：(Xr)^T(Xr)=1Constraints: (Xr) ^T (Xr)=1

建模模块30，用于根据最优相关函数和最优恢复误差函数构造目标函数，并应用目标函数优化降维矩阵后，利用优化后的降维矩阵求解出解码矩阵。The modeling module 30 is configured to construct an objective function according to the optimal correlation function and the optimal recovery error function, and after applying the objective function to optimize the dimensionality reduction matrix, use the optimized dimensionality reduction matrix to solve the decoding matrix.

具体地，通过优化模块20可以得到从降维矩阵C恢复到标注矩阵Y的最优恢复误差函数E^*=Tr[Y^TY-Y^TCC^TY]，以及其与特征矩阵X的最优相关函数因此最优的降维矩阵应该能同时最小化上述最优恢复误差函数并最大化上述最优相关函数。Specifically, the optimal restoration error function E ^* =Tr[Y ^T YY ^T CC ^TY ] from the dimensionality reduction matrix C restored to the label matrix Y can be obtained through the optimization module 20, and its optimal correlation function with the feature matrix X Therefore, the optimal dimensionality reduction matrix should be able to minimize the above optimal restoration error function and maximize the above optimal correlation function at the same time.

Ω=max_CTr[Y^TCC^TY]+αTr[C^TΔC]Ω=max _C Tr[Y ^T CC ^T Y]+αTr[C ^T ΔC]

约束条件：C^TC=EConstraints: C ^T C = E

Ω=max_CTr[C^T(YY^T+αΔ)C]Ω=max _C Tr[C ^T (YY ^T +αΔ)C]

约束条件：C^TC=EConstraints: C ^T C = E

通过目标函数Ω，可以得到最优降维矩阵C。最优降维矩阵C的求解可以分解为各个列的优化问题。对于降维矩阵C第i列C_.,i的优化求解，可以构造出如下优化子问题：Through the objective function Ω, the optimal dimensionality reduction matrix C can be obtained. The solution of the optimal dimensionality reduction matrix C can be decomposed into the optimization problem of each column. For the optimal solution of the i-th column C _.,i of the dimensionality reduction matrix C, the following optimization sub-problems can be constructed:

约束条件： Restrictions:

利用拉格朗日乘子法，可以得出最优的C_.,i需满足以下最优性条件：Using the Lagrange multiplier method, it can be obtained that the optimal C _.,i must satisfy the following optimality conditions:

(YY^T+αΔ)C_.,i=λ_iC_.,i (YY ^T +αΔ)C _.,i =λ _i C _.,i

另外，在通过特征值分解方法求解出降维矩阵C之后，可以通过最小化从降维矩阵C恢复到原始标注矩阵Y之间的最优恢复误差来求得最优解码矩阵D。由优化模块20的计算过程可知，最优的解码矩阵D^*=C^TY。In addition, after the dimensionality reduction matrix C is solved by the eigenvalue decomposition method, the optimal decoding matrix D can be obtained by minimizing the optimal restoration error from the dimensionality reduction matrix C to the original labeling matrix Y. It can be seen from the calculation process of the optimization module 20 that the optimal decoding matrix D ^* =C TY ^.

学习模块40，用于利用优化后的降维矩阵学习训练以获取预测模型。The learning module 40 is configured to use the optimized dimensionality reduction matrix to learn and train to obtain a prediction model.

根据建模模块30得到的最优降维矩阵C，为其所描述的潜语义空间的各个维度训练相应的预测模型。具体而言，对于第i维（1≤i≤l），第j个测试实例在该维度上的取值为C_j,i，从而所有测试实例的训练数据在该维度上的取值构成的向量即为C的第i列C_.,i。根据所有测试实例的训练数据的特征矩阵X以及在潜语义空间第i维的取值向量C_.,i，可以学习训练出相应的预测模型h_i:X→C_.,i，用于预测任意实例在该维度上的取值情况，其输入为测试实例的特征向量，而输出则是测试实例在第i维上的取值。According to the optimal dimensionality reduction matrix C obtained by the modeling module 30, corresponding prediction models are trained for each dimension of the latent semantic space described by it. Specifically, for the i-th dimension (1≤i≤l), the value of the j-th test instance in this dimension is C _j,i , so that the values of the training data of all test instances in this dimension constitute the The vector is the ith column C _.,i of C. According to the feature matrix X of the training data of all test instances and the value vector C _.,i in the i-th dimension of the latent semantic space, the corresponding prediction model h _i :X→C _.,i can be learned and trained to predict any The value of the instance on this dimension, the input is the feature vector of the test instance, and the output is the value of the test instance on the i-th dimension.

此外，本发明实施例的预测模块200具体包括：In addition, the prediction module 200 of the embodiment of the present invention specifically includes:

（1）当给定一个待分类的测试实例时，需要对其抽取与训练过程相同的特征，并得到该测试实例的d维特征向量 (1) When a test instance to be classified is given, it is necessary to extract the same features as the training process, and obtain the d-dimensional feature vector of the test instance

（2）在得到测试实例的特征向量之后，利用训练模块100中学习得到的l个预测模型，预测出测试实例在潜语义空间的各个维度上的取值，从而得到其在潜语义空间上的l维表示向量 (2) After getting the feature vector of the test instance After that, use the l prediction models learned in the training module 100 to predict the values of the test instance in each dimension of the latent semantic space, thereby obtaining its l-dimensional representation vector in the latent semantic space

（3）利用解码矩阵对测试实例在潜语义空间中的表示进行解码，以获取测试实例在原始标签空间的分类结果。(3) Use the decoding matrix to decode the representation of the test instance in the latent semantic space to obtain the classification result of the test instance in the original label space.

利用优化模块30所得到的最优解码矩阵D^*，对测试实例在潜语义空间的l维表示向量进行解码，恢复到原始标签空间的k维表示向量亦即 Using the optimal decoding matrix D ^* obtained by the optimization module 30, the l-dimensional representation vector of the test instance in the latent semantic space is decoded, and the k-dimensional representation vector of the original label space is restored that is

在本发明的一个实施例中，潜语义空间的维数小于原始标签空间的维数，并且不需要预先设置显式的编码函数，从而保证本发明实施例的方法在不同的数据场景下都能自适应地学习出最优的降维结果，稳定性更好。In one embodiment of the present invention, the dimensionality of the latent semantic space is smaller than the dimensionality of the original label space, and there is no need to pre-set an explicit encoding function, thereby ensuring that the method of the embodiment of the present invention can be used in different data scenarios Adaptively learn the optimal dimensionality reduction result with better stability.

在本说明书的描述中，参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中，对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且，描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外，在不相互矛盾的情况下，本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or feature is included in at least one embodiment or example of the present invention. In this specification, the schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.

尽管上面已经示出和描述了本发明的实施例，可以理解的是，上述实施例是示例性的，不能理解为对本发明的限制，本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and those skilled in the art can make the above-mentioned The embodiments are subject to changes, modifications, substitutions and variations.

Claims

1. A label space dimensionality reduction method based on feature correlation implicit encoding, characterized in that, comprising the following steps:

Provide training data set;

Construct feature matrix and label matrix according to described training data set;

Obtaining the optimal correlation function between the dimensionality reduction matrix and the feature matrix according to the feature matrix, and obtaining the optimal restoration error function between the dimensionality reduction matrix and the labeling matrix according to the labeling matrix, wherein, according to the labeling matrix The feature matrix obtains the optimal correlation function between the dimensionality reduction matrix and the feature matrix specifically includes:

The dimensionality reduction matrix is obtained by combining the labeling matrix with an implicit coding method;

The correlation between the dimensionality reduction matrix and the feature matrix is decomposed into a sum of correlations, and expressed in the form of a cosine correlation function as follows:

Among them, r is a linear mapping of the feature matrix X, which is used to project the feature matrix X to the space where any column c in the dimensionality reduction matrix C is located;

Obtain the optimal linear mapping r according to the cosine correlation function, and obtain the optimal correlation between any column c in the dimensionality reduction matrix C and the feature matrix X;

The optimal linear mapping r ^* is obtained by the Lagrange multiplier method, and the optimal correlation function is obtained according to the optimal linear mapping r ^* : Among them, C _{, i} represents the i-th column of the dimensionality reduction matrix C;

Wherein, the optimal restoration error function obtained from the dimensionality reduction matrix and the labeling matrix according to the labeling matrix specifically includes:

The annotation matrix is combined with implicit coding to obtain the dimensionality reduction matrix;

The error function expression of restoring the dimensionality reduction matrix to the labeling matrix is as follows:

Among them, when ε is the smallest, the recovery error is the smallest, D is the linear decoding matrix introduced to ensure the decoding efficiency, Represents the square of the Frobenius normal form of the matrix;

The optimal recovery error function is obtained by minimizing ε, and the expression is as follows:

Constructing an objective function according to the optimal correlation function and the optimal restoration error function;

Optimizing the dimensionality reduction matrix by applying the objective function, and solving the decoding matrix according to the optimized dimensionality reduction matrix;

Using the optimized dimensionality reduction matrix to learn and train to obtain a prediction model;

extracting test instance features, and using the predictive model to predict the representation of the test instance in a latent semantic space; and

Decoding the representation of the test instance in the latent semantic space by using the decoding matrix to obtain a classification result of the test instance in the original label space.

2. The method according to claim 1, wherein the dimensions of the latent semantic space are mutually orthogonal.

3. The method according to claim 1, wherein the classification result of the test instance in the original label space is binarized.

4. The method according to claim 1, wherein the dimensionality of the latent semantic space is smaller than the dimensionality of the original label space.

5. A label space dimensionality reduction system based on feature-related implicit coding, characterized in that it comprises:

A training module, configured to perform learning and training according to the training data set to obtain a prediction model, wherein the training module specifically includes:

A construction module for constructing a feature matrix and a label matrix according to the training data;

The optimization module is used to obtain the optimal correlation function between the dimensionality reduction matrix and the feature matrix according to the feature matrix, and obtain the optimal restoration error function between the dimensionality reduction matrix and the labeling matrix according to the labeling matrix, wherein , the optimal correlation function between the dimensionality reduction matrix and the feature matrix obtained according to the feature matrix specifically includes:

The optimal linear mapping r ^* is obtained by the Lagrange multiplier method, and the optimal correlation function is obtained according to the optimal linear mapping r ^* : Among them, C _{., i} represents the i-th column of the dimensionality reduction matrix C;

A modeling module, configured to construct an objective function according to the optimal correlation function and the optimal restoration error function, and after applying the objective function to optimize the dimensionality reduction matrix, use the optimized dimensionality reduction matrix to solve the decoding matrix ;

A learning module, configured to use the optimized dimensionality reduction matrix to learn and train to obtain a prediction model;

The prediction module is used to obtain the classification result of the test instance in the original label space according to the prediction model.

6. The system according to claim 5, wherein the dimensions of the latent semantic space are mutually orthogonal.

7. The system according to claim 5, wherein the classification result of the test instance in the original label space is binarized.

8. The system according to claim 5, wherein the dimensionality of the latent semantic space is smaller than the dimensionality of the original label space.