CN111539306A

CN111539306A - A method for building recognition in remote sensing images based on the replaceability of activation expressions

Info

Publication number: CN111539306A
Application number: CN202010314628.4A
Authority: CN
Inventors: 陈力; 李海峰; 彭剑; 朱佳玮; 黄浩哲; 崔振琦
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2020-04-21
Filing date: 2020-04-21
Publication date: 2020-08-14
Anticipated expiration: 2040-04-21
Also published as: CN111539306B; AU2021101713A4

Abstract

The invention discloses a remote sensing image building identification method based on activation expression replaceability, which comprises the following steps: acquiring a remote sensing image building data set; training a common deep neural network model; computing an independent maximum response map identifying each convolution kernel in the model; calculating the activation expression replaceability of each convolution kernel; pruning the model convolution kernels according to the activation expression replaceability of each convolution kernel, and keeping small activation expression replaceability; and carrying out remote sensing image building identification by using the trimmed deep neural network model. The method provides an activated expression replaceability index method for a building identification model based on deep learning, and can quantify the replaceability of each convolution kernel on the same layer in the expression of the feature space, the lower the activated expression replaceability value of the convolution kernel is, the less the convolution kernel is replaceable in the feature space, and further selective pruning is carried out on the convolution kernel, so that the identification precision of the building identification model of the remote sensing image is effectively improved.

Description

A method for building recognition in remote sensing images based on the replaceability of activation expressions

技术领域technical field

本发明属于遥感图像识别技术领域，涉及一种基于激活表达可替换性的遥感图像建筑物识别方法。The invention belongs to the technical field of remote sensing image recognition, and relates to a remote sensing image building recognition method based on the replaceability of activation expression.

背景技术Background technique

近年来，大量的遥感卫星升空的同时也带来了大量的遥感影像。遥感影像数据日益剧增，包括具有不同光谱和空间分辨率的多种遥感影像。这些遥感图像带来了巨大的经济价值。能在遥感图像中快速的提取建筑物等目标可以有效的帮助城市规划，基础设施建设，违规建筑检测等应用。当前，大量基于深度学习的建筑物识别算法应运而生，但由于不了解遥感图像建筑物识别模型的泛化性，导致模型的整体识别精度难以满足实际应用。In recent years, a large number of remote sensing satellites have also brought a large number of remote sensing images. Remote sensing imagery data is proliferating day by day, including multiple remote sensing imagery with different spectral and spatial resolutions. These remote sensing images bring enormous economic value. The ability to quickly extract buildings and other objects from remote sensing images can effectively help urban planning, infrastructure construction, and illegal building detection. At present, a large number of building recognition algorithms based on deep learning have emerged, but due to the lack of understanding of the generalization of the remote sensing image building recognition model, the overall recognition accuracy of the model is difficult to meet practical applications.

目前，对于度量深度学习模型泛化能力主要有两种方法。第一种方案主要依赖于传统的统计学习理论的方法，如VC维，Rademacher复杂性等算法去探索模型的鲁棒性，复杂度与泛化之间的关系。这些理论认为含有大量参数的模型容易在数据上过拟合，但同时会降低在测试数据上的泛化能力。然而这一结论与目前深度学习模型的表现相反，传统的统计学习理论无法合理的解释深度学习模型的泛化能力。第二种方案主要从深度学习模型优化过程中参数空间的变化来解释和评估模型的泛化能力。Schmidhuber认为模型的泛化能力与极小值的平直性和贝叶斯界限的平直性相关。然而Dinh指出非平滑的极小值深度学习模型其实也可以具有较好的泛化。Wang在贝叶斯框架下将解的平滑性和模型的泛化能力联系在了一起，并从理论上证明了模型的泛化能力不仅和Hessian谱有关，还与解的平滑性、参数的尺度以及训练样本的数量也有关。除此之外，训练深度学习模型的随机梯度下降算法也可以提高模型的泛化能力。第二种方案中许多结论也具有矛盾性。At present, there are two main methods for measuring the generalization ability of deep learning models. The first scheme mainly relies on traditional statistical learning theory methods, such as VC dimension, Rademacher complexity and other algorithms to explore the relationship between model robustness, complexity and generalization. These theories suggest that models with a large number of parameters are prone to overfitting on the data, but at the same time reduce the generalization ability on the test data. However, this conclusion is contrary to the performance of current deep learning models, and traditional statistical learning theories cannot reasonably explain the generalization ability of deep learning models. The second scheme mainly interprets and evaluates the generalization ability of the model from the changes in the parameter space during the optimization of the deep learning model. Schmidhuber believes that the generalization ability of the model is related to the flatness of the minima and the flatness of the Bayesian bound. However, Dinh pointed out that non-smooth minima deep learning models can actually generalize well. Wang linked the smoothness of the solution to the generalization ability of the model under the Bayesian framework, and theoretically proved that the generalization ability of the model is not only related to the Hessian spectrum, but also to the smoothness of the solution and the scale of the parameters. And the number of training samples is also related. In addition to this, the stochastic gradient descent algorithm for training deep learning models can also improve the generalization ability of the model. Many of the conclusions in the second scenario are also contradictory.

在实际应用中，通常用一个丰富且分布良好的遥感图像建筑物测试集来评估模型的泛化能力。然而对此方法，Recht提出在特定测试集上得到性能良好的模型可能不能体现模型的本身泛化能力，基于测试集的正确率是脆弱的并容易受到数据分布中微小的变化而发生改变。测试集上的评估也存在不准确的问题。所以目前并没有一种合理的算法可以正确的度量遥感图像识别模型的泛化性。In practical applications, a rich and well-distributed test set of remote sensing image buildings is usually used to evaluate the generalization ability of the model. However, for this method, Recht proposed that a model with good performance on a specific test set may not reflect the generalization ability of the model itself, and the accuracy based on the test set is fragile and easily changed by small changes in the data distribution. The evaluation on the test set also suffers from inaccuracy. Therefore, there is currently no reasonable algorithm that can correctly measure the generalization of remote sensing image recognition models.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本发明的目的在于提供基于激活表达可替换性的遥感图像建筑物识别方法，本发明提出一种有效的评估遥感图像识别模型的泛化性的指标，通过所述指标对遥感图像识别模型进行修剪，从而提高模型的精度，进而提高模型的识别能力。In view of this, the purpose of the present invention is to provide a remote sensing image building recognition method based on the replaceability of activation expression, and the present invention proposes an effective index for evaluating the generalization of a remote sensing image recognition model. The recognition model is pruned to improve the accuracy of the model, thereby improving the recognition ability of the model.

本发明的目的是这样实现的，基于激活表达可替换性的遥感图像建筑物识别方法，包括以下步骤：The purpose of the present invention is to realize in this way, the remote sensing image building identification method based on activation expression replaceability, comprises the following steps:

步骤1，获取遥感图像建筑物数据集；Step 1, obtaining a remote sensing image building data set;

步骤2，训练普通深度神经网络模型；Step 2, train a common deep neural network model;

步骤3，计算识别所述模型中每个卷积核的独立最大响应图；Step 3, calculate and identify the independent maximum response map of each convolution kernel in the model;

步骤4，计算每个卷积核的激活表达可替换性；Step 4, calculate the replaceability of the activation expression of each convolution kernel;

步骤5，根据每个卷积核的激活表达可替换性对模型卷积核进行修剪，保留小的激活表达可替换性；Step 5, trim the model convolution kernel according to the replaceability of activation expression of each convolution kernel, and retain the replaceability of small activation expression;

步骤6，使用修剪后的深度神经网络模型进行遥感图像建筑物识别。Step 6, use the pruned deep neural network model to identify buildings in remote sensing images.

具体地，在步骤2中所述的深度神经网络模型的训练过程中，训练集表示为D＝{X，y}，X表示第n个遥感图像，y表示对应第n个遥感图像的建筑物标签，Θ＝{W，b}表示为需要训练的深度神经网络的权重，W表示第l个卷积层，b表示第l层上的偏置，通过定义识别任务的损失函数并使用BP算法，得到Θ^★使得模型可以在数据集D上得到高的识别正确率的同时，还保持小的误差值静态。Specifically, in the training process of the deep neural network model described in step 2, the training set is represented as D={X, y}, where X represents the nth remote sensing image, and y represents the building corresponding to the nth remote sensing image. Label, Θ={W, b} is the weight of the deep neural network to be trained, W represents the lth convolutional layer, b represents the bias on the lth layer, by defining the loss function of the recognition task and using the BP algorithm , and Θ ^★ is obtained, so that the model can obtain a high recognition accuracy rate on the dataset D while maintaining a small error value statically.

进一步地，在步骤3中所述的独立最大响应图的计算过程中，其目标函数为Further, in the calculation process of the independent maximum response graph described in step 3, its objective function is

其中，初始输入的X为随机噪声，Θ^★为训练好的权重，J表示在l层所有的卷积核数量，每一个卷积核的输出为h_l，i(X，Θ^*)，它表示第l层上第i个卷积核的输出，h_l，-i(X，Θ^*)表示除去目标i个卷积核的输出激活值的其它特征图，arg max(*)表示输出的最大响应图，X^*则表示最后输出的独立最大响应图；Among them, the initial input X is random noise, ^Θ is the trained weight, J represents the number of all convolution kernels in the l layer, and the output of each convolution kernel is h _{l, i} (X, Θ ^* ), which is Represents the output of the ith convolution kernel on the lth layer, h _{l, -i} (X, Θ ^* ) represents other feature maps that remove the output activation values of the target i convolution kernels, and arg max(*) represents the output Maximum response map, X ^* represents the independent maximum response map of the final output;

固定对应的权值，使用梯度上升算法，迭代更新X^*，使它可以让第l层第i个卷积核的输出激活值最大，目标函数得到的X^*可以使目标卷积核输出尽可能大的同时，也保证其它卷积核的整体输出较小，最终的X^*则为对应卷积核在特征空间下的独立最大响应图。Fix the corresponding weights, use the gradient ascent algorithm to iteratively update X ^* , so that it can maximize the output activation value of the ith convolution kernel of the lth layer, and the X ^* obtained by the objective function can make the output of the target convolution kernel as far as possible At the same time, it also ensures that the overall output of other convolution kernels is small, and the final X ^* is the independent maximum response map of the corresponding convolution kernel in the feature space.

进一步地，所述的激活表达可替换性的计算包括以下步骤：Further, the calculation of described activation expression replaceability comprises the following steps:

步骤401，计算表达可替换性，表达可替换性的公式如下In step 401, the expression replaceability is calculated, and the formula for expressing the replaceability is as follows

RS(l，i)表示第l层第i个卷积核的解缠特征可以被同层其它卷积核替换的特性，其中，|{x_l}|表示第l层卷积核的总数目，IAM(l，i)表示为第l层第i个卷积核生成的独立最大响应图的特征表示，f(IAM(l，i))表示将生成的独立最大响应图特征表示进行前向传播得到的第l层第i个卷积核的激活值；|{x_l，j：x_l，j＞x_l，i}|表示第l层中，大于第j个卷积核激活值的卷积核个数；表达可替换性量化了卷积核在同层上表达的可替换性，其度量值范围在[0，1]；RS(l, i) represents the feature that the disentangled feature of the ith convolution kernel in the lth layer can be replaced by other convolution kernels in the same layer, where |{x _l }| represents the total number of convolution kernels in the lth layer , IAM(l, i) represents the feature representation of the independent maximum response map generated by the ith convolution kernel of the lth layer, f(IAM(l, i)) represents forwarding the generated independent maximum response map feature representation The activation value of the i-th convolution kernel in the l-th layer obtained by propagation; |{x _{l, j} : x _{l, j} > x _{l, i} }| indicates that in the l-th layer, the activation value is greater than the j-th convolution kernel activation value The number of convolution kernels; expression replaceability quantifies the replaceability of convolution kernels expressed on the same layer, and its metric value ranges from [0, 1];

步骤402，计算激活的表达可替换性，激活的表达可替换性定义为：In step 402, the activated expression replaceability is calculated, and the activated expression replaceability is defined as:

AR(l，i)表示对应卷积核的激活值中，不为0数值的占比，激活的表达可替换性表示了目标卷积核输出的有效激活值在特征空间中的表达的可替换性。AR(l, i) represents the proportion of the activation value of the corresponding convolution kernel that is not 0, and the replacement of the activation expression represents the replacement of the expression of the effective activation value output by the target convolution kernel in the feature space sex.

对于现在遥感图像建筑物识别模型，传统理论方法得到的泛化上界对于模型的评估有限。它们不能具体量化深度学习模型的泛化能力。而设置测试集的方法，需要更多的技巧来保证测试集中数据分布的均衡性，很难真正的体现出模型在未看见数据上的表现。因此需要一种对模型泛化进行量化的指标，可以直接对比模型的泛化能力，而不使用测试集。做到对建筑物识别的针对提高。对此，用建筑物识别模型的主要结构卷积核来直接量化模型的泛化能力。每个卷积核都具备抽取特征的能力。最终识别器通过对抽取特征进行识别从而完成对未看见数据的推理，也就是泛化性的体现。众多在参数空间上的分析也验证了卷积核对模型泛化性的重要性。卷积核在特征空间的表现也属于模型泛化能力的表现。自然地，当卷积核在特征空间可以呈现更加丰富的特征表达，这些丰富的特征将更有利于预测新的未看见数据。模型获取特征的丰富度的度量对评估泛化具有重要的意义。本发明方法可以通过度量卷积核获取特征的丰富度，从而进一步量化模型的泛化能力，从而对模型进行修剪，有效提高模型对遥感图像建筑物的识别精度。For the current remote sensing image building recognition model, the generalization upper bound obtained by traditional theoretical methods has limited evaluation of the model. They cannot specifically quantify the generalization ability of deep learning models. The method of setting the test set requires more skills to ensure the balance of the data distribution in the test set, and it is difficult to truly reflect the performance of the model on unseen data. Therefore, there is a need for an indicator to quantify the generalization of the model, which can directly compare the generalization ability of the model without using the test set. To improve the identification of buildings. In this regard, the main structure convolution kernel of the building recognition model is used to directly quantify the generalization ability of the model. Each convolution kernel has the ability to extract features. The final recognizer completes the reasoning of unseen data by identifying the extracted features, which is the embodiment of generalization. Numerous analyses in parameter space also verify the importance of convolution kernels for model generalization. The performance of the convolution kernel in the feature space also belongs to the performance of the generalization ability of the model. Naturally, when the convolution kernel can present richer feature expressions in the feature space, these rich features will be more conducive to predicting new unseen data. A measure of the richness of features acquired by a model has important implications for evaluating generalization. The method of the invention can obtain the richness of features by measuring the convolution kernel, so as to further quantify the generalization ability of the model, so as to prune the model, and effectively improve the recognition accuracy of the model for remote sensing image buildings.

附图说明Description of drawings

图1本发明方法的流程示意图；Fig. 1 is the schematic flow chart of the method of the present invention;

图2本发明方法实施例的激活的表达可替换性示意图。Fig. 2 is a schematic diagram of the alternative expression of activation in the method embodiment of the present invention.

具体实施方式Detailed ways

下面结合实施例和附图对本发明作进一步的说明，但不以任何方式对本发明加以限制，基于本发明教导所作的任何变换或替换，均属于本发明的保护范围。The present invention will be further described below in conjunction with the embodiments and the accompanying drawings, but the present invention is not limited in any way, and any transformation or replacement based on the teachings of the present invention belongs to the protection scope of the present invention.

如图1所示，基于激活表达可替换性的遥感图像建筑物识别方法，包括以下步骤：As shown in Fig. 1, the method for identifying buildings in remote sensing images based on the replaceability of activation expression includes the following steps:

在一个遥感图像建筑物识别任务中，假设训练集是D＝{X，y}。X表示第n个遥感图像，y表示对应第n个遥感图像的建筑物标签。以及需要训练的深度神经网络的权重Θ＝{W，b}。W表示第l个卷积层，同理b表示第l层上的偏置。通过定义识别任务的损失函数并使用BP算法，得到Θ^★使得模型可以在数据集D上得到高的识别正确率同时，还保持小的误差值。当模型收敛后，对于每一个新输入的图像，通过卷积核的处理可以将图像转换为对应的特征向量。模型最后的识别器可以正确识别该图像。In a remote sensing image building recognition task, the training set is assumed to be D={X, y}. X represents the nth remote sensing image, and y represents the building label corresponding to the nth remote sensing image. and the weights Θ={W, b} of the deep neural network that needs to be trained. W represents the lth convolutional layer, and similarly b represents the bias on the lth layer. By defining the loss function of the recognition task and using the BP algorithm, Θ ^★ can be obtained so that the model can obtain a high recognition accuracy rate on the dataset D while maintaining a small error value. After the model converges, for each new input image, the image can be converted into a corresponding feature vector by processing the convolution kernel. The recognizer at the end of the model can correctly recognize the image.

每个卷积核会对不同的特征产生不同的响应。定义每一个卷积核的输出为h(X，θ)，它表示第l层上第i个卷积核的输出。Erhan等人提出的最大响应图算法希望获取能引起卷积核输出响应最大的特征，公式定义如下，Each convolution kernel will respond differently to different features. The output of each convolution kernel is defined as h(X, θ), which represents the output of the ith convolution kernel on the lth layer. The maximum response graph algorithm proposed by Erhan et al. hopes to obtain the feature that can cause the largest output response of the convolution kernel. The formula is defined as follows:

X^*＝arg max h_l，i(X，Θ^*)X ^* = arg max h _{l, i} (X, Θ ^* )

初始输入的X为随机噪声。对于训练好的权重Θ^★。固定对应的权值，使用梯度上升算法，迭代更新X^*，使它可以让第l层第i个卷积核的输出激活值最大。最后得到的X^*则代表了可以引起该卷积核产生最大响应的可视化表达。The initial input X is random noise. For trained weights Θ ^★ . The corresponding weights are fixed, and the gradient ascent algorithm is used to iteratively update X ^* so that it can maximize the output activation value of the i-th convolution kernel in the l-th layer. The resulting X ^* represents the visual representation that elicits the maximum response from the convolution kernel.

然而，在最大响应图算法中，只考虑使目标卷积核的激活值最大。我们发现得到的可视化表达也会引起同层中其它卷积核产生高的响应，也就是说得到的视觉表达在特征上与其它卷积核的表达纠缠在了一起。生成的图像并不能很好的表示对应卷积核在特征空间上的表达。因此，为了更好的得到卷积核在特征空间的解缠特征，本实施例修改了最大响应图算法的目标。使得让目标卷积核输出激活值最大的同时，尽量保证其它卷积核得到较少的输入。However, in the maximum response graph algorithm, only maximizing the activation value of the target convolution kernel is considered. We found that the obtained visual representation also causes other convolution kernels in the same layer to produce high responses, that is to say, the obtained visual representation is entangled with the expressions of other convolution kernels in features. The generated images do not well represent the representation of the corresponding convolution kernels in the feature space. Therefore, in order to better obtain the disentangled features of the convolution kernel in the feature space, this embodiment modifies the target of the maximum response graph algorithm. While maximizing the output activation value of the target convolution kernel, try to ensure that other convolution kernels get less input.

其中，初始输入的X为随机噪声，Θ^★为训练好的权重，J表示在l层所有的卷积核数量，每一个卷积核的输出为h_l，i(X，Θ^*)，它表示第l层上第i个卷积核的输出，h_l，i(X，Θ^*)表示除去目标i个卷积核的输出激活值的其它特征图，arg max(*)表示输出的最大响应图，X^*则表示最后输出的独立最大响应图；Among them, the initial input X is random noise, ^Θ is the trained weight, J represents the number of all convolution kernels in the l layer, and the output of each convolution kernel is h _{l, i} (X, Θ ^* ), which is Represents the output of the i-th convolution kernel on the l-th layer, h _{l, i} (X, Θ ^* ) represents other feature maps excluding the output activation values of the target i convolution kernels, arg max(*) represents the maximum output value Response map, X ^* represents the independent maximum response map of the final output;

对于遥感图像建筑物识别，模型依赖多种特征可以预测新的数据。当新的遥感图像预测结果好，说明模型有强泛化能力。所以通过度量模型在特征空间的表达的特性可以帮助我们度量模型泛化能力。For building recognition from remote sensing images, the model relies on a variety of features to predict new data. When the prediction result of the new remote sensing image is good, it shows that the model has strong generalization ability. Therefore, by measuring the characteristics of the expression of the model in the feature space, it can help us measure the generalization ability of the model.

目前通过独立最大响应图算法，可以得到每一个卷积核的解缠特征。但不同卷积核可能产生重复的表达，存在其它卷积核对该解缠特征产生较高的响应。本发明方法提出表达可替换性Representational Substitution(RS)的方法去量化其它卷积核对表达的重复性，即卷积核在特征空间上表达的替换性。Currently, the disentangled features of each convolution kernel can be obtained through the independent maximum response graph algorithm. However, different convolution kernels may produce repeated expressions, and there are other convolution kernels that have a higher response to the disentanglement feature. The method of the present invention proposes a Representational Substitution (RS) method to quantify the repeatability of the expression of other convolution kernels, that is, the replacement of the expression of the convolution kernel in the feature space.

使用独立最大响应图算法得到的目标卷积核的解缠特征，并前向传播到目标卷积核所在层。如果目标卷积核在对应层的激活值是最大的话，那么表示该卷积核的特征是其它卷积核不能替换的。相反，当在同层中有其它卷积核的激活值大于目标卷积核的激活值，则代表目标卷积核的特征可被替换。可被替换的特征丢失也有其它卷积核表达该特征，所以该卷积核就不重要。The disentangled features of the target convolution kernel obtained using the independent maximum response graph algorithm are forwarded to the layer where the target convolution kernel is located. If the activation value of the target convolution kernel in the corresponding layer is the largest, then the feature representing the convolution kernel cannot be replaced by other convolution kernels. On the contrary, when the activation value of other convolution kernels in the same layer is greater than the activation value of the target convolution kernel, the features representing the target convolution kernel can be replaced. The replaceable feature loss also has other convolution kernels expressing the feature, so the convolution kernel is not important.

进一步地，所述的激活表达可替换性的计算包括步骤401和402。Further, the calculation of the replaceability of the activation expression includes steps 401 and 402 .

RS(l，i)表示第l层第i个卷积核的解缠特征可以被同层其它卷积核替换的特性，其中，|{x_l}|表示第l层卷积核的总数目，IAM(l，i)表示为第l层第i个卷积核生成的独立最大响应图的特征表示，f(IAM(l，i))表示将生成的独立最大响应图特征表示进行前向传播得到的第l层第i个卷积核的激活值；|{x_l，j：x_l，j＞x_l，i}|表示第l层中，大于第j个卷积核激活值的卷积核个数；表达可替换性量化了卷积核在同层上表达的可替换性，其度量值范围在[0，1]。RS(l, i) represents the feature that the disentangled feature of the ith convolution kernel in the lth layer can be replaced by other convolution kernels in the same layer, where |{x _l }| represents the total number of convolution kernels in the lth layer , IAM(l, i) represents the feature representation of the independent maximum response map generated by the ith convolution kernel of the lth layer, f(IAM(l, i)) represents forwarding the generated independent maximum response map feature representation The activation value of the i-th convolution kernel in the l-th layer obtained by propagation; |{x _{l, j} : x _{l, j} > x _{l, i} }| indicates that in the l-th layer, the activation value is greater than the j-th convolution kernel activation value The number of convolution kernels; expression replaceability quantifies the replaceability of the expression of the convolution kernels on the same layer, and its metric value ranges from [0, 1].

表达可替换性低的卷积核，可以表示表达不容易被替换。而当表达可替换性高时，有两个含义。一个是卷积核的表达容易被其它卷积核替代。另一种则表示，卷积核没有学习到任何特征。因为当目标卷积核对任何特征都不产生响应时，根据独立最大响应图公式可知f(x)接近于0，结果为其它卷积核平均响应值最小。所以这些卷积核会产生几乎相似的结果。但卷积核没有学习到任何特征，也会引起其它卷积核较高的响应，并使该卷积核的表达可替换性值变高。所以表达可替换性也表示该卷积核没有学习到特征。A convolution kernel with low expression replaceability can indicate that the expression is not easily replaceable. And when the expression replaceability is high, there are two meanings. One is that the expression of the convolution kernel is easily replaced by other convolution kernels. The other means that the convolution kernel did not learn any features. Because when the target convolution kernel does not respond to any feature, according to the independent maximum response graph formula, f(x) is close to 0, and the result is that the average response value of other convolution kernels is the smallest. So these convolution kernels will produce almost similar results. However, the convolution kernel does not learn any features, which will also cause a higher response of other convolution kernels and make the expression replaceability value of the convolution kernel higher. So expressing replaceability also means that the convolution kernel does not learn features.

本发明进一步提出了激活的表达可替换性Activated RepresentationalSubstitution(ARS)，它可以统一表示卷积核表达的可替换性。根据独立最大响应图公式，目标卷积核没有学习到特征时输出为0。使用表达可替换性表示卷积核的激活情况，表示输出特征图中非零数值占比。The present invention further proposes Activated Representational Substitution (ARS), which can uniformly represent the replaceability of convolution kernel expressions. According to the independent maximum response map formula, the output of the target convolution kernel is 0 when no features have been learned. Use expression replaceability to represent the activation of the convolution kernel, representing the proportion of non-zero values in the output feature map.

步骤402，结合卷积核的表达可替换性和激活响应值，计算激活的表达可替换性，激活的表达可替换性定义为：Step 402, combining the expression replaceability of the convolution kernel and the activation response value, calculate the expression replaceability of activation, and the expression replaceability of activation is defined as:

激活的表达可替换性低的卷积核代表在同层上激活的表达更不容易被替换，它对模型的泛化重要。随着ARS的值变大，表示卷积核从重复表达变成无意义表达。ARS高的卷积核对模型泛化不重要。The convolutional kernel with low replacement of activated expressions means that the expressions activated on the same layer are less likely to be replaced, which is important for the generalization of the model. As the value of ARS becomes larger, it means that the convolution kernel changes from repeated expression to meaningless expression. Convolutional kernels with high ARS are not important for model generalization.

激活的表达可替换性(ARS)实现原理是通过度量遥感图像建筑物模型的特征丰富度，从而度量出模型的泛化特性，并针对性的提高识别模型的识别结果。该方法也同时可以优秀的提高遥感图像建筑物模型的识别精度。激活的表达可替换性(ARS)原理示意图如图2所示，灰度和形状表示遥感图像深度学习模型所学习的不同的特征。因为模型卷积核的数量是固定的，当每个卷积核的表达不容易被其它卷积核的表达替换，则对于整个模型而言就可以存在更丰富的特征。所以，卷积核的激活的表达可替换性与模型的泛化具有十分强的关系。The realization principle of Activated Representation Replaceability (ARS) is to measure the feature richness of the remote sensing image building model, so as to measure the generalization characteristics of the model, and improve the recognition results of the recognition model in a targeted manner. This method can also improve the recognition accuracy of remote sensing image building models. A schematic diagram of the principle of Activated Representation Replaceability (ARS) is shown in Figure 2, where grayscale and shape represent different features learned by the deep learning model of remote sensing images. Because the number of model convolution kernels is fixed, when the expression of each convolution kernel is not easily replaced by the expression of other convolution kernels, there can be richer features for the entire model. Therefore, the expression replaceability of the activation of the convolution kernel has a very strong relationship with the generalization of the model.

如图2所示，输入图像是又不同特征组成的纽扣和脸。模型1在三个卷积核都表现其它卷积核都不容易替换的表达，即每个卷积核的激活的表达可替换性都较低，从而对输入图像都能正确识别。而模型2卷积核存在相似的表达，部分卷积核的激活的表达可替换性大，不能正确的识别到脸。As shown in Figure 2, the input images are buttons and faces composed of different features. In Model 1, all three convolution kernels show expressions that other convolution kernels are not easy to replace, that is, the expression replaceability of the activation of each convolution kernel is low, so that the input image can be correctly recognized. However, the model 2 convolution kernels have similar expressions, and the activation expressions of some convolution kernels are highly replaceable and cannot correctly recognize faces.

由发明内容和实施例可知，本发明方法针对基于深度学习的建筑物识别模型的提出激活的表达可替换性指标方法，它可以量化同层上每个卷积核在特征空间表达的可替换性。越重要的卷积核的激活的表达可替换性值越低，代表它在特征空间具有不可替换，该方法可以针对卷积核进行选择性修剪，从而有效提高遥感图像建筑物识别模型的识别精度。It can be seen from the content of the invention and the embodiments that the method of the present invention proposes an activated expression replaceability index method for a building recognition model based on deep learning, which can quantify the replaceability of each convolution kernel on the same layer expressed in the feature space. . The more important the activation of the convolution kernel, the lower the expression replaceability value, which means that it is irreplaceable in the feature space. This method can selectively prune the convolution kernel, thereby effectively improving the recognition accuracy of the remote sensing image building recognition model. .

Claims

1. The method for identifying the replaceable remote sensing image building based on the activation expression is characterized by comprising the following steps of:

step 1, obtaining a building data set of a remote sensing image;

step 2, training a common deep neural network model;

step 3, calculating and identifying an independent maximum response graph of each convolution kernel in the model;

step 4, calculating the activation expression replaceability of each convolution kernel;

step 5, pruning the model convolution kernels according to the activation expression replaceability of each convolution kernel, and keeping small activation expression replaceability;

and 6, using the trimmed deep neural network model to identify the remote sensing image building.

2. A remote sensing image building identification method according to claim 1, characterized in that in the training process of the deep neural network model in step 2, the training set is represented as D ═ { X, y }, X represents the nth remote distanceThe method comprises the steps of obtaining a perception image, wherein y represents a building label corresponding to an nth remote sensing image, theta is { W, b } represents weight of a deep neural network needing training, W represents an ith convolutional layer, b represents bias on the ith layer, and theta is obtained by defining a loss function of an identification task and using a BP algorithm^*So that the model can be made static with small error values while achieving a high recognition accuracy on the data set D.

3. A method for identifying buildings based on remote sensing images as claimed in claim 2, wherein the objective function of the independent maximum response graph in the step 3 is

Wherein, X of the initial input is random noise theta^★For the trained weights, J represents the number of all convolution kernels in the l layers, and the output of each convolution kernel is h_l，i(X，Θ^*) Which represents the output of the ith convolution kernel at the l-th layer, h_l，-i(X，Θ^*) Other feature maps representing the activation values of the output excluding the target i convolution kernels, argmax (X) representing the maximum response map of the output, X^*Then the independent maximum response graph of the final output is represented;

fixing corresponding weight value, using gradient rising algorithm to iteratively update X^*So that it can make the output activation value of ith convolution kernel of l layer be maximum and X obtained by target function^*The target convolution kernel output can be enabled to be as large as possible, and simultaneously, the integral output of other convolution kernels is ensured to be small, and the final X is obtained^*Then the independent maximum response map in feature space for the corresponding convolution kernel.

4. A remote sensing image building identification method as claimed in claim 3, characterized in that said calculation of the alternative of the activation expression comprises the following steps:

step 401, calculating the expression replaceability, wherein the formula of the expression replaceability is as follows

RS (l, i) represents the property that the unwrapping feature of the ith convolution kernel of the ith layer can be replaced by other convolution kernels on the same layer, wherein | { x_lL represents the total number of the ith layer of convolution kernels, IAM (l, i) represents the characteristic representation of an independent maximum response graph generated by the ith layer of convolution kernels, and f (IAM (l, i)) represents the activation value of the ith layer of convolution kernels obtained by forward propagation of the characteristic representation of the generated independent maximum response graph; i { x_l，j：x_l，j＞x_l，iThe j represents the number of convolution kernels which are larger than the j activation value of the convolution kernel in the l layer; expression replaceability quantifies the replaceability of the convolution kernel expression on the same layer, and the metric value ranges from 0 to 1]；

Step 402, calculating the expression replaceability of activation, wherein the expression replaceability of activation is defined as:

AR (l, i) represents the ratio of the activation values of the corresponding convolution kernels, wherein the ratio is not a 0 value, and the expression replaceability of activation represents the replaceability of the expression of the effective activation value output by the target convolution kernel in the feature space.