CN107992874B

CN107992874B - Method and system for image salient target region extraction based on iterative sparse representation

Info

Publication number: CN107992874B
Application number: CN201711387624.3A
Authority: CN
Inventors: 张永军; 王祥; 谢勋伟; 李彦胜
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2017-12-20
Filing date: 2017-12-20
Publication date: 2020-01-07
Anticipated expiration: 2037-12-20
Also published as: CN107992874A

Abstract

The invention provides an image salient target area extraction method and system based on iterative sparse representation. First, the original image is divided into superpixels by using multiple sets of SLIC segmentation methods with different pixel number parameters, and a group of segmentations with different sizes of superpixel areas are generated. Then, for the segmentation results of each scale, the classical visual attention detection results are used as the initial saliency map to constrain the selection of foreground and background sample regions, and then the reconstruction residuals of each superpixel region are calculated through the sparse representation process as the saliency map. The saliency factor is combined with recursive iterative operation to optimize the saliency detection result map under a single scale, and finally the final salient target and detection result are obtained through multi-scale saliency map fusion. The invention can effectively improve the problems existing in the traditional method, such as inconsistent saliency evaluation of a single target, difficulty in detecting salient targets at the edge of an image, and incomplete extraction of multiple salient targets.

Description

Method and system for image salient target region extraction based on iterative sparse representation

技术领域technical field

本发明属于计算机视觉与图像处理领域，涉及一种基于迭代稀疏表示的图像显著目标区域提取技术。The invention belongs to the field of computer vision and image processing, and relates to an image salient target area extraction technology based on iterative sparse representation.

背景技术Background technique

图像视觉显著性分析是计算机视觉、心理学、神经科学等领域十分重视的基础研究项目，是人眼能够从场景中快速、准确捕捉到视觉上最能引起注目的目标区域这一生物性能的技术性体现。通过图像显著性分析，能有效提取人们感兴趣的目标区域，借之可成功实现数据压缩，完成数据的高效管理与利用，其也是众多图像处理问题的基础环节。Image visual saliency analysis is a basic research project that attaches great importance to computer vision, psychology, neuroscience and other fields. reflect. Through image saliency analysis, the target area that people are interested in can be effectively extracted, data compression can be successfully achieved, and the efficient management and utilization of data can be completed, which is also the basic link of many image processing problems.

从1998年人们首次通过计算机实现对图像的自动显著性分析至今，随着其应用前景不断被挖掘，新颖的显著目标自动检测算法层出不穷。从解决方法的角度出发，已有的显著目标提取算法大体可分为两类，数据驱动的自下而上检测方法及任务驱动的自上而下检测方法。前者根据经验认识对输入图像做自动处理与识别，实现传统认知上的显著性分析，常见为非监督模式下的自动提取算法；后者结合实际的目标任务对图像进行针对性的分析，提取出符合特定应用需求的目标对象，常见为监督学习下的识别算法。另一方面，从提取结果状态的角度出发，传统方法又可以划分为基于视觉注意的显著性分析算法和显著目标提取方法，前者生成像素级的显著性预测图，后者则以提取出完整的显著目标区域为最终目标。Since 1998 when people first realized the automatic saliency analysis of images by computer, with the continuous exploration of its application prospects, novel automatic salient target detection algorithms have emerged one after another. From the perspective of solutions, the existing salient target extraction algorithms can be roughly divided into two categories: data-driven bottom-up detection methods and task-driven top-down detection methods. The former automatically processes and recognizes the input image based on empirical knowledge, and realizes the saliency analysis in traditional cognition, which is usually an automatic extraction algorithm in unsupervised mode; The target object that meets the specific application requirements is usually identified as a recognition algorithm under supervised learning. On the other hand, from the perspective of extracting the result state, traditional methods can be divided into saliency analysis algorithms based on visual attention and salient target extraction methods. The former generates pixel-level saliency prediction maps, while the latter extracts complete The salient target area is the final target.

在自下而上的非监督方法中，由于缺少高层次的生物认知信息，所以通常需要引入一定的假设性约束来完成检测任务。经验分析结果认为，分布于图像中间附近的目标更能吸引视觉注意，反之接近图像边缘的区域，其显著性通常较低；同时对比度大的局部区域也能表现出更高的视觉显著特性，因此，结合图像中心/边界约束和对比度分析的显著性检测方法发展迅速，同时也表现出了十分突出的检测性能。随着研究和应用的深入与扩展，前述方法对假设性条件的依赖性这一不足日渐凸显，具体表现如下：1)当显著性目标靠近图像边缘时，通常无法实现正确检测；2)基于局部对比度分析的方法，提取出的显著目标区域不完整，目标内部显著性评测不统一；3)基于全局对比度分析的方法，当处理同时存在多个显著性目标的问题时，常有漏检情况发生。因此，如何攻克传统方法中的不足，弱化高层认知信息缺失情形下的假设性条件约束依赖性，提升显著目标提取的统一性与完整度，强化算法的适应能力仍旧是需要进一步研究和攻克的技术性难题。In bottom-up unsupervised methods, due to the lack of high-level biocognitive information, it is usually necessary to introduce certain hypothetical constraints to complete the detection task. The empirical analysis results show that the target distributed near the middle of the image can attract more visual attention, whereas the area close to the edge of the image is usually less saliency. , saliency detection methods combined with image center/boundary constraints and contrast analysis developed rapidly, and also showed very outstanding detection performance. With the deepening and expansion of research and applications, the above-mentioned method's dependence on hypothetical conditions has become increasingly prominent. The specific manifestations are as follows: 1) When the salient target is close to the edge of the image, it is usually impossible to achieve correct detection; 2) Based on local With the contrast analysis method, the extracted salient target area is incomplete, and the saliency evaluation within the target is not uniform; 3) The method based on the global contrast analysis, when dealing with the problem of multiple salient targets at the same time, there are often missed detections. . Therefore, how to overcome the shortcomings of traditional methods, weaken the assumption conditional constraint dependence in the absence of high-level cognitive information, improve the unity and integrity of salient target extraction, and strengthen the adaptability of the algorithm still needs further research and overcoming. technical difficulties.

发明内容SUMMARY OF THE INVENTION

本发明的目的是提供一种自然背景下图像显著目标区域一致性提取方法的技术方案，它能够充分利用图像中前景和背景的综合差异，整合显著目标间的固有联系，弱化显著性分析过程对传统假设性条件约束的依赖性，实现多显著目标的一致性提取，保障单一显著目标内部整体性，和多目标提取的完整性。The purpose of the present invention is to provide a technical solution for a method for extracting the consistency of image salient target regions under natural background, which can make full use of the comprehensive differences between foreground and background in images, integrate the inherent relationship between salient targets, and weaken the influence of salience analysis process on the The dependence of the traditional assumption condition constraints, realizes the consistent extraction of multiple salient objects, guarantees the internal integrity of a single salient object, and the integrity of multi-objective extraction.

为达到上述目的，本发明提供的技术方案为一种基于迭代稀疏表示的图像显著目标区域提取方法，包括以下步骤：In order to achieve the above purpose, the technical solution provided by the present invention is an image salient target region extraction method based on iterative sparse representation, comprising the following steps:

步骤1，数据预处理，设定不同的SLIC超像素数，对原始图像进行多尺度超像素分割，使用基于经典视觉注意的显著性检测，将检测结果设定为初始显著图SAL₀；Step 1, data preprocessing, set different SLIC superpixel numbers, perform multi-scale superpixel segmentation on the original image, use the saliency detection based on classical visual attention, and set the detection result as the initial saliency map SAL ₀ ;

步骤2，提取像素级原始图像的显著性特征，包括颜色特征、位置特征和梯度特征；Step 2, extract the salient features of the pixel-level original image, including color features, position features and gradient features;

步骤3，通过计算每种尺度超像素区域内所有原始像素特征均值，获得单一尺度分割下超像素区域特征；Step 3, by calculating the mean value of all original pixel features in the superpixel region of each scale, obtain the superpixel region features under a single scale segmentation;

步骤4，针对单一尺度的分割结果，通过递归稀疏表示计算显著图，包括以下子步骤：Step 4, for the segmentation result of a single scale, calculate the saliency map through recursive sparse representation, including the following sub-steps:

步骤4.1，超像素初始显著性计算，根据SAL₀通过均值计算求解各超像素初始显著性水平；Step 4.1, superpixel initial saliency calculation, calculate the initial saliency level of each superpixel through mean calculation according to SAL ₀ ;

步骤4.2，前景样本提取，对超像素初始显著性水平进行降序排列，取前p1％个超像素作为前景样本D_f；Step 4.2, extracting foreground samples, arranging the initial significance levels of superpixels in descending order, and taking the first p1% of superpixels as foreground samples D _f ;

步骤4.3，背景样本提取，对超像素初始显著性水平进行升序排列，取前p2％个超像素作为备选背景样本D_b1，提取接触图像边界的超像素作为备选背景样本D_b2，背景样本计算公式如下：Step 4.3, background sample extraction, arranging the initial significance levels of superpixels in ascending order, taking the first p2% superpixels as candidate background samples D _b1 , extracting superpixels touching the boundary of the image as candidate background samples D _b2 , background samples Calculated as follows:

D_b＝D_b1+D_b2-D_f (1)D _b =D _b1 +D _b2 -D _f (1)

步骤4.4，双重稀疏表示及稀疏残差计算，分别以前景样本和背景样本作为字典对所有超像素进行稀疏表示并计算重构残差，公式如下：Step 4.4, double sparse representation and sparse residual calculation, use foreground samples and background samples as dictionaries to sparsely represent all superpixels and calculate the reconstruction residuals, the formula is as follows:

其中i表示超像素编号；F_i是超像素区域的特征向量；λ_b，λ_f是正则参数；α_bi，α_fi分别为前景稀疏表示结果和背景稀疏表示结果；ε_bi，ε_fi分别为前景稀释重构残差和背景稀疏重构残差；where i represents the superpixel number; F _i is the feature vector of the superpixel region; λ _b , λ _f are regular parameters; α _bi , α _fi are the foreground sparse representation result and the background sparse representation result respectively; ε _bi , ε _fi are respectively Foreground dilution reconstruction residual and background sparse reconstruction residual;

步骤4.5，显著性因子计算，按照公式(6)对ε_bi和ε_fi进行融合并将超像素融合结果赋予其内所有原始图像像素，计算得到显著性因子图SAL_i，Step 4.5, saliency factor calculation, fuse ε _bi and ε _fi according to formula (6) and assign the superpixel fusion result to all original image pixels in it, and calculate the saliency factor map SAL _i ,

SAL_i＝ε_bi/(ε_fi+σ²) (6)SAL _i =ε _bi /(ε _fi +σ ² ) (6)

其中σ²是非负调节参数；where σ ² is a non-negative tuning parameter;

步骤4.6，递归处理，按照公式(7)计算显著性因子图SAL_i和初始显著图SAL₀之间的相关系数rela，若rela<K，则令SAL₀＝SAL_i并重复执行步骤4全过程；若rela>K，则递归结束，输出当前SAL_i为该尺度下的显著性检测结果；其中，K是相似性判定阈值，Step 4.6, recursive processing, calculate the correlation coefficient rela between the saliency factor map SAL _i and the initial saliency map SAL ₀ according to formula (7), if rela<K, then set SAL ₀ =SAL _i and repeat the whole process of step 4 ; If rela>K, the recursion ends, and the output current SAL _i is the saliency detection result at this scale; where K is the similarity judgment threshold,

rela＝corr2(A,B) (7)rela=corr2(A,B) (7)

其中corr2()为相关系数计算函数；A,B为待比较矩阵或图像；rela为A和B之间的相关系数，值越大，A和B越相似，反之差异越大；where corr2() is the correlation coefficient calculation function; A and B are the matrices or images to be compared; rela is the correlation coefficient between A and B, the larger the value, the more similar A and B, and vice versa, the greater the difference;

步骤5，多尺度显著性检测结果融合，对各单一尺度下的显著性结果进行等权线性组合，计算最终的显著性检测结果。In step 5, the multi-scale saliency detection results are fused, and the saliency results under each single scale are combined linearly with equal weights to calculate the final saliency detection result.

进一步的，步骤2中所述显著性特征为RGB、Lab、x、y、以及一阶梯度和二阶梯度共计13维特征，表示为{R,G,B,L,a,b,x,y,f_x,f_y,f_xx,f_yy,f_xy}，其中R,G,B,L,a,b共六维特征为颜色特征，R,G,B和L,a,b分别为RGB和LAB颜色信息；x、y为位置信息，x、y为像素在图像中的行列坐标；f_x,f_y,f_xx,f_yy,f_xy为梯度特征，分别表示像素在X，Y方向的一阶和二阶差分，计算公式如下：Further, the salient features described in step 2 are RGB, Lab, x, y, and a total of 13-dimensional features of the first-order gradient and the second-order gradient, which are expressed as {R, G, B, L, a, b, x, y,f _x ,f _y ,f _xx ,f _yy ,f _xy }, where R, G, B, L, a, b have six-dimensional features as color features, R, G, B and L, a, b respectively are RGB and LAB color information; x, y are position information, x, y are the row and column coordinates of the pixel in the image; f _x , f _y , f _xx , f _yy , f _xy are gradient features, which respectively indicate that the pixel is in X, The first-order and second-order differences in the Y direction are calculated as follows:

f_x＝(f(i+1,j)-f(i-1,j))/2f _x =(f(i+1,j)-f(i-1,j))/2

f_y＝(f(i,j+1)-f(i,j-1))/2f _y =(f(i,j+1)-f(i,j-1))/2

f_xx＝(f_x(i+1,j)-f_x(i-1,j))/2 (8)f _xx = (f _x (i+1,j)-f _x (i-1,j))/2 (8)

f_yy＝(f_y(i,j+1)-f_y(i,j-1))/2f _yy =(f _y (i,j+1)-f _y (i,j-1))/2

f_xy＝(f_x(i,j+1)-f_x(i,j-1))/2f _xy = (f _x (i,j+1)-f _x (i,j-1))/2

其中，f(i,j)为图像矩阵，i,j为图像像素行列号。Among them, f(i,j) is the image matrix, and i,j is the image pixel row and column number.

本发明还相应提供一种基于迭代稀疏表示的图像显著目标区域提取系统，包括以下模块，The present invention also correspondingly provides an image salient target region extraction system based on iterative sparse representation, comprising the following modules:

预处理模块，用于数据预处理，设定不同的SLIC超像素数，对原始图像进行多尺度超像素分割，使用基于经典视觉注意的显著性检测，将检测结果设定为初始显著图SAL₀；The preprocessing module is used for data preprocessing, setting different SLIC superpixel numbers, performing multi-scale superpixel segmentation on the original image, using saliency detection based on classical visual attention, and setting the detection result as the initial saliency map SAL ₀ ;

显著性特征提取模块，用于提取像素级原始图像的显著性特征，包括颜色特征、位置特征和梯度特征；The saliency feature extraction module is used to extract the saliency features of pixel-level original images, including color features, position features and gradient features;

超像素区域特征获取模块，用于通过计算每种尺度超像素区域内所有原始像素特征均值，获得单一尺度分割下超像素区域特征；The superpixel region feature acquisition module is used to obtain the superpixel region features under a single scale segmentation by calculating the mean value of all original pixel features in the superpixel region of each scale;

稀疏表示模块，用于针对单一尺度的分割结果，通过递归稀疏表示计算显著图，包括以下子模块：The sparse representation module is used to calculate the saliency map through recursive sparse representation for the segmentation results of a single scale, including the following sub-modules:

第一子模块，用于超像素初始显著性计算，根据SAL₀通过均值计算求解各超像素初始显著性水平；The first submodule is used for the initial saliency calculation of superpixels, and calculates the initial saliency level of each superpixel through mean value calculation according to SAL ₀ ;

第二子模块，用于前景样本提取，对超像素初始显著性水平进行降序排列，取前p1％个超像素作为前景样本D_f；The second sub-module is used for foreground sample extraction, the initial significance levels of superpixels are sorted in descending order, and the first p1% superpixels are taken as foreground samples D _f ;

第三子模块，用于背景样本提取，对超像素初始显著性水平进行升序排列，取前p2％个超像素作为备选背景样本D_b1，提取接触图像边界的超像素作为备选背景样本D_b2，背景样本计算公式如下：The third sub-module is used for background sample extraction, arranging the initial significance levels of superpixels in ascending order, taking the first p2% superpixels as candidate background samples D _b1 , and extracting superpixels touching the boundary of the image as candidate background samples D _b2 , the background sample calculation formula is as follows:

D_b＝D_b1+D_b2-D_f (1)D _b =D _b1 +D _b2 -D _f (1)

第四子模块，用于双重稀疏表示及稀疏残差计算，分别以前景样本和背景样本作为字典对所有超像素进行稀疏表示并计算重构残差，公式如下：The fourth sub-module is used for double sparse representation and sparse residual calculation. The foreground samples and background samples are used as dictionaries to sparsely represent all superpixels and calculate the reconstruction residual. The formula is as follows:

第五子模块，用于显著性因子计算，按照公式(6)对ε_bi和ε_fi进行融合并将超像素融合结果赋予其内所有原始图像像素，计算得到显著性因子图SAL_i，The fifth sub-module is used for saliency factor calculation. According to formula (6), ε _bi and ε _fi are fused, and the superpixel fusion result is assigned to all original image pixels in it, and the saliency factor map SAL _i is obtained by calculating,

SAL_i＝ε_bi/(ε_fi+σ²) (6)SAL _i =ε _bi /(ε _fi +σ ² ) (6)

第六子模块，用于递归处理，按照公式(7)计算显著性因子图SAL_i和初始显著图SAL₀之间的相关系数rela，若rela<K，则令SAL₀＝SAL_i并重复执行步骤4全过程；若rela>K，则递归结束，输出当前SAL_i为该尺度下的显著性检测结果；其中，K是相似性判定阈值，The sixth sub-module, used for recursive processing, calculates the correlation coefficient rela between the saliency factor map SAL _i and the initial saliency map SAL ₀ according to formula (7). If rela<K, then make SAL ₀ =SAL _i and repeat the execution The whole process of step 4; if rela>K, the recursion ends, and the output current SAL _i is the saliency detection result under this scale; among them, K is the similarity judgment threshold,

rela＝corr2(A,B) (7)rela=corr2(A,B) (7)

检测结果融合模块，用于多尺度显著性检测结果融合，对各单一尺度下的显著性结果进行等权线性组合，计算最终的显著性检测结果。The detection result fusion module is used for the fusion of multi-scale saliency detection results, and performs equal-weight linear combination of saliency results under each single scale to calculate the final saliency detection result.

进一步的，显著性特征提取模块中所述显著性特征为RGB、Lab、x、y、以及一阶梯度和二阶梯度共计13维特征，表示为{R,G,B,L,a,b,x,y,f_x,f_y,f_xx,f_yy,f_xy}，其中R,G,B,L,a,b共六维特征为颜色特征，R,G,B和L,a,b分别为RGB和LAB颜色信息；x、y为位置信息，x、y为像素在图像中的行列坐标；f_x,f_y,f_xx,f_yy,f_xy为梯度特征，分别表示像素在X，Y方向的一阶和二阶差分，计算公式如下：Further, the salient features described in the salient feature extraction module are RGB, Lab, x, y, and a total of 13-dimensional features of the first-order gradient and the second-order gradient, which are expressed as {R, G, B, L, a, b ,x,y,f _x ,f _y ,f _xx ,f _yy ,f _xy }, where R, G, B, L, a, b have six-dimensional features as color features, R, G, B and L, a , b are RGB and LAB color information respectively; x, y are position information, x, y are the row and column coordinates of pixels in the image; f _x , f _y , f _xx , f _yy , f _xy are gradient features, representing pixels respectively The first-order and second-order differences in the X and Y directions are calculated as follows:

f_x＝(f(i+1,j)-f(i-1,j))/2f _x =(f(i+1,j)-f(i-1,j))/2

f_y＝(f(i,j+1)-f(i,j-1))/2f _y =(f(i,j+1)-f(i,j-1))/2

f_xx＝(f_x(i+1,j)-f_x(i-1,j))/2 (8)f _xx = (f _x (i+1,j)-f _x (i-1,j))/2 (8)

f_yy＝(f_y(i,j+1)-f_y(i,j-1))/2f _yy =(f _y (i,j+1)-f _y (i,j-1))/2

f_xy＝(f_x(i,j+1)-f_x(i,j-1))/2f _xy = (f _x (i,j+1)-f _x (i,j-1))/2

本发明的方法首先利用多组不同像元数参数的SLIC分割方法对原始图像进行超像素分割，生成一组超像素区域大小不同的分割图像，建立多尺度源数据。然后针对每一种尺度的分割结果，以经典的视觉注意检测结果作为初始显著图约束前景和后景样本区域的选择，进而通过稀疏表示过程计算各超像素区域的重构残差作为显著性因子，并结合递归迭代运算对单一尺度下的显著检测结果图进行优化，最后通过多尺度显著图融合取得最终的显著目标与检测结果。本发明技术方案具有如下优点：The method of the invention firstly uses multiple groups of SLIC segmentation methods with different pixel number parameters to perform superpixel segmentation on the original image, generates a group of segmented images with different sizes of superpixel regions, and establishes multi-scale source data. Then, for the segmentation results of each scale, the classical visual attention detection results are used as the initial saliency map to constrain the selection of foreground and background sample regions, and then the reconstruction residuals of each superpixel region are calculated as saliency factors through the sparse representation process. , and combined with recursive iterative operations to optimize the saliency detection result map at a single scale, and finally obtain the final salient target and detection result through multi-scale saliency map fusion. The technical scheme of the present invention has the following advantages:

1)通过多组SLIC分割器将图像划分为多个尺度的超像素图像，一方面结合SLIC方法能有效保留图像轮廓信息，有助于显著性检测过程保持同一目标区域内部的一致性；另外通过多尺度分割，使得算法对不同尺寸目标的检测具有较好适应性和鲁棒性。1) The image is divided into superpixel images of multiple scales by multiple groups of SLIC segmenters. On the one hand, the SLIC method can effectively retain the image contour information, which helps the saliency detection process to maintain the consistency within the same target area; Multi-scale segmentation enables the algorithm to have better adaptability and robustness to the detection of objects of different sizes.

2)通过基于前景字典和背景字典的双重稀疏表示过程计算像素(区域)显著性，一方面以重构过程残差作为显著性水平指标，从全局的角度判定像素间的视觉显著相似性水平，不同于传统基于对比度和图像边界约束方法，能有效改善目标检测不完整的问题；另外双重稀疏表示的计算过程也能对各像元的属性作更为全面的分析以判定其显著性水平，可进一步提升算法鲁棒性。2) Calculate the pixel (regional) saliency through the double sparse representation process based on the foreground dictionary and the background dictionary. On the one hand, the residual of the reconstruction process is used as the saliency level indicator to determine the visual saliency similarity level between pixels from a global perspective. Different from the traditional methods based on contrast and image boundary constraints, it can effectively improve the problem of incomplete target detection; in addition, the calculation process of double sparse representation can also analyze the attributes of each pixel more comprehensively to determine its significance level. Further improve the robustness of the algorithm.

3)通过递归优化过程能一定程度上弱化算法对基于经典视觉注意模型生成的初始显著图的依赖，有助于提升算法可靠性。3) Through the recursive optimization process, the dependence of the algorithm on the initial saliency map generated based on the classical visual attention model can be weakened to a certain extent, which is helpful to improve the reliability of the algorithm.

附图说明Description of drawings

图1为本发明实施例中本发明方法与传统方法处理图像显著目标检测结果图对比，(a)为输入图像，(b)为显著目标真值，(c)为传统基于局部对比度方法检测结果，(d)为基于全局对比度方法检测结果，(e)为图像边界约束方法检测结果，(f)为本发明方法检测结果。Fig. 1 is a comparison of the detection results of the salient target in the image processed by the method of the present invention and the traditional method in the embodiment of the present invention, (a) is the input image, (b) is the true value of the salient target, (c) is the detection result of the traditional method based on local contrast , (d) is the detection result based on the global contrast method, (e) is the detection result of the image boundary constraint method, and (f) is the detection result of the method of the present invention.

图2为本发明实施例流程图。FIG. 2 is a flowchart of an embodiment of the present invention.

具体实施方式Detailed ways

以下根据附图和实施例对本发明的具体技术方案进行说明。The specific technical solutions of the present invention will be described below according to the accompanying drawings and embodiments.

本发明提出一种基于迭代稀疏表示的图像显著目标区域提取方法。该方法通过对图像进行显著性分析，提取出最能吸引人体视觉注意的目标区域，可起到有效数据优选和数据压缩的作用，是诸多图像处理问题的基本环节。研究发现，传统基于局部对比度、全局对比度和图像边界约束的图像显著目标提取方法，通常都对相应的约束条件具有较强的依赖性，容易出现单目标内部显著性不统一、多目标检测不完整和图像边界显著目标提取不明确的问题，如图1(c)-(e)所示。本方法利用双重稀疏表示及重构残差计算像素显著性水平，借助递归迭代过程优化检测结果，同时通过融合多尺度检测结果来提升算法适用性，如图1(f)所示，本发明方法检测结果中单目标显著性具有更好的一致性，对多目标检测更完整，同时也能一定程度上改善传统方法对于贴近图像边界的显著目标存在漏检的不足，实施例充分证实了该方法具有强于传统一般显著目标提取方法的检测性能，如图2所示，实施例所提供的具体实现方法包含以下步骤：The invention proposes an image salient target region extraction method based on iterative sparse representation. This method extracts the target area that can most attract human visual attention by analyzing the saliency of the image, which can play the role of effective data optimization and data compression, and is the basic link of many image processing problems. The study found that traditional image salient object extraction methods based on local contrast, global contrast and image boundary constraints usually have strong dependence on the corresponding constraints, and are prone to inconsistent internal saliency of single target and incomplete multi-target detection. and the problem that the extraction of salient objects with image boundaries is not clear, as shown in Figure 1(c)-(e). This method uses double sparse representation and reconstruction residuals to calculate the pixel saliency level, optimizes the detection results by means of a recursive iterative process, and improves the applicability of the algorithm by fusing the multi-scale detection results. As shown in Figure 1(f), the method of the present invention The single-target saliency in the detection results has better consistency, and is more complete for multi-target detection. At the same time, it can also improve the traditional method to a certain extent. The lack of missed detection of salient targets close to the image boundary is fully confirmed by the embodiment. It has better detection performance than traditional general salient target extraction methods. As shown in Figure 2, the specific implementation method provided by the embodiment includes the following steps:

步骤1，数据获取。下载显著性检测开源数据集原始图像和显著目标真值数据。Step 1, data acquisition. Download the saliency detection open source dataset raw images and saliency object ground-truth data.

步骤2，数据预处理。对原始图像进行多尺度分割，并使用经典视觉注意模型进行显著性检测，生成初始显著图。Step 2, data preprocessing. Multi-scale segmentation is performed on the original image, and a classical visual attention model is used for saliency detection to generate an initial saliency map.

步骤3，提取原始图像像素的13维特征，即{R,G,B,L,a,b,x,y,f_x,f_y,f_xx,f_yy,f_xy}。Step 3: Extract the 13-dimensional features of the original image pixels, namely {R, G, B, L, a, b, x, y, f _x , f _y , f _xx , f _yy , f _xy }.

步骤4，通过均值计算，提取单一尺度分割下的超像素区域特征，即F＝{mR,mG,mB,mL,ma,mb,mx,my,mf_x,mf_y,mf_xx,mf_yy,mf_xy}，其中F为区域特征向量，mX(X＝R,G,B,L,a,b,x,y,f_x,f_y,f_xx,f_yy,f_xy)为超像素区域内所有图像原始像素X属性的均值。Step 4: Extract the superpixel region features under single-scale segmentation through mean calculation, that is, F={mR,mG,mB,mL,ma,mb,mx,my,mf _x ,mf _y ,mf _xx ,mf _yy , mf _xy }, where F is the region feature vector, mX (X=R, G, B, L, a, b, x, y, f _x , f _y , f _xx , f _yy , f _xy ) is the superpixel region The mean of the X attribute of all the original pixels in the image.

步骤5，针对单一尺度的分割结果，通过递归稀疏表示计算显著图，包括以下子步骤：Step 5, for the segmentation result of a single scale, calculate the saliency map through recursive sparse representation, including the following sub-steps:

步骤5.1，通过均值计算，获取超像素初始显著图。Step 5.1, obtain the initial saliency map of superpixels through mean calculation.

步骤5.2，提取初始显著性水平较高的部分超像素区域作为前景样本。对超像素初始显著性水平进行降序排列，取前p1％个超像素作为前景样本D_f，本实施例中p1取20，本领域技术人员可根据需要选取合适值；Step 5.2, extract some superpixel regions with high initial saliency level as foreground samples. The initial significance levels of the superpixels are arranged in descending order, and the first p1% of the superpixels are taken as the foreground samples D _f . In this embodiment, p1 is taken as 20, and those skilled in the art can select an appropriate value as needed;

步骤5.3，结合初始显著图和图像边界约束提取背景样本。对超像素初始显著性水平进行升序排列，取前p2％个超像素作为备选背景样本D_b1，本实施例中p2取20，本领域技术人员可根据需要选取合适值；提取接触图像边界的超像素作为备选背景样本D_b2，背景样本计算公式如下：Step 5.3, combine the initial saliency map and image boundary constraints to extract background samples. The initial significance levels of the superpixels are arranged in ascending order, and the first p2% of the superpixels are taken as the candidate background samples D _b1 . The superpixel is used as the candidate background sample D _b2 , and the calculation formula of the background sample is as follows:

D_b＝D_b1+D_b2-D_f (1)D _b =D _b1 +D _b2 -D _f (1)

步骤5.4，使用前景样本和背景样本分别对图像所有超像素进行两组稀疏表示，并计算对应的重构残差。公式如下：Step 5.4: Use foreground samples and background samples to perform two sets of sparse representations on all superpixels in the image, and calculate the corresponding reconstruction residuals. The formula is as follows:

其中i表示超像素编号；F_i是超像素区域的特征向量；λ_b，λ_f是正则参数，本实施例中λ_b，λ_f均取0.01；α_bi，α_fi分别为前景稀疏表示结果和背景稀疏表示结果；ε_bi，ε_fi分别为前景稀释重构残差和背景稀疏重构残差；where i represents the number of the superpixel; F _i is the feature vector of the superpixel region; λ _b , λ _f are regular parameters, in this embodiment λ _b , λ _f both take 0.01; α _bi , α _fi are the foreground sparse representation results, respectively and background sparse representation results; ε _bi , ε _fi are the foreground dilution reconstruction residual and the background sparse reconstruction residual, respectively;

步骤5.5，融合两组重构残差生成超像素显著性因子，并以超像素区域内原始影像像素显著性一致的要求为准则获取原始图像显著性因子图。按照公式(6)对ε_bi和ε_fi进行融合并将超像素融合结果赋予其内所有原始图像像素，计算得到显著性因子图SAL_i，Step 5.5, fuse the two groups of reconstruction residuals to generate superpixel saliency factors, and obtain the original image saliency factor map based on the requirement that the saliency of the original image pixels in the superpixel area is consistent. According to formula (6), ε _bi and ε _fi are fused and the superpixel fusion result is assigned to all the original image pixels in it, and the saliency factor map SAL _i is obtained by calculating,

SAL_i＝ε_bi/(ε_fi+σ²) (6)SAL _i =ε _bi /(ε _fi +σ ² ) (6)

其中σ²是非负调节参数，本实施例取0.1，本领域技术人员可根据需要选取合适值；Wherein σ ² is a non-negative adjustment parameter, which is taken as 0.1 in this embodiment, and those skilled in the art can select an appropriate value as required;

步骤5.6，比较5.5得到的显著因子图和5.1中的初始显著图，执行递归处理过程，递归结束时输出为当前尺度下的显著性检测结果。按照公式(7)计算显著性因子图SAL_i和初始显著图SAL₀之间的相关系数rela，若rela<K，则令SAL₀＝SAL_i并重复执行步骤4全过程；若rela>K，则递归结束，输出当前SAL_i为该尺度下的显著性检测结果；其中，K是相似性判定阈值，本实施例中K取0.99，本领域技术人员可根据需要选取合适值；Step 5.6, compare the saliency factor map obtained in 5.5 with the initial saliency map in 5.1, execute the recursive processing process, and output the saliency detection result at the current scale at the end of the recursion. Calculate the correlation coefficient rela between the saliency factor map SAL _i and the initial saliency map SAL ₀ according to formula (7). If rela<K, then set SAL ₀ =SAL _i and repeat the whole process of step 4; if rela>K, Then the recursion ends, and the output current SAL _i is the significance detection result under this scale; wherein, K is the similarity judgment threshold, and in this embodiment, K takes 0.99, and those skilled in the art can select an appropriate value as needed;

rela＝corr2(A,B) (7)rela=corr2(A,B) (7)

步骤6，对多尺度下的显著性检测结果进行均值融合，生成最终的目标提取结果。Step 6: Perform mean fusion on the multi-scale saliency detection results to generate the final target extraction result.

从理论上分析，本发明整个技术方案实施中，在稀疏表示原理的支撑下实现自然背景图像中显著目标区域的整体一致性提取。不同于传统基于对比度和图像边界约束的显著目标检测方法，本发明充分利用图像中前景和背景之间的综合差异，整合显著目标内部和多个显著目标之间的固有联系，尝试规避传统基于对比度约束和图像边界约束的检测方法面临的假设性条件依赖性问题，以稀疏表示作为图像像素一致性分析途径，以稀疏重构残差作为像素差异指标，将基于图像前景和背景字典的稀疏重构残差作为显著性因子，以此实现多个显著目标的一致性提取，保障单一显著目标内部的整体性和多目标提取的完整性。Theoretically, in the implementation of the entire technical solution of the present invention, the overall consistency extraction of the salient target area in the natural background image is realized under the support of the principle of sparse representation. Different from the traditional salient target detection method based on contrast and image boundary constraints, the present invention makes full use of the comprehensive difference between the foreground and background in the image, integrates the inherent relationship within the salient target and between multiple salient targets, and tries to avoid the traditional contrast-based method. Constraints and image boundary constraint detection methods face the problem of hypothetical conditional dependence. The sparse representation is used as the image pixel consistency analysis method, and the sparse reconstruction residual is used as the pixel difference index. The sparse reconstruction based on the image foreground and background dictionaries is used. The residual is used as a saliency factor to achieve consistent extraction of multiple salient targets, ensuring the integrity of a single salient target and the integrity of multi-target extraction.

具体实施时，本发明技术方案可基于计算机软件技术实现自动运行流程，也可采用模块化方式实现相应系统。本发明实施例提供一种基于迭代稀疏表示的图像显著目标区域提取系统，包括以下模块：During specific implementation, the technical solution of the present invention can realize the automatic running process based on computer software technology, and can also realize the corresponding system in a modular way. An embodiment of the present invention provides an image salient target region extraction system based on iterative sparse representation, including the following modules:

D_b＝D_b1+D_b2-D_f (1)D _b =D _b1 +D _b2 -D _f (1)

SAL_i＝ε_bi/(ε_fi+σ²) (6)SAL _i =ε _bi /(ε _fi +σ ² ) (6)

第六子模块，用于递归处理，按照公式(7)计算显著性因子图SAL_i和初始显著图SAL₀之间的相关系数rela，若rela<K，则令SAL₀＝SAL_i并重复执行步骤4全过程；若rela>K，则递归结束，输出当前SAL_i为该尺度下的显著性检测结果；其中，K是相似性判定阈值，The sixth sub-module, used for recursive processing, calculates the correlation coefficient rela between the saliency factor map SAL _i and the initial saliency map SAL ₀ according to formula (7). If rela<K, then set SAL ₀ =SAL _i and repeat the execution The whole process of step 4; if rela>K, the recursion ends, and the output current SAL _i is the saliency detection result under this scale; among them, K is the similarity judgment threshold,

rela＝corr2(A,B) (7)rela=corr2(A,B) (7)

显著性特征提取模块中所述显著性特征为RGB、Lab、x、y、以及一阶梯度和二阶梯度共计13维特征，表示为F＝{R,G,B,L,a,b,x,y,f_x,f_y,f_xx,f_yy,f_xy}，其中R,G,B,L,a,b共六维特征为颜色特征，R,G,B和L,a,b分别为RGB和LAB颜色信息；x、y为位置信息，x、y为像素在图像中的行列坐标；f_x,f_y,f_xx,f_yy,f_xy为梯度特征，分别表示像素在X，Y方向的一阶和二阶差分，计算公式如下：The saliency features described in the saliency feature extraction module are RGB, Lab, x, y, and a total of 13-dimensional features of the first-order gradient and the second-order gradient, which are expressed as F={R,G,B,L,a,b, x, y, f _x , f _y , f _xx , f _yy , f _xy }, where R, G, B, L, a, b have six-dimensional features as color features, R, G, B and L, a, b are RGB and LAB color information respectively; x, y are position information, x, y are the row and column coordinates of the pixel in the image; f _x , f _y , f _xx , f _yy , f _xy are gradient features, respectively indicating that the pixel is in the image The first-order and second-order differences in the X and Y directions are calculated as follows:

f_x＝(f(i+1,j)-f(i-1,j))/2f _x =(f(i+1,j)-f(i-1,j))/2

f_y＝(f(i,j+1)-f(i,j-1))/2f _y =(f(i,j+1)-f(i,j-1))/2

f_xx＝(f_x(i+1,j)-f_x(i-1,j))/2 (8)f _xx = (f _x (i+1,j)-f _x (i-1,j))/2 (8)

f_yy＝(f_y(i,j+1)-f_y(i,j-1))/2f _yy =(f _y (i,j+1)-f _y (i,j-1))/2

f_xy＝(f_x(i,j+1)-f_x(i,j-1))/2f _xy = (f _x (i,j+1)-f _x (i,j-1))/2

各模块具体实现可参见相应步骤，本发明不予撰述。The specific implementation of each module can refer to the corresponding steps, which will not be described in the present invention.

上述实施例描述仅对本发明的基本技术方案予以说明，且并不仅限于上述实施例。本发明所属领域的技术人员或团队可以对所描述的具体实施例进行任何简单的修改、补充、同等变化或修饰，但并不会偏离本发明的基本精神或超越权利要求书所定义的范围。The foregoing description of the embodiments only illustrates the basic technical solutions of the present invention, and is not limited to the foregoing embodiments. A person or team skilled in the art to which the present invention pertains can make any simple modifications, additions, equivalent changes or modifications to the described specific embodiments without departing from the basic spirit of the present invention or beyond the scope defined by the claims.

Claims

1. The image salient object region extraction method based on the iterative sparse representation is characterized by comprising the following steps of:

step 1, preprocessing data, setting different SLIC superpixel numbers, carrying out multi-scale superpixel segmentation on an original image, using saliency detection based on classical visual attention, and setting a detection result as an initial saliency map SAL₀；

Step 2, extracting the salient features of the pixel-level original image, wherein the salient features comprise color features, position features and gradient features;

step 3, calculating the average value of all original pixel features in the superpixel region of each scale to obtain the superpixel region features under single-scale segmentation;

step 4, aiming at the segmentation result of a single scale, calculating a saliency map through recursive sparse representation, and comprising the following substeps:

step 4.1, superpixel initial saliency calculation according to SAL₀Solving the initial significance level of each super pixel through mean value calculation;

step 4.2, extracting foreground samples, performing descending order arrangement on the initial significance level of the superpixels, and taking the front p 1% superpixels as foreground samples D_f；

Step 4.3, extracting a background sample, arranging the initial significance levels of the superpixels in an ascending order, and taking the front p2% superpixels as an alternative background sample D_b1Extracting superpixels contacting the image boundary as alternative background samples D_b2The background sample calculation formula is as follows:

D_b＝D_b1+D_b2-D_f (1)

step 4.4, performing double sparse representation and sparse residual calculation, wherein the foreground sample and the background sample are respectively used as dictionaries to perform sparse representation on all the superpixels, and a reconstructed residual is calculated, wherein the formula is as follows:

wherein i represents a super pixel number; f_iIs a feature vector of the superpixel region; lambda [ alpha ]_b，λ_fIs a regularization parameter; alpha is alpha_bi，α_fiRespectively representing a foreground sparse representation result and a background sparse representation result; epsilon_bi，ε_fiRespectively a foreground dilution reconstruction residual error and a background sparse reconstruction residual error;

step 4.5, calculating the significance factor, and aligning epsilon according to a formula (6)_biAnd ε_fiFusing, giving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SAL_i，

SAL_i＝ε_bi/(ε_fi+σ²) (6)

Wherein sigma²A non-negative tuning parameter;

step 4.6, recursive processing is carried out, and the significance factor graph SAL is calculated according to a formula (7)_iAnd an initial saliency map SAL₀Relative coefficient between them, if rela is less than K, SAL is ordered₀＝SAL_iAnd the whole process of the step 4 is repeatedly executed; if rela > K, the recursion is ended, and the current SAL is output_iThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold value,

rela＝corr2(A，B) (7)

wherein corr2() is the correlation coefficient calculation function; a and B are matrixes or images to be compared; rela is a correlation coefficient between A and B, the larger the value is, the more similar A and B are, otherwise, the difference is larger;

and 5, fusing multi-scale significance detection results, performing equal-weight linear combination on significance results under each single scale, and calculating a final significance detection result.

2. The image salient object region extraction method based on iterative sparse representation as claimed in claim 1, wherein the salient features in step 2 are RGB, Lab, x, y, and 13-dimensional features with first-order gradient and second-order gradient in total, and are represented as { R, G, B, L, a, B, x, y, f_x，f_y，f_xx，f_yy，f_xyThe color feature is a six-dimensional feature of R, G, B, L, a and B, and the R, G, B and L, a and B are RGB and Lab color information respectively; x and y are position information, and x and y are row-column coordinates of pixels in the image; f. of_x，f_yf_xx，f_yy，f_xFor the gradient feature, the first and second order differences of the pixel in the X and Y directions are respectively expressed, and the calculation formula is as follows:

where f (i, j) is the image matrix and i, j is the image pixel row column number.

3. The image salient object region extraction system based on the iterative sparse representation is characterized by comprising the following modules:

a preprocessing module for preprocessing data, setting different SLIC superpixel numbers, performing multi-scale superpixel segmentation on the original image, using the saliency detection based on classical visual attention, setting the detection result as an initial saliency map SAL₀；

The salient feature extraction module is used for extracting salient features of the pixel-level original image, wherein the salient features comprise color features, position features and gradient features;

the super-pixel region feature acquisition module is used for calculating the mean value of all original pixel features in the super-pixel region of each scale to obtain the super-pixel region features under single-scale segmentation;

the sparse representation module is used for calculating the saliency map through recursive sparse representation aiming at the segmentation result of a single scale, and comprises the following sub-modules:

a first sub-module for superpixel initial saliency calculation according to SAL₀Solving the initial significance level of each super pixel through mean value calculation;

the second sub-module is used for extracting the foreground sample, performing descending order arrangement on the initial significance level of the superpixels, and taking the front p 1% superpixels as the foreground sample D_f；

The third sub-module is used for extracting a background sample, the initial significance levels of the super pixels are arranged in an ascending order, and the front p 20% of super pixels are taken as an alternative background sample D_b1Extracting superpixels contacting the image boundary as alternative background samples D_b2The background sample calculation formula is as follows:

D_b＝D_b1+D_b2-D_f (1)

the fourth submodule is used for dual sparse representation and sparse residual calculation, all the superpixels are sparsely represented and reconstructed residual is calculated by taking the foreground sample and the background sample as dictionaries, and the formula is as follows:

a fifth submodule for calculating the significance factor, for ε according to equation (6)_biAnd ε_fiFusing, giving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SAL_i，

SAL_i＝ε_bi/(ε_fi+σ²) (6)

Wherein sigma²A non-negative tuning parameter;

a sixth sub-module for recursive processing for calculating the significance factor graph SAL according to equation (7)_iAnd an initial saliency map SAL₀Relative coefficient between them, if rela is less than K, SAL is ordered₀＝SAL_iAnd the whole process of the step 4 is repeatedly executed; if rela > K, the recursion is ended, and the current SAL is output_iThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold value,

rela＝corr2(A，B) (7)

and the detection result fusion module is used for fusing multi-scale significance detection results, performing equal-weight linear combination on significance results under each single scale, and calculating a final significance detection result.

4. The iterative sparse representation-based image salient object region extraction system according to claim 3, wherein the salient features in the salient feature extraction module are RGB, Lab, x, y and 13-dimensional features with first-order gradient and second-order gradient in total, and are represented as { R, G, B, L, a, B, x, y, f_x，f_y，f_xx，f_yy，f_xyThe color feature is a six-dimensional feature of R, G, B, L, a and B, and the R, G, B and L, a and B are RGB and Lab color information respectively; x and y are position information, and x and y are row-column coordinates of pixels in the image; f. of_x，f_y，f_xx，f_yy，f_xyFor the gradient feature, the first and second order differences of the pixel in the X and Y directions are respectively expressed, and the calculation formula is as follows: