CN102004786A

CN102004786A - Acceleration method in image retrieval system

Info

Publication number: CN102004786A
Application number: CN 201010573237
Authority: CN
Inventors: 冯德瀛; 杨杰; 杨程; 刘从新
Original assignee: Shanghai Jiao Tong University
Current assignee: Shanghai Jiao Tong University
Priority date: 2010-12-02
Filing date: 2010-12-02
Publication date: 2011-04-06
Anticipated expiration: 2030-12-02
Also published as: CN102004786B

Abstract

An acceleration method in an image retrieval system in the field of computer information processing technology, which extracts feature descriptors from standard images and images to be retrieved respectively and creates a visual codebook, then creates a random kd tree based on a set of seed points and performs feature descriptors Classify, then perform vectorization processing and optimize the inverted index, and finally perform similarity search on the image vector to be retrieved in the optimized inverted index, so as to realize the acceleration of the image retrieval system. The invention can make up for the problems of large calculation amount and long calculation time in the clustering process in the prior art, optimize the inverted index and improve the real-time performance of the similarity search under the condition of ensuring the retrieval accuracy.

Description

Acceleration Method in Image Retrieval System

技术领域technical field

本发明涉及的是一种计算机信息处理技术领域的方法，具体是一种图像检索系统中的加速方法。The invention relates to a method in the technical field of computer information processing, in particular to an acceleration method in an image retrieval system.

背景技术Background technique

随着Internet网络和数码采集设备的大规模普及，图像数据在人们的生活中得到了广泛的应用。越来越多的商业活动、事务交易和信息表现中包含着大量的图像数据。在大规模图像数据库中，如何按照需求有效地去组织和查找这些图像数据成为人们关注的热点问题。With the large-scale popularization of Internet and digital acquisition equipment, image data has been widely used in people's life. More and more commercial activities, business transactions and information representations contain a large amount of image data. In large-scale image databases, how to effectively organize and search these image data according to needs has become a hot issue that people pay attention to.

图像检索技术是指根据查询图像内容信息或指定查询标准，在标准图像库中进行搜索并查找出符合查询条件的相应图像。图像检索技术一般分为基于文本的图像检索技术和基于内容的图像检索技术。基于文本的图像检索技术，目前应用比较普及，它沿用了传统文本检索技术，回避了对图像低层特征元素的分析，从图像名称、图像尺寸、压缩类型、作者、年代等方面描述图像，通过关键词的形式查询图像，或者根据等级目录的形式浏览查找特定目录下的图像。基于内容的图像检索技术，在给定查询图像的前提下，从图像的颜色、形状、纹理等全局特征以及局部不变特征等方面来描述图像，并通过对图像特征进行矢量化处理，在标准图像库中进行相似性搜索进而查找出内容相似的图像。Image retrieval technology refers to searching in the standard image library and finding corresponding images that meet the query conditions according to the query image content information or specified query criteria. Image retrieval technology is generally divided into text-based image retrieval technology and content-based image retrieval technology. Text-based image retrieval technology is currently widely used. It follows the traditional text retrieval technology and avoids the analysis of low-level feature elements of images. It describes images from the aspects of image name, image size, compression type, author, and age. Query images in the form of words, or browse to find images in a specific category according to the form of hierarchical categories. Content-based image retrieval technology, under the premise of a given query image, describes the image from the global features such as color, shape, texture and local invariant features of the image, and vectorizes the image features, in the standard Similarity search is performed in the image library to find images with similar content.

基于内容的图像检索技术，早期大多采用颜色、纹理、形状等全局特征进行相似性搜索，但是由于这些特征对于光照、遮挡以及几何形变等不具有稳健性，因此逐渐被DOG、MSER、Harris等局部不变特征检测方法所取代。目前基于内容的图像检索技术，一般通过特征检测提取图像特征，创建特征描述子，然后对特征描述子聚类创建视觉码书，将图像矢量化，最后将图像矢量在高维索引结构中进行相似性搜索，给出相关搜索结果。Content-based image retrieval technology mostly uses global features such as color, texture, and shape for similarity search in the early days, but because these features are not robust to illumination, occlusion, and geometric deformation, they are gradually used by local imagers such as DOG, MSER, and Harris. Invariant feature detection methods are superseded. The current content-based image retrieval technology generally extracts image features through feature detection, creates feature descriptors, then creates visual codebooks for feature descriptor clustering, vectorizes images, and finally performs similarity on image vectors in high-dimensional index structures. Sexual search, giving relevant search results.

经对现有技术的文献检索发现，已有以下与“图像检索系统中的加速方法”相关的技术。Andrew Zisserman等在专利“Object Retrieval”(美国专利号为：US 2005/0225678A1，公开日期为2005年12月13日)中提供了用户在图像中自定义目标进行检索的方法。其中在对特征描述子分类时使用了K-Means聚类方法，在待检索图像矢量和标准图像矢量之间相似性查询时使用了传统的倒排索引方法。在大规模图像库中对所有特征描述子使用K-Means聚类方法分类时，由于标准图像库中存在海量特征描述子，聚类中心数目多，且聚类需要经过多次迭代才能完成，从而造成了聚类过程计算时间长，计算量大的问题。在标准图像矢量中使用传统倒排索引方法进行相似性查询时，由于标准图像矢量维数高，达到十万维以上，同样造成了查询实时性差的问题。After searching the literature of the prior art, it is found that the following technologies related to "Acceleration Method in Image Retrieval System" exist. In the patent "Object Retrieval" (US Patent No.: US 2005/0225678A1, published on December 13, 2005), Andrew Zisserman et al. provided a method for users to search for custom objects in images. Among them, the K-Means clustering method is used in the classification of the feature descriptors, and the traditional inverted index method is used in the similarity query between the image vector to be retrieved and the standard image vector. When using the K-Means clustering method to classify all feature descriptors in a large-scale image library, due to the large number of feature descriptors in the standard image library, the number of cluster centers is large, and the clustering needs to go through multiple iterations to complete, so This causes the problem of long calculation time and large amount of calculation in the clustering process. When using the traditional inverted index method for similarity query in the standard image vector, due to the high dimensionality of the standard image vector, which reaches more than 100,000 dimensions, it also causes the problem of poor real-time query performance.

进一步检索发现，David Nister等在专利“Scalable Object Recognition Using HierarchicalQuantization with a Vocabulary Tree”(美国专利号为：US7725484B2，公开日期为2010年5月25日)中提供了一种码书树，在K-Means聚类方法的基础上引入了分层的概念，与传统K-Means聚类方法相比，聚类过程计算时间有所缩短，但是由于标准图像库中存在海量描述子，聚类过程的计算量同样很大，聚类时间过长，同时由于采用了分层的方法，属于同一类别的不同描述子往往会被划分到不同的类别当中，进而造成了量化性能较差。在待检索图像矢量和标准图像矢量之间进行相似性查询同样使用了传统的倒排索引方法，由于图像矢量的维数没有降低，且量化性能较差，从而造成了检索准确率较低，实时性较差。Further search found that David Nister et al. provided a codebook tree in the patent "Scalable Object Recognition Using Hierarchical Quantization with a Vocabulary Tree" (US Patent No.: US7725484B2, date of publication is May 25, 2010), in K- Based on the Means clustering method, the concept of layering is introduced. Compared with the traditional K-Means clustering method, the calculation time of the clustering process is shortened, but due to the large number of descriptors in the standard image library, the calculation of the clustering process The amount is also large, and the clustering time is too long. At the same time, due to the hierarchical method, different descriptors belonging to the same category are often divided into different categories, resulting in poor quantitative performance. The similarity query between the image vector to be retrieved and the standard image vector also uses the traditional inverted index method. Since the dimensionality of the image vector is not reduced and the quantization performance is poor, the retrieval accuracy is low, and real-time Sex is poor.

发明内容Contents of the invention

本发明针对现有技术存在的上述不足，提供一种图像检索系统中的加速方法，通过随机采样创建视觉码书以及根据标准图像矢量创建优化倒排索引得以实现，能够弥补现有技术中聚类过程计算量大和计算时间长的问题，优化倒排索引在保证检索准确率的情况下，提高了相似性搜索的实时性。Aiming at the above-mentioned deficiencies in the prior art, the present invention provides an acceleration method in an image retrieval system, which realizes the creation of visual codebooks through random sampling and the creation of optimized inverted indexes based on standard image vectors, which can make up for the clustering in the prior art The problem of large amount of calculation and long calculation time in the process, optimizing the inverted index improves the real-time performance of similarity search while ensuring the accuracy of retrieval.

本发明是通过以下技术方案实现的，本发明通过对标准图像和待检索图像分别提取特征描述子并生成视觉码书，然后根据种子点集合创建随机kd树并对特征描述子进行分类，然后通过矢量化处理对倒排索引进行优化，最后将待检索图像矢量在优化倒排索引中进行相似性搜索，实现图像检索系统的加速。The present invention is achieved through the following technical solutions. The present invention extracts feature descriptors respectively for standard images and images to be retrieved and generates visual codebooks, then creates a random kd tree according to the set of seed points and classifies the feature descriptors, and then passes The vectorization process optimizes the inverted index, and finally performs similarity search on the image vector to be retrieved in the optimized inverted index, so as to realize the acceleration of the image retrieval system.

所述的对标准图像和待检索图像分别提取特征描述子是指：对标准图像和待检索图像先采用高斯差分算子(Different of Gaussian，DOG)进行特征点检测，然后将每个高斯差分算子通过尺度不变描述子(Scale Invariant Feature Transformation，SIFT)进行描述。The described extraction of feature descriptors for the standard image and the image to be retrieved respectively refers to: the standard image and the image to be retrieved first adopt the Gaussian difference operator (Different of Gaussian, DOG) to detect feature points, and then each Gaussian difference operator The child is described by the scale invariant descriptor (Scale Invariant Feature Transformation, SIFT).

所述的通过尺度不变描述子进行描述包括离线处理和实时处理两个步骤，其中：The description by the scale-invariant descriptor includes two steps of offline processing and real-time processing, wherein:

在离线处理中，对于标准图像库C＝(I₁，I₂，…，I_N)中的图像I_i(i＝1，2，…，N)，通过SIFT描述子表示为

其中：

是图像I_i中的单个描述子，维数为128维，n_i是图像I_i中SIFT描述子的个数。标准图像库中全部SIFT描述子集合表示为S＝(X₁，X₂，…，X_N)，集合S中SIFT描述子的总数为 In offline processing, for the image I _i (i=1, ₂ ,..., _N ) in the standard image library C=(I ₁ , I 2 ,..., I N ), the SIFT descriptor is expressed as

in:

is a single descriptor in image I _i with a dimension of 128, and n _i is the number of SIFT descriptors in image I _i . The set of all SIFT descriptors in the standard image library is expressed as S=(X ₁ , X ₂ ,...,X _N ), and the total number of SIFT descriptors in the set S is

在实时处理中，对于待检索图像Q，通过SIFT描述子T表示为T＝(q¹，q²，…，q^m)，其中q^k(k＝1，2，…，m)是图像Q中单个描述子，维数为128维，m是图像Q中SIFT描述子的个数。In real-time processing, for the image Q to be retrieved, the SIFT descriptor T is expressed as T=(q ¹ , q ² ,…,q ^m ), where q ^k (k=1, 2,…,m) is the image Q In a single descriptor, the dimension is 128 dimensions, and m is the number of SIFT descriptors in the image Q.

所述的创建视觉码书是指：对标准图像库中的特征描述子进行随机采样并创建视觉码书，具体步骤为：对SIFT描述子集合S随机采样，提取部分SIFT描述子作为种子点集合D，D＝(y₁，y₂，…，y_z)，其中：集合D中种子点的数量为z，每个子点为y_j(j＝1，2，…，z)；然后对SIFT描述子集合S进行分类，种子点y_j决定了将与其相似的SIFT描述子划分到种子点y_j对应的类别中，种子点的数量z即为类别的数量，种子点集合D为标准图像I_i量化需要的视觉码书。The creation of the visual codebook refers to: randomly sampling the feature descriptors in the standard image library and creating the visual codebook, the specific steps are: randomly sampling the SIFT descriptor set S, and extracting part of the SIFT descriptors as the seed point set D, D=(y ₁ , y ₂ ,..., y _z ), wherein: the number of seed points in the set D is z, and each sub-point is y _j (j=1, 2,..., z); then for SIFT The descriptor set S is classified, the seed point y _j decides to divide the similar SIFT descriptors into the category corresponding to the seed point y _j , the number of seed points z is the number of categories, and the seed point set D is the standard image I _i Quantify the required visual codebook.

所述的创建随机kd树是指：通过自上而下的迭代过程，每次迭代都以每个节点在多个较大方差值对应的维数中随机选择且节点的分割阈值在对应维数靠近中值的元素中随机选择为原则进行节点的创建。The creation of a random kd tree refers to: through a top-down iterative process, each iteration randomly selects each node in a dimension corresponding to a plurality of larger variance values and the segmentation threshold of the node is in the corresponding dimension Nodes are created based on the principle of randomly selecting elements close to the median.

所述的对特征描述子进行分类是指：根据节点阈值将种子点集合D中的每个种子点y_j划分到不同的空间，具体步骤为：使用单一最优查询方法对SIFT描述子在多棵随机kd树中进行搜索以找到对应种子点y_j，查找到最相似的种子点并存放在单一最优序列当中，当查询路径达到一定数目时停止搜索，则查询到种子点对应的类别即为SIFT描述子应该划分的类别。The described classification of feature descriptors refers to: according to the node threshold, each seed point y _j in the seed point set D is divided into different spaces, and the specific steps are: using a single optimal query method to classify SIFT descriptors in multiple Search in a random kd tree to find the corresponding seed point y _j , find the most similar seed point and store it in a single optimal sequence, stop searching when the query path reaches a certain number, then query the category corresponding to the seed point is The category that the SIFT descriptor should be divided into.

所述的矢量化处理是指：采用种子频率-倒图像频率(term frequency-inverse documentfrequency，tf-idf)方法分别对标准图像和待检索图像矢量化，然后对标准图像矢量和待检索图像矢量非零元素的位置指数和进行计算。The vectorization process refers to: using the seed frequency-inverse image frequency (term frequency-inverse document frequency, tf-idf) method to vectorize the standard image and the image to be retrieved respectively, and then the standard image vector and the image vector to be retrieved are inversely The position index sum of the zero element is calculated.

所述的图像矢量化包括离线处理和实时处理，其中：The image vectorization includes offline processing and real-time processing, wherein:

图像矢量化的离线处理步骤包括：The offline processing steps for image vectorization include:

1)对标准图像I_i中种子点y_j出现的次数n_ij及SIFT描述子总数n_i进行统计作为种子频率，则标准图像I_i中种子频率

对标准图像库C中包含有种子点y_j的标准图像数量M_j进行统计；1) The number of times n _ij of the seed point y _j in the standard image I _i and the total number of SIFT descriptors n _i are counted as the seed frequency, then the seed frequency in the standard image I _i

Perform statistics on the number of standard images M _j containing the seed point y _j in the standard image library C;

2)采用文本检索中常用的停用词方法，对M_j的大小进行判定，判定阈值为T，当M_j＞T时，删除对应种子点y_j；当M_j≤T时，保留M_j且令M_j＝M_r；在对所有M_j判定后，种子点的个数由z减少为z′，进而倒图像频率

种子频率由f_ij变为f_ir，

2) Use the stop word method commonly used in text retrieval to judge the size of M _j , the judgment threshold is T, when M _j > T, delete the corresponding seed point y _j ; when M _j ≤ T, keep M _j And let M _j =M _r ; after all M _j are judged, the number of seed points is reduced from z to z', and then the image frequency is reversed

The seed frequency is changed from f _ij to f _ir ,

3)标准图像I_i对应的图像矢量为V_i，则标准图像矢量V_i表示为V_i＝(c₁，c₂，…，c_z′)，其中

从而完成离线处理中标准图像矢量化。3) The image vector corresponding to the standard image I _i is V _i , then the standard image vector V _i is expressed as V _i =(c ₁ , c ₂ ,...,c _z′ ), where

This completes the standard image vectorization in offline processing.

图像矢量化的实时处理步骤包括：The real-time processing steps for image vectorization include:

a)对待检索图像Q中种子点y_r出现的次数m_r及SIFT描述子个数m进行统计，则待检索图像Q中种子频率

a) Count the number of occurrences m _r of the seed point y _r in the image Q to be retrieved and the number m of SIFT descriptors, then the frequency of the seed point y r in the image Q to be retrieved is

b)对于待检索图像Q的倒图像频率idf_qr，采用离线处理的倒图像频率idf_r，即

待检索图像矢量V_q表示为V_q＝(d₁，d₂，…，d_z′)，其中

从而完成实时处理中待检索图像矢量化。b) For the inverted image frequency idf _qr of the image Q to be retrieved, the inverted image frequency idf _r processed offline is used, namely

The image vector V _q to be retrieved is expressed as V _q =(d ₁ ,d ₂ ,...,d _z′ ), where

In this way, the vectorization of the image to be retrieved in the real-time processing is completed.

所述的对标准图像矢量和待检索图像矢量非零元素的位置指数和进行计算是指：在离线处理中，对矢量V_i二值化，设二值化后图像矢量为V_i′＝(p₁，p₂，…，p_z′)，其中

从而标准图像矢量V_i非零元素的位置指数和s_i表示为

在实时处理中，对矢量V_q二值化，设二值化后图像矢量为V′_q＝(w₁，w₂，…，w_z′)，其中

从而待检索图像矢量V_q非零元素的位置指数和s_q表示为

The calculation of the position index sum of the non-zero elements of the standard image vector and the image vector to be retrieved refers to: in offline processing, the vector V _i is binarized, and the image vector after binarization is V _i '=( p ₁ , p ₂ ,...,p _z′ ), where

Thus the position indices and s _i of the non-zero elements of the standard image vector V _i are expressed as

In real-time processing, the vector V _q is binarized, and the image vector after binarization is V′ _q =(w ₁ ,w ₂ ,…,w _z′ ), where

Therefore, the position index and s _q of the non-zero elements of the image vector V _q to be retrieved are expressed as

所述的创建优化倒排索引是指：在离线处理中，采用种子点y_r作为索引，标准图像矢量V_i作为索引目标，对于种子点y_r，存在对应的倒排索引列表L_r。对于标准图像矢量V_i中的元素u，当c_u＞0，则该图像矢量V_i的名称I_i及非零元素的位置指数和s_i被记录在列表L_u中，记为L_u＝{y_u|(I_i，s_i)}；然后依次对标准图像矢量V_i进行处理并根据非零元素的位置将其记录到对应的索引例表L_r中，创建倒排索引L＝{L₁，L₂，…，L_z′}；再将倒排索引例表L_r以及列表L_r中对应的标准图像I_i进行排序，对于索引列表L_r，记录标准图像的数量并不相同，标准图像矢量非零元素的位置指数和也不相同。首先将索引列表L_r按照记录标准图像的数量从大到小排序，然后在索引列表L_r中将标准图像I_i根据非零元素的位置指数和s_i从大到小排序。在对倒排索引列表L_r及其列表中对应的标准图像I_i排序后，创建优化倒排索引L′，从而用于实时处理进行相似性搜索。The creation of an optimized inverted index refers to: in offline processing, the seed point y _r is used as the index, the standard image vector V _i is used as the index target, and there is a corresponding inverted index list L _r for the seed point y _r . For the element u in the standard image vector V _i , when c _u > 0, the name I _i of the image vector V _i and the position index and s _i of the non-zero elements are recorded in the list L _u , denoted as L _u = {y _u |(I _i , s _i )}; then process the standard image vector V _i sequentially and record it into the corresponding index table L _r according to the position of the non-zero element, and create an inverted index L={ L ₁ , L ₂ ,..., L _z′ }; then sort the inverted index table L _r and the corresponding standard images I _i in the list L _r , for the index list L _r , the number of recorded standard images is not the same , the sum of the position indices of the non-zero elements of the standard image vector is also different. First sort the index list L _r according to the number of recorded standard images from large to small, and then sort the standard image I _i in the index list L _r according to the position index and _si of the non-zero elements from large to small. After sorting the inverted index list L _r and its corresponding standard image I _i in the list, an optimized inverted index L' is created to be used for real-time processing for similarity search.

所述的相似性搜索具体包括以下步骤：The similarity search specifically includes the following steps:

i)查询包含标准图像数量较多的索引列表，然后在该索引列表中将待检索图像非零元素的位置指数和s_q作为阈值，将s_q与列表中标准图像非零元素的位置指数和s_i进行比较，对于小于该阈值s_q的标准图像及其后续位置指数和更小的标准图像将被排除；i) Query the index list that contains a large number of standard images, and then use the position index and s _{q of the non-zero elements of the image to be retrieved as the threshold in the index list, and use s q} _and the position index and sum of the non-zero elements of the standard image in the list _si for comparison, for the standard image smaller than the threshold s _q and its subsequent position index and smaller standard images will be excluded;

ii)在优化倒排索引L′中进行相似性搜索时，存在累加器A，用于记录标准图像I_i出现的次数a_i，每个标准图像都对应着一个累加器a_i，则A＝(a₁，a₂，…，a_N)，当在倒排索引列表中标准图像I_i被查询一次，则标准图像I_i对应的累加器a_i加1，即a_i＝a_i+1，最后对标准图像对应的累加器A进行排序，数值较大的累加器对应的标准图像，即是待检索图像矢量V_q的候选查询结果，从而完成优化倒排索引搜索；ii) When similarity search is performed in the optimized inverted index L′, there is an accumulator A, which is used to record the number of occurrences a _i of the standard image I _i , and each standard image corresponds to an accumulator a _i , then A= (a ₁ , a ₂ ,..., a _N ), when the standard image I _i is queried once in the inverted index list, the accumulator a _i corresponding to the standard image I _i is incremented by 1, that is, a _i =a _i +1 , finally sort the accumulator A corresponding to the standard image, and the standard image corresponding to the accumulator with a larger value is the candidate query result of the image vector V _q to be retrieved, thereby completing the optimized inverted index search;

iii)将待检索图像矢量V_q和候选标准图像矢量V_i进行相似性度量，采用两个矢量间的余弦值进行相似性计算，

其中

在计算出余弦值cos(V_q，V_i)后，将余弦值cos(V_q，V_i)从大到小排序，最大余弦值cos(V_q，V_i)对应的标准图像I_i，即为待检索图像Q的最终查询结果。iii) Measure the similarity between the image vector V _q to be retrieved and the candidate standard image vector V _i , and use the cosine value between the two vectors to calculate the similarity,

in

After calculating the cosine value cos(V _q , V _i ), sort the cosine value cos(V _q , V _i ) from large to small, and the standard image I _i corresponding to the largest cosine value cos(V _q , V _i ), That is, the final query result of the image Q to be retrieved.

本发明的有益效果是：与传统K-means聚类方法创建视觉码书相比，本发明提供的随机视觉码书只需在SIFT描述子集合中进行随机采样，不需要多次迭代处理，计算量小，计算时间短。与传统倒排索引相比，本发明提出的优化倒排索引能够根据待检索图像矢量非零元素的位置指数和快速排除不相干标准图像，提高了在大规模图像库中相似性搜索的速度。与现有技术相比，本发明能够在降低计算量的同时提高检索的实时性。The beneficial effects of the present invention are: compared with the traditional K-means clustering method to create a visual codebook, the random visual codebook provided by the present invention only needs to be randomly sampled in the SIFT descriptor set, and does not require multiple iterations. The amount is small and the calculation time is short. Compared with the traditional inverted index, the optimized inverted index proposed by the present invention can quickly exclude irrelevant standard images according to the position index of the non-zero elements of the image vector to be retrieved, and improves the speed of similarity search in large-scale image databases. Compared with the prior art, the present invention can improve the real-time performance of retrieval while reducing the amount of computation.

附图说明Description of drawings

图1为本方法流程图。Figure 1 is a flowchart of the method.

图2为实时处理中整体检索时间及相关步骤所耗费的时间。Figure 2 shows the overall retrieval time and the time spent on related steps in real-time processing.

图3为传统倒排索引查询时间与优化倒排索引查询时间比较。Figure 3 is a comparison of the traditional inverted index query time and the optimized inverted index query time.

具体实施方式Detailed ways

下面对本发明的实施例作详细说明，本实施例在以本发明技术方案为前提下进行实施，给出了详细的实施方式和具体的操作过程，但本发明的保护范围不限于下述的实施例。The embodiments of the present invention are described in detail below. This embodiment is implemented on the premise of the technical solution of the present invention, and detailed implementation methods and specific operating procedures are provided, but the protection scope of the present invention is not limited to the following implementation example.

如图1所示，本实施例采用图像检索系统的加速方法，对手机拍摄图像进行检索，具体实施步骤如下：As shown in Figure 1, this embodiment adopts the acceleration method of the image retrieval system to retrieve the images captured by the mobile phone, and the specific implementation steps are as follows:

1.对标准图像和待检索图像分别提取特征描述子。1. Extract feature descriptors for the standard image and the image to be retrieved respectively.

在离线处理中，对标准图像库C＝(I₁，I₂，…，I_N)中的图像提取SIFT描述子。图像I_i中SIFT描述子数量为n_i，则标准图像库全部SIFT描述子的总数为

In offline processing, SIFT descriptors are extracted for images in the standard image library C=(I ₁ , I ₂ , . . . , I _N ). The number of SIFT descriptors in image I _i is n _i , then the total number of all SIFT descriptors in the standard image library is

在实时处理中，对待检索图像Q提取SIFT描述子，待检索图像Q中SIFT描述子数量为m。In real-time processing, SIFT descriptors are extracted from the image Q to be retrieved, and the number of SIFT descriptors in the image Q to be retrieved is m.

2.对标准图像库中的特征描述子随机采样，创建视觉码书。2. Randomly sample the feature descriptors in the standard image library to create a visual codebook.

在离线处理中，对标准图像库对应的n个SIFT描述子进行随机采样，提取其中z个SIFT描述子作为种子点创建视觉码书，其中z＝20％×n。In offline processing, the n SIFT descriptors corresponding to the standard image library are randomly sampled, and z SIFT descriptors are extracted as seed points to create a visual codebook, where z=20%×n.

3.根据种子点集合创建随机kd树，对标准图像和待检索图像的特征描述子进行分类。3. Create a random kd tree based on the set of seed points, and classify the feature descriptors of the standard image and the image to be retrieved.

在离线处理中，根据z个种子点创建8棵独立的随机kd树，将标准图像中的SIFT描述子依次在8棵随机kd树中进行近似最近邻搜索，查询路径数量的最大值设为100，进而将SIFT描述子划分到种子点对应的类别中，统计每个SIFT描述属于种子点类别。In offline processing, 8 independent random kd trees are created according to z seed points, and the SIFT descriptor in the standard image is sequentially searched for approximate nearest neighbors in 8 random kd trees, and the maximum number of query paths is set to 100 , and then divide the SIFT descriptor into the category corresponding to the seed point, and count each SIFT description as belonging to the seed point category.

在实时处理中，根据离线处理创建的8棵随机kd树，将待检索图像Q中SIFT描述子进行近似最近邻搜索，查询路径数量的最大值同样设为100，进而将SIFT描述子划分到种子点对应的类别中，统计每个SIFT描述属于种子点类别。In real-time processing, according to the 8 random kd trees created by offline processing, the SIFT descriptors in the image Q to be retrieved are searched for approximate nearest neighbors, and the maximum number of query paths is also set to 100, and then the SIFT descriptors are divided into seed In the category corresponding to the point, it is counted that each SIFT description belongs to the category of the seed point.

4.采用种子频率-倒图像频率方法分别对标准图像和待检索图像矢量化。4. Use the seed frequency-inverse image frequency method to vectorize the standard image and the image to be retrieved respectively.

在离线处理中，对标准图像库C中包含有种子点y_j的标准图像数量M_j采用停用词方法，令停用词阈值T＝0.6×max(M_j)。In the off-line processing, the stop word method is used for the number of standard images M _j containing the seed point y _j in the standard image library C, and the stop word threshold T=0.6×max(M _j ).

在实时处理中，只考虑离线处理中采用停用词方法筛选后的种子点，同时采用离线处理的倒图像频率。In the real-time processing, only the seed points filtered by the stop words method in the offline processing are considered, and the inverted image frequency of the offline processing is also used.

5.对标准图像矢量和待检索图像矢量非零元素的位置指数和进行计算。5. Calculate the position index sum of the non-zero elements of the standard image vector and the image vector to be retrieved.

在离线处理中，将标准图像矢量V_i＝(c₁，c₂，…，c_z′)二值化为矢量V_i′＝(p₁，p₂，…，p_z′)，其中

则标准图像矢量V_i非零元素的位置指数和 In offline processing, the standard image vector V _i = (c ₁ , c ₂ , ..., c _z′ ) is binarized into a vector V _i ′ = (p ₁ , p ₂ , ..., p _z′ ), where

Then the position indices of the non-zero elements of the standard image vector V _i sum

在实时处理中，将待检索图像矢量V_q二值化为矢量V_q＝(d₁，d₂，…，d_z′)，其中

则待检索图像矢量V_q非零元素的位置指数和

In real-time processing, the image vector V _q to be retrieved is binarized into a vector V _q = (d ₁ , d ₂ ,..., d _z′ ), where

Then the position indices of the non-zero elements of the image vector V _q to be retrieved and

6.创建优化倒排索引。6. Create an optimized inverted index.

在离线处理中，将种子点y_r作为索引，标准图像矢量V_i作为索引目标，对标准图像矢量V_i中的非零元素进行统计。当矢量V_i中元素u不为零，则将标准图像名称I_i及非零元素的位置指数和s_i记录在种子点y_u对应的索引列表中。在标准图像矢量V_i中所有非零元素统计完成后，将索引列表按照记录标准图像的数量从大到小排序，对索引列表中的标准图像按照非零元素的位置指数和s_i从大到小排序，创建优化倒排索引。In the off-line processing, the seed point y _r is used as the index, the standard image vector V _i is used as the index target, and the non-zero elements in the standard image vector V _i are counted. When the element u in the vector V _i is not zero, record the standard image name I _i and the position index and s _i of the non-zero elements in the index list corresponding to the seed point y _u . After the statistics of all non-zero elements in the standard image vector V _i are completed, the index list is sorted according to the number of recorded standard images from large to small, and the standard images in the index list are sorted from large to small according to the position index and s _i of the non-zero elements. Small sort, create optimized inverted index.

7.将待检索图像矢量在优化倒排索引中进行相似性搜索，并进行余弦值度量。7. Perform a similarity search on the image vector to be retrieved in the optimized inverted index, and perform a cosine value measurement.

在实时处理中，对于待检索图像矢量V_q非零元素对应的所有索引列表，包含有标准图像数量最多的索引列表将被优先查询，并在该索引列表中将待检索图像矢量V_q的非零元素位置指数和s_q和标准图像矢量V_i的非零元素位置指数和s_i进行比较，当s_q≥s_i时，标准图像I_i对应的累加器a_i＝a_i+1；当s_q＜s_i时，排除对应的标准图像I_i及I_i后续位置指数和更小的标准图像。将矢量V_q中所有非零元素依次查询后，对标准图像对应的累加器A进行排序，取出前5个最大的累加器数值，将其对应的标准图像矢量V_i和待检索图像矢量V_q按照余弦值进行相似性度量，对5个余弦值从大到小排序，最大余弦值对应的标准图像，即为待检索图像对应的查询结果。In real-time processing, for all the index lists corresponding to the non-zero elements of the image vector V _q to be retrieved, the index list containing the largest number of standard images will be queried first, and the non-zero elements of the image vector V _q to be retrieved will be searched first. The zero element position index and s _q are compared with the non-zero element position index and s _i of the standard image vector V _i , when s _q ≥ s _i , the accumulator a _i corresponding to the standard image I _i = a _i +1; when When s _q <s _i , the corresponding standard image I _i and subsequent position indices of I _i and smaller standard images are excluded. After querying all the non-zero elements in the vector V _q sequentially, sort the accumulator A corresponding to the standard image, take out the first 5 largest accumulator values, and compare the corresponding standard image vector V _i and the image vector V _q to be retrieved The similarity measurement is carried out according to the cosine value, and the 5 cosine values are sorted from large to small, and the standard image corresponding to the largest cosine value is the query result corresponding to the image to be retrieved.

对本方法仿真实验如下：在7,655幅标准图像的基础上，对284幅待检索图像进行检索测试。图2为实时处理284幅待检索图像的整体检索时间以及相关步骤所耗费的时间。从图2中看出，叉号曲线表示优化倒排索引的查询时间以及待检索图像矢量和标准图像矢量相似性度量的时间，曲线变化相对稳定，平均耗费时间为0.0055s。与叉号曲线相比，菱形曲线表示待检索图像SIFT描述子分配时间，图像矢量化的时间及非零元素位置指数和的计算时间三者之和，时间相对较长，但曲线变化幅度不大，平均耗费时间为0.2047s。正方形曲线表示待检索图像SIFT描述子提取时间，曲线变化幅度大，这主要与待检索图像的大小有关，平均耗费时间为0.2686s。黑色曲线对应整体检索时间，平均耗费时间为0.4788s，满足实时性的要求。The simulation experiment of this method is as follows: On the basis of 7,655 standard images, the retrieval test is carried out on 284 images to be retrieved. Figure 2 shows the overall retrieval time and the time spent in related steps for real-time processing of 284 images to be retrieved. It can be seen from Figure 2 that the crossed curve represents the query time for optimizing the inverted index and the time for similarity measurement between the image vector to be retrieved and the standard image vector. The change of the curve is relatively stable, and the average time spent is 0.0055s. Compared with the crossed curve, the diamond curve represents the sum of the SIFT descriptor allocation time of the image to be retrieved, the image vectorization time, and the calculation time of the non-zero element position index sum. The time is relatively long, but the curve changes little , the average time spent is 0.2047s. The square curve represents the SIFT descriptor extraction time of the image to be retrieved, and the curve changes greatly, which is mainly related to the size of the image to be retrieved, and the average time spent is 0.2686s. The black curve corresponds to the overall retrieval time, and the average time spent is 0.4788s, which meets the real-time requirements.

在创建视觉码书时间上，本发明的方法分别与AKM(Approximate K-Means)算法以及HKM(Hierarchical K-Means)算法进行了比较。在7,655幅标准图像的基础上，提取到1,999,620个SIFT描述子。设AKM和HKM的聚类中心为19962，迭代次数为40次，随机视觉码书的种子点个数同样为19962。表1中给出了三种算法创建视觉码书的时间。从表1中看出，随机视觉码书的创建时间要远远小于AKM和HKM创建的时间。On the time of creating a visual codebook, the method of the present invention is compared with the AKM (Approximate K-Means) algorithm and the HKM (Hierarchical K-Means) algorithm respectively. Based on 7,655 standard images, 1,999,620 SIFT descriptors are extracted. Assuming that the cluster center of AKM and HKM is 19962, the number of iterations is 40, and the number of seed points of the random visual codebook is also 19962. Table 1 gives the time for the three algorithms to create the visual codebook. It can be seen from Table 1 that the creation time of random visual codebook is much shorter than that of AKM and HKM.

在倒排索引查询时间上，本发明的方法与传统的倒排索引进行了比较。在284幅待检索图像的基础上进行了测试。图3中给出了传统倒排索引查询时间和优化倒排索引查询时间。从图3中看出，菱形曲线表示传统倒排索引的查询时间，平均耗费时间为0.0205s。方形曲线表示优化倒排索引的查询时间，与菱形曲线相比，查询时间较短，曲线幅度波动较小，平均查询时间为0.0028s。看出，优化倒排索引加快待检索图像在标准图像库中的查询速度。On the query time of the inverted index, the method of the present invention is compared with the traditional inverted index. The test is carried out on the basis of 284 images to be retrieved. Figure 3 shows the traditional inverted index query time and optimized inverted index query time. It can be seen from Figure 3 that the diamond curve represents the query time of the traditional inverted index, and the average time spent is 0.0205s. The square curve represents the query time of the optimized inverted index. Compared with the diamond-shaped curve, the query time is shorter, the curve amplitude fluctuates less, and the average query time is 0.0028s. It can be seen that optimizing the inverted index speeds up the query speed of the image to be retrieved in the standard image library.

以上所有算法均在Matlab 7.6上运行。All the above algorithms are run on Matlab 7.6.

AKMAKM HKMHKM 随机视觉码书Random Visual Codebook 时间 time 2.5h2.5h 2h2h 183s183s

表1随机视觉码书与AKM及HKM创建视觉码书时间比较Table 1 Comparison of time between random visual codebook and AKM and HKM to create visual codebook

Claims

1. the accelerated method in the image indexing system, it is characterized in that, by adopting the difference of Gaussian operator to carry out feature point detection respectively to standard picture and image to be retrieved, then each difference of Gaussian operator is described and creates the vision code book by the constant descriptor of yardstick, create at random the kd tree according to seed points set then and feature description is classified, then carry out vectorized process and inverted index is optimized, at last image vector to be retrieved is carried out similarity searching in optimizing inverted index, realize the acceleration of image indexing system.

2. the accelerated method in the image indexing system according to claim 1 is characterized in that, described being described by the constant descriptor of yardstick comprises processed offline and two steps of real-time processing, wherein:

In processed offline, for standard picture storehouse C=(I ₁, I ₂..., I _N) in image I _i, be expressed as by the SIFT descriptor Wherein:

It is image I _iIn single descriptor, dimension be 128 the dimension, n _iIt is image I _iThe number of middle SIFT descriptor, whole SIFT descriptors set are expressed as S=(X in the standard picture storehouse ₁, X ₂..., X _N), the SIFT descriptor adds up in the S set

In handling in real time, for image Q to be retrieved, T is expressed as T=(q by the SIFT descriptor ¹, q ²..., q ^m), q wherein ^k(k=1,2 ..., m) being single descriptor among the image Q, dimension is 128 dimensions, m is the number of SIFT descriptor among the image Q.

3. the accelerated method in the image indexing system according to claim 1, it is characterized in that, described establishment vision code book is meant: feature description in the standard picture storehouse is carried out stochastic sampling and creates the vision code book, concrete steps are: to SIFT descriptor S set stochastic sampling, extract part SIFT descriptor and gather D as seed points, D=(y ₁, y ₂..., y _z), wherein: the quantity of seed points is z among the set D, and each seed points is y _j(j=1,2 ..., z); Then SIFT descriptor S set is classified seed points y _jDetermined the SIFT descriptor similar to it is divided into seed points y _jIn the corresponding class, the quantity z of seed points is the quantity of classification, and seed points set D is standard picture I _iThe vision code book that quantification needs.

4. the accelerated method in the image indexing system according to claim 1, it is characterized in that, described establishment kd tree at random is meant: by top-down iterative process, each iteration all selects at random in the dimension of a plurality of big variance yields correspondences with each node and the segmentation threshold of node is chosen as the establishment that principle is carried out node at corresponding dimension at random near in the element of intermediate value.

5. the accelerated method in the image indexing system according to claim 1 is characterized in that, described feature description is classified is meant: according to the node threshold value seed points is gathered each seed points y among the D _jBe divided into different spaces, concrete steps are: use single optimum querying method that the SIFT descriptor is searched for to find corresponding seed points y in the kd tree at random at many _j, the most similar seed points is searched and left in the middle of the single optimal sequence, when reaching some, query path stops search, and then inquire the seed points corresponding class and be the classification that the SIFT descriptor should be divided.

6. the accelerated method in the image indexing system according to claim 1, it is characterized in that, described vectorized process is meant: adopt seed frequency-method of falling the picture frequency respectively to standard picture and image vector to be retrieved, then to the location index of standard picture vector and image vector nonzero element to be retrieved with calculate.

7. the accelerated method in the image indexing system according to claim 1 is characterized in that, described image vector comprises processed offline and processing in real time, wherein:

The processed offline step of image vector comprises:

1) to standard picture I _iMiddle seed points y _jThe frequency n that occurs _IjAnd SIFT descriptor sum n _iAdd up as seed frequency, then standard picture I _iMiddle seed frequency To including seed points y among the C of standard picture storehouse _jStandard picture quantity M _jAdd up;

2) adopt stop words method commonly used in the text retrieval, to M _jSize judge that decision threshold is T, works as M _jDuring＞T, delete corresponding seed points y _jWork as M _jDuring≤T, keep M _jAnd make M _j=M _rTo all M _jAfter the judgement, the number of seed points is reduced to z ' by z, and then falls picture frequency

Seed frequency is by f _IjBecome f _Ir,

3) standard picture I _iCorresponding image vector is V _i, standard picture vector V then _iBe expressed as V _i=(c ₁, c ₂..., c _{Z '}), wherein

Thereby finish standard image vector in the processed offline;

The real-time treatment step of image vector comprises:

A) treat seed points y among the retrieving images Q _rThe number of times m that occurs _rAnd SIFT descriptor number m adds up seed frequency among the image Q then to be retrieved

B) for the idf of falling the picture frequency of image Q to be retrieved _Qr, the idf of falling the picture frequency of employing processed offline _r, promptly Image vector V to be retrieved _qBe expressed as V _q=(d ₁, d ₂..., d _{Z '}), wherein

Thereby finish image vector to be retrieved in the real-time processing.

8. the accelerated method in the image indexing system according to claim 1 is characterized in that, described location index to standard picture vector and image vector nonzero element to be retrieved is meant with calculating: in processed offline, to vector V _iBinaryzation establishes that image vector is V ' after the binaryzation _i=(p ₁, p ₂..., p _{Z '}), wherein Thereby standard picture vector V _iThe location index of nonzero element and s _iBe expressed as

In handling in real time, to vector V _qBinaryzation establishes that image vector is V ' after the binaryzation _q=(w ₁, w ₂..., w _{Z '}), wherein

Thereby image vector V to be retrieved _qThe location index of nonzero element and s _qBe expressed as

9. the accelerated method in the image indexing system according to claim 1 is characterized in that, described establishment is optimized inverted index and is meant: in processed offline, adopt seed points y _rAs index, the standard picture vector V _iAs the index target, for seed points y _r, have corresponding inverted index tabulation L _r, for the standard picture vector V _iIn element u, work as c _u＞0, this image vector V then _iTitle I _iAnd the location index and the s of nonzero element _iBe recorded in tabulation L _uIn, be designated as L _u={ y _u| (I _i, s _i); Then successively to the standard picture vector V _iHandle and it is recorded corresponding index L according to the position of nonzero element _rIn, create inverted index L={L ₁, L ₂..., L _{Z '}; L again tabulates inverted index _rAnd tabulation L _rMiddle corresponding standard picture I _iSort, for index L _r, the quantity of record standard image is also inequality, and the location index of standard picture vector nonzero element and also inequality is at first with index L _rQuantity according to the record standard image sorts from big to small, then at index L _rIn with standard picture I _iLocation index and s according to nonzero element _iOrdering from big to small is at L that inverted index is tabulated _rAnd corresponding standard picture I in the tabulation _iAfter the ordering, create and optimize inverted index L ', thereby being used for handling in real time carries out similarity searching.

10. according to the accelerated method in the described image indexing system of claim 1, it is characterized in that described similarity searching is meant:

I) inquiry comprises a fairly large number of index of standard picture, then in this index with the location index and the s of image nonzero element to be retrieved _qAs threshold value, with s _qLocation index and s with standard picture nonzero element in the tabulation _iCompare, for less than this threshold value s _qStandard picture and the littler standard picture of follow-up location exponential sum thereof will be excluded;

When ii) in optimizing inverted index L ', carrying out similarity searching, there is totalizer A, is used for the record standard image I _iThe number of times a that occurs _i, each standard picture all corresponding a totalizer a _i, A=(a then ₁, a ₂..., a _N), when standard image I in the inverted index tabulation _iInquired about once, then standard picture I _iCorresponding totalizer a _iAdd 1, i.e. a _i=a _i+ 1, the totalizer A to the standard picture correspondence sorts at last, and the standard picture of the totalizer correspondence that numerical value is bigger promptly is image vector V to be retrieved _qCandidate's Query Result, optimize the inverted index search thereby finish;

Iii) with image vector V to be retrieved _qWith candidate's standard picture vector V _iCarry out similarity measurement, adopt the cosine value between two vectors to carry out similarity calculating,

Wherein

Calculating cosine value cos (V _q, V _i) after, with cosine value cos (V _q, V _i) ordering from big to small, maximum cosine value cos (V _q, V _i) corresponding standard picture I _i, be the final Query Result of image Q to be retrieved.