[go: up one dir, main page]

CN102004786A - Acceleration method in image retrieval system - Google Patents

Acceleration method in image retrieval system Download PDF

Info

Publication number
CN102004786A
CN102004786A CN 201010573237 CN201010573237A CN102004786A CN 102004786 A CN102004786 A CN 102004786A CN 201010573237 CN201010573237 CN 201010573237 CN 201010573237 A CN201010573237 A CN 201010573237A CN 102004786 A CN102004786 A CN 102004786A
Authority
CN
China
Prior art keywords
image
vector
standard picture
index
retrieved
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 201010573237
Other languages
Chinese (zh)
Other versions
CN102004786B (en
Inventor
冯德瀛
杨杰
杨程
刘从新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiao Tong University
Original Assignee
Shanghai Jiao Tong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiao Tong University filed Critical Shanghai Jiao Tong University
Priority to CN2010105732370A priority Critical patent/CN102004786B/en
Publication of CN102004786A publication Critical patent/CN102004786A/en
Application granted granted Critical
Publication of CN102004786B publication Critical patent/CN102004786B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种计算机信息处理技术领域的图像检索系统中的加速方法,通过对标准图像和待检索图像分别提取特征描述子并创建视觉码书,然后根据种子点集合创建随机kd树并对特征描述子进行分类,接着进行矢量化处理并对倒排索引进行优化,最后将待检索图像矢量在优化倒排索引中进行相似性搜索,实现图像检索系统的加速。本发明能够弥补现有技术中聚类过程计算量大和计算时间长的问题,优化倒排索引在保证检索准确率的情况下,提高了相似性搜索的实时性。

Figure 201010573237

An acceleration method in an image retrieval system in the field of computer information processing technology, which extracts feature descriptors from standard images and images to be retrieved respectively and creates a visual codebook, then creates a random kd tree based on a set of seed points and performs feature descriptors Classify, then perform vectorization processing and optimize the inverted index, and finally perform similarity search on the image vector to be retrieved in the optimized inverted index, so as to realize the acceleration of the image retrieval system. The invention can make up for the problems of large calculation amount and long calculation time in the clustering process in the prior art, optimize the inverted index and improve the real-time performance of the similarity search under the condition of ensuring the retrieval accuracy.

Figure 201010573237

Description

图像检索系统中的加速方法 Acceleration Method in Image Retrieval System

技术领域technical field

本发明涉及的是一种计算机信息处理技术领域的方法,具体是一种图像检索系统中的加速方法。The invention relates to a method in the technical field of computer information processing, in particular to an acceleration method in an image retrieval system.

背景技术Background technique

随着Internet网络和数码采集设备的大规模普及,图像数据在人们的生活中得到了广泛的应用。越来越多的商业活动、事务交易和信息表现中包含着大量的图像数据。在大规模图像数据库中,如何按照需求有效地去组织和查找这些图像数据成为人们关注的热点问题。With the large-scale popularization of Internet and digital acquisition equipment, image data has been widely used in people's life. More and more commercial activities, business transactions and information representations contain a large amount of image data. In large-scale image databases, how to effectively organize and search these image data according to needs has become a hot issue that people pay attention to.

图像检索技术是指根据查询图像内容信息或指定查询标准,在标准图像库中进行搜索并查找出符合查询条件的相应图像。图像检索技术一般分为基于文本的图像检索技术和基于内容的图像检索技术。基于文本的图像检索技术,目前应用比较普及,它沿用了传统文本检索技术,回避了对图像低层特征元素的分析,从图像名称、图像尺寸、压缩类型、作者、年代等方面描述图像,通过关键词的形式查询图像,或者根据等级目录的形式浏览查找特定目录下的图像。基于内容的图像检索技术,在给定查询图像的前提下,从图像的颜色、形状、纹理等全局特征以及局部不变特征等方面来描述图像,并通过对图像特征进行矢量化处理,在标准图像库中进行相似性搜索进而查找出内容相似的图像。Image retrieval technology refers to searching in the standard image library and finding corresponding images that meet the query conditions according to the query image content information or specified query criteria. Image retrieval technology is generally divided into text-based image retrieval technology and content-based image retrieval technology. Text-based image retrieval technology is currently widely used. It follows the traditional text retrieval technology and avoids the analysis of low-level feature elements of images. It describes images from the aspects of image name, image size, compression type, author, and age. Query images in the form of words, or browse to find images in a specific category according to the form of hierarchical categories. Content-based image retrieval technology, under the premise of a given query image, describes the image from the global features such as color, shape, texture and local invariant features of the image, and vectorizes the image features, in the standard Similarity search is performed in the image library to find images with similar content.

基于内容的图像检索技术,早期大多采用颜色、纹理、形状等全局特征进行相似性搜索,但是由于这些特征对于光照、遮挡以及几何形变等不具有稳健性,因此逐渐被DOG、MSER、Harris等局部不变特征检测方法所取代。目前基于内容的图像检索技术,一般通过特征检测提取图像特征,创建特征描述子,然后对特征描述子聚类创建视觉码书,将图像矢量化,最后将图像矢量在高维索引结构中进行相似性搜索,给出相关搜索结果。Content-based image retrieval technology mostly uses global features such as color, texture, and shape for similarity search in the early days, but because these features are not robust to illumination, occlusion, and geometric deformation, they are gradually used by local imagers such as DOG, MSER, and Harris. Invariant feature detection methods are superseded. The current content-based image retrieval technology generally extracts image features through feature detection, creates feature descriptors, then creates visual codebooks for feature descriptor clustering, vectorizes images, and finally performs similarity on image vectors in high-dimensional index structures. Sexual search, giving relevant search results.

经对现有技术的文献检索发现,已有以下与“图像检索系统中的加速方法”相关的技术。Andrew Zisserman等在专利“Object Retrieval”(美国专利号为:US 2005/0225678A1,公开日期为2005年12月13日)中提供了用户在图像中自定义目标进行检索的方法。其中在对特征描述子分类时使用了K-Means聚类方法,在待检索图像矢量和标准图像矢量之间相似性查询时使用了传统的倒排索引方法。在大规模图像库中对所有特征描述子使用K-Means聚类方法分类时,由于标准图像库中存在海量特征描述子,聚类中心数目多,且聚类需要经过多次迭代才能完成,从而造成了聚类过程计算时间长,计算量大的问题。在标准图像矢量中使用传统倒排索引方法进行相似性查询时,由于标准图像矢量维数高,达到十万维以上,同样造成了查询实时性差的问题。After searching the literature of the prior art, it is found that the following technologies related to "Acceleration Method in Image Retrieval System" exist. In the patent "Object Retrieval" (US Patent No.: US 2005/0225678A1, published on December 13, 2005), Andrew Zisserman et al. provided a method for users to search for custom objects in images. Among them, the K-Means clustering method is used in the classification of the feature descriptors, and the traditional inverted index method is used in the similarity query between the image vector to be retrieved and the standard image vector. When using the K-Means clustering method to classify all feature descriptors in a large-scale image library, due to the large number of feature descriptors in the standard image library, the number of cluster centers is large, and the clustering needs to go through multiple iterations to complete, so This causes the problem of long calculation time and large amount of calculation in the clustering process. When using the traditional inverted index method for similarity query in the standard image vector, due to the high dimensionality of the standard image vector, which reaches more than 100,000 dimensions, it also causes the problem of poor real-time query performance.

进一步检索发现,David Nister等在专利“Scalable Object Recognition Using HierarchicalQuantization with a Vocabulary Tree”(美国专利号为:US7725484B2,公开日期为2010年5月25日)中提供了一种码书树,在K-Means聚类方法的基础上引入了分层的概念,与传统K-Means聚类方法相比,聚类过程计算时间有所缩短,但是由于标准图像库中存在海量描述子,聚类过程的计算量同样很大,聚类时间过长,同时由于采用了分层的方法,属于同一类别的不同描述子往往会被划分到不同的类别当中,进而造成了量化性能较差。在待检索图像矢量和标准图像矢量之间进行相似性查询同样使用了传统的倒排索引方法,由于图像矢量的维数没有降低,且量化性能较差,从而造成了检索准确率较低,实时性较差。Further search found that David Nister et al. provided a codebook tree in the patent "Scalable Object Recognition Using Hierarchical Quantization with a Vocabulary Tree" (US Patent No.: US7725484B2, date of publication is May 25, 2010), in K- Based on the Means clustering method, the concept of layering is introduced. Compared with the traditional K-Means clustering method, the calculation time of the clustering process is shortened, but due to the large number of descriptors in the standard image library, the calculation of the clustering process The amount is also large, and the clustering time is too long. At the same time, due to the hierarchical method, different descriptors belonging to the same category are often divided into different categories, resulting in poor quantitative performance. The similarity query between the image vector to be retrieved and the standard image vector also uses the traditional inverted index method. Since the dimensionality of the image vector is not reduced and the quantization performance is poor, the retrieval accuracy is low, and real-time Sex is poor.

发明内容Contents of the invention

本发明针对现有技术存在的上述不足,提供一种图像检索系统中的加速方法,通过随机采样创建视觉码书以及根据标准图像矢量创建优化倒排索引得以实现,能够弥补现有技术中聚类过程计算量大和计算时间长的问题,优化倒排索引在保证检索准确率的情况下,提高了相似性搜索的实时性。Aiming at the above-mentioned deficiencies in the prior art, the present invention provides an acceleration method in an image retrieval system, which realizes the creation of visual codebooks through random sampling and the creation of optimized inverted indexes based on standard image vectors, which can make up for the clustering in the prior art The problem of large amount of calculation and long calculation time in the process, optimizing the inverted index improves the real-time performance of similarity search while ensuring the accuracy of retrieval.

本发明是通过以下技术方案实现的,本发明通过对标准图像和待检索图像分别提取特征描述子并生成视觉码书,然后根据种子点集合创建随机kd树并对特征描述子进行分类,然后通过矢量化处理对倒排索引进行优化,最后将待检索图像矢量在优化倒排索引中进行相似性搜索,实现图像检索系统的加速。The present invention is achieved through the following technical solutions. The present invention extracts feature descriptors respectively for standard images and images to be retrieved and generates visual codebooks, then creates a random kd tree according to the set of seed points and classifies the feature descriptors, and then passes The vectorization process optimizes the inverted index, and finally performs similarity search on the image vector to be retrieved in the optimized inverted index, so as to realize the acceleration of the image retrieval system.

所述的对标准图像和待检索图像分别提取特征描述子是指:对标准图像和待检索图像先采用高斯差分算子(Different of Gaussian,DOG)进行特征点检测,然后将每个高斯差分算子通过尺度不变描述子(Scale Invariant Feature Transformation,SIFT)进行描述。The described extraction of feature descriptors for the standard image and the image to be retrieved respectively refers to: the standard image and the image to be retrieved first adopt the Gaussian difference operator (Different of Gaussian, DOG) to detect feature points, and then each Gaussian difference operator The child is described by the scale invariant descriptor (Scale Invariant Feature Transformation, SIFT).

所述的通过尺度不变描述子进行描述包括离线处理和实时处理两个步骤,其中:The description by the scale-invariant descriptor includes two steps of offline processing and real-time processing, wherein:

在离线处理中,对于标准图像库C=(I1,I2,…,IN)中的图像Ii(i=1,2,…,N),通过SIFT描述子表示为

Figure BDA0000035639960000021
其中:
Figure BDA0000035639960000022
是图像Ii中的单个描述子,维数为128维,ni是图像Ii中SIFT描述子的个数。标准图像库中全部SIFT描述子集合表示为S=(X1,X2,…,XN),集合S中SIFT描述子的总数为 In offline processing, for the image I i (i=1, 2 ,..., N ) in the standard image library C=(I 1 , I 2 ,..., I N ), the SIFT descriptor is expressed as
Figure BDA0000035639960000021
in:
Figure BDA0000035639960000022
is a single descriptor in image I i with a dimension of 128, and n i is the number of SIFT descriptors in image I i . The set of all SIFT descriptors in the standard image library is expressed as S=(X 1 , X 2 ,...,X N ), and the total number of SIFT descriptors in the set S is

在实时处理中,对于待检索图像Q,通过SIFT描述子T表示为T=(q1,q2,…,qm),其中qk(k=1,2,…,m)是图像Q中单个描述子,维数为128维,m是图像Q中SIFT描述子的个数。In real-time processing, for the image Q to be retrieved, the SIFT descriptor T is expressed as T=(q 1 , q 2 ,…,q m ), where q k (k=1, 2,…,m) is the image Q In a single descriptor, the dimension is 128 dimensions, and m is the number of SIFT descriptors in the image Q.

所述的创建视觉码书是指:对标准图像库中的特征描述子进行随机采样并创建视觉码书,具体步骤为:对SIFT描述子集合S随机采样,提取部分SIFT描述子作为种子点集合D,D=(y1,y2,…,yz),其中:集合D中种子点的数量为z,每个子点为yj(j=1,2,…,z);然后对SIFT描述子集合S进行分类,种子点yj决定了将与其相似的SIFT描述子划分到种子点yj对应的类别中,种子点的数量z即为类别的数量,种子点集合D为标准图像Ii量化需要的视觉码书。The creation of the visual codebook refers to: randomly sampling the feature descriptors in the standard image library and creating the visual codebook, the specific steps are: randomly sampling the SIFT descriptor set S, and extracting part of the SIFT descriptors as the seed point set D, D=(y 1 , y 2 ,..., y z ), wherein: the number of seed points in the set D is z, and each sub-point is y j (j=1, 2,..., z); then for SIFT The descriptor set S is classified, the seed point y j decides to divide the similar SIFT descriptors into the category corresponding to the seed point y j , the number of seed points z is the number of categories, and the seed point set D is the standard image I i Quantify the required visual codebook.

所述的创建随机kd树是指:通过自上而下的迭代过程,每次迭代都以每个节点在多个较大方差值对应的维数中随机选择且节点的分割阈值在对应维数靠近中值的元素中随机选择为原则进行节点的创建。The creation of a random kd tree refers to: through a top-down iterative process, each iteration randomly selects each node in a dimension corresponding to a plurality of larger variance values and the segmentation threshold of the node is in the corresponding dimension Nodes are created based on the principle of randomly selecting elements close to the median.

所述的对特征描述子进行分类是指:根据节点阈值将种子点集合D中的每个种子点yj划分到不同的空间,具体步骤为:使用单一最优查询方法对SIFT描述子在多棵随机kd树中进行搜索以找到对应种子点yj,查找到最相似的种子点并存放在单一最优序列当中,当查询路径达到一定数目时停止搜索,则查询到种子点对应的类别即为SIFT描述子应该划分的类别。The described classification of feature descriptors refers to: according to the node threshold, each seed point y j in the seed point set D is divided into different spaces, and the specific steps are: using a single optimal query method to classify SIFT descriptors in multiple Search in a random kd tree to find the corresponding seed point y j , find the most similar seed point and store it in a single optimal sequence, stop searching when the query path reaches a certain number, then query the category corresponding to the seed point is The category that the SIFT descriptor should be divided into.

所述的矢量化处理是指:采用种子频率-倒图像频率(term frequency-inverse documentfrequency,tf-idf)方法分别对标准图像和待检索图像矢量化,然后对标准图像矢量和待检索图像矢量非零元素的位置指数和进行计算。The vectorization process refers to: using the seed frequency-inverse image frequency (term frequency-inverse document frequency, tf-idf) method to vectorize the standard image and the image to be retrieved respectively, and then the standard image vector and the image vector to be retrieved are inversely The position index sum of the zero element is calculated.

所述的图像矢量化包括离线处理和实时处理,其中:The image vectorization includes offline processing and real-time processing, wherein:

图像矢量化的离线处理步骤包括:The offline processing steps for image vectorization include:

1)对标准图像Ii中种子点yj出现的次数nij及SIFT描述子总数ni进行统计作为种子频率,则标准图像Ii中种子频率

Figure BDA0000035639960000031
对标准图像库C中包含有种子点yj的标准图像数量Mj进行统计;1) The number of times n ij of the seed point y j in the standard image I i and the total number of SIFT descriptors n i are counted as the seed frequency, then the seed frequency in the standard image I i
Figure BDA0000035639960000031
Perform statistics on the number of standard images M j containing the seed point y j in the standard image library C;

2)采用文本检索中常用的停用词方法,对Mj的大小进行判定,判定阈值为T,当Mj>T时,删除对应种子点yj;当Mj≤T时,保留Mj且令Mj=Mr;在对所有Mj判定后,种子点的个数由z减少为z′,进而倒图像频率

Figure BDA0000035639960000032
种子频率由fij变为fir
Figure BDA0000035639960000033
2) Use the stop word method commonly used in text retrieval to judge the size of M j , the judgment threshold is T, when M j > T, delete the corresponding seed point y j ; when M j ≤ T, keep M j And let M j =M r ; after all M j are judged, the number of seed points is reduced from z to z', and then the image frequency is reversed
Figure BDA0000035639960000032
The seed frequency is changed from f ij to f ir ,
Figure BDA0000035639960000033

3)标准图像Ii对应的图像矢量为Vi,则标准图像矢量Vi表示为Vi=(c1,c2,…,cz′),其中

Figure BDA0000035639960000034
从而完成离线处理中标准图像矢量化。3) The image vector corresponding to the standard image I i is V i , then the standard image vector V i is expressed as V i =(c 1 , c 2 ,...,c z′ ), where
Figure BDA0000035639960000034
This completes the standard image vectorization in offline processing.

图像矢量化的实时处理步骤包括:The real-time processing steps for image vectorization include:

a)对待检索图像Q中种子点yr出现的次数mr及SIFT描述子个数m进行统计,则待检索图像Q中种子频率

Figure BDA0000035639960000041
a) Count the number of occurrences m r of the seed point y r in the image Q to be retrieved and the number m of SIFT descriptors, then the frequency of the seed point y r in the image Q to be retrieved is
Figure BDA0000035639960000041

b)对于待检索图像Q的倒图像频率idfqr,采用离线处理的倒图像频率idfr,即

Figure BDA0000035639960000042
待检索图像矢量Vq表示为Vq=(d1,d2,…,dz′),其中
Figure BDA0000035639960000043
从而完成实时处理中待检索图像矢量化。b) For the inverted image frequency idf qr of the image Q to be retrieved, the inverted image frequency idf r processed offline is used, namely
Figure BDA0000035639960000042
The image vector V q to be retrieved is expressed as V q =(d 1 ,d 2 ,...,d z′ ), where
Figure BDA0000035639960000043
In this way, the vectorization of the image to be retrieved in the real-time processing is completed.

所述的对标准图像矢量和待检索图像矢量非零元素的位置指数和进行计算是指:在离线处理中,对矢量Vi二值化,设二值化后图像矢量为Vi′=(p1,p2,…,pz′),其中

Figure BDA0000035639960000044
从而标准图像矢量Vi非零元素的位置指数和si表示为
Figure BDA0000035639960000045
在实时处理中,对矢量Vq二值化,设二值化后图像矢量为V′q=(w1,w2,…,wz′),其中
Figure BDA0000035639960000046
从而待检索图像矢量Vq非零元素的位置指数和sq表示为
Figure BDA0000035639960000047
The calculation of the position index sum of the non-zero elements of the standard image vector and the image vector to be retrieved refers to: in offline processing, the vector V i is binarized, and the image vector after binarization is V i '=( p 1 , p 2 ,...,p z′ ), where
Figure BDA0000035639960000044
Thus the position indices and s i of the non-zero elements of the standard image vector V i are expressed as
Figure BDA0000035639960000045
In real-time processing, the vector V q is binarized, and the image vector after binarization is V′ q =(w 1 ,w 2 ,…,w z′ ), where
Figure BDA0000035639960000046
Therefore, the position index and s q of the non-zero elements of the image vector V q to be retrieved are expressed as
Figure BDA0000035639960000047

所述的创建优化倒排索引是指:在离线处理中,采用种子点yr作为索引,标准图像矢量Vi作为索引目标,对于种子点yr,存在对应的倒排索引列表Lr。对于标准图像矢量Vi中的元素u,当cu>0,则该图像矢量Vi的名称Ii及非零元素的位置指数和si被记录在列表Lu中,记为Lu={yu|(Ii,si)};然后依次对标准图像矢量Vi进行处理并根据非零元素的位置将其记录到对应的索引例表Lr中,创建倒排索引L={L1,L2,…,Lz′};再将倒排索引例表Lr以及列表Lr中对应的标准图像Ii进行排序,对于索引列表Lr,记录标准图像的数量并不相同,标准图像矢量非零元素的位置指数和也不相同。首先将索引列表Lr按照记录标准图像的数量从大到小排序,然后在索引列表Lr中将标准图像Ii根据非零元素的位置指数和si从大到小排序。在对倒排索引列表Lr及其列表中对应的标准图像Ii排序后,创建优化倒排索引L′,从而用于实时处理进行相似性搜索。The creation of an optimized inverted index refers to: in offline processing, the seed point y r is used as the index, the standard image vector V i is used as the index target, and there is a corresponding inverted index list L r for the seed point y r . For the element u in the standard image vector V i , when c u > 0, the name I i of the image vector V i and the position index and s i of the non-zero elements are recorded in the list L u , denoted as L u = {y u |(I i , s i )}; then process the standard image vector V i sequentially and record it into the corresponding index table L r according to the position of the non-zero element, and create an inverted index L={ L 1 , L 2 ,..., L z′ }; then sort the inverted index table L r and the corresponding standard images I i in the list L r , for the index list L r , the number of recorded standard images is not the same , the sum of the position indices of the non-zero elements of the standard image vector is also different. First sort the index list L r according to the number of recorded standard images from large to small, and then sort the standard image I i in the index list L r according to the position index and si of the non-zero elements from large to small. After sorting the inverted index list L r and its corresponding standard image I i in the list, an optimized inverted index L' is created to be used for real-time processing for similarity search.

所述的相似性搜索具体包括以下步骤:The similarity search specifically includes the following steps:

i)查询包含标准图像数量较多的索引列表,然后在该索引列表中将待检索图像非零元素的位置指数和sq作为阈值,将sq与列表中标准图像非零元素的位置指数和si进行比较,对于小于该阈值sq的标准图像及其后续位置指数和更小的标准图像将被排除;i) Query the index list that contains a large number of standard images, and then use the position index and s q of the non-zero elements of the image to be retrieved as the threshold in the index list, and use s q and the position index and sum of the non-zero elements of the standard image in the list si for comparison, for the standard image smaller than the threshold s q and its subsequent position index and smaller standard images will be excluded;

ii)在优化倒排索引L′中进行相似性搜索时,存在累加器A,用于记录标准图像Ii出现的次数ai,每个标准图像都对应着一个累加器ai,则A=(a1,a2,…,aN),当在倒排索引列表中标准图像Ii被查询一次,则标准图像Ii对应的累加器ai加1,即ai=ai+1,最后对标准图像对应的累加器A进行排序,数值较大的累加器对应的标准图像,即是待检索图像矢量Vq的候选查询结果,从而完成优化倒排索引搜索;ii) When similarity search is performed in the optimized inverted index L′, there is an accumulator A, which is used to record the number of occurrences a i of the standard image I i , and each standard image corresponds to an accumulator a i , then A= (a 1 , a 2 ,..., a N ), when the standard image I i is queried once in the inverted index list, the accumulator a i corresponding to the standard image I i is incremented by 1, that is, a i =a i +1 , finally sort the accumulator A corresponding to the standard image, and the standard image corresponding to the accumulator with a larger value is the candidate query result of the image vector V q to be retrieved, thereby completing the optimized inverted index search;

iii)将待检索图像矢量Vq和候选标准图像矢量Vi进行相似性度量,采用两个矢量间的余弦值进行相似性计算,

Figure BDA0000035639960000051
其中
Figure BDA0000035639960000052
Figure BDA0000035639960000053
在计算出余弦值cos(Vq,Vi)后,将余弦值cos(Vq,Vi)从大到小排序,最大余弦值cos(Vq,Vi)对应的标准图像Ii,即为待检索图像Q的最终查询结果。iii) Measure the similarity between the image vector V q to be retrieved and the candidate standard image vector V i , and use the cosine value between the two vectors to calculate the similarity,
Figure BDA0000035639960000051
in
Figure BDA0000035639960000052
Figure BDA0000035639960000053
After calculating the cosine value cos(V q , V i ), sort the cosine value cos(V q , V i ) from large to small, and the standard image I i corresponding to the largest cosine value cos(V q , V i ), That is, the final query result of the image Q to be retrieved.

本发明的有益效果是:与传统K-means聚类方法创建视觉码书相比,本发明提供的随机视觉码书只需在SIFT描述子集合中进行随机采样,不需要多次迭代处理,计算量小,计算时间短。与传统倒排索引相比,本发明提出的优化倒排索引能够根据待检索图像矢量非零元素的位置指数和快速排除不相干标准图像,提高了在大规模图像库中相似性搜索的速度。与现有技术相比,本发明能够在降低计算量的同时提高检索的实时性。The beneficial effects of the present invention are: compared with the traditional K-means clustering method to create a visual codebook, the random visual codebook provided by the present invention only needs to be randomly sampled in the SIFT descriptor set, and does not require multiple iterations. The amount is small and the calculation time is short. Compared with the traditional inverted index, the optimized inverted index proposed by the present invention can quickly exclude irrelevant standard images according to the position index of the non-zero elements of the image vector to be retrieved, and improves the speed of similarity search in large-scale image databases. Compared with the prior art, the present invention can improve the real-time performance of retrieval while reducing the amount of computation.

附图说明Description of drawings

图1为本方法流程图。Figure 1 is a flowchart of the method.

图2为实时处理中整体检索时间及相关步骤所耗费的时间。Figure 2 shows the overall retrieval time and the time spent on related steps in real-time processing.

图3为传统倒排索引查询时间与优化倒排索引查询时间比较。Figure 3 is a comparison of the traditional inverted index query time and the optimized inverted index query time.

具体实施方式Detailed ways

下面对本发明的实施例作详细说明,本实施例在以本发明技术方案为前提下进行实施,给出了详细的实施方式和具体的操作过程,但本发明的保护范围不限于下述的实施例。The embodiments of the present invention are described in detail below. This embodiment is implemented on the premise of the technical solution of the present invention, and detailed implementation methods and specific operating procedures are provided, but the protection scope of the present invention is not limited to the following implementation example.

如图1所示,本实施例采用图像检索系统的加速方法,对手机拍摄图像进行检索,具体实施步骤如下:As shown in Figure 1, this embodiment adopts the acceleration method of the image retrieval system to retrieve the images captured by the mobile phone, and the specific implementation steps are as follows:

1.对标准图像和待检索图像分别提取特征描述子。1. Extract feature descriptors for the standard image and the image to be retrieved respectively.

在离线处理中,对标准图像库C=(I1,I2,…,IN)中的图像提取SIFT描述子。图像Ii中SIFT描述子数量为ni,则标准图像库全部SIFT描述子的总数为

Figure BDA0000035639960000054
In offline processing, SIFT descriptors are extracted for images in the standard image library C=(I 1 , I 2 , . . . , I N ). The number of SIFT descriptors in image I i is n i , then the total number of all SIFT descriptors in the standard image library is
Figure BDA0000035639960000054

在实时处理中,对待检索图像Q提取SIFT描述子,待检索图像Q中SIFT描述子数量为m。In real-time processing, SIFT descriptors are extracted from the image Q to be retrieved, and the number of SIFT descriptors in the image Q to be retrieved is m.

2.对标准图像库中的特征描述子随机采样,创建视觉码书。2. Randomly sample the feature descriptors in the standard image library to create a visual codebook.

在离线处理中,对标准图像库对应的n个SIFT描述子进行随机采样,提取其中z个SIFT描述子作为种子点创建视觉码书,其中z=20%×n。In offline processing, the n SIFT descriptors corresponding to the standard image library are randomly sampled, and z SIFT descriptors are extracted as seed points to create a visual codebook, where z=20%×n.

3.根据种子点集合创建随机kd树,对标准图像和待检索图像的特征描述子进行分类。3. Create a random kd tree based on the set of seed points, and classify the feature descriptors of the standard image and the image to be retrieved.

在离线处理中,根据z个种子点创建8棵独立的随机kd树,将标准图像中的SIFT描述子依次在8棵随机kd树中进行近似最近邻搜索,查询路径数量的最大值设为100,进而将SIFT描述子划分到种子点对应的类别中,统计每个SIFT描述属于种子点类别。In offline processing, 8 independent random kd trees are created according to z seed points, and the SIFT descriptor in the standard image is sequentially searched for approximate nearest neighbors in 8 random kd trees, and the maximum number of query paths is set to 100 , and then divide the SIFT descriptor into the category corresponding to the seed point, and count each SIFT description as belonging to the seed point category.

在实时处理中,根据离线处理创建的8棵随机kd树,将待检索图像Q中SIFT描述子进行近似最近邻搜索,查询路径数量的最大值同样设为100,进而将SIFT描述子划分到种子点对应的类别中,统计每个SIFT描述属于种子点类别。In real-time processing, according to the 8 random kd trees created by offline processing, the SIFT descriptors in the image Q to be retrieved are searched for approximate nearest neighbors, and the maximum number of query paths is also set to 100, and then the SIFT descriptors are divided into seed In the category corresponding to the point, it is counted that each SIFT description belongs to the category of the seed point.

4.采用种子频率-倒图像频率方法分别对标准图像和待检索图像矢量化。4. Use the seed frequency-inverse image frequency method to vectorize the standard image and the image to be retrieved respectively.

在离线处理中,对标准图像库C中包含有种子点yj的标准图像数量Mj采用停用词方法,令停用词阈值T=0.6×max(Mj)。In the off-line processing, the stop word method is used for the number of standard images M j containing the seed point y j in the standard image library C, and the stop word threshold T=0.6×max(M j ).

在实时处理中,只考虑离线处理中采用停用词方法筛选后的种子点,同时采用离线处理的倒图像频率。In the real-time processing, only the seed points filtered by the stop words method in the offline processing are considered, and the inverted image frequency of the offline processing is also used.

5.对标准图像矢量和待检索图像矢量非零元素的位置指数和进行计算。5. Calculate the position index sum of the non-zero elements of the standard image vector and the image vector to be retrieved.

在离线处理中,将标准图像矢量Vi=(c1,c2,…,cz′)二值化为矢量Vi′=(p1,p2,…,pz′),其中

Figure BDA0000035639960000061
则标准图像矢量Vi非零元素的位置指数和 In offline processing, the standard image vector V i = (c 1 , c 2 , ..., c z′ ) is binarized into a vector V i ′ = (p 1 , p 2 , ..., p z′ ), where
Figure BDA0000035639960000061
Then the position indices of the non-zero elements of the standard image vector V i sum

在实时处理中,将待检索图像矢量Vq二值化为矢量Vq=(d1,d2,…,dz′),其中

Figure BDA0000035639960000063
则待检索图像矢量Vq非零元素的位置指数和
Figure BDA0000035639960000064
In real-time processing, the image vector V q to be retrieved is binarized into a vector V q = (d 1 , d 2 ,..., d z′ ), where
Figure BDA0000035639960000063
Then the position indices of the non-zero elements of the image vector V q to be retrieved and
Figure BDA0000035639960000064

6.创建优化倒排索引。6. Create an optimized inverted index.

在离线处理中,将种子点yr作为索引,标准图像矢量Vi作为索引目标,对标准图像矢量Vi中的非零元素进行统计。当矢量Vi中元素u不为零,则将标准图像名称Ii及非零元素的位置指数和si记录在种子点yu对应的索引列表中。在标准图像矢量Vi中所有非零元素统计完成后,将索引列表按照记录标准图像的数量从大到小排序,对索引列表中的标准图像按照非零元素的位置指数和si从大到小排序,创建优化倒排索引。In the off-line processing, the seed point y r is used as the index, the standard image vector V i is used as the index target, and the non-zero elements in the standard image vector V i are counted. When the element u in the vector V i is not zero, record the standard image name I i and the position index and s i of the non-zero elements in the index list corresponding to the seed point y u . After the statistics of all non-zero elements in the standard image vector V i are completed, the index list is sorted according to the number of recorded standard images from large to small, and the standard images in the index list are sorted from large to small according to the position index and s i of the non-zero elements. Small sort, create optimized inverted index.

7.将待检索图像矢量在优化倒排索引中进行相似性搜索,并进行余弦值度量。7. Perform a similarity search on the image vector to be retrieved in the optimized inverted index, and perform a cosine value measurement.

在实时处理中,对于待检索图像矢量Vq非零元素对应的所有索引列表,包含有标准图像数量最多的索引列表将被优先查询,并在该索引列表中将待检索图像矢量Vq的非零元素位置指数和sq和标准图像矢量Vi的非零元素位置指数和si进行比较,当sq≥si时,标准图像Ii对应的累加器ai=ai+1;当sq<si时,排除对应的标准图像Ii及Ii后续位置指数和更小的标准图像。将矢量Vq中所有非零元素依次查询后,对标准图像对应的累加器A进行排序,取出前5个最大的累加器数值,将其对应的标准图像矢量Vi和待检索图像矢量Vq按照余弦值进行相似性度量,对5个余弦值从大到小排序,最大余弦值对应的标准图像,即为待检索图像对应的查询结果。In real-time processing, for all the index lists corresponding to the non-zero elements of the image vector V q to be retrieved, the index list containing the largest number of standard images will be queried first, and the non-zero elements of the image vector V q to be retrieved will be searched first. The zero element position index and s q are compared with the non-zero element position index and s i of the standard image vector V i , when s q ≥ s i , the accumulator a i corresponding to the standard image I i = a i +1; when When s q <s i , the corresponding standard image I i and subsequent position indices of I i and smaller standard images are excluded. After querying all the non-zero elements in the vector V q sequentially, sort the accumulator A corresponding to the standard image, take out the first 5 largest accumulator values, and compare the corresponding standard image vector V i and the image vector V q to be retrieved The similarity measurement is carried out according to the cosine value, and the 5 cosine values are sorted from large to small, and the standard image corresponding to the largest cosine value is the query result corresponding to the image to be retrieved.

对本方法仿真实验如下:在7,655幅标准图像的基础上,对284幅待检索图像进行检索测试。图2为实时处理284幅待检索图像的整体检索时间以及相关步骤所耗费的时间。从图2中看出,叉号曲线表示优化倒排索引的查询时间以及待检索图像矢量和标准图像矢量相似性度量的时间,曲线变化相对稳定,平均耗费时间为0.0055s。与叉号曲线相比,菱形曲线表示待检索图像SIFT描述子分配时间,图像矢量化的时间及非零元素位置指数和的计算时间三者之和,时间相对较长,但曲线变化幅度不大,平均耗费时间为0.2047s。正方形曲线表示待检索图像SIFT描述子提取时间,曲线变化幅度大,这主要与待检索图像的大小有关,平均耗费时间为0.2686s。黑色曲线对应整体检索时间,平均耗费时间为0.4788s,满足实时性的要求。The simulation experiment of this method is as follows: On the basis of 7,655 standard images, the retrieval test is carried out on 284 images to be retrieved. Figure 2 shows the overall retrieval time and the time spent in related steps for real-time processing of 284 images to be retrieved. It can be seen from Figure 2 that the crossed curve represents the query time for optimizing the inverted index and the time for similarity measurement between the image vector to be retrieved and the standard image vector. The change of the curve is relatively stable, and the average time spent is 0.0055s. Compared with the crossed curve, the diamond curve represents the sum of the SIFT descriptor allocation time of the image to be retrieved, the image vectorization time, and the calculation time of the non-zero element position index sum. The time is relatively long, but the curve changes little , the average time spent is 0.2047s. The square curve represents the SIFT descriptor extraction time of the image to be retrieved, and the curve changes greatly, which is mainly related to the size of the image to be retrieved, and the average time spent is 0.2686s. The black curve corresponds to the overall retrieval time, and the average time spent is 0.4788s, which meets the real-time requirements.

在创建视觉码书时间上,本发明的方法分别与AKM(Approximate K-Means)算法以及HKM(Hierarchical K-Means)算法进行了比较。在7,655幅标准图像的基础上,提取到1,999,620个SIFT描述子。设AKM和HKM的聚类中心为19962,迭代次数为40次,随机视觉码书的种子点个数同样为19962。表1中给出了三种算法创建视觉码书的时间。从表1中看出,随机视觉码书的创建时间要远远小于AKM和HKM创建的时间。On the time of creating a visual codebook, the method of the present invention is compared with the AKM (Approximate K-Means) algorithm and the HKM (Hierarchical K-Means) algorithm respectively. Based on 7,655 standard images, 1,999,620 SIFT descriptors are extracted. Assuming that the cluster center of AKM and HKM is 19962, the number of iterations is 40, and the number of seed points of the random visual codebook is also 19962. Table 1 gives the time for the three algorithms to create the visual codebook. It can be seen from Table 1 that the creation time of random visual codebook is much shorter than that of AKM and HKM.

在倒排索引查询时间上,本发明的方法与传统的倒排索引进行了比较。在284幅待检索图像的基础上进行了测试。图3中给出了传统倒排索引查询时间和优化倒排索引查询时间。从图3中看出,菱形曲线表示传统倒排索引的查询时间,平均耗费时间为0.0205s。方形曲线表示优化倒排索引的查询时间,与菱形曲线相比,查询时间较短,曲线幅度波动较小,平均查询时间为0.0028s。看出,优化倒排索引加快待检索图像在标准图像库中的查询速度。On the query time of the inverted index, the method of the present invention is compared with the traditional inverted index. The test is carried out on the basis of 284 images to be retrieved. Figure 3 shows the traditional inverted index query time and optimized inverted index query time. It can be seen from Figure 3 that the diamond curve represents the query time of the traditional inverted index, and the average time spent is 0.0205s. The square curve represents the query time of the optimized inverted index. Compared with the diamond-shaped curve, the query time is shorter, the curve amplitude fluctuates less, and the average query time is 0.0028s. It can be seen that optimizing the inverted index speeds up the query speed of the image to be retrieved in the standard image library.

以上所有算法均在Matlab 7.6上运行。All the above algorithms are run on Matlab 7.6.

  AKMAKM   HKMHKM   随机视觉码书Random Visual Codebook   时间 time   2.5h2.5h   2h2h   183s183s

表1随机视觉码书与AKM及HKM创建视觉码书时间比较Table 1 Comparison of time between random visual codebook and AKM and HKM to create visual codebook

Claims (10)

1. the accelerated method in the image indexing system, it is characterized in that, by adopting the difference of Gaussian operator to carry out feature point detection respectively to standard picture and image to be retrieved, then each difference of Gaussian operator is described and creates the vision code book by the constant descriptor of yardstick, create at random the kd tree according to seed points set then and feature description is classified, then carry out vectorized process and inverted index is optimized, at last image vector to be retrieved is carried out similarity searching in optimizing inverted index, realize the acceleration of image indexing system.
2. the accelerated method in the image indexing system according to claim 1 is characterized in that, described being described by the constant descriptor of yardstick comprises processed offline and two steps of real-time processing, wherein:
In processed offline, for standard picture storehouse C=(I 1, I 2..., I N) in image I i, be expressed as by the SIFT descriptor Wherein:
Figure FDA0000035639950000012
It is image I iIn single descriptor, dimension be 128 the dimension, n iIt is image I iThe number of middle SIFT descriptor, whole SIFT descriptors set are expressed as S=(X in the standard picture storehouse 1, X 2..., X N), the SIFT descriptor adds up in the S set
Figure FDA0000035639950000013
In handling in real time, for image Q to be retrieved, T is expressed as T=(q by the SIFT descriptor 1, q 2..., q m), q wherein k(k=1,2 ..., m) being single descriptor among the image Q, dimension is 128 dimensions, m is the number of SIFT descriptor among the image Q.
3. the accelerated method in the image indexing system according to claim 1, it is characterized in that, described establishment vision code book is meant: feature description in the standard picture storehouse is carried out stochastic sampling and creates the vision code book, concrete steps are: to SIFT descriptor S set stochastic sampling, extract part SIFT descriptor and gather D as seed points, D=(y 1, y 2..., y z), wherein: the quantity of seed points is z among the set D, and each seed points is y j(j=1,2 ..., z); Then SIFT descriptor S set is classified seed points y jDetermined the SIFT descriptor similar to it is divided into seed points y jIn the corresponding class, the quantity z of seed points is the quantity of classification, and seed points set D is standard picture I iThe vision code book that quantification needs.
4. the accelerated method in the image indexing system according to claim 1, it is characterized in that, described establishment kd tree at random is meant: by top-down iterative process, each iteration all selects at random in the dimension of a plurality of big variance yields correspondences with each node and the segmentation threshold of node is chosen as the establishment that principle is carried out node at corresponding dimension at random near in the element of intermediate value.
5. the accelerated method in the image indexing system according to claim 1 is characterized in that, described feature description is classified is meant: according to the node threshold value seed points is gathered each seed points y among the D jBe divided into different spaces, concrete steps are: use single optimum querying method that the SIFT descriptor is searched for to find corresponding seed points y in the kd tree at random at many j, the most similar seed points is searched and left in the middle of the single optimal sequence, when reaching some, query path stops search, and then inquire the seed points corresponding class and be the classification that the SIFT descriptor should be divided.
6. the accelerated method in the image indexing system according to claim 1, it is characterized in that, described vectorized process is meant: adopt seed frequency-method of falling the picture frequency respectively to standard picture and image vector to be retrieved, then to the location index of standard picture vector and image vector nonzero element to be retrieved with calculate.
7. the accelerated method in the image indexing system according to claim 1 is characterized in that, described image vector comprises processed offline and processing in real time, wherein:
The processed offline step of image vector comprises:
1) to standard picture I iMiddle seed points y jThe frequency n that occurs IjAnd SIFT descriptor sum n iAdd up as seed frequency, then standard picture I iMiddle seed frequency To including seed points y among the C of standard picture storehouse jStandard picture quantity M jAdd up;
2) adopt stop words method commonly used in the text retrieval, to M jSize judge that decision threshold is T, works as M jDuring>T, delete corresponding seed points y jWork as M jDuring≤T, keep M jAnd make M j=M rTo all M jAfter the judgement, the number of seed points is reduced to z ' by z, and then falls picture frequency
Figure FDA0000035639950000022
Seed frequency is by f IjBecome f Ir,
Figure FDA0000035639950000023
3) standard picture I iCorresponding image vector is V i, standard picture vector V then iBe expressed as V i=(c 1, c 2..., c Z '), wherein
Figure FDA0000035639950000024
Thereby finish standard image vector in the processed offline;
The real-time treatment step of image vector comprises:
A) treat seed points y among the retrieving images Q rThe number of times m that occurs rAnd SIFT descriptor number m adds up seed frequency among the image Q then to be retrieved
Figure FDA0000035639950000025
B) for the idf of falling the picture frequency of image Q to be retrieved Qr, the idf of falling the picture frequency of employing processed offline r, promptly Image vector V to be retrieved qBe expressed as V q=(d 1, d 2..., d Z '), wherein
Figure FDA0000035639950000032
Thereby finish image vector to be retrieved in the real-time processing.
8. the accelerated method in the image indexing system according to claim 1 is characterized in that, described location index to standard picture vector and image vector nonzero element to be retrieved is meant with calculating: in processed offline, to vector V iBinaryzation establishes that image vector is V ' after the binaryzation i=(p 1, p 2..., p Z '), wherein Thereby standard picture vector V iThe location index of nonzero element and s iBe expressed as
Figure FDA0000035639950000034
In handling in real time, to vector V qBinaryzation establishes that image vector is V ' after the binaryzation q=(w 1, w 2..., w Z '), wherein
Figure FDA0000035639950000035
Thereby image vector V to be retrieved qThe location index of nonzero element and s qBe expressed as
Figure FDA0000035639950000036
9. the accelerated method in the image indexing system according to claim 1 is characterized in that, described establishment is optimized inverted index and is meant: in processed offline, adopt seed points y rAs index, the standard picture vector V iAs the index target, for seed points y r, have corresponding inverted index tabulation L r, for the standard picture vector V iIn element u, work as c u>0, this image vector V then iTitle I iAnd the location index and the s of nonzero element iBe recorded in tabulation L uIn, be designated as L u={ y u| (I i, s i); Then successively to the standard picture vector V iHandle and it is recorded corresponding index L according to the position of nonzero element rIn, create inverted index L={L 1, L 2..., L Z '; L again tabulates inverted index rAnd tabulation L rMiddle corresponding standard picture I iSort, for index L r, the quantity of record standard image is also inequality, and the location index of standard picture vector nonzero element and also inequality is at first with index L rQuantity according to the record standard image sorts from big to small, then at index L rIn with standard picture I iLocation index and s according to nonzero element iOrdering from big to small is at L that inverted index is tabulated rAnd corresponding standard picture I in the tabulation iAfter the ordering, create and optimize inverted index L ', thereby being used for handling in real time carries out similarity searching.
10. according to the accelerated method in the described image indexing system of claim 1, it is characterized in that described similarity searching is meant:
I) inquiry comprises a fairly large number of index of standard picture, then in this index with the location index and the s of image nonzero element to be retrieved qAs threshold value, with s qLocation index and s with standard picture nonzero element in the tabulation iCompare, for less than this threshold value s qStandard picture and the littler standard picture of follow-up location exponential sum thereof will be excluded;
When ii) in optimizing inverted index L ', carrying out similarity searching, there is totalizer A, is used for the record standard image I iThe number of times a that occurs i, each standard picture all corresponding a totalizer a i, A=(a then 1, a 2..., a N), when standard image I in the inverted index tabulation iInquired about once, then standard picture I iCorresponding totalizer a iAdd 1, i.e. a i=a i+ 1, the totalizer A to the standard picture correspondence sorts at last, and the standard picture of the totalizer correspondence that numerical value is bigger promptly is image vector V to be retrieved qCandidate's Query Result, optimize the inverted index search thereby finish;
Iii) with image vector V to be retrieved qWith candidate's standard picture vector V iCarry out similarity measurement, adopt the cosine value between two vectors to carry out similarity calculating,
Figure FDA0000035639950000041
Wherein
Figure FDA0000035639950000042
Figure FDA0000035639950000043
Calculating cosine value cos (V q, V i) after, with cosine value cos (V q, V i) ordering from big to small, maximum cosine value cos (V q, V i) corresponding standard picture I i, be the final Query Result of image Q to be retrieved.
CN2010105732370A 2010-12-02 2010-12-02 Acceleration method in image retrieval system Expired - Fee Related CN102004786B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010105732370A CN102004786B (en) 2010-12-02 2010-12-02 Acceleration method in image retrieval system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010105732370A CN102004786B (en) 2010-12-02 2010-12-02 Acceleration method in image retrieval system

Publications (2)

Publication Number Publication Date
CN102004786A true CN102004786A (en) 2011-04-06
CN102004786B CN102004786B (en) 2012-11-28

Family

ID=43812148

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010105732370A Expired - Fee Related CN102004786B (en) 2010-12-02 2010-12-02 Acceleration method in image retrieval system

Country Status (1)

Country Link
CN (1) CN102004786B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102201001A (en) * 2011-04-29 2011-09-28 西安交通大学 A Fast Retrieval Method Based on Inversion Technology
CN102254015A (en) * 2011-07-21 2011-11-23 上海交通大学 Image retrieval method based on visual phrases
CN102902826A (en) * 2012-11-08 2013-01-30 公安部第三研究所 Quick image retrieval method based on reference image indexes
CN103092935A (en) * 2013-01-08 2013-05-08 杭州电子科技大学 Approximate copy image detection method based on scale invariant feature transform (SIFT) quantization
CN103390063A (en) * 2013-07-31 2013-11-13 南京大学 Search method for relevance feedback images based on ant colony algorithm and probability hypergraph
CN104199842A (en) * 2014-08-07 2014-12-10 同济大学 Similar image retrieval method based on local feature neighborhood information
CN104217006A (en) * 2014-09-15 2014-12-17 无锡天脉聚源传媒科技有限公司 Method and device for searching image
CN104424226A (en) * 2013-08-26 2015-03-18 阿里巴巴集团控股有限公司 Method and device for acquiring visual word dictionary and retrieving image
CN105760503A (en) * 2016-02-23 2016-07-13 清华大学 Method for quickly calculating graph node similarity
CN108959650A (en) * 2018-08-02 2018-12-07 聊城大学 Image search method based on symbiosis SURF feature
CN110019879A (en) * 2017-07-31 2019-07-16 清华大学 Satellite remote-sensing image searching method and device
CN110019907A (en) * 2017-12-01 2019-07-16 北京搜狗科技发展有限公司 A kind of image search method and device
CN111190893A (en) * 2018-11-15 2020-05-22 华为技术有限公司 Method and apparatus for establishing feature index
CN111797260A (en) * 2020-07-10 2020-10-20 宁夏中科启创知识产权咨询有限公司 Trademark retrieval method and system based on image recognition
CN113536019A (en) * 2017-09-27 2021-10-22 深圳市商汤科技有限公司 A kind of image retrieval method, apparatus and computer readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6522782B2 (en) * 2000-12-15 2003-02-18 America Online, Inc. Image and text searching techniques
CN101567051A (en) * 2009-06-03 2009-10-28 复旦大学 Image matching method based on characteristic points

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6522782B2 (en) * 2000-12-15 2003-02-18 America Online, Inc. Image and text searching techniques
CN101567051A (en) * 2009-06-03 2009-10-28 复旦大学 Image matching method based on characteristic points

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《IEEE ICCV2009 Workshops》 20091004 Mohamed Aly等 《Scaling Object Recognition: Benchmark of Current State of the Art Techniques》 第2117-2124页 1 , 2 *
《IEEE ICIS2009》 20091122 Mei Mei等 《Rapid Search Scheme for Video Copy Detection in Large Databases》 第448-452页 1 , 2 *

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102201001B (en) * 2011-04-29 2012-11-28 西安交通大学 Fast retrieval method based on inverted technology
CN102201001A (en) * 2011-04-29 2011-09-28 西安交通大学 A Fast Retrieval Method Based on Inversion Technology
CN102254015A (en) * 2011-07-21 2011-11-23 上海交通大学 Image retrieval method based on visual phrases
CN102254015B (en) * 2011-07-21 2013-11-20 上海交通大学 Image retrieval method based on visual phrases
CN102902826B (en) * 2012-11-08 2016-07-06 公安部第三研究所 A kind of image method for quickly retrieving based on reference picture index
CN102902826A (en) * 2012-11-08 2013-01-30 公安部第三研究所 Quick image retrieval method based on reference image indexes
CN103092935A (en) * 2013-01-08 2013-05-08 杭州电子科技大学 Approximate copy image detection method based on scale invariant feature transform (SIFT) quantization
CN103390063B (en) * 2013-07-31 2016-08-10 南京大学 A kind of based on ant group algorithm with the search method of related feedback images of probability hypergraph
CN103390063A (en) * 2013-07-31 2013-11-13 南京大学 Search method for relevance feedback images based on ant colony algorithm and probability hypergraph
CN104424226B (en) * 2013-08-26 2018-08-24 阿里巴巴集团控股有限公司 A kind of method and device obtaining visual word dictionary, image retrieval
CN104424226A (en) * 2013-08-26 2015-03-18 阿里巴巴集团控股有限公司 Method and device for acquiring visual word dictionary and retrieving image
CN104199842A (en) * 2014-08-07 2014-12-10 同济大学 Similar image retrieval method based on local feature neighborhood information
CN104199842B (en) * 2014-08-07 2017-10-24 同济大学 A kind of similar pictures search method based on local feature neighborhood information
CN104217006A (en) * 2014-09-15 2014-12-17 无锡天脉聚源传媒科技有限公司 Method and device for searching image
CN105760503B (en) * 2016-02-23 2019-02-05 清华大学 A Fast Method to Calculate the Similarity of Graph Nodes
CN105760503A (en) * 2016-02-23 2016-07-13 清华大学 Method for quickly calculating graph node similarity
CN110019879A (en) * 2017-07-31 2019-07-16 清华大学 Satellite remote-sensing image searching method and device
CN113536019A (en) * 2017-09-27 2021-10-22 深圳市商汤科技有限公司 A kind of image retrieval method, apparatus and computer readable storage medium
CN110019907A (en) * 2017-12-01 2019-07-16 北京搜狗科技发展有限公司 A kind of image search method and device
CN110019907B (en) * 2017-12-01 2021-07-16 北京搜狗科技发展有限公司 Image retrieval method and device
CN108959650A (en) * 2018-08-02 2018-12-07 聊城大学 Image search method based on symbiosis SURF feature
CN111190893A (en) * 2018-11-15 2020-05-22 华为技术有限公司 Method and apparatus for establishing feature index
CN111190893B (en) * 2018-11-15 2023-05-16 华为技术有限公司 Method and device for building feature index
CN111797260A (en) * 2020-07-10 2020-10-20 宁夏中科启创知识产权咨询有限公司 Trademark retrieval method and system based on image recognition

Also Published As

Publication number Publication date
CN102004786B (en) 2012-11-28

Similar Documents

Publication Publication Date Title
CN102004786B (en) Acceleration method in image retrieval system
CN111198959B (en) Two-stage image retrieval method based on convolutional neural network
JP5294342B2 (en) Object recognition image database creation method, processing apparatus, and processing program
CN104834693B (en) Visual pattern search method and system based on deep search
CN102254015B (en) Image retrieval method based on visual phrases
CN111859004B (en) Retrieval image acquisition method, retrieval image acquisition device, retrieval image acquisition equipment and readable storage medium
CN106503223B (en) An online housing search method and device combining location and keyword information
CN102033949B (en) Correction-based K nearest neighbor text classification method
CN114048318A (en) Clustering method, system, device and storage medium based on density radius
JPWO2010143573A1 (en) Object recognition image database creation method, creation apparatus, and creation processing program
CN103390165A (en) Picture clustering method and device
CN107832456A (en) A kind of parallel KNN file classification methods based on the division of critical Value Data
CN112182264B (en) Method, device and equipment for determining landmark information and readable storage medium
CN111325276A (en) Image classification method and apparatus, electronic device, and computer-readable storage medium
CN113222109A (en) Internet of things edge algorithm based on multi-source heterogeneous data aggregation technology
CN108319959A (en) A kind of corps diseases image-recognizing method compressed based on characteristics of image with retrieval
CN112966072A (en) Case prediction method and device, electronic device and storage medium
CN105678244A (en) Approximate video retrieval method based on improvement of editing distance
CN108121806A (en) One kind is based on the matched image search method of local feature and system
CN113761242A (en) A big data image recognition system and method based on artificial intelligence
CN106503146A (en) Computer text feature selection method, classification feature selection method and system
CN117493998A (en) A method and system for intelligent classification management of questionnaire survey events based on big data
CN111723223B (en) A multi-label image retrieval method based on subject inference
CN114860929A (en) News text classification method based on improved TextCNN
CN110674334B (en) A near-duplicate image retrieval method based on deep learning features of consistent regions

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121128

Termination date: 20171202

CF01 Termination of patent right due to non-payment of annual fee