JP2014225168A

JP2014225168A - Program, device, and method for calculating similarity between images represented by feature point set

Info

Publication number: JP2014225168A
Application number: JP2013104582A
Authority: JP
Inventors: 祐介内田; Yusuke Uchida; 茂之酒澤; Shigeyuki Sakasawa
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2013-05-16
Filing date: 2013-05-16
Publication date: 2014-12-04

Abstract

【課題】リファレンス画像に基づくマーカが平面上に配置されており、そのリファレンス画像がクエリ画像中に存在する場合に、クエリ画像とリファレンス画像との間の類似度を、できる限り正確に算出するプログラム、装置及び方法を提供する。【解決手段】クエリ画像の特徴点ｐ'及びリファレンス画像の特徴点ｐと、リファレンス画像からクエリ画像へ射影するHomography行列Ｈとを入力し、リファレンス画像から、クエリ画像に対する平面画像への回転並進行列Ｔを算出し、リファレンス画像の他の複数の特徴点について、Homography行列Ｈ及び回転並進行列Ｔを用いて射影変換した点と、クエリ画像の特徴点との類似度を算出する類似度算出手段を有する。【選択図】図１A program for calculating a similarity between a query image and a reference image as accurately as possible when markers based on the reference image are arranged on a plane and the reference image is present in the query image. An apparatus and method are provided. A feature point p ′ of a query image, a feature point p of a reference image, and a Homography matrix H that projects from the reference image to the query image are input, and a rotation parallel progression sequence from the reference image to the planar image with respect to the query image A similarity calculation means for calculating T and calculating the similarity between a point obtained by projective transformation using the Homography matrix H and the rotation parallel progression T for a plurality of other feature points of the reference image and a feature point of the query image Have. [Selection] Figure 1

Description

本発明は、特徴点集合で表される画像間の類似度を算出する技術に関する。また、画像間の類似度を算出することによって、特徴点集合で表されるリファレンス画像（検索対象の画像）の集合から、同じく特徴点集合で表されるクエリ画像（検索キーとなる画像）に類似したリファレンス画像を高精度に検索する技術に関する。 The present invention relates to a technique for calculating the similarity between images represented by feature point sets. Further, by calculating the similarity between images, a reference image (image to be searched) represented by a set of feature points is changed to a query image (image serving as a search key) also represented by a set of feature points. The present invention relates to a technique for retrieving a similar reference image with high accuracy.

近年、局所特徴点（特徴ベクトル）に基づいた画像認識や検索技術が注目されている。特に、ＳＩＦＴ(Scale-Invariant Feature Transform)やＳＵＲＦ(Speeded Up Robust Features)のようなアルゴリズムが、回転やスケールの変化にロバストであって、広く利用されている。 In recent years, image recognition and search techniques based on local feature points (feature vectors) have attracted attention. In particular, algorithms such as SIFT (Scale-Invariant Feature Transform) and SURF (Speeded Up Robust Features) are robust to changes in rotation and scale, and are widely used.

従来、局所特徴量を用いた類似画像検索の枠組みは、「Bag-of-Visual Words」（又はBag-of-Features、Bag-of-Keypoints）と称される（例えば特許文献１及び非特許文献１参照）。この技術によれば、Bag-of-Wordsモデル及び転置インデックスを用いた文章の検索方法を、類似画像の検索に適用したものである。Bag-of-Wordsは、文章を１つの単語の頻度により定義される特徴ベクトルで表現し、文章集合に基づいて予め導出されたＩＤＦ(Inverse Document Frequency)を単語の重みとして文章間の類似度を導出する枠組みである。これに対し、Bag-of-Visual Wordsは、画像の局所特徴量を量子化し、量子化後の局所特徴量を単語と見立て、同様に頻度により定義される１つの特徴ベクトルとして表現し、ＩＤＦを用いた重み付けを利用して同一の類推方法を適用することができる。 Conventionally, a similar image retrieval framework using local features is called “Bag-of-Visual Words” (or Bag-of-Features, Bag-of-Keypoints) (for example, Patent Document 1 and Non-Patent Documents). 1). According to this technique, a sentence retrieval method using a Bag-of-Words model and a transposed index is applied to retrieval of similar images. Bag-of-Words expresses a sentence as a feature vector defined by the frequency of one word, and uses IDF (Inverse Document Frequency) derived in advance based on the sentence set to determine the similarity between sentences. It is a framework to derive. On the other hand, Bag-of-Visual Words quantizes the local feature quantity of an image, regards the local feature quantity after quantization as a word, and expresses it as one feature vector similarly defined by the frequency. The same analogy method can be applied using the weighting used.

これら従来技術に対して、クエリ画像に対して類似度の高い順に並べられたリファレンス画像のリストについて、更なる正確度を高めたい場合もある。また、クエリ画像とリファレンス画像とに、同一の物体が含まれているか否かを、類似度に基づいて閾値で判定したい場合もある。これらの場合に対して、最初の検索結果で得られた上位Ｎ件のリファレンス画像について、更に正確なスコアを計算するリランキング技術がある（例えば非特許文献２及び３参照）。非特許文献２に記載の技術によれば、上位Ｎ件の各リファレンス画像について、クエリ画像との近傍点対応に基づいて、リファレンス画像とクエリ画像間のHomography行列を推定する。そのHomography行列により定義される拘束条件を満たす点対応 (inlier) の数を新たなスコアとして利用することで、スコアの高精度化を実現している。 In some cases, it may be desirable to further improve the accuracy of the list of reference images arranged in descending order of similarity to the query image. Further, there is a case where it is desired to determine whether or not the same object is included in the query image and the reference image using a threshold value based on the similarity. For these cases, there is a reranking technique that calculates a more accurate score for the top N reference images obtained from the first search result (see, for example, Non-Patent Documents 2 and 3). According to the technique described in Non-Patent Document 2, the Homography matrix between the reference image and the query image is estimated for each of the top N reference images based on the neighborhood point correspondence with the query image. By using the number of point correspondences (inliers) that satisfy the constraint conditions defined by the Homography matrix as a new score, high accuracy of the score is achieved.

一方で、サービスの視点によれば、従来、例えばトレーディングカードゲーム（Trading Card Game、略称「トレカ」）について、拡張現実感（Augmented Reality, AR）の技術を用いたものがある。例えば、カードに、ＡＲマーカが記載されたものがある（例えば非特許文献４参照）。これは、所定アプリをダウンロードしたスマートフォンで、そのカードに記載されたＡＲマーカを読み込み。これによって、そのスマートフォンのディスプレイに、キャラクタが浮かび上がるように出現し、ゲームを進行させるものである。 On the other hand, from the viewpoint of services, conventionally, for example, trading card games (abbreviated as “trading cards”) have used augmented reality (AR) technology. For example, there is a card in which an AR marker is described (see, for example, Non-Patent Document 4). This is a smartphone that has downloaded a predetermined application, and reads the AR marker written on the card. As a result, the character appears on the display of the smartphone so as to emerge, and the game proceeds.

特開２０１０−２８２５８１号公報JP 2010-282581 A

J. Sivic et al., "Video Google: A Text Retrieval Approach toObject Matching in Videos," in Proc. ICCV, 2003.J. Sivic et al., "Video Google: A Text Retrieval Approach to Object Matching in Videos," in Proc. ICCV, 2003. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman,"Object Retrieval with Large Vocabularies and Fast Spatial Matching,"in Proc of CVPR, 2007.J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, "Object Retrieval with Large Vocabularies and Fast Spatial Matching," in Proc of CVPR, 2007. O. Chum and J. Matas, "Matching with PROSAC - ProgressiveSample Consensus," in Proc. of CVPR, 2005.O. Chum and J. Matas, "Matching with PROSAC-ProgressiveSample Consensus," in Proc. Of CVPR, 2005. バンダイ、「ＡＲコンテンツを搭載したカードダスを7月30日に発売」、[online]、［平成２５年４月１９日検索］、インターネット＜URL:http://gamebiz.jp/?p=15875＞Bandai, “Carddass equipped with AR content will be released on July 30,” [online], [April 19, 2013 search], Internet <URL: http://gamebiz.jp/?p=15875> D. G. Lowe, "Distinctive Image Features from Scale-InvariantKeypoints," International Journal of Computer Vision, vol. 60, no. 2, pp.91-110, 2004.D. G. Lowe, "Distinctive Image Features from Scale-InvariantKeypoints," International Journal of Computer Vision, vol. 60, no. 2, pp.91-110, 2004. Y. Uchida, K. Takagi, and S. Sakazawa, "An Alternative to IDF:Effective Scoring for Accurate Image Retrieval with Non-Parametric DensityRatio Estimation," in Proc. of ICPR, 2012.Y. Uchida, K. Takagi, and S. Sakazawa, "An Alternative to IDF: Effective Scoring for Accurate Image Retrieval with Non-Parametric DensityRatio Estimation," in Proc. Of ICPR, 2012. H. Jegou, M. Douze, and C. Schmid, "Product quantization fornearest neighbor search," in IEEE Trans. on PAMI, vol. 33, no. 1, pp117-128, 2011.H. Jegou, M. Douze, and C. Schmid, "Product quantization fornearest neighbor search," in IEEE Trans. On PAMI, vol. 33, no. 1, pp117-128, 2011. H. Jegou, M. Douze, and C. Schmid, "Improving bag-offeaturesfor large scale image search," in IJCV, vol.87, no.3, pp.316-336, 2010.H. Jegou, M. Douze, and C. Schmid, "Improving bag-offeatures for large scale image search," in IJCV, vol.87, no.3, pp.316-336, 2010.

しかしながら、既存のリランキング技術によれば、上位Ｎ件の画像に対し、個別にHomography行列を推定し且つスコアを算出している。そのために、以下のような問題が生じる。
（１）Homography行列の推定には、最低４組の点対応が必要である。そのために、４組全ての点対応が、inlierでなければ正解のHomography行列が得られない。４組全ての点対応がinlierである確率は、outlierの割合の増加に応じて、指数的に減少し、認識精度も低下する。
（２）Homography行列は、その算出に用いられた点対応以外の点対応を利用して、その信頼度が算出される。そのために、少なくとも４組よりも多い点対応が得られていなければ、Homography行列の検証ができない。結果的に未検出を発生させる。 However, according to the existing reranking technique, the Homography matrix is individually estimated and the score is calculated for the top N images. Therefore, the following problems arise.
(1) The estimation of the homography matrix requires at least four pairs of points. For this reason, a correct Homography matrix cannot be obtained unless all four points correspond to inliers. The probability that all four points correspond to inliers decreases exponentially as the outlier ratio increases, and the recognition accuracy also decreases.
(2) The reliability of the Homography matrix is calculated using point correspondence other than the point correspondence used for the calculation. Therefore, the verification of the Homography matrix cannot be performed unless at least four point correspondences are obtained. As a result, non-detection occurs.

このような画像の局所特徴点の技術は、カードを用いたマーカ認識技術に適用することもできる。例えばＡＲマーカやＱＲコード(Quick Response Code)（登録商標）のような人工的なマーカではなく、自然画像をマーカとして認識することができる。特に、カードのようなものは、テーブルのような平面上に配置されるのが一般的である。 Such a technique of local feature points of an image can also be applied to a marker recognition technique using a card. For example, a natural image can be recognized as a marker, not an artificial marker such as an AR marker or a QR code (Quick Response Code) (registered trademark). In particular, a card or the like is generally arranged on a plane such as a table.

そこで、本発明によれば、リファレンス画像に基づくマーカ（例えばカード）が平面上に配置された状態で写るクエリ画像について、クエリ画像とリファレンス画像との間の類似度を、できる限り正確に算出するプログラム、装置及び方法を提供する。 Therefore, according to the present invention, the similarity between the query image and the reference image is calculated as accurately as possible with respect to the query image captured in a state where a marker (for example, a card) based on the reference image is arranged on a plane. A program, an apparatus, and a method are provided.

本発明によれば、特徴点集合で表されるクエリ画像とリファレンス画像との間の類似度を算出するべく、装置に搭載されたコンピュータを機能させるプログラムであって、
クエリ画像の特徴点ｐ'及びリファレンス画像の特徴点ｐの点対応集合Ｐと、リファレンス画像からクエリ画像へ射影するHomography行列Ｈとを入力し、
２つの点対応（ｐ1,ｐ1'）（ｐ2,ｐ2'）を任意に選択する第１のステップと、
２つの点対応を用いて、リファレンス画像からクエリ画像に対する平面画像への回転並進行列Ｔを算出する第２のステップと、
Homography行列Ｈ及び回転並進行列Ｔを用いて射影変換した点と、クエリ画像の特徴点との間の誤差が、所定範囲に収まった近傍数を計数する第３のステップと
を有し、点対応集合Ｐの全ての点対応に対して第１〜第３のステップを繰り返し、
近傍数が最も多い高整合の回転並進行列Ｔを導出する第４のステップと、
高整合の回転並進行列Ｔについて所定範囲に収まった近傍点対応数を、リファレンス画像とクエリ画像との間の類似度とする第５のステップと
を実行する類似度算出手段としてコンピュータを機能させることを特徴とする。 According to the present invention, in order to calculate the similarity between the query image represented by the feature point set and the reference image, a program for causing a computer mounted on the apparatus to function,
Input a point correspondence set P of the feature point p ′ of the query image and the feature point p of the reference image, and a Homography matrix H that projects from the reference image to the query image,
A first step of arbitrarily selecting two point correspondences (p1, p1 ′) (p2, p2 ′);
A second step of calculating a rotation parallel sequence T from the reference image to the planar image for the query image using the two point correspondence;
A third step of counting the number of neighbors in which the error between the points transformed by using the Homography matrix H and the rotational parallel progression T and the feature points of the query image are within a predetermined range; Repeat steps 1 to 3 for all points in the set P;
A fourth step of deriving a highly aligned rotating parallel train T having the largest number of neighbors;
Causing the computer to function as a similarity calculation means for executing a fifth step in which the number of neighboring points corresponding to the highly aligned rotation parallel progression T is within the predetermined range is set as the similarity between the reference image and the query image. It is characterized by.

本発明のプログラムにおける他の実施形態によれば、
クエリ画像及びリファレンス画像から局所特徴点の特徴点集合を抽出する局所特徴点抽出手段と、
クエリ画像の特徴点集合とリファレンス画像の特徴点集合とをマッチングし、所定範囲で類似したクエリ画像の特徴点ｐ'及びリファレンス画像の特徴点ｐを類似度算出手段へ出力するマッチング手段と
してコンピュータを更に機能させることも好ましい。 According to another embodiment of the program of the present invention,
Local feature point extraction means for extracting a feature point set of local feature points from a query image and a reference image;
A computer is used as a matching unit that matches a feature point set of a query image with a feature point set of a reference image, and outputs a feature point p ′ of a query image that is similar in a predetermined range and a feature point p of the reference image to a similarity calculation unit. It is also preferable to make it function.

本発明のプログラムにおける他の実施形態によれば、
クエリ画像に、複数のリファレンス画像が含まれており、
Homography行列Ｈは、当該リファレンス画像とは別のリファレンス画像から、クエリ画像へ射影されたものである
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
The query image contains multiple reference images,
It is also preferable to cause the computer to function so that the Homography matrix H is projected onto a query image from a reference image different from the reference image.

本発明のプログラムにおける他の実施形態によれば、
複数のリファレンス画像それぞれから抽出された局所特徴点集合を蓄積するリファレンス特徴点蓄積手段と、
リファレンス特徴点蓄積手段を用いて、クエリ画像の特徴点集合と類似する上位Ｎ件のリファレンス画像を検索する画像検索手段と、
上位Ｎ件のリファレンス画像について、類似度の上位から順に、クエリ画像との間のHomography行列Ｈを算出し、所定閾値以上で類似する近傍点対応（inliner）となったHomography行列Ｈを出力するHomography行列推定手段と
してコンピュータを更に機能させることも好ましい。 According to another embodiment of the program of the present invention,
Reference feature point storage means for storing a set of local feature points extracted from each of a plurality of reference images;
Image search means for searching for top N reference images similar to the feature point set of the query image using reference feature point storage means;
For the top N reference images, the Homography matrix H between the query images is calculated in order from the top of the similarity, and the Homography matrix H that is similar to an inliner at a predetermined threshold or more is output. It is also preferable to further cause the computer to function as matrix estimation means.

本発明のプログラムにおける他の実施形態によれば、
Homography行列推定手段は、上位Ｎ件のリファレンス画像について、類似度の上位から順に、クエリ画像との間のHomography行列Ｈを算出していく際に、最初に近傍点対応（inliner）が所定数以上となったHomography行列を出力する
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
When the Homography matrix estimation means calculates the Homography matrix H between the query images for the top N reference images in order from the top of the similarity, first, the neighborhood point correspondence (inliner) is a predetermined number or more. It is also preferable to cause the computer to function so as to output the obtained Homography matrix.

本発明のプログラムにおける他の実施形態によれば、
マッチング手段は、各クエリ特徴点ｐ'に対し、最も類似したリファレンス特徴点ｐを対応付けて出力する
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
The matching means preferably causes the computer to function so that the most similar reference feature point p is output in association with each query feature point p ′.

本発明のプログラムにおける他の実施形態によれば、
マッチング手段は、各クエリ特徴点ｐ'に対し、最も類似したリファレンス特徴点ｐを対応付け、リファレンス特徴点ｐに複数のクエリ特徴点ｐ'が対応付けられた場合にはリファレンス特徴点ｐに最も類似するクエリ特徴点ｐ'のみを対応付けて出力する
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
The matching means associates the most similar reference feature point p with each query feature point p ′, and when a plurality of query feature points p ′ are associated with the reference feature point p, the matching feature point p is the most. It is also preferable to make the computer function so that only similar query feature points p ′ are output in association with each other.

本発明のプログラムにおける他の実施形態によれば、
回転並進行列Ｔは、１つの回転パラメータθと、２つの並進パラメータｔx,ｔyとを有し、
クエリ画像の特徴点ｐ'と、リファレンス画像の特徴点ｐと、Homography行列Ｈと、回転並進行列Ｔとは、ｐ'＝Ｈ・Ｔ・ｐの関係にある

ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
The rotational translation sequence T has one rotational parameter θ and two translation parameters tx, ty,
The feature point p ′ of the query image, the feature point p of the reference image, the Homography matrix H, and the rotation parallel sequence T have a relationship of p ′ = H · T · p.

It is also preferable to make the computer function.

本発明のプログラムにおける他の実施形態によれば、
回転並進行列Ｔは、２組の点対応関係から、非線形最小二乗法、レーベンバーグ・マーカート法、又は、ニュートン法を用いて推定されるようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
It is also preferable to cause the computer to function so that the rotation parallel progression T is estimated from the two pairs of point correspondences using the nonlinear least square method, the Levenberg-Marquardt method, or the Newton method.

本発明によれば、特徴点集合で表されるクエリ画像とリファレンス画像との間の類似度を算出する装置であって、
クエリ画像の特徴点ｐ'及びリファレンス画像の特徴点ｐの点対応集合Ｐと、リファレンス画像からクエリ画像へ射影するHomography行列Ｈとを入力し、
２つの点対応（ｐ1,ｐ1'）（ｐ2,ｐ2'）を任意に選択する第１のステップと、
２つの点対応を用いて、リファレンス画像からクエリ画像に対する平面画像への回転並進行列Ｔを算出する第２のステップと、
Homography行列Ｈ及び回転並進行列Ｔを用いて射影変換した点と、クエリ画像の特徴点との間の誤差が、所定範囲に収まった近傍数を計数する第３のステップと
を有し、点対応集合Ｐの全ての点対応に対して第１〜第３のステップを繰り返し、
近傍数が最も多い高整合の回転並進行列Ｔを導出する第４のステップと、
高整合の回転並進行列Ｔについて所定範囲に収まった近傍点対応数を、リファレンス画像とクエリ画像との間の類似度とする第５のステップと
を実行する類似度算出手段を有することを特徴とする。 According to the present invention, an apparatus for calculating a similarity between a query image represented by a feature point set and a reference image,
Input a point correspondence set P of the feature point p ′ of the query image and the feature point p of the reference image, and a Homography matrix H that projects from the reference image to the query image,
A first step of arbitrarily selecting two point correspondences (p1, p1 ′) (p2, p2 ′);
A second step of calculating a rotation parallel sequence T from the reference image to the planar image for the query image using the two point correspondence;
A third step of counting the number of neighbors in which the error between the points transformed by using the Homography matrix H and the rotational parallel progression T and the feature points of the query image are within a predetermined range; Repeat steps 1 to 3 for all points in the set P;
A fourth step of deriving a highly aligned rotating parallel train T having the largest number of neighbors;
It has a similarity calculation means for executing a fifth step in which the correspondence number of neighboring points within a predetermined range for the highly aligned rotation parallel progression T is set as the similarity between the reference image and the query image. To do.

本発明によれば、装置を用いて、特徴点集合で表されるクエリ画像とリファレンス画像との間の類似度を算出する方法であって、
クエリ画像の特徴点ｐ'及びリファレンス画像の特徴点ｐの点対応集合Ｐと、リファレンス画像からクエリ画像へ射影するHomography行列Ｈとを入力し、
２つの点対応（ｐ1,ｐ1'）（ｐ2,ｐ2'）を任意に選択する第１のステップと、
２つの点対応を用いて、リファレンス画像からクエリ画像に対する平面画像への回転並進行列Ｔを算出する第２のステップと、
Homography行列Ｈ及び回転並進行列Ｔを用いて射影変換した点と、クエリ画像の特徴点との間の誤差が、所定範囲に収まった近傍数を計数する第３のステップと
を有し、点対応集合Ｐの全ての点対応に対して第１〜第３のステップを繰り返し、
近傍数が最も多い高整合の回転並進行列Ｔを導出する第４のステップと、
高整合の回転並進行列Ｔについて所定範囲に収まった近傍点対応数を、リファレンス画像とクエリ画像との間の類似度とする第５のステップと
を有することを特徴とする。 According to the present invention, using a device, a method for calculating a similarity between a query image represented by a feature point set and a reference image,
Input a point correspondence set P of the feature point p ′ of the query image and the feature point p of the reference image, and a Homography matrix H that projects from the reference image to the query image,
A first step of arbitrarily selecting two point correspondences (p1, p1 ′) (p2, p2 ′);
A second step of calculating a rotation parallel sequence T from the reference image to the planar image for the query image using the two point correspondence;
A third step of counting the number of neighbors in which the error between the points transformed by using the Homography matrix H and the rotational parallel progression T and the feature points of the query image are within a predetermined range; Repeat steps 1 to 3 for all points in the set P;
A fourth step of deriving a highly aligned rotating parallel train T having the largest number of neighbors;
The fifth step is characterized in that the number of neighboring points that fall within a predetermined range with respect to the highly aligned rotational parallel progression T is a similarity between the reference image and the query image.

本発明のプログラム、装置及び方法によれば、リファレンス画像に基づくマーカが平面上に配置された状態で写るクエリ画像について、クエリ画像とリファレンス画像との間の類似度を、できる限り正確に算出することができる。 According to the program, apparatus, and method of the present invention, the similarity between a query image and a reference image is calculated as accurately as possible for a query image that is captured in a state where markers based on the reference image are arranged on a plane. be able to.

本発明における検索装置の機能構成図である。It is a functional block diagram of the search device in this invention. Homography行列に基づくinlier及びoutlierを表す画像対応図である。It is an image correspondence figure showing inlier and outlier based on a Homography matrix. クエリ画像座標系とリファレンス画像座標系との間の変換を表す説明図である。It is explanatory drawing showing conversion between a query image coordinate system and a reference image coordinate system. クエリ画像に写る一方のカードとリファレンス画像とを対応付けた説明図である。It is explanatory drawing which matched one card | curd which appears in a query image, and the reference image. 図４に基づいて、クエリ画像に写る他方のカードとリファレンス画像とを対応付けた説明図である。It is explanatory drawing which matched the other card | curd which appears in a query image, and the reference image based on FIG. 本発明の検索装置を組み込んだスマートフォンを用いた使用形態を表す説明図である。It is explanatory drawing showing the usage pattern using the smart phone incorporating the search device of this invention.

以下では、本発明の実施の形態について、図面を用いて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

本発明によれば、以下のように２つの実施の形態に区分できる。
（１）類似度算出機能：特徴点集合で表されるクエリ画像とリファレンス画像との間の類似度を算出する。
（２）検索装置：類似度算出機能を用いて、大量のリファレンス画像の中から、クエリ画像に写る１つ以上のリファレンス画像を検索する。
尚、以下では、本発明を、類似度算出部を含む検索装置として説明する。 According to the present invention, it can be divided into two embodiments as follows.
(1) Similarity calculation function: calculates a similarity between a query image represented by a feature point set and a reference image.
(2) Search device: Using a similarity calculation function, one or more reference images appearing in a query image are searched from a large number of reference images.
Hereinafter, the present invention will be described as a search device including a similarity calculation unit.

図１は、本発明における検索装置の機能構成図である。 FIG. 1 is a functional configuration diagram of a search device according to the present invention.

図１によれば、検索装置１は、大量の「リファンレンス画像」を予め入力する。リファレンス画像には、同一の物体又は同一カテゴリに属する少なくとも１つのインスタンス（対象物、マーカ）が写っている。そして、検索の際、検索装置１は、検索キーとなる「クエリ画像」を入力し、そのクエリ画像に写る１枚以上のリファレンス画像を検索する。 According to FIG. 1, the search device 1 inputs a large amount of “reference images” in advance. In the reference image, at least one instance (object, marker) belonging to the same object or the same category is shown. In the search, the search device 1 inputs a “query image” serving as a search key, and searches for one or more reference images that appear in the query image.

図１の検索装置１は、局所特徴点抽出部１１と、リファレンス特徴点蓄積部１２と、画像検索部１３と、マッチング・リランキング部１４と、Homography推定部１５と、類似度算出部１６とを有する。これら機能構成部は、検索装置に搭載されたコンピュータを機能させるプログラムを実行することによって実現できる。特に、本発明の特徴部分である類似度算出部１６は、その処理の流れから、類似度算出方法としても理解できる。 1 includes a local feature point extraction unit 11, a reference feature point storage unit 12, an image search unit 13, a matching / reranking unit 14, a homography estimation unit 15, and a similarity calculation unit 16. Have These functional components can be realized by executing a program that causes a computer installed in the search device to function. In particular, the similarity calculation unit 16 which is a characteristic part of the present invention can be understood as a similarity calculation method from the processing flow.

［局所特徴点抽出部１１］
局所特徴点抽出部１１は、クエリ画像及びリファレンス画像それぞれから局所特徴点の特徴点集合を抽出する。リファレンス画像の特徴点集合は、リファレンス特徴点蓄積部１２へ出力され、クエリ画像の特徴点集合は、画像検索部１３へ出力される。 [Local feature point extraction unit 11]
The local feature point extraction unit 11 extracts a feature point set of local feature points from each of the query image and the reference image. The feature point set of the reference image is output to the reference feature point storage unit 12, and the feature point set of the query image is output to the image search unit 13.

局所特徴点の抽出アルゴリズムとしては、例えばＳＩＦＴやＳＵＲＦ、ＯＲＢ(Oriented FAST and Rotated BRIEF)が用いられる。これらの局所特徴点は、座標ｐ＝(x,y)と特徴量ベクトルfとによって記述される。尚、クエリ画像の局所特徴点と、リファレンス画像の局所特徴点とは、その特徴ベクトルについて同じ次元数である。 For example, SIFT, SURF, or ORB (Oriented FAST and Rotated BRIEF) is used as the local feature point extraction algorithm. These local feature points are described by coordinates p = (x, y) and feature quantity vector f. Note that the local feature points of the query image and the local feature points of the reference image have the same number of dimensions for the feature vector.

例えば、ＳＩＦＴの場合、１枚の画像からは１２８次元の特徴点集合が抽出される（例えば非特許文献５参照）。ＳＩＦＴとは、スケールスペースを用いて特徴的な局所領域を解析し、そのスケール変化及び回転に不変となる特徴ベクトルを記述する技術である。一方で、ＳＵＲＦの場合、ＳＩＦＴよりも高速処理が可能であって、１枚の画像から６４次元の特徴点集合が抽出される。また、ＯＲＢは、バイナリコードによる特徴記述としてＢＲＩＥＦ(Binary Robust Independent Elementary Features)を用いて、１つのコンテンツから２５６ビットのバイナリ特徴ベクトルの集合を抽出する。特に、ＯＲＢによれば、ＳＩＦＴやＳＵＲＦと比較して、同等以上の精度を保持すると共に、数百倍の高速化を実現することができる。 For example, in the case of SIFT, a 128-dimensional feature point set is extracted from one image (see, for example, Non-Patent Document 5). SIFT is a technique for analyzing a characteristic local region using a scale space and describing a feature vector that is invariant to scale change and rotation. On the other hand, in the case of SURF, higher-speed processing is possible than in SIFT, and a 64-dimensional feature point set is extracted from one image. In addition, the ORB extracts a set of 256-bit binary feature vectors from one content by using BRIEF (Binary Robust Independent Elementary Features) as a feature description by a binary code. In particular, according to the ORB, it is possible to maintain an accuracy equal to or higher than that of SIFT or SURF and realize a speed increase of several hundred times.

［リファレンス特徴点蓄積部１２］
リファレンス特徴点蓄積部１２は、リファレンス画像毎（リファレンス画像識別子毎）に、局所特徴点の特徴点集合を蓄積する。 [Reference Feature Point Accumulator 12]
The reference feature point accumulation unit 12 accumulates a feature point set of local feature points for each reference image (for each reference image identifier).

［画像検索部１３］
画像検索部１３は、リファレンス特徴点蓄積部１２を用いて、クエリ画像の特徴点集合と類似する上位Ｎ（＞１）件のリファレンス画像を検索する（例えば非特許文献６参照）。クエリ画像の特徴点と、リファレンス画像の特徴点との間の距離が短いほど、類似度が高いことを意味する。具体的には、最近傍探索(Approximate Nearest Neighbor)アルゴリズムの１つである直積量子化を用いた方法（例えば非特許文献７参照）やHamming Embeddingを用いた方法（例えば非特許文献８参照）、ＬＳＨ(Locality-Sensitive
Hashing)を用いることも好ましい。上位Ｎ件のリファレンス画像の特徴点集合は、マッチング・リランキング部１４へ出力され、リファレンス画像の類似度順位は、Homography推定部１５へ出力される。 [Image Search Unit 13]
The image search unit 13 uses the reference feature point storage unit 12 to search for top N (> 1) reference images similar to the feature point set of the query image (for example, see Non-Patent Document 6). The shorter the distance between the feature point of the query image and the feature point of the reference image, the higher the similarity. Specifically, a method using Cartesian product quantization (for example, Non-Patent Document 7) or a method using Hamming Embedding (for example, Non-Patent Document 8), which is one of the Approximate Nearest Neighbor algorithms, LSH (Locality-Sensitive
It is also preferable to use (Hashing). The feature point set of the top N reference images is output to the matching / reranking unit 14, and the similarity ranking of the reference images is output to the Homography estimation unit 15.

［マッチング・リランキング部１４］
マッチング・リランキング部１４は、マッチング機能とリランキング機能とを有する。
＜マッチング機能＞
マッチング機能は、各クエリ特徴点ｐ'に対し、最も類似したリファレンス特徴点ｐを対応付ける。また、リファレンス特徴点ｐに複数のクエリ特徴点ｐ'が対応付けられた場合にはリファレンス特徴点ｐに最も類似するクエリ特徴点ｐ'のみを対応付ける。対応付けられたリファレンス画像とクエリ画像との点対応Ｐ＝{(p, p')}集合は、Homography推定部１５及び類似度算出部１６へ出力される。 [Matching / Reranking Unit 14]
The matching / reranking unit 14 has a matching function and a reranking function.
<Matching function>
The matching function associates the most similar reference feature point p with each query feature point p ′. When a plurality of query feature points p ′ are associated with the reference feature point p, only the query feature point p ′ most similar to the reference feature point p is associated. The point correspondence P = {(p, p ′)} set between the associated reference image and query image is output to the Homography estimation unit 15 and the similarity calculation unit 16.

＜リランキング機能＞
画像検索部１３から出力されたリファレンス画像のリストを、類似度算出部１６から出力された類似度に応じて降順にリランキングする。リランキングには、例えばＩＤＦ(Inverse Document Frequency)を用いることができる。最終的に、所定閾値以上の上位のスコアを得たリファレンス画像を、検索結果として出力する。 <Reranking function>
The list of reference images output from the image search unit 13 is reranked in descending order according to the similarity output from the similarity calculation unit 16. For example, IDF (Inverse Document Frequency) can be used for the re-ranking. Finally, a reference image obtained with a higher score above a predetermined threshold is output as a search result.

［Homography推定部１５］
Homography推定部１５では、画像検索部１３からの上位Ｎ件のリファレンス画像について、類似度の上位から降順に、クエリ画像との間のHomography行列Ｈを算出する。ここで、所定閾値以上で類似する近傍点対応（inliner）となったHomography行列Ｈを検出した際、そのHomography行列Ｈを、類似度算出部１６へ出力する。 [Homography estimation unit 15]
The Homography estimation unit 15 calculates a Homography matrix H between the top N reference images from the image search unit 13 and the query images in descending order of similarity. Here, when a Homography matrix H that is similar to a neighboring point (inliner) at a predetermined threshold or more is detected, the Homography matrix H is output to the similarity calculation unit 16.

図２は、Homography行列に基づくinlier及びoutlierを表す画像対応図である。outlierは、破線で表されている。 FIG. 2 is an image correspondence diagram showing inliers and outliers based on the Homography matrix. outlier is represented by a broken line.

クエリ画像及び対象リファレンス画像は、類似度が高いほど、特徴ベクトル同士が射影幾何学的に線形となる。従って、平面射影変換行列であるHomography行列Ｈによって、座標を置き換えることができる。Homography行列Ｈは、以下のように表される。

In the query image and the target reference image, as the degree of similarity is higher, the feature vectors are projected geometrically linear. Therefore, the coordinates can be replaced by the Homography matrix H which is a planar projective transformation matrix. The Homography matrix H is expressed as follows.

Homography行列の算出には、マッチング・リランキング部１４から入力された特徴点の対応点の集合が用いられる。Homography行列Ｈの未知パラメータ数は、８個（h11〜h33、h33=1）であり、一組の対応点は２個の制約式を与える。従って、この行列Ｈは、４組以上の対応点があれば、最小二乗法によって算出することができる。 For the calculation of the Homography matrix, a set of corresponding points of feature points input from the matching / reranking unit 14 is used. The number of unknown parameters of the Homography matrix H is 8 (h11 to h33, h33 = 1), and a set of corresponding points gives two constraint equations. Therefore, this matrix H can be calculated by the least square method if there are four or more pairs of corresponding points.

そこで、Homography行列推定部１５は、入力された特徴点の対応点の集合の中で、ランダムに４組を選択し、その４組からHomography行列Ｈを算出する。その後、そのHomography行列Ｈを用いて、各クエリ特徴点を射影した際に、以下のように判定する。
（１）リファレンス特徴点に対して所定閾値以下の近くに射影されれば、inlierと判定する。
（２）逆に、所定閾値よりも遠くに射影されれば、outlierと判定する。
この処理を複数回実行した後、inlierの数が最も多かったHomography行列Ｈのみを採用する。 Therefore, the Homography matrix estimation unit 15 randomly selects four sets from the set of corresponding points of the input feature points, and calculates a Homography matrix H from the four sets. Then, when each query feature point is projected using the Homography matrix H, it is determined as follows.
(1) If the reference feature point is projected close to a predetermined threshold value or less, it is determined as inlier.
(2) Conversely, if it is projected farther than the predetermined threshold, it is determined as outlier.
After this process is executed a plurality of times, only the Homography matrix H having the largest number of inliers is employed.

［類似度算出部１６］
類似度算出部１６は、クエリ画像の特徴点ｐ'及びリファレンス画像の特徴点ｐの点対応Ｐ＝{(ｐ, ｐ')}集合と、Homography行列Ｈとを入力する。そして、以下のステップを実行する。
（Ｓ１）２つの点対応（ｐ1,ｐ1'）（ｐ2,ｐ2'）を任意（ランダム）に選択する。
（Ｓ２）２つの点対応を用いて、リファレンス画像からクエリ画像に対する平面画像への回転並進行列Ｔを算出する。
（Ｓ３）Homography行列Ｈ及び回転並進行列Ｔを用いて射影変換した点と、クエリ画像の特徴点との間の誤差が、所定範囲に収まった近傍数を計数する。その誤差は、類似度を意味し（誤差が小さいほど類似度も高い）、以下のように表される。
誤差：||ｐ'−Ｈ・Ｔ・ｐ||²
そして、その誤差が所定閾値以下の近傍点対応をinlierと判定する。最終的には、inlierの数をカウントする。 [Similarity calculation unit 16]
The similarity calculation unit 16 inputs the point correspondence P = {(p, p ′)} set of the feature point p ′ of the query image and the feature point p of the reference image, and the Homography matrix H. Then, the following steps are executed.
(S1) Two point correspondences (p1, p1 ′) (p2, p2 ′) are arbitrarily (randomly) selected.
(S2) The rotation parallel progression T from the reference image to the planar image with respect to the query image is calculated using the two point correspondence.
(S3) Count the number of neighbors in which the error between the projection transformation point using the Homography matrix H and the rotation parallel progression T and the feature point of the query image falls within a predetermined range. The error means the similarity (the smaller the error, the higher the similarity), and is expressed as follows.
Error: || p'-H · T · p || ²
Then, it is determined that the correspondence of neighboring points whose error is equal to or less than a predetermined threshold is inlier. Finally, count the number of inliers.

ここで、点対応集合Ｐの全ての点対応に対して、Ｓ１〜Ｓ３を繰り返す。 Here, S1 to S3 are repeated for all point correspondences of the point correspondence set P.

（Ｓ４）近傍数が最も多い高整合の回転並進行列Ｔを導出する。即ち、最もinlierの数が多かった回転並進行列Ｔを採用する。
（Ｓ５）高整合の回転並進行列Ｔについて所定範囲に収まった近傍点対応数を、リファレンス画像とクエリ画像との間の類似度とする。
(S4) A highly aligned rotational parallel train T having the largest number of neighbors is derived. That is, the rotation parallel progression T having the largest number of inliers is employed.
(S5) The number of neighbor points corresponding to the high-alignment rotation parallel progression T within the predetermined range is set as the similarity between the reference image and the query image.

図３は、クエリ画像座標系とリファレンス画像座標系との間の変換を表す説明図である。 FIG. 3 is an explanatory diagram illustrating conversion between the query image coordinate system and the reference image coordinate system.

リファレンス画像は、回転並進行列Ｔによって中間座標系へ変換でき、更に、Homography行列Ｈによってクエリ画像へ変換できる。回転並進行列Ｔは、１つの回転パラメータθと、２つの並進パラメータｔx,ｔyとを有する。そして、クエリ画像の特徴点ｐ'と、リファレンス画像の特徴点ｐと、Homography行列Ｈと、回転並進行列Ｔとは、以下のような関係にある。
ｐ'＝Ｈ・Ｔ・ｐ

The reference image can be converted into the intermediate coordinate system by the rotation parallel progression T, and further converted into the query image by the Homography matrix H. The rotation translation sequence T has one rotation parameter θ and two translation parameters tx, ty. The feature point p ′ of the query image, the feature point p of the reference image, the Homography matrix H, and the rotation parallel progression sequence T have the following relationship.
p '= H ・ T ・ p

回転並進行列Ｔは、３つの未知変数（ｔx,ｔy、θ）であるために、少なくとも点対応（ｐ,ｐ'）の２つの組を用いて、最小二乗法によって算出することができる。その他の方法として、回転並進行列Ｔは、２組の点対応関係から、レーベンバーグ・マーカート法やニュートン法のような非線形最小二乗法を用いて算出することもできる。 Since the rotation parallel progression T is three unknown variables (tx, ty, θ), it can be calculated by the least square method using at least two pairs of point correspondences (p, p ′). As another method, the rotational parallel progression T can be calculated from the two pairs of point correspondences using a nonlinear least square method such as the Levenberg-Marquardt method or the Newton method.

前述したHomography推定部１５によれば、８個のパラメータを推定するために、少なくとも４組の点対応が必要となる。これに対し、類似度算出部１６によれば、２組の点対応のみで、回転並進パラメータを推定することができる。これは、リファレンス画像の存在する平面が既知であることを前提とすることによって、平行移動及び回転に基づく３パラメータのみを推定すればよいからである。これによって、以下のような効果が得られる。
（１）２組の点対応の全てがinlierである確率は、通常のHomography推定で求められる４組の点対応全てがinlierである確率よりも、圧倒的に高い。そのために、正解のパラメータが推定できる可能性が高くなり、認識精度も高くなる。
（２）２組の点対応のみがパラメータ推定に用いられるために、それ以外の点対応が検証に利用することができ、未検出率が低くなる。 According to the above-described Homography estimation unit 15, in order to estimate eight parameters, at least four sets of point correspondences are required. On the other hand, according to the similarity calculation unit 16, it is possible to estimate the rotational translation parameter only by correspondence between two sets of points. This is because it is only necessary to estimate three parameters based on translation and rotation on the assumption that the plane on which the reference image exists is known. As a result, the following effects can be obtained.
(1) The probability that all the two sets of point correspondences are inliers is overwhelmingly higher than the probability that all the four sets of point correspondences obtained by normal Homography estimation are inliers. Therefore, the possibility that the correct parameter can be estimated increases and the recognition accuracy also increases.
(2) Since only two sets of point correspondences are used for parameter estimation, the other point correspondences can be used for verification, and the undetected rate becomes low.

図４は、クエリ画像に写る一方のカードとリファレンス画像とを対応付けた説明図である。 FIG. 4 is an explanatory diagram in which one card shown in the query image is associated with the reference image.

図４によれば、クエリ画像には、例えば平面上に２枚のカード（複数のリファレンス画像）が配置された状態が写っているとする。その中で、１枚のカードについて、リファレンス画像と類似することが検出されることによって、リファレンス画像からクエリ画像へ変換するためのHomography行列Ｈが算出される。 According to FIG. 4, it is assumed that the query image includes a state in which two cards (a plurality of reference images) are arranged on a plane, for example. Among them, when one card is detected to be similar to the reference image, a Homography matrix H for converting from the reference image to the query image is calculated.

図５は、図４に基づいて、クエリ画像に写る他方のカードとリファレンス画像とを対応付けた説明図である。 FIG. 5 is an explanatory diagram in which the other card shown in the query image is associated with the reference image based on FIG.

図５によれば、図４によって算出されたHomography行列Ｈを用いて、別途、回転並進行列Ｔを算出するだけで、クエリ画像の他方のカードに類似するリファレンス画像を検出することできる。即ち、図５によれば、Homography行列Ｈとして、当該リファレンス画像とは別のリファレンス画像から、クエリ画像へ射影されたものを用いる。 According to FIG. 5, a reference image similar to the other card of the query image can be detected only by separately calculating the rotation parallel progression T using the Homography matrix H calculated in FIG. 4. That is, according to FIG. 5, as the Homography matrix H, a projection image projected from a reference image different from the reference image is used.

図６は、本発明の検索装置を組み込んだスマートフォンを用いた使用形態を表す説明図である。 FIG. 6 is an explanatory diagram showing a usage pattern using a smartphone incorporating the search device of the present invention.

図６によれば、ユーザが、スマートフォンのカメラを用いて、テーブル上に配置されたカードを撮影している。このとき、スマートフォンは、複数のカードが写るクエリ画像（撮影写真）から、できる限り正確に複数のカード（リファレンス画像）を検出することができる。これによって、所定アプリをダウンロードしたスマートフォンでは、複数のカードを撮影したクエリ画像を取り込むことによって、そのスマートフォンのディスプレイに、そのキャラクタが浮かび上がるように出現させることもできる。例えば、カードＡとカードＢとが揃って撮影された場合にのみ、キャラクタＡを出現させることができる。 According to FIG. 6, the user is photographing the card | curd arrange | positioned on the table using the camera of a smart phone. At this time, the smartphone can detect a plurality of cards (reference images) as accurately as possible from a query image (photographed photograph) showing a plurality of cards. Thereby, in the smart phone which downloaded the predetermined application, the character can also be made to appear on the display of the smart phone by capturing query images obtained by photographing a plurality of cards. For example, the character A can appear only when the card A and the card B are photographed together.

以上、詳細に説明したように、本発明のプログラム、装置及び方法によれば、リファレンス画像に基づくマーカが平面上に配置された状態で写るクエリ画像について、クエリ画像とリファレンス画像との間の類似度を、できる限り正確に算出することができる。 As described above in detail, according to the program, the apparatus, and the method of the present invention, the similarity between the query image and the reference image with respect to the query image that is captured in a state in which the marker based on the reference image is arranged on a plane. The degree can be calculated as accurately as possible.

前述した本発明の種々の実施形態について、本発明の技術思想及び見地の範囲の種々の変更、修正及び省略は、当業者によれば容易に行うことができる。前述の説明はあくまで例であって、何ら制約しようとするものではない。本発明は、特許請求の範囲及びその均等物として限定するものにのみ制約される。 Various changes, modifications, and omissions of the above-described various embodiments of the present invention can be easily made by those skilled in the art. The above description is merely an example, and is not intended to be restrictive. The invention is limited only as defined in the following claims and the equivalents thereto.

１検索装置
１１局所特徴点抽出部
１２リファレンス特徴点蓄積部
１３画像検索部
１４マッチング・リランキング部
１５ Homography推定部
１６類似度算出部 DESCRIPTION OF SYMBOLS 1 Search apparatus 11 Local feature point extraction part 12 Reference feature point storage part 13 Image search part 14 Matching / reranking part 15 Homography estimation part 16 Similarity calculation part

Claims

A program for causing a computer mounted on the apparatus to function in order to calculate the similarity between a query image represented by a feature point set and a reference image,
Input a point correspondence set P of the feature points p ′ of the query image and the feature points p of the reference image, and a Homography matrix H that projects from the reference image to the query image,
A first step of arbitrarily selecting two point correspondences (p1, p1 ′) (p2, p2 ′);
A second step of calculating a rotation parallel progression T from the reference image to a planar image with respect to the query image using the two point correspondence;
A third step of counting the number of neighbors in which an error between a point transformed using the Homography matrix H and the rotation parallel progression T and a feature point of the query image falls within a predetermined range; , Repeating the first to third steps for all point correspondences of the point correspondence set P,
A fourth step of deriving a highly aligned rotating parallel train T having the largest number of neighbors;
A computer as similarity calculation means for executing a fifth step in which the number of neighboring points corresponding to the high-alignment rotation parallel progression T is within a predetermined range is set as the similarity between the reference image and the query image. A program characterized by functioning.

Local feature point extraction means for extracting a feature point set of local feature points from the query image and the reference image;
The feature point set of the query image and the feature point set of the reference image are matched, and the feature point p ′ of the query image and the feature point p of the reference image that are similar in a predetermined range are output to the similarity calculation unit. The program according to claim 1, further causing the computer to function as matching means.

The query image includes a plurality of reference images,
The computer program according to claim 2, wherein the computer functions so that the Homography matrix H is projected from the reference image different from the reference image onto the query image.

Reference feature point storage means for storing a set of local feature points extracted from each of the plurality of reference images;
Image search means for searching for top N reference images similar to the feature point set of the query image using the reference feature point storage means;
For the top N reference images, the Homography matrix H between the query images is calculated in order from the top of the similarity, and the Homography matrix H that is similar to an inliner at a predetermined threshold or more is output. The program according to claim 2 or 3, further causing a computer to function as the homography matrix estimation means.

When calculating the Homography matrix H between the query image and the query image in order from the top of the similarity with respect to the top N reference images, the Homography matrix estimation means first matches the neighborhood point (inliner). The program according to claim 4, wherein the computer is caused to function so as to output a Homography matrix having a predetermined number or more.

The said matching means makes a computer function so that the most similar reference feature point p is matched and output with respect to each query feature point p ', The any one of Claim 2 to 5 characterized by the above-mentioned. Program.

The matching means associates each query feature point p ′ with the most similar reference feature point p, and when a plurality of query feature points p ′ are associated with the reference feature point p, the matching feature point p is designated as the reference feature point p. The program according to claim 6, wherein the computer is caused to function so as to output only the most similar query feature point p ′ in association with each other.

The rotational translation sequence T has one rotational parameter θ and two translation parameters tx, ty,
The feature point p ′ of the query image, the feature point p of the reference image, the Homography matrix H, and the rotation parallel sequence T are in a relationship of p ′ = H · T · p.

The program according to any one of claims 1 to 7, wherein the computer is caused to function as described above.

9. The computer system as claimed in claim 8, wherein the rotational parallel progression T is estimated from two sets of point correspondences using a nonlinear least square method, a Levenberg-Marquett method, or a Newton method. The program described in.

An apparatus for calculating the similarity between a query image represented by a feature point set and a reference image,
Input a point correspondence set P of the feature points p ′ of the query image and the feature points p of the reference image, and a Homography matrix H that projects from the reference image to the query image,
A first step of arbitrarily selecting two point correspondences (p1, p1 ′) (p2, p2 ′);
A second step of calculating a rotation parallel progression T from the reference image to a planar image with respect to the query image using the two point correspondence;
A third step of counting the number of neighbors in which an error between a point transformed using the Homography matrix H and the rotation parallel progression T and a feature point of the query image falls within a predetermined range; , Repeating the first to third steps for all point correspondences of the point correspondence set P,
A fourth step of deriving a highly aligned rotating parallel train T having the largest number of neighbors;
It has similarity calculation means for executing a fifth step in which the corresponding number of neighboring points that fall within a predetermined range with respect to the highly aligned rotational parallel progression T is set as the similarity between the reference image and the query image. A device characterized by.

A method of calculating a similarity between a query image represented by a feature point set and a reference image using an apparatus,
Input a point correspondence set P of the feature points p ′ of the query image and the feature points p of the reference image, and a Homography matrix H that projects from the reference image to the query image,
A first step of arbitrarily selecting two point correspondences (p1, p1 ′) (p2, p2 ′);
A second step of calculating a rotation parallel progression T from the reference image to a planar image with respect to the query image using the two point correspondence;
A third step of counting the number of neighbors in which an error between a point transformed using the Homography matrix H and the rotation parallel progression T and a feature point of the query image falls within a predetermined range; , Repeating the first to third steps for all point correspondences of the point correspondence set P,
A fourth step of deriving a highly aligned rotating parallel train T having the largest number of neighbors;
5. A method comprising: a fifth step in which a correspondence number between neighboring points that fall within a predetermined range with respect to the highly aligned rotation parallel progression T is set as a similarity between the reference image and the query image.