JP2015176484A

JP2015176484A - Method and system for three-dimensional model search

Info

Publication number: JP2015176484A
Application number: JP2014053956A
Authority: JP
Inventors: 知佳真田; Chika Sanada; 淳司立間; Junji Tatema; 青野　雅樹; Masaki Aono; 雅樹青野
Original assignee: Toyohashi University of Technology NUC
Current assignee: Toyohashi University of Technology NUC
Priority date: 2014-03-17
Filing date: 2014-03-17
Publication date: 2015-10-05
Anticipated expiration: 2034-03-17
Also published as: JP6320806B2

Abstract

PROBLEM TO BE SOLVED: To provide a method and system for high-precision high-speed three-dimensional model search.SOLUTION: A search method searches for a three-dimensional model by comparing LBP (Local Binary Pattern) characteristic values of the three-dimensional model being searched (with those of a known three-dimensional model). The LBP characteristic values are obtained by generating depth buffer images of the known three-dimensional model at multiple viewpoints, generating multiple scale images with different image scales from each of the depth buffer images, and extracting an LBP histogram from each subdivided areas of each scaled images. Comparison of the LBP characteristic values involves assessing dissimilarity, in terms of sum of maximum values of the LBP histograms extracted from each scaled image, between the known three-dimensional model and the three-dimensional model being searched using all combinations of the multiple viewpoints as an index.

Description

本発明は、三次元物体モデルのデータベースを検索する方法、及び検索システムに関する。 The present invention relates to a method and a search system for searching a database of a three-dimensional object model.

自動車製造業に代表される機械部品の製造業では、三次元物体の形状モデルを三次元ＣＡＤ／ＣＡＭシステムで製作するのが通常業務として行われている。また、建設業界でも建物の外観・内装や建物周辺の風景を建立前にＣＧを使ってシミュレーションするために、三次元の建物モデル、部屋のモデル、家具や樹木などのモデルを作成することが広く行われている。さらに、アニメーションや映画、コマーシャルフィルムなどの作成にも、今では三次元ＣＧ技術は欠かせないものとなっている。 In the machine parts manufacturing industry represented by the automobile manufacturing industry, it is a common practice to manufacture a shape model of a three-dimensional object using a three-dimensional CAD / CAM system. Also, in the construction industry, it is widely used to create 3D building models, room models, furniture and tree models in order to simulate the exterior and interior of buildings and the surrounding landscape using CG before construction. Has been done. Furthermore, 3D CG technology is now indispensable for creating animation, movies, commercial films, and the like.

しかしながら、このような応用分野で、精緻な三次元モデルを最初に作成する場合、２次元の図形描画よりもはるかに多くの労力と時間を要する。そこで、これらの応用分野で、一度人手や三次元スキャナなどの補助手段で入力し作成した三次元モデルを、三次元物体モデル・データベースに保存しておき、類似した三次元物体の形状モデルを作成する場合に、類似した形状の物体モデルを再利用することで大幅なコスト削減がはかれると考えられる。 However, in such an application field, when an elaborate three-dimensional model is first created, much more labor and time are required than two-dimensional drawing. Therefore, in these application fields, a 3D model that has been input and created once by an auxiliary means such as a human or a 3D scanner is stored in the 3D object model database, and a similar 3D object shape model is created. In this case, it is considered that a significant cost reduction can be achieved by reusing an object model having a similar shape.

特許文献1に記載の技術は、ポリゴンを一定の大きさに分割し、分割されたポリゴンの面積と法線ベクトルから特徴量を定義して検索する方法である。正対（正規化）を行わない古い手法であるため、位置、向き、大きさに依存した検索となってしまい、形状が類似していても検索できない、という問題がある。 The technique described in Patent Document 1 is a method of dividing a polygon into a certain size and defining and searching for feature quantities from the areas of the divided polygons and normal vectors. Since this is an old method that does not perform the normal (normalization), the search depends on the position, orientation, and size, and there is a problem that the search cannot be performed even if the shapes are similar.

特許文献２に記載の技術は、二次元画像を検索キーとした三次元モデル類似検索手法であり、複数の二次元画像から得られる投影像あるいは断面像をもとに特徴量を計算し、これに基づき検索するものである。なお、この特許での特徴量とは、物体表面に実際に付随するテクスチャに対して算出できるＲＧＢ、ＨＳＶ、Ｌａｂ等の各色情報の値を量子化したヒストグラム、エッジ微分を量子化した形状ヒストグラム、三次元オブジェクトの体積や表面積、頂点分布、ポリゴン分布のヒストグラム等である。このことから、形が類似していても、色が違うものは検索できないという問題がある。また、特許文献１と同じく、正対（正規化）処理がなく、かつ、周波数空間でのスペクトルを使わない点で、投影パラメータ（断面生成パラメータ）依存性などの問題もあり、高い検索精度は出ないと考えられる。 The technique described in Patent Document 2 is a three-dimensional model similarity search method using a two-dimensional image as a search key, and calculates feature amounts based on projection images or cross-sectional images obtained from a plurality of two-dimensional images. Search based on. The feature amount in this patent is a histogram obtained by quantizing each color information value such as RGB, HSV, Lab, etc., which can be calculated for the texture actually attached to the object surface, a shape histogram obtained by quantizing the edge derivative, These include the volume, surface area, vertex distribution, and polygon distribution histogram of a three-dimensional object. For this reason, there is a problem that even if the shapes are similar, those having different colors cannot be searched. Also, as in Patent Document 1, there is a problem such as dependency on projection parameters (cross-section generation parameters) in that there is no facing (normalization) processing and spectrum in the frequency space is not used, and high search accuracy is high. It is thought that it does not come out.

非特許文献１では複数の手法が提案されているが、最も良い検索精度なのは、Hybrid Descriptor である。これは、６方向の深度バッファ画像を２次元フーリエ変換し、パワースペクトルを求め、それを特徴量とするものと、３方向のシルエット画像を２次元フーリエ変換し、パワースペクトルを求め、それを特徴量とする。そして、中心から放射状に物体表面と接するまでベクトルを放ち、そのベクトルを特徴量とするものの３つを組み合わせた手法である。スペクトル特徴量と幾何学特徴量を組み合わせた複合特徴量であり、高性能を達成した。Ｖｒａｎｉｃ氏が作成したシステム名のＤＥＳＩＲＥから、しばしばＤＥＳＩＲＥ特徴量（あるいは公開されているプログラム名からＤＳＲ４７２特徴量）と呼ばれる。しかし、スペクトルと幾何学特徴量はかなり異質なものであり、インデックスの管理が複雑化すること、また、深度バッファ画像やシルエット画像からは、物体内部の形状が捉えられないなどの問題点が残る。 Non-Patent Document 1 proposes a plurality of methods, but the best search accuracy is Hybrid Descriptor. This is a two-dimensional Fourier transform of a depth buffer image in six directions to obtain a power spectrum, which is used as a feature value, and a two-dimensional Fourier transform of a silhouette image in three directions to obtain a power spectrum and characterize it. Amount. Then, a vector is emitted from the center until it touches the object surface in a radial manner, and this vector is used as a feature amount. It is a composite feature that combines spectral feature and geometric feature, and achieves high performance. From the system name DESIRE created by Mr. Vranic, it is often called a DESIRE feature (or a DSR472 feature from the published program name). However, the spectrum and geometric features are quite different, complicating index management, and the problem that the shape inside the object cannot be captured from depth buffer images and silhouette images remains. .

非特許文献２では、１００以上の異なる方向から２５６×２５６のシルエット画像を生成し、１０個のフーリエ係数と３５個のツェルニケモーメント係数を求め、これを特徴量とする。類似度計算時に、ある類似しているシルエット画像同士の部分を比較する。この特徴量をＬｉｇｈｔＦｉｅｌｄＤｅｓｃｒｉｐｔｏｒ（あるいは略してＬＦＤ特徴量）と呼ぶ。１００以上の方向からレンダリングするため、特徴量の作成時間が猛烈に必要である。シルエットだけから形状を近似するため、微細な形状を見つけられないこと、また、物体内部の特徴が捉えられないなどの問題がある。 In Non-Patent Document 2, a 256 × 256 silhouette image is generated from 100 or more different directions, 10 Fourier coefficients and 35 Zernike moment coefficients are obtained, and these are used as feature quantities. At the time of similarity calculation, parts of similar silhouette images are compared. This feature amount is referred to as a Light Field Descriptor (or LFD feature amount for short). Since rendering is performed from more than 100 directions, the creation time of the feature amount is extremely necessary. Since the shape is approximated only from the silhouette, there are problems such as being unable to find a fine shape and not being able to capture the features inside the object.

特開２００２−４１５３０号公報JP 2002-41530 A 特開２００４−１６４５０３号公報JP 2004-164503 A

D.Vranic, 3D Model Retrieval. Ph.D. thesis, University of Leipzig, 2004.D. Vranic, 3D Model Retrieval. Ph.D. thesis, University of Leipzig, 2004. D-Y. Chen, X-P. Tian, Y-T. Shen, M. Ouhyoung, On Visual Similarity Based 3D Model Retrieval. Computer Graphics Forum (2003), 223-232.D-Y. Chen, X-P. Tian, Y-T. Shen, M. Ouhyoung, On Visual Similarity Based 3D Model Retrieval.Computer Graphics Forum (2003), 223-232.

三次元モデルの特徴量では、前記先行技術のうち非特許文献１及び２の様に、三次元モデルから複数視点で生成した二次元画像を解析する手法が優れた検索性能を得ていた。特に、三次元モデルの形状の凹凸を、画像の濃淡で表現した深度バッファ画像の解析は、三次元モデルの形状を捉えるのに適している。しかし、深度バッファ画像を扱う非特許文献１では、単一の解像度の深度バッファ画像のみから解析を行うことから、概形は類似しているが、詳細な部分で異なる場合に、違う形状であると判別される場合がある。 In the feature quantity of the three-dimensional model, as in Non-Patent Documents 1 and 2 in the above prior art, a technique for analyzing a two-dimensional image generated from a three-dimensional model from a plurality of viewpoints has obtained excellent search performance. In particular, the analysis of the depth buffer image in which the unevenness of the shape of the 3D model is expressed by the shading of the image is suitable for capturing the shape of the 3D model. However, in Non-Patent Document 1 that deals with depth buffer images, analysis is performed only from a depth buffer image of a single resolution, so that the outline is similar, but is different when the details are different. May be determined.

また、三次元モデルの回転の任意性に対する解決法に関しても問題がある。非特許文献１では、主成分分析もしくは特異値分解により主軸を求め、三次元モデルの回転を一意に決定している。この方法では、類似した形状であっても、欠損などのノイズにより、主軸が変化してしまう問題がある。非特許文献２では、大量の視点から生成した二次元画像の特徴量を比較することで、回転の任意性を緩和している。特徴量の比較の際に、画像対で特徴量のユークリッド距離を計算し、その最小値を三次元モデル間の距離とする。結果として、単一視点で比較していることになり、回転の任意性は解決できているが、三次元モデルの形状を正確には比較できていない。
There are also problems with the solution to the arbitrary rotation of the 3D model. In Non-Patent Document 1, the principal axis is obtained by principal component analysis or singular value decomposition, and the rotation of the three-dimensional model is uniquely determined. This method has a problem that even if the shape is similar, the main axis changes due to noise such as a defect. Non-Patent Document 2 relaxes the arbitraryness of rotation by comparing feature quantities of two-dimensional images generated from a large number of viewpoints. When comparing feature amounts, the Euclidean distance of the feature amounts is calculated for the image pair, and the minimum value is set as the distance between the three-dimensional models. As a result, the comparison is made from a single viewpoint, and the arbitraryness of rotation can be solved, but the shapes of the three-dimensional models cannot be accurately compared.

本発明は、上記の先行技術の課題を鑑み、これを解決するために成されたものである。 The present invention has been made in order to solve the problems of the prior art described above.

請求項１に記載の三次元モデルの検索方法は、既知の三次元モデルについて、複数視点における深度バッファ画像における多重スケール画像のローカルバイナリパターン（ＬＢＰ）特徴量を予めデータベース化し、検索対象の三次元モデルのＬＢＰ特徴量を比較することにより三次元モデルを検索する検索方法であって、前記ＬＢＰ特徴量は、既知の三次元モデルから複数視点における深度バッファ画像を生成し、各深度バッファ画像について画像スケールの異なる多重スケール画像を生成し、スケール画像ごとに複数分割した小領域についてＬＢＰヒストグラムを抽出したものであり、前記ＬＢＰ特徴量の比較は、各スケール画像について抽出したＬＢＰヒストグラムの最大値の総和を複数視点の全組合せをインデックスとして、検索対象の三次元モデルとの間で相違度を判定するものであることを特徴とする。 The three-dimensional model search method according to claim 1, for a known three-dimensional model, a local binary pattern (LBP) feature amount of a multi-scale image in a depth buffer image at a plurality of viewpoints is databased in advance, and the three-dimensional model to be searched A search method for searching a three-dimensional model by comparing LBP feature values of models, wherein the LBP feature value generates a depth buffer image at a plurality of viewpoints from a known three-dimensional model, and an image for each depth buffer image Multiple scale images having different scales are generated, and LBP histograms are extracted for a plurality of small regions divided for each scale image. The comparison of LBP feature values is the sum of the maximum values of the LBP histograms extracted for each scale image. Search target using all combinations of multiple viewpoints as indexes And characterized in that to determine the degree of difference between the three-dimensional model.

請求項２に記載の三次元モデルの検索方法は、前記深度バッファ画像は、既知の三次元モデルを特異値分解して姿勢正規化した後、予め定めた視点数における所定のピクセルの大きさで生成されたものであることを特徴とする。 The 3D model retrieval method according to claim 2, wherein the depth buffer image has a predetermined pixel size at a predetermined number of viewpoints after performing singular value decomposition on a known 3D model and performing posture normalization. It is generated.

請求項３に記載の三次元モデルの検索方法は、前記多重スケール画像は、複数の段階で変化させたガウスフィルターを施して生成されたものであることを特徴とする。 The three-dimensional model search method according to claim 3 is characterized in that the multi-scale image is generated by applying a Gaussian filter changed in a plurality of stages.

請求項４に記載の三次元モデルの検索方法は、前記相違度は、前記視点数における各スケール画像のＬＢＰヒストグラムの最大値の総和の行列を計算し、該行列に対してハンガリアン法によって得た組合せの最小和の相違度であることを特徴としている。 The three-dimensional model search method according to claim 4, wherein the degree of difference is obtained by calculating a matrix of sums of maximum values of LBP histograms of each scale image at the number of viewpoints, and obtaining the matrix by the Hungarian method. It is characterized by the difference in the minimum sum of combinations.

請求項５に記載の三次元モデルの検索システムは、既知の三次元モデルについて、予めデータベース化された複数視点における深度バッファ画像における多重スケール画像のＬＢＰ特徴量と、検索対象の三次元モデルのＬＢＰ特徴量とを比較することにより三次元モデルを検索する検索システムであって、検索対象の三次元モデルから複数視点における深度バッファ画像を生成する手段と、各深度バッファ画像について画像スケールの異なる多重スケール画像を生成する手段と、スケール画像ごとに複数分割した小領域についてＬＢＰヒストグラムを抽出する手段と、各スケール画像について抽出したＬＢＰヒストグラムの最大値の総和を複数視点の全組合せをインデックスとして、既知の三次元モデルと検索対象の三次元モデルとの間で相違度を判定する手段とを備えることを特徴とする。 The three-dimensional model search system according to claim 5, with respect to a known three-dimensional model, an LBP feature quantity of a multiscale image in a depth buffer image at a plurality of viewpoints stored in advance in a database and an LBP of a three-dimensional model to be searched A search system for searching a three-dimensional model by comparing feature quantities, means for generating a depth buffer image at a plurality of viewpoints from a three-dimensional model to be searched, and multiple scales having different image scales for each depth buffer image Means for generating an image, means for extracting an LBP histogram for a plurality of small regions divided for each scale image, and the sum of the maximum values of the LBP histogram extracted for each scale image as an index for all combinations of a plurality of viewpoints. Between the 3D model and the 3D model to be searched Characterized in that it comprises a means for determining degrees.

前記手段により、高精度かつ高速な三次元モデルの検索方法及び三次元モデルの検索システムを実現できる。
By the above means, a high-precision and high-speed 3D model search method and 3D model search system can be realized.

深度バッファ画像から多重スケールＬｏｃａｌＢｉｎａｒｙＰａｔｔｅｒｎヒストグラムの生成を示した説明図である。It is explanatory drawing which showed the production | generation of the multiscale Local Binary Pattern histogram from the depth buffer image. 本発明にかかる三次元モデル検索方法および検索システムのフロー図である。It is a flowchart of the three-dimensional model search method and search system concerning this invention. 姿勢正規化（ステップ１）の説明図である。It is explanatory drawing of attitude | position normalization (step 1). 深度バッファ画像の生成（ステップ２）の説明図である。It is explanatory drawing of the production | generation (step 2) of a depth buffer image. 多重スケール表現（画像）の生成（ステップ３）の説明図である。It is explanatory drawing of the production | generation (step 3) of multiscale expression (image). ローカルバイナリパターンの特徴量を決定する（ステップ４）説明図である。It is explanatory drawing which determines the feature-value of a local binary pattern (step 4). バイナリコード構成（ステップ４）の説明図である。It is explanatory drawing of a binary code structure (step 4). 本発明にかかる特徴量である多重スケール・ローカルバイナリパターンの説明図である。It is explanatory drawing of the multiscale local binary pattern which is the feature-value concerning this invention. ＰＳＢにおける各特徴量のＲｅｃａｌｌ−Ｐｒｅｃｉｓｉｏｎ曲線のグラフである。It is a graph of the Recall-Precision curve of each feature-value in PSB. ＥＳＢにおける各特徴量のＲｅｃａｌｌ−Ｐｒｅｃｉｓｉｏｎ曲線のグラフである。It is a graph of the Recall-Precision curve of each feature-value in ESB.

本発明の実施の形態について図表を用いて説明を行う。本発明の実施は、計算機上で構成されるものであり、ＣＰＵ及びメモリなどハードウェアを機能させて実行されるものである。 Embodiments of the present invention will be described with reference to the drawings. The embodiment of the present invention is configured on a computer and executed by causing hardware such as a CPU and a memory to function.

本発明技術では、三次元モデル（以下、三次元形状モデル、あるいは三次元物体ということがある。）から複数視点で生成した深度バッファ画像を解析することで、三次元モデルの特徴量を得る。深度バッファ画像を解析する際に、大きさの異なるガウスフィルターを施すことで多重スケール表現を生成し、各スケール表現でＬｏｃａｌＢｉｎａｒｙＰａｔｔｅｒｎ（ＬＢＰ）ヒストグラムを計算する。詳細度の異なる多重スケール表現を用いることで、大雑把な形状から詳細な形状まで捉える。各スケール表現のＬＢＰヒストグラムは、ヒストグラムの要素ごとに最大値をとることで統合する。詳細度の異なる形状表現で最も強く表れる形状を捉えることができる。 In the technology of the present invention, a feature value of a three-dimensional model is obtained by analyzing a depth buffer image generated from a plurality of viewpoints from a three-dimensional model (hereinafter also referred to as a three-dimensional shape model or a three-dimensional object). When analyzing the depth buffer image, a multiscale representation is generated by applying Gaussian filters having different sizes, and a Local Binary Pattern (LBP) histogram is calculated with each scale representation. By using multiple scale representations with different levels of detail, it captures from rough shapes to detailed shapes. The LBP histogram of each scale expression is integrated by taking the maximum value for each element of the histogram. It is possible to capture the shape that appears most strongly in shape representations with different levels of detail.

まず、三次元モデルをＰｏｉｎｔＳＶＤにより姿勢正規化する。ＰｏｉｎｔＳＶＤは、三次元物体の面上にランダムに生成した点をサンプル点とし、三次元物体の姿勢正規化を行うものである。第一に、サンプル点の平均を三次元モデルの重心として、重心が三次元空間の原点になるように平行移動する。第二に、サンプル点を特異値分解することで、三次元モデルの主軸を求め、主軸が三次元空間のｘ軸・ｙ軸・ｚ軸に沿うように回転する。最後に三次元モデルの頂点と重心との距離の最大値で、各頂点の座標軸を割ることで、三次元モデルを単位球体内に納める。 First, posture normalization of the three-dimensional model is performed by Point SVD. Point SVD normalizes the posture of a three-dimensional object using a randomly generated point on the surface of the three-dimensional object as a sample point. First, using the average of the sample points as the center of gravity of the three-dimensional model, translation is performed so that the center of gravity is the origin of the three-dimensional space. Second, the principal point of the three-dimensional model is obtained by performing singular value decomposition on the sample points, and the principal axis rotates along the x-axis, y-axis, and z-axis of the three-dimensional space. Finally, by dividing the coordinate axis of each vertex by the maximum distance between the vertex and the center of gravity of the 3D model, the 3D model is placed in the unit sphere.

次に、複数視点Ｎ_ν（例えばｇｅｏｄｅｓｉｃｓｐｈｅｒｅの各頂点を視点とした３８視点）からＭ×Ｍ（例えば２５６×２５６）ピクセルの大きさで深度バッファ画像を生成する。図１のように、各深度バッファ画像で、大きさをＬ段階（例えば５段階）で変化させたガスフィルターを施し、多重スケール表現を得る。さらに、各スケール表現で、４分割してできる小領域でＤビン（例えば６４ビン）のＬＢＰヒストグラムを作成する。分割した小領域からＬＢＰヒストグラムを作成することで、どのような形状がどこにあるかという位置情報を捉える。各スケール表現のＬＢＰヒストグラムは、小領域ごとに作成したヒストグラムを連結させたものである。そして、各スケール表現のＬＢＰヒストグラムを、ヒストグラムの要素ごとに最大値をとることで統合する。深度バッファごとに作成される多重スケール表現ＬＢＰヒストグラムの大きさは４×Ｄビンとなる。各多重スケール表現ＬＢＰヒストグラムは要素の総和により正規化される。 Next, a depth buffer image is generated with a size of M × M (for example, 256 × 256) pixels from a plurality of viewpoints N _v (for example, 38 viewpoints with each vertex of the geosphere sphere as a viewpoint). As shown in FIG. 1, each depth buffer image is subjected to a gas filter whose size is changed in L steps (for example, 5 steps) to obtain a multi-scale expression. Furthermore, an LBP histogram of D bins (for example, 64 bins) is created in a small area obtained by dividing each scale into four. By creating an LBP histogram from the divided small areas, position information indicating what shape is located is captured. The LBP histogram of each scale expression is a concatenation of histograms created for each small area. Then, the LBP histogram of each scale expression is integrated by taking the maximum value for each element of the histogram. The size of the multiscale representation LBP histogram created for each depth buffer is 4 × D bins. Each multiscale representation LBP histogram is normalized by the sum of the elements.

本発明にかかる三次元モデルの検索方法および検索システムのフロー（以下、流れということがある。）は、図２に示される。六つのステップからなり、処理の手順に従ってステップごとに説明を行う。データベースに大量の三次元形状モデルが蓄積されており、形状モデルごとに六つのステップを繰り返す。 A flow (hereinafter also referred to as a flow) of a three-dimensional model search method and search system according to the present invention is shown in FIG. It consists of six steps, and each step is explained according to the processing procedure. A large number of three-dimensional shape models are stored in the database, and six steps are repeated for each shape model.

＜ステップ１＞
データベースの各三次元形状モデルに対して、姿勢正規化を適用し、「位置」「回転」「大きさ」の任意性を解消する。姿勢正規化された三次元形状モデルは、その重心が半径１の球の中心にくるように配置される（図３）。 <Step 1>
Posture normalization is applied to each three-dimensional shape model in the database to eliminate the arbitraryness of “position”, “rotation”, and “size”. The posture-normalized three-dimensional shape model is arranged so that the center of gravity is at the center of a sphere having a radius of 1 (FIG. 3).

＜ステップ２＞
多数視点（例えば３８視点）の各々から深度バッファ画像を生成する。ステップ１で処理された姿勢正規化された球を、３８頂点をもつ三角パッチで近似し、その各頂点から球の中心に向かうベクトルに垂直な面に投影し、深度バッファ画像（デプスバッファ画像）を生成する（図４）。 <Step 2>
A depth buffer image is generated from each of multiple viewpoints (for example, 38 viewpoints). The posture normalized sphere processed in step 1 is approximated by a triangular patch having 38 vertices, projected onto a plane perpendicular to the vector from each vertex toward the center of the sphere, and a depth buffer image (depth buffer image) Is generated (FIG. 4).

＜ステップ３＞
ステップ２の各深度バッファ画像から多重スケール表現（画像）を生成する。このとき、スケール（すなわち、カーネルサイズ）の異なるガウスフィルタ（例えば、５種類のスケール）を適用して多重スケール表現（画像）を生成する（図５）。 <Step 3>
A multiscale representation (image) is generated from each depth buffer image in step 2. At this time, a multiscale expression (image) is generated by applying Gaussian filters (for example, five types of scales) having different scales (ie, kernel sizes) (FIG. 5).

＜ステップ４＞
各多重スケール表現（画像）を４分割し、それぞれからローカルバイナリパターン（ヒストグラム）を生成し、前記ローカルバイナリパターンを並べ、対象としている三次元モデルの特徴量とする（図６）。なお、前記ローカルバイナリパターンは、図７のように、着目している画素を取り囲む８つの近傍画素との輝度差からバイナリコードを作成し、その出現率をヒストグラムとして表現する。 <Step 4>
Each multi-scale expression (image) is divided into four, a local binary pattern (histogram) is generated from each, and the local binary patterns are arranged and used as feature quantities of the target three-dimensional model (FIG. 6). Note that, as shown in FIG. 7, the local binary pattern creates a binary code from luminance differences with eight neighboring pixels surrounding the pixel of interest, and expresses the appearance rate as a histogram.

＜ステップ５＞
前記ステップ４で生成される４分割された画像のヒストグラムを、前記ステップ３で得られる多重スケール画像それぞれから生成し、前記４分割画像のヒストグラムの最大値をとって、多重スケール・ローカルバイナリパターン（Ｍｕｌｔｉ−sｃａｌｅＲｅｐｒｅｓｅｎｔａｔｉｏｎＬｏｃａｌＢｉｎａｒｙＰａｔｔｅｒｎ：ＭＲＬＢＰと呼ぶ。）を生成する（図８）。 <Step 5>
A histogram of the quadrant image generated in the step 4 is generated from each of the multiscale images obtained in the step 3, and the maximum value of the histogram of the quadrant image is taken to obtain a multiscale local binary pattern ( Multi-scale Representation Local Binary Pattern (referred to as MRLBP) is generated (FIG. 8).

＜ステップ６＞
前記ステップ５で得られたＭＲＬＢＰ特徴量を前記ステップ２で適用する複数の視点からの深度バッファすべてに適用して得られた特徴量を三次元モデルの形状特徴量とし、検索におけるインデックスとして利用する。本実施例では視点が３８、ステップ５で４＊６４次元（ただし前記４は４分割、６４はビットパターンから決まる次元数である。）のデータが得られているので、形状特徴量の総次元は９７２８次元となる。 <Step 6>
The feature value obtained by applying the MRLBP feature value obtained in step 5 to all the depth buffers applied in step 2 is used as a shape feature value of the three-dimensional model and used as an index in the search. . In this embodiment, the viewpoint is 38, and the data of 4 * 64 dimensions (where 4 is a 4-division and 64 is the number of dimensions determined from the bit pattern) is obtained in step 5, so the total dimension of the shape feature amount Becomes 9728 dimensions.

三次元モデル検索時には、クエリ（検索質問）となる三次元モデルに対し、前記ステップ１〜６を適用して特徴量に変換し、前記特徴量とデータベースに格納されている特徴量との距離を（数１）を用いて計算し、距離の小さい順にソーティングしたものが、類似する形状の順序としている。 At the time of a three-dimensional model search, the steps 1 to 6 are applied to the three-dimensional model serving as a query (search question) to convert it into a feature amount, and the distance between the feature amount and the feature amount stored in the database is calculated. What is calculated using (Equation 1) and sorted in ascending order of distance is the order of similar shapes.

なお、本発明の実施例における前記数値パラメータは実施形態の一例にすぎず、前記フローは、前記数値パラメータに依存しない。 The numerical parameter in the example of the present invention is merely an example of the embodiment, and the flow does not depend on the numerical parameter.

ＭＲＬＢＰ特徴量による三次元モデル検索システムの有効性を確認するために、従来手法との比較実験を行った。 In order to confirm the effectiveness of the 3D model retrieval system based on MRLBP features, a comparison experiment with a conventional method was performed.

まず、複数視点Ｎ_νの境界画素ヒストグラムの全組み合わせで相違度を計算し、Ｎ_ν×Ｎ_νの大きさの相違度行列を計算する。そして、相違度行列に対してハンガリアン法を適用することで得られる、組み合わせの最小和の相違度を三次元モデルの相違度とする。これにより、三次元モデルの回転の任意性の解決と、複数視点での三次元モデルの形状比較を実現することができる。 First, the dissimilarity is calculated for all combinations of the boundary pixel histograms of a plurality of viewpoints _Nν , and a dissimilarity matrix having a size of _Nν × _Nν is calculated. Then, the difference of the minimum sum of combinations obtained by applying the Hungarian method to the difference matrix is set as the difference of the three-dimensional model. As a result, it is possible to realize the solution of the arbitrary rotation of the 3D model and the shape comparison of the 3D model from a plurality of viewpoints.

本発明は、三次元モデルの形状類似検索に適用した。三次元モデルの特徴量として代表的なものと比較実験を行った。 The present invention is applied to a shape similarity search of a three-dimensional model. A comparison experiment was conducted with typical ones of the three-dimensional model.

従来手法には、ＳｈａｐｅＤｉｓｔｒｉｂｕｔｉｏｎＤ2（Ｄ2）、ＳｐｈｅｒｉｃａｌＨａｒｍｏｎｉｃｓＤｅｓｃｒｉｐｔｏｒ（ＳＨＤ）、ＬｉｇｈｔＦｉｅｌｄＤｅｓｃｒｉｐｔｏｒ（ＬＦＤ）、ＬｏｃａｌＢｉｎａｒｙＰａｔｔｅｒｎ（ＬＢＰ）、ＤＥＳＩＲＥＤｅｓｃｒｉｐｔｏｒ（ＤＥＳＩＲＥ）、Ｍｕｌｔｉ−ＦｏｕｒｉｅｒＳｐｅｃｔｒａＤｅｓｃｒｉｐｔｏｒ（ＭＦＳＤ）の六つを使用する。Ｄ２は、三次元物体の面上にランダムに生成した点から特徴量を解析する。ＳＨＤは、三次元物体をボクセル表現に変換して特徴量を解析する。これらは、三次元表現に基づいた特徴量である。ＬＦＤは、二次元画像に基づく特徴量の代表的な手法であり、あらゆる方向から生成したシルエット画像から特徴量を解析する。ＬＢＰは、単一の深度バッファ画像からヒストグラムを計算するものである。ＤＥＳＩＲＥは、Ｄｅｐｔｈ−Ｂｕｆｆｅｒ画像、シルエット画像、三次元物体の重心から放射状に生成したベクトルそれぞれから解析した特徴量を複合したものである。ＭＦＳＤは、三次元物体を４種類のフーリエスペクトルで複合したものである。 The conventional methods include Shape Distribution D2 (D2), Spherical Harmonics Descriptor (SHD), Light Field Descriptor (LF), Local Binary Pattern (LBP), DE RD Is used. D2 analyzes the feature quantity from points randomly generated on the surface of the three-dimensional object. SHD analyzes a feature quantity by converting a three-dimensional object into a voxel representation. These are feature quantities based on a three-dimensional representation. LFD is a representative feature amount method based on a two-dimensional image, and analyzes a feature amount from silhouette images generated from all directions. LBP calculates a histogram from a single depth buffer image. DESIRE is a composite of feature quantities analyzed from depth-buffer images, silhouette images, and vectors generated radially from the center of gravity of a three-dimensional object. MFSD is a composite of a three-dimensional object with four types of Fourier spectra.

三次元モデルのデータセットには、ＰｒｉｎｃｅｔｏｎＳｈａｐｅＢｅｎｃｈｍａｒｋ（ＰＳＢ）とＥｎｇｉｎｅｅｒｉｎｇＳｈａｐｅＢｅｎｃｈｍａｒｋ（ＥＳＢ）を選択した。ＰＳＢは、犬や飛行機などの様々な分野の三次元モデルからなり、一般的な検索精度を評価することができる。ＰＳＢは訓練に用いるＴｒａｉｎｉｎｇＳｅｔｓと評価に用いるＴｅｓｔＳｅｔｓに分かれる。ＥＳＢは、ギアやパイプなどの機械部品の三次元物体からなり、3ＤＣＡＤによる製品設計などでの部品検索を想定した検索精度を評価することができる。 Princeton Shape Benchmark (PSB) and Engineering Shape Benchmark (ESB) were selected as the three-dimensional model data set. PSB consists of three-dimensional models in various fields such as dogs and airplanes, and can evaluate general search accuracy. The PSB is divided into training sets used for training and test sets used for evaluation. The ESB is composed of a three-dimensional object of mechanical parts such as gears and pipes, and can evaluate a search accuracy assuming a part search in product design by 3D CAD.

検索精度の評価尺度には、ＦｉｒｓｔＴｉｅｒ（ＦＴ）、ＳｅｃｏｎｄＴｉｅｒ（ＳＴ）、ＮｅａｒｅｓｔＮｅｉｇｈｂｏｒ（ＮＮ）、ＤｉｓｃｏｕｎｔｅｄＣｕｍｕｌａｔｉｖｅＧａｉｎ（ＤＣＧ）、および再現率（Ｒｅｃａｌｌ）、適合率（Ｐｒｅｃｉｓｉｏｎ）を用いた。ＦＴ、ＳＴは検索結果上位での検索精度であり、ＮＮは検索結果１位の検索精度である。ＤＣＧは、検索結果全体の検索精度を表す値である。ＦＴ、ＳＴ、ＮＮ、ＤＣＧは値が大きくなるほど検索精度が高く、再現率−適合率曲線は、曲線が右上に近づくほど検索精度が高い。 As evaluation scales for search accuracy, First Tier (FT), Second Tier (ST), Nearest Neighbor (NN), Disclosed Cumulative Gain (DCG), recall (Recall), and precision (Precision) were used. FT and ST are the search accuracy at the top of the search results, and NN is the search accuracy at the top of the search results. DCG is a value representing the search accuracy of the entire search result. FT, ST, NN, and DCG have higher search accuracy as the value increases, and the recall-matching rate curve has higher search accuracy as the curve approaches the upper right.

各評価尺度は、各クエリ（検索質問）三次元モデルの評価尺度の平均を、全体の平均評価尺度とするマイクロ平均で表す。クラスごとに評価尺度の平均を計算し、それらの平均を全体の平均評価尺度とするマクロ平均では、各クラスが少数の三次元モデルで構成される場合に、評価尺度の値に偏りが生じることが分かっている。比較実験で用いるＰＳＢ、ＥＳＢは各クラスが少数の三次元モデルで構成されているため、マイクロ平均を用いることとした。 Each evaluation scale is represented by a micro average that uses an average of the evaluation scales of each query (search question) three-dimensional model as an overall average evaluation scale. In the macro average, which calculates the average of the evaluation scales for each class and uses these averages as the overall average evaluation scale, when each class is composed of a small number of three-dimensional models, the value of the evaluation scale is biased. I know. Since PSB and ESB used in the comparative experiment are composed of a small number of three-dimensional models, the micro average is used.

表１はＰＳＢにおける各特徴量のＦＴ、ＳＴ、ＮＮ、ＤＣＧで表したものである。発明技術（ＭＲＬＢＰ）が、ＦＴとＮＮにおいて、最も優れた検索性能となっている。これは、発明技術は、検索結果上位において、従来技術よりも多くの適合三次元モデルを提示できることを示している。また、二次元画像に基づく特徴量の代表であるＬＦＤよりも高い検索精度であることが分かる。また、ＭＲＬＢＰが、単一スケールの画像の特徴量を表すＬＢＰよりも高い検索精度であることから、多重スケール表現が有効であることが分かる。その他、Ｄ２、ＳＨＤといったボクセル表現や点群の三次元表現に基づく特徴量と比較しても、ＭＲＬＢＰが、すべての評価尺度において、最も高い検索精度を得ている。 Table 1 shows FT, ST, NN, and DCG of each feature amount in PSB. The inventive technique (MRLBP) has the best search performance in FT and NN. This indicates that the inventive technique can present more conforming three-dimensional models than the prior art in the higher search results. It can also be seen that the search accuracy is higher than that of LFD, which is a representative feature amount based on a two-dimensional image. In addition, since MRLBP has a higher search accuracy than LBP representing the feature amount of a single scale image, it can be seen that multi-scale representation is effective. In addition, even when compared with feature quantities based on voxel representations such as D2 and SHD and three-dimensional representations of point groups, MRLBP has the highest search accuracy in all evaluation scales.

図９はＰＳＢにおける各特徴量のＲｅｃａｌｌ−Ｐｒｅｃｉｓｉｏｎ曲線（再現率−適合率曲線）である。発明技術は従来技術のＭＦＳＤと同程度の検索性能となっている。ほぼすべての再現率−適合率曲線の組み合わせで、ＭＲＬＢＰ（本発明）がほかの特徴量を上回っていることが分かる。ＭＲＬＢＰは、特定の分野によらず、基本的な検索精度が高いことが分かる。 FIG. 9 is a Recall-Precision curve (reproduction rate-matching rate curve) of each feature amount in PSB. The inventive technique has a search performance comparable to that of the prior art MFSD. It can be seen that MRLBP (invention) is superior to other feature quantities in almost all recall-matching rate curve combinations. It can be seen that MRLBP has a high basic search accuracy regardless of a specific field.

表２はＥＳＢにおける各特徴量のＦＴ、ＳＴ、ＮＮ、ＤＣＧで表したものである。ＮＮを除いた評価尺度で、ＭＲＬＢＰ（発明技術）が、最も高い検索精度であることが分かる。ＰＳＢと同様に、高い検索精度を得ていることから機械部品の三次元モデルに関しても、多重スケール表現が有効であることが分かる。 Table 2 shows FT, ST, NN, and DCG of each feature amount in ESB. It can be seen that MRLBP (invention technology) has the highest search accuracy on the evaluation scale excluding NN. As with the PSB, since high search accuracy is obtained, it can be seen that the multi-scale representation is also effective for the three-dimensional model of the machine part.

図１０はＥＳＢにおける各特徴量のＲｅｃａｌｌ−Ｐｒｅｃｉｓｉｏｎ曲線（再現率−適合率曲線）である。ほぼ全ての再現率−適合率の組み合わせで、ＭＲＬＢＰ（本発明）がほかの特徴量を上回っていることが分かる。ＭＲＬＢＰが、機械部品の三次元モデルを対象とした形状類似検索においても、高い検索精度を得られることが分かる。

FIG. 10 is a Recall-Precision curve (reproduction rate-matching rate curve) of each feature amount in ESB. It can be seen that MRLBP (the present invention) outperforms other feature quantities in almost all recall-matching ratio combinations. It can be seen that MRLBP can obtain high search accuracy even in a shape similarity search for a three-dimensional model of a machine part.

Claims

For a known three-dimensional model, a local binary pattern (LBP) feature quantity of a multi-scale image in a depth buffer image at a plurality of viewpoints is stored in a database in advance, and the three-dimensional model is compared by comparing the LBP feature quantity of the three-dimensional model to be searched. A search method for searching,
The LBP feature value is obtained by generating a depth buffer image at a plurality of viewpoints from a known three-dimensional model, generating a multi-scale image having different image scales for each depth buffer image, and generating an LBP histogram for a small region divided into a plurality for each scale image. Extracted from
The comparison of the LBP feature values is to determine the degree of difference from the three-dimensional model to be searched using the sum of the maximum values of the LBP histogram extracted for each scale image as an index for all combinations of a plurality of viewpoints. 3D model search method characterized by

2. The three-dimensional image according to claim 1, wherein the depth buffer image is generated with a predetermined pixel size at a predetermined number of viewpoints after performing singular value decomposition and posture normalization of a known three-dimensional model. How to search for models.

The three-dimensional model search method according to claim 1, wherein the multiscale image is generated by applying a Gaussian filter changed in a plurality of stages.

The dissimilarity is a dissimilarity of a minimum sum of combinations obtained by calculating a matrix of sums of maximum values of LBP histograms of each scale image at the number of viewpoints and using the Hungarian method for the matrix. The search method of the three-dimensional model in any one of.

For a known three-dimensional model, a three-dimensional model is searched by comparing the LBP feature value of the multiscale image in the depth buffer image at a plurality of viewpoints stored in advance in the database with the LBP feature value of the three-dimensional model to be searched. A search system,
Means for generating a depth buffer image in a plurality of viewpoints from a three-dimensional model to be searched;
Means for generating multiple scale images of different image scales for each depth buffer image;
Means for extracting an LBP histogram for a small region divided into a plurality of scale images;
Means for determining the degree of difference between a known three-dimensional model and a three-dimensional model to be searched, using the sum of the maximum values of the LBP histogram extracted for each scale image as an index for all combinations of a plurality of viewpoints. 3D model search system characterized by