JP2011242134A

JP2011242134A - Image processor, image processing method, program, and electronic device

Info

Publication number: JP2011242134A
Application number: JP2010111588A
Authority: JP
Inventors: Yasuhiro Shudo; 泰広周藤; Yoshiaki Iwai; 嘉昭岩井; Takayuki Ashigahara; 隆之芦ヶ原
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2010-05-14
Filing date: 2010-05-14
Publication date: 2011-12-01

Abstract

【課題】簡易な構成で、より容易にカメラの位置や姿勢を推定する。
【解決手段】合成画像生成部１０３は、推定の基準となる基準画像を構成する画素の位置を、基準画像を構成する画素に対応する撮像画像上の画素の位置に射影する射影パラメータに基づいて、射影パラメータにより射影される撮像画像上の位置に存在する各画素により構成される合成画像を生成し、評価部１０４は、合成画像と基準画像との相関を表す評価関数を生成し、パラメータ更新部１０６は、評価関数に基づいて、射影パラメータを更新し、更新後の射影パラメータに基づいて、撮像部の位置又は姿勢の少なくとも一方を推定する。本発明は、例えば、カメラの位置や姿勢を推定し、その推定結果に基づく処理を行なうコンピュータ等に適用できる。
【選択図】図３The position and orientation of a camera can be estimated more easily with a simple configuration.
A composite image generation unit 103 projects a position of a pixel constituting a reference image serving as a reference for estimation onto a position of a pixel on a captured image corresponding to a pixel constituting the reference image. Then, a composite image composed of each pixel existing at a position on the captured image projected by the projection parameter is generated, and the evaluation unit 104 generates an evaluation function representing a correlation between the composite image and the reference image, and updates the parameters The unit 106 updates the projection parameter based on the evaluation function, and estimates at least one of the position or orientation of the imaging unit based on the updated projection parameter. The present invention can be applied to, for example, a computer that estimates the position and orientation of a camera and performs processing based on the estimation result.
[Selection] Figure 3

Description

本発明は、画像処理装置、画像処理方法、プログラム、及び電子装置に関し、特に、例えば、カメラの撮像により得られる撮像画像に基づいて、カメラの位置や姿勢を推定する場合に用いて好適な画像処理装置、画像処理方法、プログラム、及び電子装置に関する。 The present invention relates to an image processing device, an image processing method, a program, and an electronic device, and in particular, an image suitable for use when, for example, estimating the position and orientation of a camera based on a captured image obtained by imaging with a camera. The present invention relates to a processing device, an image processing method, a program, and an electronic device.

カメラの撮像により得られる撮像画像に基づいて、カメラの位置や姿勢を推定するキャリブレーションを行なう推定方法として、例えば、以下の第１及び第２の推定方法が存在する。 For example, there are the following first and second estimation methods as an estimation method for performing calibration for estimating the position and orientation of the camera based on a captured image obtained by imaging of the camera.

第１の推定方法では、カメラが、特定の模様（例えば市松模様等）の壁紙等を撮像する。そして、予め用意された基準画像と撮像された撮像画像との、対応する画素どうしの差分自乗和を算出し、算出した差分自乗和を、カメラの位置や姿勢を推定するための評価関数として用いて、カメラの位置や姿勢を推定する（例えば、特許文献１を参照）。 In the first estimation method, the camera captures a wallpaper or the like having a specific pattern (for example, a checkered pattern). Then, the difference square sum between corresponding pixels of the reference image prepared in advance and the captured image is calculated, and the calculated difference square sum is used as an evaluation function for estimating the position and orientation of the camera. Thus, the position and orientation of the camera are estimated (see, for example, Patent Document 1).

なお、基準画像とは、位置や姿勢が既知であるカメラにより、特定の模様の壁紙等を撮像したときに得られる画像を表す。 The reference image represents an image obtained when a wallpaper having a specific pattern is captured by a camera whose position and orientation are known.

また、第２の推定方法では、例えば、８ビットの画素値を有する画素により構成される８ビット画像から、８ビットのビット列を構成する各ビットのうち、異なる８つの位置に存在するビットをそれぞれ画素値とする８枚の１ビット画像を生成する。 Further, in the second estimation method, for example, from an 8-bit image configured by pixels having an 8-bit pixel value, bits existing at eight different positions among the respective bits configuring an 8-bit bit string are respectively determined. Eight 1-bit images are generated as pixel values.

そして、８枚の１ビット画像を、順次、LCD(Liquid Crystal Display)に表示させるようにして、LCDに表示される８枚の１ビット画像を、カメラにより撮像して得られる撮像画像に基づいて、カメラの位置や姿勢を推定する（例えば、非特許文献１を参照）。 Then, eight 1-bit images are sequentially displayed on an LCD (Liquid Crystal Display), and the eight 1-bit images displayed on the LCD are captured based on the captured images obtained by the camera. The position and orientation of the camera are estimated (see, for example, Non-Patent Document 1).

特開２００２−２７５０７号公報JP 2002-27507 A

中村泰敏，渡部斉：画像の差分による高精度ステレオカメラキャリブレーション，画像センシングシンポジウムSSII09,IS1-10,2009.Yasutoshi Nakamura, Hitoshi Watanabe: High-precision stereo camera calibration based on image differences, Image Sensing Symposium SSII09, IS1-10, 2009.

しかしながら、上述した第１の推定方法では、評価関数を算出する際に、撮像画像と基準画像との間に生じる画素値（輝度値）のレベル差による悪影響を軽減するために、特定の模様の壁紙等を対象として撮像を行なうとともに、基準画像及び撮像画像それぞれを構成する各画素の画素値を、所定の値で正規化（除算）して２値化する必要があった。 However, in the first estimation method described above, when calculating the evaluation function, in order to reduce the adverse effect due to the level difference of the pixel value (luminance value) that occurs between the captured image and the reference image, In addition to capturing images such as wallpaper, it is necessary to normalize (divide) the pixel values of each pixel constituting each of the reference image and the captured image and binarize them with a predetermined value.

また、上述した第２の推定方法では、LCDが表示可能な総画素数に応じて、LCDに表示させる１ビット画像の画素数に制限が加わってしまう。さらに、第２の推定方法では、LCDに表示される８枚の１ビット画像を撮像するために、カメラにより、少なくとも８回の撮像を行なわなければならなかった。 In the second estimation method described above, the number of pixels of the 1-bit image displayed on the LCD is limited according to the total number of pixels that can be displayed on the LCD. Further, in the second estimation method, in order to capture eight 1-bit images displayed on the LCD, it has been necessary to perform imaging at least eight times by the camera.

また、第２の推定方法では、キャリブレーションにおいて、LCDを使用するため、例えば、LCDを使用せずにキャリブレーションを行なう場合と比較して、大掛かりな構成となっていた。 Further, in the second estimation method, since an LCD is used for calibration, for example, the configuration is large compared with a case where calibration is performed without using an LCD.

本発明は、このような状況に鑑みてなされたものであり、簡易な構成で、より容易にカメラの位置や姿勢を推定できるようにするものである。 The present invention has been made in view of such a situation, and makes it possible to more easily estimate the position and orientation of a camera with a simple configuration.

本発明の第１の側面の画像処理装置は、撮像部の撮像により得られる撮像画像に基づいて、前記撮像部の位置又は姿勢の少なくとも一方を推定する画像処理装置であって、推定の基準となる基準画像を構成する画素の位置を、前記基準画像を構成する画素に対応する前記撮像画像上の画素の位置に射影する射影パラメータに基づいて、前記射影パラメータにより射影される前記撮像画像上の位置に存在する各画素により構成される合成画像を生成する合成画像生成手段と、前記合成画像と前記基準画像との相関を表す評価関数を生成する評価生成手段と、前記評価関数に基づいて、前記射影パラメータを更新する更新手段と、更新後の前記射影パラメータに基づいて、前記撮像部の位置又は姿勢の少なくとも一方を推定する推定手段とを含む画像処理装置である。 An image processing apparatus according to a first aspect of the present invention is an image processing apparatus that estimates at least one of the position or orientation of the imaging unit based on a captured image obtained by imaging of the imaging unit, On the captured image projected by the projection parameter based on the projection parameter that projects the position of the pixel constituting the reference image to the position of the pixel on the captured image corresponding to the pixel constituting the reference image. Based on the evaluation function, a composite image generation unit that generates a composite image composed of each pixel existing at a position, an evaluation generation unit that generates an evaluation function that represents a correlation between the composite image and the reference image, Updating means for updating the projection parameter; and estimation means for estimating at least one of the position or orientation of the imaging unit based on the updated projection parameter. An image processing apparatus.

前記評価生成手段では、２次多項式により表される前記評価関数を生成し、前記更新手段では、前記評価関数を用いた最急降下法により算出した射影パラメータを新たな射影パラメータとして更新するようにすることができる。 The evaluation generation unit generates the evaluation function represented by a quadratic polynomial, and the update unit updates the projection parameter calculated by the steepest descent method using the evaluation function as a new projection parameter. be able to.

前記評価生成手段では、更新後の前記射影パラメータの候補を表す候補射影パラメータを変数として有する前記評価関数を生成し、前記更新手段では、前記評価関数が最小となるときの前記候補射影パラメータを新たな射影パラメータとして更新するようにすることができる。 The evaluation generation means generates the evaluation function having a candidate projection parameter representing the updated projection parameter candidate as a variable, and the update means newly sets the candidate projection parameter when the evaluation function is minimized. It can be updated as a new projection parameter.

前記基準画像を構成する各画素の位置と、対応する前記撮像画像上の位置に基づいて、前記射影パラメータを生成する初期パラメータ生成手段をさらに設けることができ、前記合成画像生成手段では、前記初期パラメータ生成手段により前記射影パラメータが生成されたことに対応して、前記射影パラメータに基づいて、前記合成画像を生成し、前記更新手段により前記射影パラメータが更新されたことに対応して、更新後の前記射影パラメータに基づいて、前記合成画像を生成するようにすることができる。 An initial parameter generating means for generating the projection parameter based on the position of each pixel constituting the reference image and the corresponding position on the captured image can be further provided, and the composite image generating means In response to the projection parameter being generated by the parameter generation unit, the composite image is generated based on the projection parameter, and the updated unit is updated in response to the update of the projection parameter by the update unit. The composite image can be generated based on the projection parameters.

前記基準画像は、前記撮像部とは異なる、位置及び姿勢が既知である他の撮像部の撮像により得られたものであり、前記推定手段では、前記他の撮像部に対する前記撮像部の位置及び姿勢の違いを表す前記射影パラメータに基づいて、前記撮像部の位置又は姿勢の少なくとも一方を推定するようにすることができる。 The reference image is obtained by imaging with another imaging unit having a known position and orientation different from the imaging unit, and the estimation unit includes the position of the imaging unit with respect to the other imaging unit and It is possible to estimate at least one of the position or the posture of the imaging unit based on the projection parameter representing the difference in posture.

本発明の第１の側面の画像処理方法は、撮像部の撮像により得られる撮像画像に基づいて、前記撮像部の位置又は姿勢の少なくとも一方を推定する画像処理装置の画像処理方法であって、前記画像処理装置は、合成画像生成手段と、評価生成手段と、更新手段と、推定手段とを含み、前記画像合成手段が、推定の基準となる基準画像を構成する画素の位置を、前記基準画像を構成する画素に対応する前記撮像画像上の画素の位置に射影する射影パラメータに基づいて、前記射影パラメータにより射影される前記撮像画像上の位置に存在する各画素により構成される合成画像を生成し、前記評価生成手段が、前記合成画像と前記基準画像との相関を表す評価関数を生成し、前記更新手段が、前記評価関数に基づいて、前記射影パラメータを更新し、前記推定手段が、更新後の前記射影パラメータに基づいて、前記撮像部の位置又は姿勢の少なくとも一方を推定するステップを含む画像処理方法である。 An image processing method according to a first aspect of the present invention is an image processing method of an image processing apparatus that estimates at least one of the position or orientation of the imaging unit based on a captured image obtained by imaging of the imaging unit, The image processing apparatus includes a composite image generation unit, an evaluation generation unit, an update unit, and an estimation unit, and the image synthesis unit determines a position of a pixel constituting a reference image serving as a reference for estimation. Based on a projection parameter projected to the position of the pixel on the captured image corresponding to the pixel constituting the image, a composite image composed of each pixel present at the position on the captured image projected by the projection parameter Generating the evaluation function representing the correlation between the composite image and the reference image, and updating the projection parameter based on the evaluation function. The estimating means, based on said projection parameter after update, an image processing method comprising the step of estimating at least one of the position or orientation of the imaging unit.

本発明の第１の側面のプログラムは、撮像部の撮像により得られる撮像画像に基づいて、前記撮像部の位置又は姿勢の少なくとも一方を推定する画像処理装置のコンピュータを、推定の基準となる基準画像を構成する画素の位置を、前記基準画像を構成する画素に対応する前記撮像画像上の画素の位置に射影する射影パラメータに基づいて、前記射影パラメータにより射影される前記撮像画像上の位置に存在する各画素により構成される合成画像を生成する合成画像生成手段と、前記合成画像と前記基準画像との相関を表す評価関数を生成する評価生成手段と、前記評価関数に基づいて、前記射影パラメータを更新する更新手段と、更新後の前記射影パラメータに基づいて、前記撮像部の位置又は姿勢の少なくとも一方を推定する推定手段として機能させるためのプログラムである。 The program according to the first aspect of the present invention is based on a captured image obtained by imaging by an imaging unit, and uses a computer of an image processing apparatus that estimates at least one of the position or orientation of the imaging unit as a criterion for estimation. Based on the projection parameter that projects the position of the pixel that constitutes the image to the position of the pixel on the captured image that corresponds to the pixel that constitutes the reference image, the position on the captured image that is projected by the projection parameter Based on the evaluation function, a composite image generation unit that generates a composite image composed of each existing pixel, an evaluation generation unit that generates an evaluation function representing a correlation between the composite image and the reference image, and the projection based on the evaluation function Update means for updating parameters, and estimation means for estimating at least one of the position or orientation of the imaging unit based on the updated projection parameters Is a program for making the function Te.

本発明の第１の側面によれば、推定の基準となる基準画像を構成する画素の位置を、基準画像を構成する画素に対応する撮像画像上の画素の位置に射影する射影パラメータに基づいて、射影パラメータにより射影される撮像画像上の位置に存在する各画素により構成される合成画像が生成され、合成画像と基準画像との相関を表す評価関数が生成され、評価関数に基づいて、射影パラメータが更新され、更新後の射影パラメータに基づいて、撮像部の位置又は姿勢の少なくとも一方が推定される。 According to the first aspect of the present invention, based on a projection parameter for projecting the position of a pixel constituting a reference image serving as a reference for estimation to the position of a pixel on a captured image corresponding to the pixel constituting the reference image. A composite image composed of each pixel existing at a position on the captured image projected by the projection parameter is generated, and an evaluation function representing a correlation between the composite image and the reference image is generated. Based on the evaluation function, the projection is performed. The parameter is updated, and at least one of the position and orientation of the imaging unit is estimated based on the updated projection parameter.

本発明の第２の側面の電子装置は、撮像部の撮像により得られる撮像画像に基づいて、前記撮像部の位置又は姿勢の少なくとも一方を推定する電子装置であって、推定の基準となる基準画像を構成する画素の位置を、前記基準画像を構成する画素に対応する前記撮像画像上の画素の位置に射影する射影パラメータに基づいて、前記射影パラメータにより射影される前記撮像画像上の位置に存在する各画素により構成される合成画像を生成する合成画像生成手段と、前記合成画像と前記基準画像との相関を表す評価関数を生成する評価生成手段と、前記評価関数に基づいて、前記射影パラメータを更新する更新手段と、更新後の前記射影パラメータに基づいて、前記撮像部の位置又は姿勢の少なくとも一方を推定する推定手段と、前記推定手段により推定された推定結果に基づいて、所定の処理を実行する実行手段とを含む電子装置である。 An electronic device according to a second aspect of the present invention is an electronic device that estimates at least one of the position or orientation of the imaging unit based on a captured image obtained by imaging of the imaging unit, and serves as a reference for estimation. Based on the projection parameter that projects the position of the pixel that constitutes the image to the position of the pixel on the captured image that corresponds to the pixel that constitutes the reference image, the position on the captured image that is projected by the projection parameter Based on the evaluation function, a composite image generation unit that generates a composite image composed of each existing pixel, an evaluation generation unit that generates an evaluation function representing a correlation between the composite image and the reference image, and the projection based on the evaluation function Update means for updating parameters, estimation means for estimating at least one of the position or orientation of the imaging unit based on the updated projection parameters, and the estimation means Based on more estimated estimation result, an electronic device and an execution means for executing a predetermined process.

本発明の第２の側面によれば、推定の基準となる基準画像を構成する画素の位置を、基準画像を構成する画素に対応する撮像画像上の画素の位置に射影する射影パラメータに基づいて、射影パラメータにより射影される撮像画像上の位置に存在する各画素により構成される合成画像が生成され、合成画像と基準画像との相関を表す評価関数が生成され、評価関数に基づいて、射影パラメータが更新され、更新後の射影パラメータに基づいて、撮像部の位置又は姿勢の少なくとも一方が推定され、その推定結果に基づいて、所定の処理が実行される。 According to the second aspect of the present invention, based on a projection parameter that projects the position of a pixel constituting a reference image serving as a reference for estimation to the position of a pixel on a captured image corresponding to the pixel constituting the reference image. A composite image composed of each pixel existing at a position on the captured image projected by the projection parameter is generated, and an evaluation function representing a correlation between the composite image and the reference image is generated. Based on the evaluation function, the projection is performed. The parameter is updated, and at least one of the position and orientation of the imaging unit is estimated based on the updated projection parameter, and a predetermined process is executed based on the estimation result.

本発明によれば、簡易な構成で、より容易にカメラの位置や姿勢を推定することが可能となる。 According to the present invention, it is possible to estimate the position and orientation of a camera more easily with a simple configuration.

本発明の概要を説明するための第１の図である。It is a 1st figure for demonstrating the outline | summary of this invention. 本発明の概要を説明するための第２の図である。It is a 2nd figure for demonstrating the outline | summary of this invention. 本実施の形態である画像処理装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the image processing apparatus which is this Embodiment. 画像処理装置が行なうパラメータ更新処理を説明するためのフローチャートである。It is a flowchart for demonstrating the parameter update process which an image processing apparatus performs. 合成画像生成部が行なう第１の初期行列算出処理を説明するためのフローチャートである。It is a flowchart for demonstrating the 1st initial stage matrix calculation process which a synthetic | combination image generation part performs. 合成画像生成部が行なう第２の初期行列算出処理を説明するためのフローチャートである。It is a flowchart for demonstrating the 2nd initial stage matrix calculation process which a synthesized image generation part performs. 合成画像生成部が行なう合成画像生成処理を説明するためのフローチャートである。It is a flowchart for demonstrating the composite image generation process which a composite image generation part performs. 第１及び第２の初期行列算出処理により得られる射影行列P₀による推定結果の一例を示す図である。Is a diagram illustrating an example of the estimated result by projection matrix P ₀ obtained by the first and second initial matrix calculation process. ノイズに応じて変化する推定結果の一例を示す第１の図である。It is a 1st figure which shows an example of the estimation result which changes according to noise. ピッチ角に応じて変化する推定結果の一例を示す第１の図である。It is a 1st figure which shows an example of the estimation result which changes according to a pitch angle. ヨー角に応じて変化する推定結果の一例を示す第１の図である。It is a 1st figure which shows an example of the estimation result which changes according to a yaw angle. ロール角に応じて変化する推定結果の一例を示す第１の図である。It is a 1st figure which shows an example of the estimation result which changes according to a roll angle. 撮像画像内の部分領域に応じて変化する推定結果の一例を示す第１の図である。It is a 1st figure which shows an example of the estimation result which changes according to the partial area | region in a captured image. 領域マッチング処理を説明するための第１の図である。It is the 1st figure for explaining field matching processing. 領域マッチング処理を説明するための第２の図である。It is the 2nd figure for explaining field matching processing. ノイズに応じて変化する推定結果の一例を示す第２の図である。It is a 2nd figure which shows an example of the estimation result which changes according to noise. ピッチ角に応じて変化する推定結果の一例を示す第２の図である。It is a 2nd figure which shows an example of the estimation result which changes according to a pitch angle. ヨー角に応じて変化する推定結果の一例を示す第２の図である。It is a 2nd figure which shows an example of the estimation result which changes according to a yaw angle. ロール角に応じて変化する推定結果の一例を示す第２の図である。It is a 2nd figure which shows an example of the estimation result which changes according to a roll angle. 撮像画像内の部分領域に応じて変化する推定結果の一例を示す第２の図である。It is a 2nd figure which shows an example of the estimation result which changes according to the partial area | region in a captured image. コンピュータの構成例を示すブロック図である。It is a block diagram which shows the structural example of a computer.

以下、発明を実施するための形態（以下、本実施の形態という）について説明する。なお、説明は以下の順序で行う。
本実施の形態（合成画像と基準画像との相互相関に基づいて射影行列を更新する場合の一例）
変形例 Hereinafter, modes for carrying out the invention (hereinafter referred to as the present embodiment) will be described. The description will be given in the following order.
Embodiment (an example of updating a projection matrix based on the cross-correlation between a composite image and a reference image)
Modified example

[本発明の概要]
図１は、本発明の概要を示している。 [Outline of the present invention]
FIG. 1 shows an overview of the present invention.

図１Aには、３次元位置（x,y,z）が既知である３次元物体が示されている。なお、図１Aにおいて、３次元物体は、X軸、Y軸、及びZ軸により定義されるXYZ座標軸上に配置されている。 FIG. 1A shows a three-dimensional object whose three-dimensional position (x, y, z) is known. In FIG. 1A, the three-dimensional object is arranged on the XYZ coordinate axes defined by the X axis, the Y axis, and the Z axis.

図１B上側には、カメラ２１により、３次元物体を撮像した場合に得られる基準画像４１を示している。なお、カメラ２１の３次元位置（x,y,z）及び姿勢は既知であるものとする。 On the upper side of FIG. 1B, a reference image 41 obtained when a three-dimensional object is imaged by the camera 21 is shown. It is assumed that the three-dimensional position (x, y, z) and posture of the camera 21 are known.

ここで、カメラ２１の姿勢は、例えば、カメラ２１の回転角度（θ_r,θ_p,θ_y）により表される。なお、回転角度（θ_r,θ_p,θ_y）とは、カメラ２１の中心を原点とし、その原点において互いに直交するロール軸、ピッチ軸、及びヨー軸を定義した場合、カメラ２１の撮像方向に対して、ロール軸との成す角度を表すロール角θ_r、ピッチ軸との成す角度を表すピッチ角θ_p、及びヨー軸との成す角度を表すヨー角θ_yを表す。このことは、カメラ２２についても同様である。 Here, the posture of the camera 21 is represented by, for example, the rotation angle (θ _r , θ _p , θ _y ) of the camera 21. The rotation angles (θ _r , θ _p , θ _y ) are the imaging direction of the camera 21 when the center of the camera 21 is the origin and the roll axis, pitch axis, and yaw axis that are orthogonal to each other are defined at the origin. On the other hand, the roll angle θ _r representing the angle formed with the roll axis, the pitch angle θ _p representing the angle formed with the pitch axis, and the yaw angle θ _y representing the angle formed with the yaw axis are represented. The same applies to the camera 22.

また、図１B下側には、カメラ２２により、３次元物体を撮像した場合に得られる撮像画像４２を示している。なお、カメラ２２の３次元位置（x,y,z）及び姿勢は未知であるものとする。 1B shows a captured image 42 obtained when a camera 22 captures a three-dimensional object. It is assumed that the three-dimensional position (x, y, z) and posture of the camera 22 are unknown.

本発明では、カメラ２１の撮像により得られる基準画像４１、及びカメラ２２の撮像により得られる撮像画像４２に基づいて、カメラ２２の３次元位置（x,y,z）及び姿勢を推定する。 In the present invention, the three-dimensional position (x, y, z) and posture of the camera 22 are estimated based on the reference image 41 obtained by the camera 21 and the captured image 42 obtained by the camera 22.

すなわち、例えば、本発明では、カメラ２１の撮像により得られる基準画像４１、及びカメラ２２の撮像により得られる撮像画像４２に基づいて、撮像画像４２に生じている、基準画像４１との位置ずれを補正するための補正パラメータとして、例えば、射影行列P₀を算出する。 That is, for example, in the present invention, based on the reference image 41 obtained by the imaging of the camera 21 and the captured image 42 obtained by the imaging of the camera 22, the positional deviation from the reference image 41 occurring in the captured image 42 is detected. As a correction parameter for correction, for example, a projection matrix P ₀ is calculated.

そして、本発明では、算出した射影行列P₀に基づき生成される評価関数に対して、Levenberg-Marquardt法等の最急降下法を適用して射影行列P₁を算出する。さらに、本発明では、算出した射影行列P₁に基づき生成される評価関数に対して、最急降下法を適用して射影行列P₂を算出する。 In the present invention, the projection matrix P ₁ is calculated by applying the steepest descent method such as the Levenberg-Marquardt method to the evaluation function generated based on the calculated projection matrix P ₀ . Furthermore, in the present invention, the projection matrix P ₂ is calculated by applying the steepest descent method to the evaluation function generated based on the calculated projection matrix P ₁ .

このようにして、本発明では、射影行列P_iに基づいて新たな射影行列P_i+1を算出するようにして、射影行列P_iを新たな射影行列P_i+1に更新する。そして、最終的に得られる射影行列P_iに基づいて、カメラ２２の３次元位置（x,y,z）及び姿勢を推定する。なお、射影行列P_iは、i+1番目（iは自然数を示す）に算出された射影行列を表す。 In this way, in the present invention, so as to calculate a new projection matrix P _{i + 1} based on the projection matrix P _i, and updates the projection matrix P _i to the new projection matrix P _{i + 1.} Then, the three-dimensional position (x, y, z) and posture of the camera 22 are estimated based on the projection matrix P _i finally obtained. Note that the projection matrix P _i represents the i + 1-th (i represents a natural number) projection matrix.

次に、図２は、最急降下法を用いて、射影行列P_iを更新する様子の一例を示している。 Next, FIG. 2 shows an example of how the projection matrix P _i is updated using the steepest descent method.

本発明では、図２に示されるように、カメラ２２の撮像により得られる撮像画像４２内の部分領域４２a（基準画像４１上に存在する３次元物体（被写体）と同一の３次元物体が存在する領域）に生じている位置ずれを、基準画像４１の位置に補正するための射影行列P₀を算出する。そして、本発明では、算出した射影行列P₀に基づいて、撮像画像４２内の部分領域４２aに生じている位置ずれを、基準画像４１の位置に補正して合成画像６１を生成する。 In the present invention, as shown in FIG. 2, there is a three-dimensional object identical to the partial region 42 a (a three-dimensional object (subject) existing on the reference image 41) in the captured image 42 obtained by imaging by the camera 22. A projection matrix P ₀ for correcting the positional deviation occurring in (region) to the position of the reference image 41 is calculated. In the present invention, based on the calculated projection matrix P ₀ , the positional deviation occurring in the partial area 42 a in the captured image 42 is corrected to the position of the reference image 41 to generate the composite image 61.

さらに、本発明では、射影行列P₀に基づいて生成された合成画像６１と、基準画像４１との相関の程度を表す評価関数を算出する。 Further, in the present invention, an evaluation function representing the degree of correlation between the composite image 61 generated based on the projection matrix P ₀ and the reference image 41 is calculated.

そして、本発明は、算出した評価関数に対して、最急降下法を適用して、評価関数が最小となるときの射影行列P₁を算出する。これにより、射影行列P₀は、撮像画像４２に生じている位置ずれを、基準画像４１の位置に補正する精度がより高い射影行列P₁に更新される。 Then, the present invention applies the steepest descent method to the calculated evaluation function to calculate the projection matrix P ₁ when the evaluation function is minimized. As a result, the projection matrix P ₀ is updated to the projection matrix P ₁ with higher accuracy for correcting the positional deviation occurring in the captured image 42 to the position of the reference image 41.

また、本発明は、射影行列P₁に基づいて、新たに合成画像６１を生成し、生成した合成画像６１と基準画像４１との相関の程度を表す評価関数を算出する。そして、算出した評価関数に対して、最急降下法を適用して、評価関数が最小となるときの射影行列P₂を算出して、射影行列P₁を新たな射影行列P₂に更新する。それ以降、同様にして、射影行列P_iを、最急降下法を用いて更新する。 Further, the present invention newly generates a composite image 61 based on the projection matrix P ₁ , and calculates an evaluation function representing the degree of correlation between the generated composite image 61 and the reference image 41. Then, the steepest descent method is applied to the calculated evaluation function to calculate a projection matrix P ₂ when the evaluation function is minimum, and the projection matrix P ₁ is updated to a new projection matrix P ₂ . Thereafter, similarly, the projection matrix P _i is updated using the steepest descent method.

本発明では、例えば、更新により得られる射影行列P_iが収束した場合、すなわち、射影行列P_i-1と殆ど変わらない射影行列P_iが更新により得られた場合、射影行列P_iに基づいて、カメラ２２の３次元位置（x,y,z）及び姿勢を推定する。 In the present invention, for example, if the projection matrix P _i obtained by the update converges, that is, if the projection matrix P _i-1 and almost the same projection matrix P _i is obtained by the update, based on the projection matrix P _i The three-dimensional position (x, y, z) and posture of the camera 22 are estimated.

なお、射影行列P_iは、上述のように、撮像画像４２内の部分領域４２aに生じている、基準画像４１との位置ずれを補正するためのものである。したがって、射影行列P_iは、撮像画像４２内の部分領域４２aと基準画像４１との間に生じている位置ずれを表している。 Note that the projection matrix P _i is used to correct a positional deviation from the reference image 41 occurring in the partial region 42a in the captured image 42 as described above. Therefore, the projection matrix P _i represents a positional deviation that occurs between the partial area 42 a in the captured image 42 and the reference image 41.

また、撮像画像４２内の部分領域４２aと基準画像４１との間に生じる位置ずれは、カメラ２２の３次元位置（x,y,z）、及びカメラ２２の姿勢を表す回転角度（θ_r,θ_p,θ_y）に応じて変化する。 In addition, the positional shift that occurs between the partial area 42 a in the captured image 42 and the reference image 41 is a three-dimensional position (x, y, z) of the camera 22 and a rotation angle (θ _r , θ _p , θ _y ).

すなわち、射影行列P_iは、カメラ２２の３次元位置（x,y,z）及び回転角度（θ_r,θ_p,θ_y）に応じて変化するものであり、カメラ２１とカメラ２２との３次元位置（x,y,z）及び回転角度（θ_r,θ_p,θ_y）の違いを表すものである。 That is, the projection matrix P _i changes in accordance with the three-dimensional position (x, y, z) and rotation angle (θ _r , θ _p , θ _y ) of the camera 22. This represents the difference between the three-dimensional position (x, y, z) and the rotation angle (θ _r , θ _p , θ _y ).

このため、本発明では、カメラ２２の３次元位置（x,y,z）及び回転角度（θ_r,θ_p,θ_y）に応じて変化する射影行列P_iに基づいて、カメラ２２の３次元位置（x,y,z）、及び回転角度（θ_r,θ_p,θ_y）により表される姿勢を推定できる。 For this reason, in the present invention, 3 of the camera 22 is based on the projection matrix P _i that changes according to the three-dimensional position (x, y, z) and the rotation angle (θ _r , θ _p , θ _y ) of the camera 22. The posture represented by the dimension position (x, y, z) and the rotation angle (θ _r , θ _p , θ _y ) can be estimated.

[画像処理装置８１の構成例]
図３は、本実施の形態である画像処理装置８１の構成例を示している。 [Configuration Example of Image Processing Device 81]
FIG. 3 shows a configuration example of the image processing apparatus 81 according to the present embodiment.

この画像処理装置８１は、３次元情報保持部１０１、基準画像保持部１０２、合成画像生成部１０３、評価部１０４、最適化演算部１０５、及びパラメータ更新部１０６により構成される。 The image processing apparatus 81 includes a three-dimensional information holding unit 101, a reference image holding unit 102, a composite image generation unit 103, an evaluation unit 104, an optimization calculation unit 105, and a parameter update unit 106.

３次元情報保持部１０１は、基準画像４１上に存在する３次元物体の３次元位置（x,y,z）^Tを、予め保持している。なお、説明の便宜のため、３次元位置（x,y,z）を行列により表すようにしている。すなわち、Tは転置を表しており、したがって、３次元位置（x,y,z）^Tは、３行１列の行列を表す。 The three-dimensional information holding unit 101 holds a three-dimensional position (x, y, z) ^T of a three-dimensional object existing on the reference image 41 in advance. For convenience of explanation, the three-dimensional position (x, y, z) is represented by a matrix. That is, T represents transposition, and therefore the three-dimensional position (x, y, z) ^T represents a 3 × 1 matrix.

具体的には、例えば、３次元情報保持部１０１は、基準画像４１を構成する画素毎に、その画素に表示される３次元物体の３次元位置（x,y,z）^Tが対応付けられて保持されている。 Specifically, for example, the three-dimensional information holding unit 101 associates the three-dimensional position (x, y, z) ^T of the three-dimensional object displayed on each pixel constituting the reference image 41 with each pixel. Is held.

なお、基準画像４１を構成する画素の２次元位置（x,y）は、基準画像４１を構成する各画素のうち、最も左下に存在する画素の位置を原点（0,0）とし、撮像画像４２の水平方向にX軸を、垂直方向にY軸を定義したXY座標により表される。 Note that the two-dimensional position (x, y) of the pixels constituting the reference image 41 is the captured image with the position of the pixel located at the lower left among the pixels constituting the reference image 41 as the origin (0, 0). The XY coordinates are defined by defining the X axis in the horizontal direction 42 and the Y axis in the vertical direction.

すなわち、本実施の形態では、基準画像４１を構成する画素の２次元位置（x,y）を表すXY座標と、３次元物体の３次元位置（x,y,z）のうちの(x,y)を表すXY座標とは一致しているものとする。 That is, in the present embodiment, (x, y, z) out of the XY coordinates representing the two-dimensional position (x, y) of the pixels constituting the reference image 41 and the three-dimensional position (x, y, z) of the three-dimensional object. It is assumed that the XY coordinates representing y) coincide with each other.

３次元情報保持部１０１は、基準画像４１を構成する各画素の２次元位置(x_j,y_j)を特徴点m_jとして、特徴点m_j毎に、特徴点m_jの特徴を表す特徴量を予め保持している。なお、特徴点m_j毎の特徴量は、基準画像４１に基づいて予め生成される。 Three-dimensional information holding unit 101, two-dimensional position of each pixel constituting the reference image 41 (x _{_j,} y _j) as the feature point m _j, for each feature point m _j, features representing the characteristics of the feature point m _j The amount is held in advance. Note that the feature amount for each feature point m _j is generated in advance based on the reference image 41.

なお、本実施の形態では、説明を簡単にするため、基準画像４１を構成する各画素の２次元位置（x_j,y_j）を特徴点m_jとすることとしているが、特徴点m_jとして採用される位置はこれに限定されない。 In this embodiment, for the sake of simplicity, the two-dimensional position (x _j , y _j ) of each pixel constituting the reference image 41 is set as the feature point m _j , but the feature point m _j The position adopted as is not limited to this.

すなわち、例えば、特徴点m_jは、基準画像４１を構成する各画素において、横方向及び縦方向にそれぞれ所定の画素（例えば、５画素や２５画素）だけ離れて存在する画素それぞれが存在する２次元位置（x_j,y_j）のみを特徴点m_jとして採用するようにしてもよい。 In other words, for example, the feature point m _j includes pixels that are separated by predetermined pixels (for example, 5 pixels or 25 pixels) in the horizontal direction and the vertical direction in each pixel constituting the reference image 41 2. Only the dimension position (x _j , y _j ) may be adopted as the feature point m _j .

基準画像保持部１０２は、カメラ２１により、３次元物体を撮像して得られた基準画像４１が予め保持されている。 The reference image holding unit 102 holds a reference image 41 obtained by imaging a three-dimensional object with the camera 21 in advance.

合成画像生成部１０３には、カメラ２２から撮像画像４２が供給される。合成画像生成部１０３は、射影行列P_iに基づいて、カメラ２２からの撮像画像４２における、基準画像４１との位置ずれを補正することにより、撮像画像４２から合成画像６１を生成する。 A captured image 42 is supplied from the camera 22 to the composite image generation unit 103. The composite image generation unit 103 generates a composite image 61 from the captured image 42 by correcting the positional deviation of the captured image 42 from the camera 22 from the reference image 41 based on the projection matrix P _i .

すなわち、例えば、合成画像生成部１０３は、３次元情報保持部１０１から、３次元情報保持部１０１に保持されている基準画像４１上の特徴点m_j、及び特徴点m_j毎の特徴量を読み出す。 That is, for example, the composite image generation unit 103 obtains the feature points m _j on the reference image 41 held in the 3D information holding unit 101 and the feature quantities for each feature point m _j from the 3D information holding unit 101. read out.

そして、合成画像生成部１０３は、読み出した特徴点m_j毎の特徴量に基づいて、カメラ２２からの撮像画像４２から、特徴点m_jに対応する撮像画像４２上の特徴点n_j（撮像画像４２を構成する画素の位置（u_j,v_j））を抽出する。 Then, based on the feature amount for each feature point m _j that has been read, the composite image generation unit 103 determines from the captured image 42 from the camera 22 the feature point n _j (imaging image) on the captured image 42 that corresponds to the feature point m _j. The position (u _j , v _j )) of the pixels constituting the image 42 is extracted.

なお、撮像画像４２を構成する画素の位置（u,v）は、撮像画像４２を構成する各画素のうち、最も左下に存在する画素の位置を原点（0,0）とし、撮像画像４２の水平方向にU軸を、垂直方向にV軸を定義したUV座標により表される。 Note that the position (u, v) of the pixel constituting the captured image 42 is the origin (0,0) of the pixel located at the lower left among the pixels constituting the captured image 42. It is represented by UV coordinates that define the U axis in the horizontal direction and the V axis in the vertical direction.

合成画像生成部１０３は、読み出した基準画像４１上の特徴点m_jと、対応する撮像画像４２上の特徴点n_jとにおいて、特徴点m_jに対応する３次元位置（x_j,y_j,z_j）^Tと特徴点n_jとしての２次元位置（u_j,v_j）^Tを用いて、最小自乗法等により、３次元位置（x_j,y_j,z_j）^Tを２次元位置（u_j,v_j）^Tに変換（射影）するための射影行列P₀を算出する。 The composite image generation unit 103 _selects the three-dimensional position (x _j , y _j) corresponding to the feature point m _j between the read feature point m _j on the reference image 41 and the corresponding feature point n _j on the captured image 42. , z _j ) ^T and the two-dimensional position (x _j , y _j , z _j ) ^T using the two-dimensional position (u _j , v _j ) ^T as the feature point n _j by the least square method or the like. Projection matrix P ₀ for conversion (projection) to position (u _j , v _j ) ^T is calculated.

合成画像生成部１０３は、算出した射影行列P₀に基づいて、読み出した特徴点n_jに対応する３次元位置（x_j,y_j,z_j）^Tを、撮像画像４２上の２次元位置（u_j,v_j）^Tに変換する。 Based on the calculated projection matrix P ₀ , the composite image generation unit 103 converts the three-dimensional position (x _j , y _j , z _j ) ^T corresponding to the read feature point n _j to the two-dimensional position on the captured image 42. (U _j , v _j ) Convert to ^T.

合成画像生成部１０３は、カメラ２２からの撮像画像４２を構成する各画素のうち、変換により得られた２次元位置（u_j,v_j）^Tに存在する画素を抽出し、抽出した画素により構成される合成画像６１を生成する。そして、合成画像生成部１０３は、生成した合成画像６１及び射影行列P₀を、評価部１０４に供給する。 The composite image generation unit 103 extracts a pixel existing at the two-dimensional position (u _j , v _j ) ^T obtained by the conversion from each pixel constituting the captured image 42 from the camera 22, and uses the extracted pixel. The composed composite image 61 is generated. Then, the composite image generation unit 103 supplies the generated composite image 61 and the projection matrix P ₀ to the evaluation unit 104.

また、合成画像生成部１０３は、３次元情報保持部１０１に予め保持されている、基準画像４１を構成する各画素に対応付けられている３次元位置（x,y,z）^Tを読み出す。さらに、合成画像生成部１０３は、パラメータ更新部１０６からの射影行列P_i（≠P₀）に基づいて、読み出した３次元位置（x,y,z）^Tを、撮像画像４２上の２次元位置（u,v）^Tに変換する。 In addition, the composite image generation unit 103 reads a three-dimensional position (x, y, z) ^T that is stored in advance in the three-dimensional information storage unit 101 and is associated with each pixel constituting the reference image 41. Further, the composite image generation unit 103 converts the read three-dimensional position (x, y, z) ^T into a two-dimensional image on the captured image 42 based on the projection matrix P _i (≠ P ₀ ) from the parameter update unit 106. Position (u, v) Convert to ^T.

合成画像生成部１０３は、カメラ２２からの撮像画像４２を構成する各画素のうち、変換により得られた２次元位置（u_j,v_j）^Tに存在する画素を抽出し、抽出した画素により構成される合成画像６１を生成する。そして、合成画像生成部１０３は、生成した合成画像６１、及びパラメータ更新部１０６からの射影行列P_iを、評価部１０４に供給する。 The composite image generation unit 103 extracts a pixel existing at the two-dimensional position (u _j , v _j ) ^T obtained by the conversion from each pixel constituting the captured image 42 from the camera 22, and uses the extracted pixel. The composed composite image 61 is generated. Then, the composite image generation unit 103 supplies the generated composite image 61 and the projection matrix P _i from the parameter update unit 106 to the evaluation unit 104.

評価部１０４は、基準画像保持部１０２に保持されている基準画像４１を、基準画像保持部１０２から読み出す。 The evaluation unit 104 reads the reference image 41 held in the reference image holding unit 102 from the reference image holding unit 102.

評価部１０４は、読み出した基準画像４１、並びに合成画像生成部１０３からの合成画像６１及び射影行列P_iに基づいて、次式（１）に示されるように、基準画像４１と合成画像６１との相関を表す評価関数f(x_n)を算出する。 Evaluation unit 104 reads the reference image 41, and based on the composite image 61 and the projection matrix P _i from the composite image generation unit 103, as shown in the following equation (1), the reference image 41 and the synthetic image 61 An evaluation function f (x _n ) representing the correlation is calculated.

・・・（１）

... (1)

ここで、式（１）において、画素値a_i及びa_jは、基準画像４１を構成する各画素の画素値を表し、画素値b_i及びb_kは、合成画像６１を構成する各画素の画素値を表す。また、i,j,kは、それぞれ、１から基準画像４１（又は合成画像６１）を構成する各画素の総数までの値をとる。なお、基準画像４１と合成画像６１は、同一の画素数により構成されている。 Here, in Expression (1), the pixel values a _i and a _j represent the pixel values of each pixel constituting the reference image 41, and the pixel values b _i and b _k represent the pixels of the composite image 61. Represents a pixel value. In addition, i, j, and k each take a value from 1 to the total number of pixels constituting the reference image 41 (or the composite image 61). The reference image 41 and the composite image 61 are configured with the same number of pixels.

また、式（１）において、候補行列x_nは、新たな射影行列P_i+1の候補を表しており、x_n=P_i+d_nである。なお、d_nは射影行列Piと同一の行と列により表される行列であり、それぞれ異なる行列を表す。 Further, in Equation (1), the candidate matrix x _n represents a candidate for a new projection matrix P _{i + 1} , and x _n = P _i + d _n . D _n is a matrix represented by the same row and column as the projection matrix Pi, and represents a different matrix.

ところで、基準画像４１を構成する各画素の画素値a_i及びa_kは定数であり、合成画像６１を構成する各画素の画素値b_i及びb_kは、候補行列x_nに応じて変化する変数である。したがって、評価関数f(x_n)は、画素値b_i及びb_kを変数として有する関数となる。 Incidentally, the pixel values a _i and a _k of the pixels constituting the reference image 41 are constants, and the pixel values b _i and b _k of the pixels constituting the synthesized image 61 change according to the candidate matrix x _n. It is a variable. Therefore, the evaluation function f (x _n ) is a function having the pixel values b _i and b _k as variables.

評価部１０４は、算出した評価関数f(x_n)を最適化演算部１０５に供給する。 The evaluation unit 104 supplies the calculated evaluation function f (x _n ) to the optimization calculation unit 105.

最適化演算部１０５は、評価部１０４からの評価関数f(x_n)に対して、最急降下法を適用して、評価関数f(x_n)が最小となるときの候補行列x_n（=P_i+d_n）を、新たな射影行列P_i+1として算出し、パラメータ更新部１０６に供給する。 Optimizing operation unit 105, the evaluation function f (x _n) from the evaluation unit 104, by applying the steepest descent method, the evaluation function f (x _n) candidate matrix when is minimized x _n (= P _i + d _n ) is calculated as a new projection matrix P _{i + 1} and supplied to the parameter update unit 106.

パラメータ更新部１０６は、最適化演算部１０５からの射影行列P_i+1を、合成画像生成部１０３に供給する。この場合、合成画像生成部１０３は、パラメータ更新部１０６からの射影行列P_i+1に基づいて、合成画像６１を生成する。 The parameter update unit 106 supplies the projection matrix P _{i + 1} from the optimization calculation unit 105 to the composite image generation unit 103. In this case, the composite image generation unit 103 generates a composite image 61 based on the projection matrix P _{i + 1} from the parameter update unit 106.

また、パラメータ更新部１０６は、例えば、最適化演算部１０５からの射影行列P_i+1が収束したと判定した場合、最適化演算部１０５からの射影行列P_i+1に基づいて、カメラ２２の３次元位置（x,y,z）及び姿勢を表す回転角度（θ_r,θ_p,θ_y）を推定し、後段に出力する。 For example, when the parameter update unit 106 determines that the projection matrix P _{i + 1} from the optimization calculation unit 105 has converged, the parameter update unit 106 determines whether the camera 22 is based on the projection matrix P _{i + 1} from the optimization calculation unit 105. The three-dimensional position (x, y, z) and the rotation angle (θ _r , θ _p , θ _y ) representing the posture are estimated and output to the subsequent stage.

ところで、最適化演算部１０５において行われる最急降下法によれば、評価関数f(x_n)が２次多項式により表されている場合、２次多項式により表されていない場合と比較して、容易に解を算出することができる。 By the way, according to the steepest descent method performed in the optimization calculation unit 105, when the evaluation function f (x _n ) is represented by a quadratic polynomial, it is easier than when it is not represented by a quadratic polynomial. The solution can be calculated.

したがって、評価関数f(x_n)は２次多項式により表されていることが望ましい。そこで、式（１）に示された評価関数f(x_n)を２次多項式により表すことを考える。 Therefore, it is desirable that the evaluation function f (x _n ) is represented by a quadratic polynomial. Therefore, consider expressing the evaluation function f (x _n ) shown in the equation (1) by a quadratic polynomial.

まず、式（１）の右辺を２乗して逆数をとり、次式（２）に示されるような評価関数に変更する。 First, the right side of Expression (1) is squared to obtain an inverse, and the evaluation function is changed to that shown in Expression (2) below.

・・・（２）

... (2)

式（２）において、Σa_j ²は定数であり、候補行列x_nによって変化する値ではないため、候補行列x_nに応じて変化する評価関数f(x_n)から、Σa_j ²を除外し、式（３）を導出する。 In the formula ^(2), Σa _j 2 are constants, not a value that changes by the candidate matrix x _n, the evaluation function f which varies depending on the candidate matrix x _{_n} (x _n), excludes? A _j ² Equation (3) is derived.

・・・（３）

... (3)

式（３）において、変数kについてのΣ（サメーション）を外部に移動させて、式（４）に変換する。 In equation (3), Σ (summation) for variable k is moved to the outside and converted to equation (4).

・・・（４）

... (4)

ここで、式（４）に示される評価関数f(x_n)は、変数b_i及びb_kについての２次多項式となっている。 Here, the evaluation function f (x _n ) shown in Expression (4) is a quadratic polynomial for the variables b _i and b _k .

したがって、式（４）に示される評価関数f(x_n)に対して、最急降下法を適用する場合には、評価関数f(x_n)が２次多項式により表されていない場合と比較して、最急降下法の解としての射影行列P_i+1を容易に算出できるので、より迅速に射影行列P_i+1を算出することが可能となる。 Therefore, when the steepest descent method is applied to the evaluation function f (x _n ) shown in Equation (4), the evaluation function f (x _n ) is compared with the case where the evaluation function f (x _n ) is not represented by a quadratic polynomial. Thus, since the projection matrix P _{i + 1} as a solution of the steepest descent method can be easily calculated, the projection matrix P _{i + 1} can be calculated more quickly.

なお、式（４）において、f(B_k)は（b_k/Σa_ib_i）を表している。 In equation (4), f (B _k ) represents (b _k / Σa _i b _i ).

以下、評価部１０４は、式（１）に示される評価関数f(x_n)ではなく、式（４）に示された評価関数f(x_n)を算出して、最適化演算部１０５に供給し、最適化演算部１０５は、評価部１０４からの、式（４）に示された評価関数f(x_n)に基づいて、最急降下法により、新たな射影行列P_i+1を算出して、合成画像生成部１０３に供給するものとして説明する。 Hereinafter, the evaluation unit 104 calculates the evaluation function f (x _n ) shown in the equation (4) instead of the evaluation function f (x _n ) shown in the equation (1), and sends it to the optimization calculation unit 105. Then, the optimization calculation unit 105 calculates a new projection matrix P _{i + 1} by the steepest descent method based on the evaluation function f (x _n ) shown in Expression (4) from the evaluation unit 104. The description will be made assuming that the image is supplied to the composite image generation unit 103.

[画像処理装置８１の動作説明]
次に、図４のフローチャートを参照して、画像処理装置８１が行うパラメータ更新処理について説明する。 [Description of Operation of Image Processing Device 81]
Next, parameter update processing performed by the image processing apparatus 81 will be described with reference to the flowchart of FIG.

このパラメータ更新処理は、例えば、カメラ２２から画像処理装置８１の合成画像生成部１０３に撮像画像４２が供給されたときに開始される。 This parameter update process is started when, for example, the captured image 42 is supplied from the camera 22 to the composite image generation unit 103 of the image processing apparatus 81.

ステップＳ１において、合成画像生成部１０３は、３次元画像保持部１０１に記憶されている基準画像４１上の特徴点m_j、及びカメラ２２からの撮像画像４２上の特徴点n_jに基づいて、射影行列P_i(=P₀)を算出する初期行列算出処理を行う。なお、初期行列算出処理については、図５及び図６を参照して後述する。 In step S < _b > 1, the composite image generation unit 103 is based on the feature points m _j on the reference image 41 stored in the three-dimensional image holding unit 101 and the feature points n _j on the captured image 42 from the camera 22. An initial matrix calculation process for calculating the projection matrix P _i (= P ₀ ) is performed. The initial matrix calculation process will be described later with reference to FIGS.

ステップＳ２において、合成画像生成部１０３は、３次元情報保持部１０１に予め保持されている、基準画像４１を構成する各画素に対応付けられている３次元位置（x,y,z）^T、及びステップＳ１の処理で算出された射影行列P_iに基づいて、撮像画像４２から合成画像６１を生成する合成画像生成処理を行なう。なお、合成画像生成処理の詳細は、図７を参照して後述する。 In step S 2, the composite image generation unit 103 stores the three-dimensional position (x, y, z) ^{T corresponding} to each pixel constituting the reference image 41, which is stored in advance in the three-dimensional information storage unit 101. and based on the projection matrix P _i calculated in the processing of step S1, the composite image generation processing for generating a composite image 61 from the captured image 42. Details of the composite image generation process will be described later with reference to FIG.

合成画像生成部１０３は、合成画像生成処理で生成した合成画像６１を、評価部１０４に供給する。 The composite image generation unit 103 supplies the composite image 61 generated by the composite image generation process to the evaluation unit 104.

ステップＳ３において、評価部１０４は、基準画像保持部１０２に予め保持されている基準画像４１を、基準画像保持部１０２から読み出す。そして、評価部１０４は、読み出した基準画像４１、並びに合成画像生成部１０３からの合成画像６１及び射影行列P_iに基づいて、式（４）に示されたような評価関数f(x_n)を算出し、最適化演算部１０５に供給する。 In step S 3, the evaluation unit 104 reads the reference image 41 held in advance in the reference image holding unit 102 from the reference image holding unit 102. Then, the evaluation unit 104 evaluates the evaluation function f (x _n ) as shown in Expression (4) based on the read reference image 41 and the composite image 61 and the projection matrix P _i from the composite image generation unit 103. Is supplied to the optimization calculation unit 105.

ステップＳ４において、最適化演算部１０５は、評価部１０５からの評価関数f(x_n)に基づいて、最急降下法を適用し、評価関数f(x_n)が最小となるときの候補行列x_nを新たな射影行列P_i+1として算出し、パラメータ更新部１０６に供給する。 In step S4, the optimization calculation unit 105 applies the steepest descent method based on the evaluation function f (x _n ) from the evaluation unit 105, and the candidate matrix x when the evaluation function f (x _n ) is minimized. _n is calculated as a new projection matrix P _{i + 1} and supplied to the parameter update unit 106.

ステップＳ５において、パラメータ更新部１０６は、最適化演算部１０５からの射影行列P_i+1を、合成画像生成部１０３に供給して、処理をステップＳ２に戻す。 In step S5, the parameter update unit 106 supplies the projection matrix P _{i + 1} from the optimization calculation unit 105 to the composite image generation unit 103, and returns the process to step S2.

そして、ステップＳ２では、合成画像生成部１０３は、３次元情報保持部１０１に予め保持されている、基準画像４１を構成する各画素に対応付けられている３次元位置（x,y,z）^T、及び直前のステップＳ５の処理で更新された新たな射影行列P_i+1に基づいて、撮像画像４２から合成画像６１を生成する合成画像生成処理を行ない、それ以降、同様の処理が行なわれる。 In step S 2, the composite image generation unit 103 stores the three-dimensional position (x, y, z) associated with each pixel constituting the reference image 41 that is stored in advance in the three-dimensional information storage unit 101. ^{Based on T} and the new projection matrix P _{i + 1} updated in the immediately preceding step S5, a composite image generation process for generating the composite image 61 from the captured image 42 is performed, and thereafter the same process is performed. It is.

なお、ステップＳ５において、パラメータ更新部１０６は、例えば、最適化演算部１０５からの射影行列P_i+1が収束したと判定した場合、最適化演算部１０５からの射影行列P_i+1に基づいて、カメラ２２の３次元位置（x,y,z）及び姿勢を推定し、後段に出力して、パラメータ更新処理は終了される。 In step S5, for example, when the parameter updating unit 106 determines that the projection matrix P _{i + 1} from the optimization calculation unit 105 has converged, the parameter update unit 106 is based on the projection matrix P _{i + 1} from the optimization calculation unit 105. Thus, the three-dimensional position (x, y, z) and orientation of the camera 22 are estimated and output to the subsequent stage, and the parameter update process is terminated.

[初期行列算出処理の詳細]
次に、図５のフローチャートを参照して、パラメータ更新処理のステップＳ１において、合成画像生成部１０３が行なう初期行列算出処理としての第１の初期行列算出処理について説明する。 [Details of initial matrix calculation process]
Next, the first initial matrix calculation process as the initial matrix calculation process performed by the composite image generation unit 103 in step S1 of the parameter update process will be described with reference to the flowchart of FIG.

ステップＳ２１において、合成画像生成部１０３は、３次元情報保持部１０１から、基準画像４１上の特徴点m_j及び特徴点m_jの特徴量を読み出す。なお、変数jは、１からJまでの自然数を表す。 In step S21, the composite image generation unit 103, the 3-dimensional information holding unit 101 reads the feature quantity of the feature point m _j and the feature point m _j on the reference image 41. The variable j represents a natural number from 1 to J.

ステップＳ２２において、合成画像生成部１０３は、読み出した特徴点m_j（=(x_j,y_j)^T）の特徴量と同様の特徴量を有する特徴点として、カメラ２２から供給される撮像画像４２上の特徴点n_j(=(u_j,v_j)^T)を抽出する。 In step S _ 22, the composite image generation unit 103 captures an image supplied from the camera 22 as a feature point having the same feature quantity as the feature quantity of the read feature point m _j (= (x _j , y _j ) ^T ). 42, feature points n _j (= (u _j , v _j ) ^T ) are extracted.

これにより、合成画像生成部１０３は、基準画像４１上の特徴点m_jと、特徴点m_jに対応する、撮像画像４２上の特徴点n_jとのJ個の組合せ（m_j,n_j）を、組合せ対応点（m_j,n_j）として取得する。 Thus, the composite image generation unit 103, a feature point m _j on the reference image 41, corresponding to the feature point m _j, J-number of combinations (m _j of the feature points n _j on the captured image 42, n _j ) _Is acquired as a combination corresponding point (m _j , n _j ).

ステップＳ２３において、合成画像生成部１０３は、J個の組合せ対応点（m_j,n_j）から、それぞれ異なるQ(<J)個の組合せ対応点（m_j,n_j）を抽出し、抽出したQ個の組合せ対応点（m_j,n_j）により構成される集合U_kを生成する。 In step S23, the composite image generation unit 103, J-number of combinations corresponding points (m _{_j,} n _j) from extracted different Q (<J) pieces of combination corresponding points (m _{_j,} n _j), respectively, extracted A set U _k composed of the Q combination corresponding points (m _j , n _j ) is generated.

すなわち、合成画像生成部１０３は、J個の組合せ対応点（m_j,n_j）の中から、_JC_Q通りの組合せにより、Q個の組合せ対応点（mj,nj）を抽出し、抽出したQ個の組合せ対応点（mj,nj）それぞれの集合を、_JC_Q個の集合U_kとして生成する。なお、kは１から_JC_Qまでの自然数の値をとる。 That is, the composite image generation unit 103 extracts Q combination corresponding points (mj, nj) from _J combination corresponding points (m _j , n _j ) by _J C _Q combinations. Each set of Q combination corresponding points (mj, nj) is generated as _J C _Q sets U _k . Note that k is a natural number from 1 to _J C _Q.

ステップＳ２４において、合成画像生成部１０３は、生成した集合U_k毎に、集合U_kを構成するQ個の組合せ対応点（mj,nj）に基づいて得られる射影行列P_kを算出する。 In step S24, the composite image generation unit 103 calculates a projection matrix P _k obtained based on Q combination corresponding points (mj, nj) constituting the set U _k for each generated set U _k .

具体的には、例えば、合成画像生成部１０３は、集合U_kを構成するQ個の組合せ対応点（m_j,n_j）において、特徴点m_jに対応する３次元位置（x_j,y_j,z_j）^Tと、特徴点n_jとしての２次元位置（u_j,v_j）^Tとを用いて、最小自乗法等により、誤差Err1=Σ_j=1 ^J{（u_j,v_j）^T-P_k(x_j,y_j,z_j)^T}が最小となるような射影行列P_kを算出する。 Specifically, for example, the composite image generation unit 103 has a three-dimensional position (x _j , y) corresponding to the feature point m _j in Q combination corresponding points (m _j , n _j ) constituting the set U _k. _j , z _j ) ^T and the two-dimensional position (u _j , v _j ) ^T as the feature point n _j , and the error Err1 = Σ _{j = 1} ^J {(u _j , v _j ) Calculate a projection matrix P _k that minimizes ^T −P _k (x _j , y _j , z _j ) ^T }.

ステップＳ２５において、合成画像生成部１０３は、算出した複数の射影行列P_kのうち、算出した誤差Err1=Σ_j=1 ^J{（u_j,v_j）^T-P_k(x_j,y_j,z_j)^T}が最小となるときの射影行列P_kを、射影行列P_minとする。 In step S25, the composite image generation unit 103 calculates the calculated error Err1 = Σ _{j = 1} ^J {(u _j , v _j ) ^T −P _k (x _j , y _j among the plurality of calculated projection matrices P _k. , z _j ) Let the projection matrix P _k when ^T } is minimum be the projection matrix P _min .

ステップＳ２６において、合成画像生成部１０３は、J個の組合せ対応点（m_j,n_j）において、特徴点m_jに対応する３次元位置（x_j,y_j,z_j）^Tと、特徴点n_jとしての２次元位置（u_j,v_j）^Tとを用いて、誤差E_n=Σ_j=1 ^J（u_j,v_j）^T-P_n(x_j,y_j,z_j)^T=Σ_j=1 ^J（u_j,v_j）^T-(P_min+d_n)(x_j,y_j,z_j)^Tを算出する。 In step S _ 26, the composite image generation unit 103 determines the three-dimensional position (x _j , y _j , z _j ) ^T corresponding to the feature point m _j and the feature _among the J combination corresponding points (m _j , n _j ). Using the two-dimensional position (u _j , v _j ) ^T as the point n _j , the error E _n = Σ _{j = 1} ^J (u _j , v _j ) ^T -P _n (x _j , y _j , z _j ) ^T = Σ _{j = 1} ^J (u _j , v _j ) ^T − (P _min + d _n ) (x _j , y _j , z _j ) ^T is calculated.

そして、合成画像生成部１０３は、算出した値E_nに基づいて、関数g(E_n)=αE_n ²/(1+αE_n ²)を算出する。なお、αは定数であり、例えば、０以上であって１以下の値とされる。 Then, the composite image generation unit 103, based on the calculated value E _n, calculates a function _{_{g (E n) = αE n}} 2 / (1 + αE n 2). Α is a constant, for example, a value that is 0 or more and 1 or less.

ステップＳ２７において、合成画像生成部１０３は、最急降下法により、関数g(E_n)を最小化するときのE_nを算出し、算出したE_nに対応するP_n=(P_min+d_n)を、射影行列P₀とする。そして、ステップＳ２７の処理の終了後、処理を図４のステップＳ１にリターンさせ、処理をステップＳ２に進め、それ以降の処理が行なわれる。 In step S27, the composite image generation unit 103, the steepest descent method to calculate the E _n when minimizing the function g (E _n), corresponding to the calculated _{_{_{E n P n = (P min}}} + d n ) Is a projection matrix P ₀ . Then, after the process of step S27 is completed, the process is returned to step S1 of FIG. 4, the process is advanced to step S2, and the subsequent processes are performed.

なお、合成画像生成部１０３は、上述した第１の初期行列算出処理により射影行列P₀を算出するようにしたが、図６のフローチャートに示される第２の初期行列算出処理により、射影行列P₀を算出するようにしてもよい。 Note that the composite image generation unit 103 calculates the projection matrix P ₀ by the first initial matrix calculation process described above, but the projection matrix P by the second initial matrix calculation process shown in the flowchart of FIG. You may make it calculate ₀ .

[第２の初期行列算出処理の詳細]
次に、図６のフローチャートを参照して、パラメータ更新処理のステップＳ１において、合成画像生成部１０３が行う初期行列算出処理として行なう第２の初期行列算出処理について説明する。 [Details of second initial matrix calculation process]
Next, the second initial matrix calculation process performed as the initial matrix calculation process performed by the composite image generation unit 103 in step S1 of the parameter update process will be described with reference to the flowchart of FIG.

ステップＳ４１乃至ステップＳ４５において、図５のステップＳ２１乃至ステップＳ２５と同様の処理が行なわれる。 In steps S41 to S45, processing similar to that in steps S21 to S25 in FIG. 5 is performed.

ステップＳ４６において、合成画像生成部１０３は、J個の組合せ対応点（m_j,n_j）毎に、特徴点m_jに対応する３次元位置（x_j,y_j,z_j）^T、及び特徴点n_jとしての２次元位置（u_j,v_j）^Tに基づいて、誤差Err2=（u_j,v_j）^T-P_min(x_j,y_j,z_j)^Tを算出する。そして、合成画像生成部１０３は、J個の組合せ対応点（m_j,n_j）のうち、対応する誤差Err2が予め決めた閾値未満となる組合せ対応点（m_j,n_j）を選択する。 In step S46, the composite image generation unit 103, for each of the J combination corresponding points (m _j , n _j ), a three-dimensional position (x _j , y _j , z _j ) ^T corresponding to the feature point m _j , and Based on the two-dimensional position (u _j , v _j ) ^T as the feature point n _j , the error Err2 = (u _j , v _j ) ^T −P _min (x _j , y _j , z _j ) ^T is calculated. Then, the composite image generation unit 103 selects a combination corresponding point (m _j , n _j ) from which the corresponding error Err2 is less than a predetermined threshold among the _J combination corresponding points (m _j , n _j ). .

ステップＳ４７において、合成画像生成部１０３は、選択した組合せ対応点（m_j,n_j）において、特徴点m_jに対応する３次元位置（x_j,y_j,z_j）^Tと、特徴点n_jとしての２次元位置（u_j,v_j）^Tとを用いて、最小自乗法等により、（u_j,v_j）^T-P₀（x_j,y_j,z_j）^Tが最小となるときの射影行列P₀を算出し、処理を図４のステップＳ１にリターンさせ、処理をステップＳ２に進め、それ以降の処理が行なわれる。 In step S47, the composite image generation unit 103 selects the three-dimensional position (x _j , y _j , z _j ) ^T corresponding to the feature point m _j and the feature point at the selected combination corresponding point (m _j , n _j ). Using the two-dimensional position (u _j , v _j ) ^T as n _j , (u _j , v _j ) ^T -P ₀ (x _j , y _j , z _j ) ^T is minimized by the least square method or the like The projection matrix P ₀ is calculated, the process is returned to step S1 in FIG. 4, the process proceeds to step S2, and the subsequent processes are performed.

[合成画像生成処理の詳細]
次に、図７のフローチャートを参照して、パラメータ更新処理のステップＳ２において、合成画像生成部１０３が行なう合成画像生成処理について説明する。 [Details of composite image generation processing]
Next, the composite image generation process performed by the composite image generation unit 103 in step S2 of the parameter update process will be described with reference to the flowchart of FIG.

ステップＳ６１において、合成画像生成部１０３は、３次元情報保持部１０１から、基準画像４１を構成する各画素に対応付けられている３次元位置(x,y,z)^Tを読み出す。 In step S 61, the composite image generation unit 103 reads the three-dimensional position (x, y, z) ^T associated with each pixel constituting the reference image 41 from the three-dimensional information holding unit 101.

そして、ステップＳ６２において、合成画像生成部１０３は、ステップＳ１の処理で射影行列P₀が生成され、ステップＳ２に進められた場合には、生成した射影行列P₀に基づいて、読み出した基準画像４１上の３次元位置(x,y,z)^Tを、撮像画像４２上の２次元位置（u,v）^Tに変換する。 Then, in step S62, the composite image generation unit 103, the projection matrix P ₀ is generated in step S1, when advanced to step S2, based on the generated projection matrix P _0, the read reference image A three-dimensional position (x, y, z) ^T on 41 is converted into a two-dimensional position (u, v) ^T on the captured image 42.

また、ステップＳ６２において、合成画像生成部１０３は、パラメータ更新部１０６から新たな射影行列P_i+1が供給され、ステップＳ５からステップＳ２に進められた場合には、パラメータ更新部１０６からの新たな射影行列P_i+1に基づいて、３次元情報保持部１０１から読み出した３次元位置(x,y,z)^Tを、撮像画像４２上の２次元位置（u,v）^Tに変換する。 In step S62, the composite image generation unit 103 is supplied with a new projection matrix P _{i + 1} from the parameter update unit 106 and proceeds from step S5 to step S2. The three-dimensional position (x, y, z) ^T read from the three-dimensional information holding unit 101 is converted into a two-dimensional position (u, v) ^T on the captured image 42 based on the correct projection matrix P _{i + 1.} .

ステップＳ６３において、合成画像生成部１０３は、カメラ２２からの撮像画像４２を構成する各画素のうち、ステップＳ６２の処理で得られた撮像画像４２上の２次元位置（u,v）^Tに存在する画素を抽出する。 In step S63, the composite image generation unit 103 exists at the two-dimensional position (u, v) ^T on the captured image 42 obtained by the process of step S62 among the pixels constituting the captured image 42 from the camera 22. Extract the pixels to be used.

そして、合成画像生成部１０３は、抽出した画素により構成される画像を、合成画像６１として生成し、処理を図４のステップＳ２にリターンさせ、生成した合成画像６１を、評価部１０４に供給して、それ以降の処理が行なわれる。 Then, the composite image generation unit 103 generates an image composed of the extracted pixels as the composite image 61, returns the processing to step S2 in FIG. 4, and supplies the generated composite image 61 to the evaluation unit 104. Then, the subsequent processing is performed.

以上説明したように、パラメータ更新処理によれば、ステップＳ３において、カメラ２１とカメラ２２との性能の違いや、撮像条件の違いに応じて、撮像画像４２（合成画像６１）と基準画像４１との画素値に生じるレベル差による影響を受けにくい評価関数として、合成画像６１と基準画像４１との相関を表す評価関数P(x_n)を算出するようにしている。 As described above, according to the parameter update process, in step S3, the captured image 42 (the synthesized image 61) and the reference image 41 are changed according to the difference in performance between the camera 21 and the camera 22 and the difference in the imaging conditions. An evaluation function P (x _n ) representing the correlation between the composite image 61 and the reference image 41 is calculated as an evaluation function that is not easily affected by the level difference that occurs in the pixel value.

したがって、例えば、パラメータ更新処理では、合成画像６１と基準画像４１との、対応する各画素の画素値についての差分自乗和Σ（a_i-b_i）²を評価関数P(x_n)として用いる場合のように、特定の模様（パターン）が表示された壁紙等を被写体として撮像したり、合成画像６１と基準画像４１とを正規化して２値化画像に変換する必要がなくなる。 Therefore, for example, in the parameter updating process, the difference square sum Σ (a _i −b _i ) ² for the pixel values of the corresponding pixels of the composite image 61 and the reference image 41 is used as the evaluation function P (x _n ). As in the case, it is not necessary to image a wallpaper or the like on which a specific pattern (pattern) is displayed as a subject, or normalize the composite image 61 and the reference image 41 to convert them into a binary image.

すなわち、パラメータ更新処理によれば、カメラ２１及び２２で撮像する被写体は、特定の模様が表示された壁紙等に限定されず、どのような物体を被写体として撮像するようにしてもよいし、合成画像６１及び基準画像４１は、２値化画像とされる必要がなく、グレースケールの他、RGB(Red,Green,Blue)値により表される画像として用いることが可能となる。 That is, according to the parameter update process, the subject to be imaged by the cameras 21 and 22 is not limited to a wallpaper or the like on which a specific pattern is displayed, and any object may be imaged as a subject. The image 61 and the reference image 41 do not need to be binarized images, and can be used as images represented by RGB (Red, Green, Blue) values in addition to grayscale.

また、パラメータ更新処理では、式（４）に示されるように、２次多項式により表される評価関数P(x_n)を用いるようにしたので、評価関数P(x_n)が２次多項式ではない場合と比較して、より迅速に、最急降下法の解としての新たな射影行列P_i+1を算出することが可能となる。 In the parameter updating process, as shown in the equation (4), the evaluation function P (x _n ) represented by the quadratic polynomial is used. Therefore, the evaluation function P (x _n ) is not a quadratic polynomial. It is possible to calculate a new projection matrix P _{i + 1} as a solution of the steepest descent method more quickly than in the case where there is no such method.

さらに、例えば、パラメータ更新処理では、式（４）に示されたように、２次多項式により表される評価関数P(x_n)を用いるようにしたので、式（４）の評価関数P(x_n)において、評価関数P(x_n)の計算速度を速めるために、評価関数P(x_n)に用いられる画素値a_i,a_j,b_i及びb_kを少なくしたとしても、比較的、精度の高い射影行列P_i+1を算出することが可能となる。 Further, for example, in the parameter updating process, the evaluation function P (x _n ) represented by the quadratic polynomial is used as shown in the equation (4), so that the evaluation function P ( x _n ), even if the pixel values a _i , a _j , b _i and b _k used in the evaluation function P (x _n ) are reduced in order to increase the calculation speed of the evaluation function P (x _n ) Therefore, it is possible to calculate the projection matrix P _{i + 1} with high accuracy.

なお、評価関数P(x_n)に用いられる画素値a_i,a_j,b_i及びb_kを少なくしたとしても、比較的、精度の高い射影行列P_i+1を算出できる点は、本出願人が行なった実験により確認されている。本出願人が行なった実験と、その実験により確認された事項については、後述する図１６乃至図２０を参照して説明する。 Note that even if the pixel values a _i , a _j , b _i, and b _k used in the evaluation function P (x _n ) are reduced, a relatively accurate projection matrix P _{i + 1} can be calculated. This has been confirmed by experiments conducted by the applicant. The experiment conducted by the applicant and the items confirmed by the experiment will be described with reference to FIGS.

また、例えば、パラメータ更新処理において、撮像画像６１にノイズが生じていたとしても、式（４）に示されたような評価関数(x_n)を用いるときには、例えば、自乗和Σ（a_i-b_i）²を評価関数P(x_n)として用いるときと比較して、より精度の高い射影行列P_i+1を算出することができる。 Further, for example, even when noise is generated in the captured image 61 in the parameter update process, when using the evaluation function (x _n ) as shown in Expression (4), for example, the sum of squares Σ (a _i − b _i) as compared to when using a ² evaluation function P (x _n), it can be calculated with higher accuracy projection matrix P _{i + 1.}

なお、撮像画像４２にノイズが生じていたとしても、比較的、精度の高い射影行列P_i+1が算出可能な点については、後述する図１６を参照して説明する。 Note that the fact that a relatively accurate projection matrix P _{i + 1} can be calculated even if noise occurs in the captured image 42 will be described with reference to FIG. 16 described later.

[第１及び第２の初期行列算出処理により得られる射影行列P₀による推定結果]
次に、図８は、１２１個の組合せ対応点（m_j,n_j）を、それぞれ異なる１５パターンだけ用意し、用意した１５パターンそれぞれに基づいて、第１及び第２の初期行列算出処理を行った場合に得られる射影行列P₀から推定される推定結果を示している。 [Estimation result by projection matrix P ₀ obtained by first and second initial matrix calculation processes]
Next, FIG. 8 shows that 121 combinations corresponding points (m _j , n _j ) are prepared for 15 different patterns, and the first and second initial matrix calculation processes are performed based on the prepared 15 patterns. It shows the estimation results estimated from projection matrix P ₀ obtained when performing.

図８A及び図８Bにおいて、黒色で示される棒グラフは、それぞれ異なる１５パターンの、１２１個の組合せ対応点（m_j,n_j）に基づいて、第１の初期行列算出処理を行った場合に得られる射影行列P₀に基づく推定結果の誤差についての平均を示している。 8A and 8B, black bar graphs are obtained when the first initial matrix calculation process is performed based on 121 combination corresponding points (m _j , n _j ) of 15 different patterns. An average of errors in the estimation result based on the projection matrix P _{0 obtained} is shown.

また、図８A及び図８Bにおいて、白色で示される棒グラフは、それぞれ異なる１５パターンの、１２１個の組合せ対応点（m_j,n_j）に基づいて、第２の初期行列算出処理を行った場合に得られる射影行列P₀に基づく推定結果の誤差についての平均を示している。 8A and 8B, the bar graphs shown in white are obtained when the second initial matrix calculation process is performed based on 121 combination corresponding points (m _j , n _j ) in 15 different patterns. The average of the errors in the estimation results based on the projection matrix P ₀ obtained is shown.

具体的には、図８A中央では、撮像画像４２に対するノイズが0である場合（撮像画像４２に対してノイズが生じていない場合）における３次元位置の平均誤差として、白色で示される棒グラフが誤差0.00011633を表しており、黒色で示される棒グラフが誤差0.000145667を表している。 Specifically, in the center of FIG. 8A, the bar graph shown in white is an error as the average error of the three-dimensional position when the noise for the captured image 42 is 0 (when no noise is generated for the captured image 42). 0.00011633 is represented, and the bar graph shown in black represents the error 0.000145667.

なお、３次元位置の平均誤差とは、それぞれ異なる１５パターンについて得られる推定結果として推定されたカメラ２２の３次元位置（x,y,z）と、実際の３次元位置（x',y',z'）との誤差|x-x'|+|y-y'|+|z-z'|の平均を表している。 Note that the average error of the three-dimensional position refers to the three-dimensional position (x, y, z) of the camera 22 estimated as an estimation result obtained for 15 different patterns, and the actual three-dimensional position (x ′, y ′). , z ′) and | x−x ′ | + | y−y ′ | + | z−z ′ |

また、図８A右側では、撮像画像４２に対するノイズが0である場合における回転角度の平均誤差として、白色で示される棒グラフが誤差0.000123333を表しており、黒色で示される棒グラフが誤差0.000142667を表している。 On the right side of FIG. 8A, as an average error of the rotation angle when the noise with respect to the captured image 42 is 0, a bar graph shown in white represents an error 0.000123333, and a bar graph shown in black represents an error 0.000142667. .

また、図８A左側では、撮像画像４２に対するノイズが0である場合において、パラメータ更新処理により、射影行列P₀に基づいて、新たに生成した射影行列P₁から推定される３次元位置及び回転角度それぞれの平均誤差を加算した平均誤差加算値を表している。図８A左側において、射影行列P₁に基づく平均誤差加算値として、白色及び黒色で示されるいずれの棒グラフについても、平均誤差加算値0を表している。 On the left side of FIG. 8A, when the noise with respect to the captured image 42 is 0, the three-dimensional position and rotation angle estimated from the newly generated projection matrix P ₁ based on the projection matrix P ₀ by the parameter update process. An average error addition value obtained by adding the respective average errors is shown. On the left side of FIG. 8A, as the average error addition value based on the projection matrix P ₁ , the average error addition value 0 is represented for any bar graph shown in white and black.

さらに、図８B中央では、撮像画像４２に対するノイズが3.0である場合（本来、得られるノイズのない撮像画像４２の各画素の画素値に画素値３を加算した場合）における３次元位置の平均誤差として、白色で示される棒グラフが誤差1.58628を表しており、黒色で示される棒グラフが誤差1.067658667を表している。 Further, in the center of FIG. 8B, the average error of the three-dimensional position in the case where the noise with respect to the captured image 42 is 3.0 (originally, the pixel value 3 is added to the pixel value of each pixel of the captured image 42 without noise). As shown, the bar graph shown in white represents the error 1.58628, and the bar graph shown in black represents the error 1.067658667.

また、図８B右側では、撮像画像４２に対するノイズが3.0である場合における回転角度の平均誤差として、白色で示される棒グラフが誤差6.623128333を表しており、黒色で示される棒グラフが誤差1.421813333を表している。 On the right side of FIG. 8B, as an average error of the rotation angle when the noise with respect to the captured image 42 is 3.0, a bar graph shown in white represents an error 6.623128333, and a bar graph shown in black represents an error 1.421813333. .

なお、図８B左側では、撮像画像４２に対するノイズが3.0である場合において、パラメータ更新処理により、射影行列P₀に基づいて、新たに生成した射影行列P₁から推定される３次元位置及び回転角度それぞれの平均誤差を加算した平均誤差加算値を表している。図８B左側において、射影行列P₁に基づく平均誤差加算値として、白色で示される棒グラフが平均誤差加算値0.032973を表しており、黒色で示される棒グラフが平均誤差加算値0.333367333を表している。 On the left side of FIG. 8B, when the noise with respect to the captured image 42 is 3.0, the three-dimensional position and rotation angle estimated from the newly generated projection matrix P ₁ based on the projection matrix P ₀ by the parameter update process. An average error addition value obtained by adding the respective average errors is shown. In Figure 8B the left, as the average error adding value based on projection matrix P _1, bar graph indicated by white represents the average error sum value 0.032973, bar graph shown in black represents the average error sum value 0.333367333.

図８A中央及び右側、並びに図８B中央及び右側に示されるように、撮像画像４２にノイズが生じていない（ノイズが0である）場合には、第１及び第２の初期行列算出処理により生成された射影行列P₀について、推定結果の誤差に大きな差はない。 As shown in the center and right side of FIG. 8A and the center and right side of FIG. 8B, when no noise is generated in the captured image 42 (noise is 0), it is generated by the first and second initial matrix calculation processes. There is no significant difference in the error of the estimation results for the projected projection matrix P ₀ .

しかしながら、図８A中央及び右側、並びに図８B中央及び右側に示されるように、撮像画像４２にノイズが生じている（ノイズが3.0である）場合には、第１の初期行列算出処理により生成された射影行列P₀についての推定結果の方が、第２の初期行列算出処理により生成された射影行列P₀についての推定結果と比較して、推定結果の誤差は非常に小さくなる。 However, as shown in the center and right side of FIG. 8A and the center and right side of FIG. 8B, when noise occurs in the captured image 42 (noise is 3.0), it is generated by the first initial matrix calculation process. and towards the estimation results for the projection matrix P _0, compared with the estimation results for the projection matrix P ₀ generated by the second initial matrix calculation process, the error of the estimation result becomes very small.

次に、図９乃至図１３を参照して、第１の初期行列算出処理を用いて算出した射影行列P₀に基づいて推定されるカメラ２２の３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）、並びに第２の初期行列算出処理を用いて算出した射影行列P₀に基づいて推定される３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）について説明する。 Next, referring to FIGS. 9 to 13, the three-dimensional position (x, y, z) and rotation of the camera 22 estimated based on the projection matrix P ₀ calculated using the first initial matrix calculation process The three-dimensional position (x, y, z) estimated based on the angle (θ _r , θ _p , θ _y ), and the projection matrix P ₀ calculated using the second initial matrix calculation process, and the rotation angle (θ _{r, θ} _p, _θ _y) will be described.

[撮像画像４２上に生じるノイズに応じて変化する推定結果の様子]
図９は、撮像画像４２を構成する各画素に生じるノイズに応じて、推定される３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）が変化する様子の一例を示している。 [State of estimation result that changes according to noise generated on captured image 42]
FIG. 9 shows a state in which the estimated three-dimensional position (x, y, z) and the rotation angle (θ _r , θ _p , θ _y ) change according to noise generated in each pixel constituting the captured image 42. An example is shown.

なお、図９A及び図９Bの横軸は、各画素に生じるノイズとして加算される画素値を表す。 9A and 9B represent pixel values added as noise generated in each pixel.

また、図９Aにおける縦軸は、推定された３次元位置（x,y,z）と、実際の３次元位置（x',y',z'）との誤差|x-x'|+|y-y'|+|z-z'|を表している。このことは、図１０A、図１１A、図１２A、及び図１３Aにおける縦軸においても同様である。 The vertical axis in FIG. 9A represents the error | x−x ′ | + | between the estimated three-dimensional position (x, y, z) and the actual three-dimensional position (x ′, y ′, z ′). y-y '| + | z-z' | This also applies to the vertical axis in FIGS. 10A, 11A, 12A, and 13A.

また、図９において、グラフ１２１乃至１２３は、第２の初期行列算出処理を用いて算出した射影行列P₀に基づいて推定された推定結果についてのグラフを表している。 In FIG. 9, graphs 121 to 123 represent graphs of estimation results estimated based on the projection matrix P ₀ calculated using the second initial matrix calculation process.

さらに、図９において、グラフ１４１乃至１４３は、第１の初期行列算出処理を用いて算出した射影行列P₀に基づいて推定された推定結果についてのグラフを表している。 Further, in FIG. 9, graphs 141 to 143 represent graphs of estimation results estimated based on the projection matrix P ₀ calculated using the first initial matrix calculation process.

なお、グラフ１２１及び１４１の組合せ，グラフ１２２及び１４２の組合せ，並びにグラフ１２３及び１４３の組合せにおいて、それぞれ異なる組合せ毎に、基準画像４１上の特徴点m_jとして異なる特徴点m_jを採用した場合に得られるグラフの組合せを示している。 Note that, in the combination of the graphs 121 and 141, the combination of the graphs 122 and 142, and the combination of the graphs 123 and 143, a different feature point m _j is adopted as the feature point m _j on the reference image 41 for each different combination. Shows the combination of graphs obtained.

グラフ１２１乃至１２３、及びグラフ１４１乃至１４３については、後述する図１０乃至図１３についても同様である。 The same applies to the graphs 121 to 123 and the graphs 141 to 143 in FIGS. 10 to 13 described later.

[カメラ２２のピッチ角θ_pに応じて変化する推定結果の様子]
図１０は、カメラ２２のピッチ角θ_pの変化に応じて、推定される３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）が変化する様子の一例を示している。なお、図１０A及び図１０Bの横軸は、カメラ２２のピッチ角θ_pを表しており、カメラ２２のロール角θ_r及びヨー角θ_yは、カメラ２１と同様に設定されている。 [State of estimation result that changes according to pitch angle θ _p of camera 22]
FIG. 10 shows an example of how the estimated three-dimensional position (x, y, z) and rotation angle (θ _r , θ _p , θ _y ) change according to the change in the pitch angle θ _p of the camera 22. Show. 10A and 10B represents the pitch angle θ _p of the camera 22, and the roll angle θ _r and yaw angle θ _y of the camera 22 are set similarly to the camera 21.

[カメラ２２のヨー角θ_yに応じて変化する推定結果の様子]
図１１は、カメラ２２のヨー角θ_yの変化に応じて、推定される３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）が変化する様子の一例を示している。なお、図１１A及び図１１Bの横軸は、カメラ２２のヨー角θ_yを表しており、カメラ２２のロール角θ_r及びピッチ角θ_pは、カメラ２１と同様に設定されている。 [State estimation result changes according to the yaw angle theta _y Camera 22]
FIG. 11 shows an example of how the estimated three-dimensional position (x, y, z) and rotation angle (θ _r , θ _p , θ _y ) change according to the change in the yaw angle θ _y of the camera 22. Show. 11A and 11B represents the yaw angle θ _y of the camera 22, and the roll angle θ _r and the pitch angle θ _p of the camera 22 are set similarly to the camera 21.

[カメラ２２のロール角θ_rに応じて変化する推定結果の様子]
図１２は、カメラ２２のロール角θ_rの変化に応じて、推定される３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）が変化する様子の一例を示している。なお、図１２A及び図１２Bの横軸は、カメラ２２のロール角θ_rを表しており、カメラ２２のピッチ角θ_p及びヨー角θ_yは、カメラ２１と同様に設定されている。 [State of estimation result that changes according to roll angle θ _r of camera 22]
FIG. 12 shows an example of how the estimated three-dimensional position (x, y, z) and rotation angle (θ _r , θ _p , θ _y ) change according to the change in the roll angle θ _r of the camera 22. Show. 12A and 12B represents the roll angle θ _r of the camera 22, and the pitch angle θ _p and the yaw angle θ _y of the camera 22 are set similarly to the camera 21.

[撮像画像４２内の部分領域４２aの面積に応じて変化する推定結果の様子]
図１３は、撮像画像４２内の部分領域４２aの面積に応じて、推定される３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）が変化する様子の一例を示している。なお、図１３A及び図１３Bの横軸は、撮像画像４２上の全領域の面積を１００とした場合における、部分領域４２aの面積の割合を表す。 [State of the estimation result that changes according to the area of the partial region 42a in the captured image 42]
FIG. 13 shows an example of how the estimated three-dimensional position (x, y, z) and rotation angle (θ _r , θ _p , θ _y ) change according to the area of the partial region 42 a in the captured image 42. Is shown. 13A and 13B represents the ratio of the area of the partial region 42a when the area of the entire region on the captured image 42 is 100.

第１の初期行列算出処理を用いて算出した射影行列P₀から推定される３次元位置（x',y',z'）及び回転角度（θ_r',θ_p',θ_y'）の方が、図９乃至図１３に示されるいずれの場合も、第２の初期行列算出処理を用いて算出した射影行列P₀から推定された３次元位置（x',y',z'）及び回転角度（θ_r',θ_p',θ_y'）と比較して、より正確に推定されていることがわかる。 The three-dimensional position (x ′, y ′, z ′) and rotation angle (θ _r ′, θ _p ′, θ _y ′) estimated from the projection matrix P ₀ calculated using the first initial matrix calculation process In any of the cases shown in FIGS. 9 to 13, the three-dimensional position (x ′, y ′, z ′) estimated from the projection matrix P ₀ calculated using the second initial matrix calculation process and It can be seen that the estimation is more accurate than the rotation angles (θ _r ′, θ _p ′, θ _y ′).

次に、図１４及び図１５を参照して、領域マッチング処理により、基準画像４１と合成画像６１との組合せ対応点（m_j,n_j）を抽出し、抽出した組合せ対応点（m_j,n_j）に基づいて、射影行列P₁を生成する場合について説明する。 Next, referring to FIG. 14 and FIG. 15, a combination corresponding point (m _j , n _j ) between the reference image 41 and the composite image 61 is extracted by region matching processing, and the extracted combination corresponding point (m _j , n _j , A case where the projection matrix P ₁ is generated based on n _j ) will be described.

なお、図１４及び図１５を参照して説明する領域マッチング処理は、後述する図１６乃至２０において、本発明との対比のために用いられる。 The region matching processing described with reference to FIGS. 14 and 15 is used for comparison with the present invention in FIGS. 16 to 20 described later.

図１４は、領域マッチング処理により組合せ対応点（m_j,n_j）を抽出する様子を示している。 FIG. 14 shows how the combination corresponding points (m _j , n _j ) are extracted by the region matching process.

なお、合成画像６１は、本発明におけるパラメータ更新処理の場合と同様にして、射影行列P₀が算出され、算出された射影行列P₀に基づいて生成されたものとして説明する。 Incidentally, the composite image 61, as in the case of the parameter update process in the present invention, the calculated projection matrix P _0, is described as being generated based on the calculated projection matrix P _0.

領域マッチング処理では、基準画像４１上の特徴点m_jを順次、注目特徴点とし、注目特徴点に存在する画素を含む、注目特徴点の周囲に存在する複数の画素により構成される矩形領域４１aを設定する。また、領域マッチング処理では、合成画像６１上の全領域のうち、矩形領域４１aと最も類似する領域を検出し、検出した領域の中心に存在する画素が存在する位置を、特徴点m_jに対応する合成画像６１上の特徴点n_jとして抽出する。 In the area matching process, the feature point m _j on the reference image 41 is sequentially set as the feature point of interest, and a rectangular region 41a composed of a plurality of pixels existing around the feature point of interest including pixels existing at the feature point of interest. Set. In the area matching process, an area most similar to the rectangular area 41a is detected among all areas on the composite image 61, and the position where the pixel existing at the center of the detected area is associated with the feature point m _j . It is extracted as a feature point n _j on the composite image 61 to be performed.

そして、領域マッチング処理は、基準画像４１上の特徴点m_jと、対応する合成画像６１上の特徴点n_jとの組合せを表す組合せ対応点（m_j,n_j）に基づいて、最小自乗法等により射影行列P₁を算出する。 Then, the region matching processing is performed based on the combination corresponding point (m _j , n _j ) representing the combination of the feature point m _j on the reference image 41 and the corresponding feature point n _j on the composite image 61. A projection matrix P ₁ is calculated by multiplication or the like.

これにより、領域マッチング処理では、図１５に示されるように、基準画像４１上の特徴点m_jに対応する３次元位置（x,y,z）を、対応する撮像画像４２上の２次元位置（u,v）に変換するための射影行列P₁が算出される。 Thereby, in the area matching process, as shown in FIG. 15, the three-dimensional position (x, y, z) corresponding to the feature point m _j on the reference image 41 is converted into the two-dimensional position on the corresponding captured image 42. A projection matrix P ₁ for conversion to (u, v) is calculated.

次に、図１６乃至図２０を参照して、領域マッチング処理により算出した射影行列P₁に基づいて推定されるカメラ２２の３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）、並びにパラメータ更新処理により算出した射影行列P₁に基づいて推定される３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）について説明する。 Next, referring to FIG. 16 to FIG. 20, the three-dimensional position (x, y, z) and rotation angle (θ _r , θ) of the camera 22 estimated based on the projection matrix P ₁ calculated by the region matching process. _p , θ _y ) and the three-dimensional position (x, y, z) and rotation angle (θ _r , θ _p , θ _y ) estimated based on the projection matrix P ₁ calculated by the parameter update process will be described.

[撮像画像４２上に生じるノイズに応じて変化する推定結果の様子]
図１６は、撮像画像４２を構成する各画素に生じるノイズに応じて、推定される３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）が変化する様子の一例を示している。 [State of estimation result that changes according to noise generated on captured image 42]
FIG. 16 shows a state in which the estimated three-dimensional position (x, y, z) and rotation angle (θ _r , θ _p , θ _y ) change according to noise generated in each pixel constituting the captured image 42. An example is shown.

なお、図１６A及び図１６Bの横軸は、各画素に生じるノイズとして加算される画素値を表す。 16A and 16B represent pixel values added as noise generated in each pixel.

また、図１６Aにおける縦軸は、推定された３次元位置（x,y,z）と、実際の３次元位置（x',y',z'）との誤差|x-x'|+|y-y'|+|z-z'|を表している。このことは、図１７A、図１８A、図１９A、及び図２０Aにおける縦軸においても同様である。 Also, the vertical axis in FIG. 16A represents the error | x−x ′ | + | between the estimated three-dimensional position (x, y, z) and the actual three-dimensional position (x ′, y ′, z ′). y-y '| + | z-z' | This also applies to the vertical axis in FIGS. 17A, 18A, 19A, and 20A.

また、図１６において、グラフ１６１乃至１６３は、パラメータ更新処理により算出した射影行列P₁に基づいて推定された推定結果についてのグラフを表している。 In FIG. 16, graphs 161 to 163 represent graphs of estimation results estimated based on the projection matrix P ₁ calculated by the parameter update process.

さらに、図１６において、グラフ１８１乃至１８３は、領域マッチング処理により算出した射影行列P₁に基づいて推定された推定結果についてのグラフを表している。 Further, in FIG. 16, graphs 181 to 183 represent graphs of estimation results estimated based on the projection matrix P ₁ calculated by the region matching process.

なお、グラフ１６１及び１８１は、基準画像４１を構成する各画素のうち、縦方向及び横方向に１画素だけ離れて存在する画素それぞれの２次元位置（x,y）を、基準画像４１の特徴点m_jとした場合についてのグラフを示している。 Note that the graphs 161 and 181 indicate the two-dimensional positions (x, y) of the pixels that are separated by one pixel in the vertical direction and the horizontal direction among the pixels constituting the reference image 41, and the characteristics of the reference image 41. The graph about the case where it is set as the point m _j is shown.

また、グラフ１６２及び１８２は、基準画像４１を構成する各画素のうち、縦方向及び横方向に５画素だけ離れて存在する画素それぞれの２次元位置（x,y）を、基準画像４１の特徴点m_jとした場合についてのグラフを示している。 The graphs 162 and 182 indicate the two-dimensional positions (x, y) of the pixels that are separated by 5 pixels in the vertical direction and the horizontal direction among the pixels constituting the reference image 41, and the characteristics of the reference image 41. The graph about the case where it is set as the point m _j is shown.

さらに、グラフ１６３及び１８３は、基準画像４１を構成する各画素のうち、縦方向及び横方向に２５画素だけ離れて存在する画素それぞれの２次元位置（x,y）を、基準画像４１の特徴点m_jとした場合についてのグラフを示している。 Furthermore, the graphs 163 and 183 show the two-dimensional positions (x, y) of the pixels that are separated by 25 pixels in the vertical direction and the horizontal direction among the pixels constituting the reference image 41, and the characteristics of the reference image 41. The graph about the case where it is set as the point m _j is shown.

グラフ１６１乃至１６３、及びグラフ１８１乃至１８３については、後述する図１７乃至図２０についても同様である。 The same applies to the graphs 161 to 163 and the graphs 181 to 183 in FIGS.

[カメラ２２のピッチ角θ_pに応じて変化する推定結果の様子]
図１７は、カメラ２２のピッチ角θ_pの変化に応じて、推定される３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）が変化する様子の一例を示している。なお、図１７A及び図１７Bの横軸は、カメラ２２のピッチ角θ_pを表しており、カメラ２２のロール角θ_r及びヨー角θ_yは、カメラ２１と同様に設定されている。 [State of estimation result that changes according to pitch angle θ _p of camera 22]
FIG. 17 shows an example of how the estimated three-dimensional position (x, y, z) and rotation angle (θ _r , θ _p , θ _y ) change according to the change in the pitch angle θ _p of the camera 22. Show. 17A and 17B represents the pitch angle θ _p of the camera 22, and the roll angle θ _r and yaw angle θ _y of the camera 22 are set similarly to the camera 21.

[カメラ２２のヨー角θ_yに応じて変化する推定結果の様子]
図１８は、カメラ２２のヨー角θ_yの変化に応じて、推定される３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）が変化する様子の一例を示している。なお、図１８A及び図１８Bの横軸は、カメラ２２のヨー角θ_yを表しており、カメラ２２のロール角θ_r及びピッチ角θ_pは、カメラ２１と同様に設定されている。 [State estimation result changes according to the yaw angle theta _y Camera 22]
FIG. 18 shows an example of how the estimated three-dimensional position (x, y, z) and rotation angle (θ _r , θ _p , θ _y ) change according to the change in the yaw angle θ _y of the camera 22. Show. 18A and 18B represents the yaw angle θ _y of the camera 22, and the roll angle θ _r and the pitch angle θ _p of the camera 22 are set in the same manner as the camera 21.

[カメラ２２のロール角θ_rに応じて変化する推定結果の様子]
図１９は、カメラ２２のロール角θ_rの変化に応じて、推定される３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）が変化する様子の一例を示している。なお、図１９A及び図１９Bの横軸は、カメラ２２のロール角θ_rを表しており、カメラ２２のピッチ角θ_p及びヨー角θ_yは、カメラ２１と同様に設定されている。 [State of estimation result that changes according to roll angle θ _r of camera 22]
FIG. 19 shows an example of how the estimated three-dimensional position (x, y, z) and rotation angle (θ _r , θ _p , θ _y ) change according to the change in the roll angle θ _r of the camera 22. Show. 19A and 19B represents the roll angle θ _r of the camera 22, and the pitch angle θ _p and yaw angle θ _y of the camera 22 are set in the same manner as the camera 21.

[撮像画像４２内の部分領域４２aの面積に応じて変化する推定結果の様子]
図２０は、撮像画像４２内の部分領域４２aの面積に応じて、推定される３次元位置(x,y,z)及び回転角度（θ_r,θ_p,θ_y）が変化する様子の一例を示している。なお、図２０A及び図２０Bの横軸は、撮像画像４２上の全領域の面積を１００とした場合における、部分領域４２aの面積の割合を表す。 [State of the estimation result that changes according to the area of the partial region 42a in the captured image 42]
FIG. 20 shows an example of how the estimated three-dimensional position (x, y, z) and rotation angle (θ _r , θ _p , θ _y ) change according to the area of the partial region 42 a in the captured image 42. Is shown. 20A and 20B represents the ratio of the area of the partial region 42a when the area of the entire region on the captured image 42 is 100. In FIG.

パラメータ更新処理により算出した射影行列P₁から推定される３次元位置（x',y',z'）及び回転角度（θ_r',θ_p',θ_y'）の方が、図１６乃至図２０に示されるいずれの場合も、領域マッチング処理により算出した射影行列P₁から推定された３次元位置（x',y',z'）及び回転角度（θ_r',θ_p',θ_y'）と比較して、より正確に推定されていることがわかる。 The three-dimensional position (x ′, y ′, z ′) and rotation angle (θ _r ′, θ _p ′, θ _y ′) estimated from the projection matrix P ₁ calculated by the parameter update process are the same as those in FIG. In any case shown in FIG. 20, the three-dimensional position (x ′, y ′, z ′) estimated from the projection matrix P ₁ calculated by the region matching process and the rotation angle (θ _r ′, θ _p ′, θ _It can be seen that the estimation is more accurate than _y ′).

また、パラメータ更新処理により算出した射影行列P₁から推定された３次元位置（x',y',z'）及び回転角度（θ_r',θ_p',θ_y'）においては、基準画像４１の特徴点が、２５画素おきに存在するようにしても、領域マッチング処理の場合と比較して、精度良く推定することができる。 In addition, at the three-dimensional position (x ′, y ′, z ′) and the rotation angle (θ _r ′, θ _p ′, θ _y ′) estimated from the projection matrix P ₁ calculated by the parameter update process, the reference image Even if 41 feature points exist every 25 pixels, it can be estimated with higher accuracy than in the case of region matching processing.

したがって、本発明におけるパラメータ更新処理によれば、パラメータ更新処理において、射影行列P1を生成する際に、評価関数f(x_n)に用いられる基準画像４１の特徴点が疎であったとしても、比較的精度良く、カメラ２２の３次元位置（x',y',z'）及び回転角度（θ_r',θ_p',θ_y'）を推定することが可能となる。
＜２．変形例＞
本実施の形態では、３次元情報保持部１０１には、基準画像４１を構成する各画素の３次元位置（x,y,z）^Tを予め保持するようにしたが、例えば、３次元位置（x,y,z）^Tに代えて、２次元位置（x,y）^Tを保持するようにしてもよい。 Therefore, according to the parameter update process of the present invention, even when the feature points of the reference image 41 used for the evaluation function f (x _n ) are sparse when generating the projection matrix P1 in the parameter update process, It is possible to estimate the three-dimensional position (x ′, y ′, z ′) and the rotation angle (θ _r ′, θ _p ′, θ _y ′) of the camera 22 with relatively high accuracy.
<2. Modification>
In the present embodiment, the three-dimensional information holding unit 101 holds the three-dimensional position (x, y, z) ^T of each pixel constituting the reference image 41 in advance. Instead of x, y, z) ^T , a two-dimensional position (x, y) ^T may be held.

この場合、２次元位置（x,y）^Tを、３次元位置（x,y,c）^Tとして表すようにすれば、本実施の形態において説明した場合と同様にして、パラメータ更新処理を行うことができる。なお、３次元位置（x,y,c）^Tにおけるcは定数を表しており、例えばc=0とされる。 In this case, if the two-dimensional position (x, y) ^T is expressed as the three-dimensional position (x, y, c) ^T , the parameter update process is performed in the same manner as described in the present embodiment. be able to. Note that c at the three-dimensional position (x, y, c) ^T represents a constant, for example, c = 0.

また、本実施の形態における画像処理装置８１には、カメラ２２が含まれないように構成したが、その他、例えば、カメラ２２を含めるように構成してもよい。 Further, the image processing apparatus 81 in the present embodiment is configured not to include the camera 22, but may be configured to include the camera 22, for example.

さらに、本実施の形態では、射影行列P_iに基づいて、カメラ２２の３次元位置（x,y,z）及び姿勢を推定するようにしたが、その他、例えば、カメラ２２の３次元位置（x,y,z）又は姿勢のいずれか一方を推定するようにしてもよい。 Furthermore, in the present embodiment, the three-dimensional position (x, y, z) and posture of the camera 22 are estimated based on the projection matrix P _i. Any one of x, y, z) and posture may be estimated.

なお、画像処理装置８１としては、例えば、カメラ２２の位置や姿勢に基づいて、所定の処理を行なうカメラ付き携帯端末や、HMD(head mount display、ヘッドマウントディスプレイ)等の電子装置を採用することができる。すなわち、例えば、画像処理装置８１としては、AR（Augmented Reality）技術を用いて、カメラ２２により撮像されて図示せぬモニタ等に表示される撮像画像上に、カメラ２２の位置や姿勢に応じた物体を表示させる処理等を行なう電子装置を採用できる。 As the image processing device 81, for example, a mobile terminal with a camera that performs predetermined processing based on the position and orientation of the camera 22, and an electronic device such as an HMD (head mount display) are adopted. Can do. That is, for example, as the image processing device 81, an AR (Augmented Reality) technique is used, and an image captured by the camera 22 is displayed on a monitor or the like (not shown) according to the position and orientation of the camera 22. An electronic device that performs processing for displaying an object or the like can be employed.

この電子装置は、カメラ２２の撮像により得られる撮像画像上に存在する物体に関連した関連情報を、ネットワークに接続されたサーバから取得する。 This electronic apparatus acquires related information related to an object existing on a captured image obtained by imaging by the camera 22 from a server connected to the network.

例えば、撮像画像上に存在する物体についての特徴点のパターンを表す各特徴点パターン（例えば、物体としての四角い紙が存在する場合、その紙の４つの角をそれぞれ表す４つの特徴点を、１つの特徴点パターンとする）毎に、対応する物体の関連情報を対応付けて保持するようにサーバを構成した場合、電子装置は、カメラ２２の撮像により得られる撮像画像から、特徴点を抽出し、抽出した特徴点についての特徴点パターンに対応付けられている関連情報をサーバから取得する。 For example, each feature point pattern representing a pattern of feature points for an object existing on a captured image (for example, when a square paper as an object exists, four feature points respectively representing four corners of the paper are represented by 1 When the server is configured to hold the related information of the corresponding object in association with each other), the electronic device extracts the feature points from the captured image obtained by the imaging of the camera 22 The related information associated with the feature point pattern for the extracted feature points is acquired from the server.

また、電子装置の位置及び方向（例えば、電子装置に内蔵されるカメラ２２の撮像方向）の組合せ毎に、対応する関連情報を対応付けて保持するようにサーバを構成した場合、電子装置は、GPS(global positioning system)等により、電子装置の位置を測定するとともに、加速度センサやジャイロセンサ等を用いて、電子装置の方向を測定する。 Further, when the server is configured so as to associate and hold the corresponding related information for each combination of the position and direction of the electronic device (for example, the imaging direction of the camera 22 incorporated in the electronic device), the electronic device The position of the electronic device is measured using a GPS (global positioning system) or the like, and the direction of the electronic device is measured using an acceleration sensor, a gyro sensor, or the like.

そして、電子装置は、測定した位置及び方向の組合せに対応付けられている関連情報を、サーバから取得するものとなる。 And an electronic device will acquire the relevant information matched with the combination of the measured position and direction from a server.

なお、電子装置の位置毎に、対応する関連情報を対応付けて保持するようにサーバを構成した場合、電子装置は、測定した位置に対応付けられている関連情報を取得し、電子装置の方向毎に、対応する関連情報を対応付けて保持するようにサーバを構成した場合、電子装置は、測定した方向に対応付けられている関連情報を取得するものとなる。 In addition, when the server is configured to hold corresponding related information in association with each position of the electronic device, the electronic device acquires the related information associated with the measured position, and the direction of the electronic device When the server is configured to hold corresponding related information in association with each other, the electronic apparatus acquires related information associated with the measured direction.

電子装置は、取得した関連情報に基づいて、撮像画像上に合成（重畳）して表示させる合成用画像を生成する。そして、電子装置は、本実施の形態において上述したようにして、カメラ２２の位置や姿勢を推定し、その結果得られる推定結果を考慮して、合成用画像を、撮像画像に合成して表示させる。 Based on the acquired related information, the electronic device generates a composite image to be displayed by being combined (superimposed) on the captured image. Then, the electronic device estimates the position and orientation of the camera 22 as described above in the present embodiment, and combines the synthesized image with the captured image and displays it in consideration of the estimation result obtained as a result. Let

電子装置は、ネットワークに接続されたサーバから、関連情報を取得するようにしたが、その他、例えば、電子装置が、関連情報を予め保持する記憶部を有している場合には、サーバに代えて、記憶部から関連情報を取得することができる。 The electronic device acquires the related information from the server connected to the network. However, for example, when the electronic device has a storage unit that holds the related information in advance, the electronic device is replaced with the server. Thus, related information can be acquired from the storage unit.

この場合、電子装置は、ネットワークに接続することなく、記憶部から関連情報を取得できるので、ネットワークに接続できない環境下においても、撮像画像上に、関連情報に対応する合成用画像を合成して表示することが可能となる。 In this case, the electronic device can acquire the related information from the storage unit without connecting to the network. Therefore, even in an environment where the electronic device cannot be connected to the network, the composite image corresponding to the related information is synthesized on the captured image. It is possible to display.

次に、上述した一連の処理は、専用のハードウエアにより実行させることもできるし、ソフトウエアにより実行させることもできる。一連の処理をソフトウエアにより実行させる場合には、そのソフトウエアを構成するプログラムが、いわゆる組み込み型のコンピュータ、または、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータなどに、記録媒体からインストールされる。 Next, the series of processes described above can be executed by dedicated hardware or can be executed by software. When a series of processing is executed by software, a program constituting the software can execute various functions by installing a so-called embedded computer or various programs. For example, it is installed from a recording medium in a general-purpose personal computer.

[コンピュータの構成例]
図２１は、上述した一連の処理をプログラムにより実行するパーソナルコンピュータの構成例を示している。 [Computer configuration example]
FIG. 21 shows a configuration example of a personal computer that executes the above-described series of processing by a program.

CPU（Central Processing Unit）２０１は、ROM（Read Only Memory）２０２、または記憶部２０８に記憶されているプログラムに従って各種の処理を実行する。RAM（Random Access Memory）２０３には、CPU２０１が実行するプログラムやデータなどが適宜記憶される。これらのCPU２０１、ROM２０２、およびRAM２０３は、バス２０４により相互に接続されている。 A CPU (Central Processing Unit) 201 executes various processes according to a program stored in a ROM (Read Only Memory) 202 or a storage unit 208. A RAM (Random Access Memory) 203 appropriately stores programs executed by the CPU 201 and data. The CPU 201, ROM 202, and RAM 203 are connected to each other via a bus 204.

CPU２０１にはまた、バス２０４を介して入出力インタフェース２０５が接続されている。入出力インタフェース２０５には、キーボード、マウス、マイクロホンなどよりなる入力部２０６、ディスプレイ、スピーカなどよりなる出力部２０７が接続されている。CPU２０１は、入力部２０６から入力される指令に対応して各種の処理を実行する。そして、CPU２０１は、処理の結果を出力部２０７に出力する。 An input / output interface 205 is also connected to the CPU 201 via the bus 204. Connected to the input / output interface 205 are an input unit 206 made up of a keyboard, mouse, microphone, and the like, and an output unit 207 made up of a display, a speaker, and the like. The CPU 201 executes various processes in response to commands input from the input unit 206. Then, the CPU 201 outputs the processing result to the output unit 207.

入出力インタフェース２０５に接続されている記憶部２０８は、例えばハードディスクからなり、CPU２０１が実行するプログラムや各種のデータを記憶する。通信部２０９は、インターネットやローカルエリアネットワークなどのネットワークを介して外部の装置と通信する。 A storage unit 208 connected to the input / output interface 205 includes, for example, a hard disk, and stores programs executed by the CPU 201 and various data. The communication unit 209 communicates with an external device via a network such as the Internet or a local area network.

また、通信部２０９を介してプログラムを取得し、記憶部２０８に記憶してもよい。 Further, a program may be acquired via the communication unit 209 and stored in the storage unit 208.

入出力インタフェース２０５に接続されているドライブ２１０は、磁気ディスク、光ディスク、光磁気ディスク、或いは半導体メモリなどのリムーバブルメディア２１１が装着されたとき、それらを駆動し、そこに記録されているプログラムやデータなどを取得する。取得されたプログラムやデータは、必要に応じて記憶部２０８に転送され、記憶される。 The drive 210 connected to the input / output interface 205 drives a removable medium 211 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, and drives the programs and data recorded therein. Get etc. The acquired program and data are transferred to and stored in the storage unit 208 as necessary.

コンピュータにインストールされ、コンピュータによって実行可能な状態とされるプログラムを記録する記録媒体は、図２１に示されるように、磁気ディスク（フレキシブルディスクを含む）、光ディスク（CD-ROM(compact disc-read only memory),DVD(digital versatile disc)を含む）、光磁気ディスク（ＭＤ（mini-disc）を含む）、もしくは半導体メモリなどよりなるパッケージメディアであるリムーバブルメディア２１１、または、プログラムが一時的もしくは永続的に記録されるROM２０２や、記憶部２０８を構成するハードディスクなどにより構成される。記録媒体へのプログラムの記録は、必要に応じてルータ、モデムなどのインタフェースである通信部２０９を介して、ローカルエリアネットワーク、インターネット、デジタル衛星放送といった、有線または無線の通信媒体を利用して行われる。 As shown in FIG. 21, a recording medium that is installed in a computer and records a program that can be executed by the computer includes a magnetic disk (including a flexible disk), an optical disk (CD-ROM (compact disc-read only). memory), DVD (including digital versatile disc)), magneto-optical disc (including MD (mini-disc)), or removable media 211, which is a package media consisting of semiconductor memory, or the program is temporary or permanent ROM 202 recorded in the memory, a hard disk constituting the storage unit 208, and the like. Recording of a program on a recording medium is performed using a wired or wireless communication medium such as a local area network, the Internet, or digital satellite broadcasting via a communication unit 209 that is an interface such as a router or a modem as necessary. Is called.

なお、本明細書において、記録媒体に記録されるプログラムを記述するステップは、記載された順序に沿って時系列的に行われる処理はもちろん、必ずしも時系列的に処理されなくとも、並列的あるいは個別に実行される処理をも含むものである。 In the present specification, the step of describing the program recorded on the recording medium is not limited to the processing performed in chronological order according to the described order, but is not necessarily performed in chronological order. It also includes processes that are executed individually.

さらに、本実施の形態は、上述した実施の形態に限定されるものではなく、本発明の要旨を逸脱しない範囲において種々の変更が可能である。 Furthermore, the present embodiment is not limited to the above-described embodiments, and various modifications can be made without departing from the gist of the present invention.

８１画像処理装置，１０１３次元情報保持部，１０２基準画像保持部，１０３合成画像生成部，１０４評価部，１０５最適化演算部，１０６パラメータ更新部 81 image processing apparatus, 101 three-dimensional information holding unit, 102 reference image holding unit, 103 composite image generation unit, 104 evaluation unit, 105 optimization calculation unit, 106 parameter update unit

Claims

In an image processing apparatus that estimates at least one of the position or orientation of the imaging unit based on a captured image obtained by imaging of the imaging unit,
Based on a projection parameter that projects the position of a pixel constituting a reference image serving as a reference for estimation onto the position of a pixel on the captured image corresponding to the pixel constituting the reference image, the projection parameter projects the position A composite image generating means for generating a composite image composed of each pixel present at a position on the captured image;
Evaluation generation means for generating an evaluation function representing a correlation between the composite image and the reference image;
Updating means for updating the projection parameter based on the evaluation function;
An image processing apparatus comprising: an estimation unit configured to estimate at least one of a position and a posture of the imaging unit based on the updated projection parameter.

The evaluation generation unit generates the evaluation function represented by a quadratic polynomial,
The image processing apparatus according to claim 1, wherein the updating unit updates a projection parameter calculated by a steepest descent method using the evaluation function as a new projection parameter.

The evaluation generation means generates the evaluation function having a candidate projection parameter representing a candidate of the updated projection parameter as a variable,
The image processing apparatus according to claim 2, wherein the updating unit updates the candidate projection parameter when the evaluation function is minimized as a new projection parameter.

An initial parameter generating means for generating the projection parameter based on the position of each pixel constituting the reference image and the corresponding position on the captured image;
The composite image generation means includes
In response to the projection parameter being generated by the initial parameter generation means, the composite image is generated based on the projection parameter,
The image processing apparatus according to claim 1, wherein the composite image is generated based on the updated projection parameter in response to the update of the projection parameter by the update unit.

The reference image is obtained by imaging with another imaging unit having a known position and orientation different from the imaging unit,
The image processing according to claim 1, wherein the estimation unit estimates at least one of a position or a posture of the imaging unit based on the projection parameter representing a difference in position and posture of the imaging unit with respect to the other imaging unit. apparatus.

In the image processing method of the image processing apparatus that estimates at least one of the position or orientation of the imaging unit based on a captured image obtained by imaging of the imaging unit,
The image processing apparatus includes:
A composite image generating means;
Evaluation generation means;
Update means;
Including estimation means and
The image synthesizing unit projects the position of a pixel constituting a reference image serving as a reference for estimation based on a projection parameter that projects the position of a pixel on the captured image corresponding to the pixel constituting the reference image. Generating a composite image composed of each pixel present at a position on the captured image projected by the parameter;
The evaluation generation means generates an evaluation function representing a correlation between the composite image and the reference image;
The updating means updates the projection parameter based on the evaluation function,
An image processing method, comprising: a step of estimating at least one of a position or a posture of the imaging unit based on the projection parameter after update.

A computer of an image processing apparatus that estimates at least one of the position or orientation of the imaging unit based on a captured image obtained by imaging of the imaging unit,
Based on a projection parameter that projects the position of a pixel constituting a reference image serving as a reference for estimation onto the position of a pixel on the captured image corresponding to the pixel constituting the reference image, the projection parameter projects the position A composite image generating means for generating a composite image composed of each pixel present at a position on the captured image;
Evaluation generation means for generating an evaluation function representing a correlation between the composite image and the reference image;
Updating means for updating the projection parameter based on the evaluation function;
A program for functioning as an estimation means for estimating at least one of the position or orientation of the imaging unit based on the updated projection parameter.

In an electronic device that estimates at least one of the position or orientation of the imaging unit based on a captured image obtained by imaging of the imaging unit,
Based on a projection parameter that projects the position of a pixel constituting a reference image serving as a reference for estimation onto the position of a pixel on the captured image corresponding to the pixel constituting the reference image, the projection parameter projects the position A composite image generating means for generating a composite image composed of each pixel present at a position on the captured image;
Evaluation generation means for generating an evaluation function representing a correlation between the composite image and the reference image;
Updating means for updating the projection parameter based on the evaluation function;
An estimation means for estimating at least one of the position or orientation of the imaging unit based on the updated projection parameters;
An electronic device comprising: execution means for executing predetermined processing based on the estimation result estimated by the estimation means.