JPH0799644A

JPH0799644A - Image communication device

Info

Publication number: JPH0799644A
Application number: JP5242486A
Authority: JP
Inventors: Katsumi Iijima; 克己飯島; Tomotaka Muramoto; 知孝村本; Akira Suga; 章菅; Masayoshi Sekine; 正慶関根; Hideaki Mitsutake; 英明光武
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1993-09-29
Filing date: 1993-09-29
Publication date: 1995-04-11

Abstract

(57)【要約】【目的】ＴＶ会議で互いの視線を一致させる。【構成】カメラ１，２は同じ人物を撮像する。撮像さ
れた画像は、ビデオ信号処理部３，４で処理され、対応
点抽出部５及び法線ベクトル抽出部６に印加される。対
応点抽出部５は２つの画像の対応点を抽出し、法線ベク
トル抽出部６は法線ベクトルを抽出する。３次元構造処
理部７は、対応点抽出部５及び法線ベクトル抽出部６で
抽出された情報を使い、被写体のおおよその３次元位置
情報を算出する。このように算出された被写体の構造情
報を使い、座標変換部８が、視線が一致する方向にカメ
ラ１，２を向け、その撮像画像を送受信伝送部９が通信
相手に送信する。 (57) [Summary] [Purpose] Match the line of sight of each other in a video conference. [Configuration] The cameras 1 and 2 capture the same person. The captured image is processed by the video signal processing units 3 and 4, and is applied to the corresponding point extraction unit 5 and the normal vector extraction unit 6. The corresponding point extraction unit 5 extracts the corresponding points of the two images, and the normal vector extraction unit 6 extracts the normal vector. The three-dimensional structure processing unit 7 uses the information extracted by the corresponding point extraction unit 5 and the normal vector extraction unit 6 to calculate approximate three-dimensional position information of the subject. Using the structure information of the subject calculated in this way, the coordinate conversion unit 8 points the cameras 1 and 2 in the directions in which the lines of sight coincide, and the transmission / reception transmission unit 9 transmits the captured image to the communication partner.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、画像通信装置に関し、
より具体的には、互いの視線を一致させた状態でＴＶ会
議できる画像通信装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image communication device,
More specifically, the present invention relates to an image communication device capable of performing a video conference while keeping the lines of sight of each other matched.

【０００２】[0002]

【従来の技術】２つの撮像装置の撮影画像を組み合わせ
て利用する複眼撮像系としては、以下のものが知られて
いる。例えば４，０００×４，０００画素の超高精細な
画像を超高精細モニタに表示する場合、撮像系における
高密度化および高感度化が間題となる。撮像系における
解決手法として、画素数の少ない二つの撮像系を用いて
共通の被写体を撮像し、各撮像系でそれぞれ得られる二
つの画像を合成することにより高精細な一つの画像を得
る複眼撮像装置の原理が提案されている（相澤など、
「超高精細画像取得のための基礎検討」、画像電子学会
予稿９０−０３−０４、ｐ．２３〜２８）。2. Description of the Related Art The following is known as a compound-eye image pickup system which uses images taken by two image pickup devices in combination. For example, when displaying an ultra-high-definition image of 4,000 × 4,000 pixels on an ultra-high-definition monitor, high density and high sensitivity in the image pickup system are problems. As a solution in the imaging system, compound eye imaging that captures a common subject using two imaging systems with a small number of pixels and combines two images obtained by each imaging system to obtain one high-definition image The principle of the device has been proposed (Aizawa et al.
"Fundamental Study for Acquisition of Ultra High Definition Images," IEICE Preliminary Report 90-03-04, p. 23-28).

【０００３】この原理に基づく複眼撮像装置は、図７に
示すように、左側撮像系１１０_Lと右側撮像系１１０_Rと
を用意し、左側撮像系１１０_Lと右側撮像系１１０_Rとで
サンプリング点を空問位相で１／２ピッチずらして被写
体１０１を撮像するとともに、左側撮像系１１０_Lで得
られた左側画像Ｉ_Lと右側撮像系１１０_Rで得られた右側
画像Ｉ_Rとをマイクロプロセッサ（以下、「ＣＰＵ」と
称する。）１２０で合成処理することにより、一つの撮
像系で被写体１０１を撮像したときに比べて高精細な一
つの出力画像Ｉ_OUTを得るものである。[0003] multi-lens imaging apparatus based on this principle, as shown in FIG. 7, prepared left and imaging system 110 _L and the right imaging system 110 _R, the sampling points in the left side imaging system 110 _L and the right imaging system 110 _R with imaging an object 101 is shifted 1/2 pitch with Soratoi phase, and a right image I _R obtained by the left image I _L and right-side imaging system 110 _R obtained in the left imaging system 110 _L microprocessor ( Hereinafter, by performing a combining process in "CPU") 120, one output image I _OUT having a higher definition than that obtained when the object 101 is imaged by one imaging system is obtained.

【０００４】図８は、図７に示した左側撮像系１１０_L
および右側撮像系１１０_Rの基本的な光学構造を示す。
左側撮像系１１０_Lは、左側撮像光学系１１１_Lと左側イ
メージセンサ１１２_Lとからなる。右側撮像系１１０
_Rは、右側撮像光学系１１１_Rと右側イメージセンサ１１
２_Rとからなる。ここで、左側撮像光学系１１１_Lと右側
撮像光学系１１１_Rとは、互いに等価な仕様（又は性
能）を有し、ここではズームレンズからなる。また、左
側イメージセンサ１１２_Lと右側イメージセンサ１１２_R
とは互いに等価な仕様（又は性能）を有し、サチコンな
どの撮像管又はＣＣＤ型撮像素子等の固体撮像素子から
なる。FIG. 8 is a left side imaging system 110 _L shown in FIG.
And the basic optical structure of the right imaging system 110 _R.
Left imaging system 110 _L is composed of a left imaging optical system 111 _L and left image sensor 112 _L. Right imaging system 110
_R is the right imaging optical system 111 _R and the right image sensor 11
It consists of 2 _R. Here, the left-side imaging optical system 111 _L and the right-side imaging optical system 111 _R have specifications (or performance) equivalent to each other, and in this case, they are zoom lenses. Also, the left image sensor 112 _L and the right image sensor 112 _R
Have specifications (or performance) equivalent to each other, and are composed of an image pickup tube such as a SATICON or a solid-state image pickup device such as a CCD type image pickup device.

【０００５】左側撮像系１１０_Lと右側撮像系１１０_Rと
は、それらの光軸Ｌ_L，Ｌ_Rが被写体面１０２上の点Ｏで
ほぼ交差し、かつ、被写体面１０２の法線Ｏ−Ｏ’に対
して線対称になるように配置されている。なお、各光軸
Ｌ_L，Ｌ_Rと被写体面１０２の法線Ｏ−Ｏ’とのなす角
（以下、「傾斜角」と称する。）をそれぞれθとしたと
きに、２θを輻輳角と定義する。The left-side image pickup system 110 _L and the right-side image pickup system 110 _R have their optical axes L _L and L _R substantially intersect at a point O on the object plane 102, and a normal line OO of the object plane 102. It is arranged so as to be line symmetric with respect to. It should be noted that when the angles formed by the optical axes L _L and L _R and the normal line OO ′ of the subject plane 102 (hereinafter, referred to as “tilt angles”) are respectively θ, 2θ is defined as a vergence angle. To do.

【０００６】この複眼撮像装置では、被写体距離が変化
した場合には、例えば、図８に示す×印を中心として左
側撮像系１１０_L及び右側撮像系１１０_Rをそれぞれ回転
させて、被写体距離の変化に応じて輻輳角２θを変更し
た上で、撮像が行なわれる。次に、距離画像に関して説
明する。図９は、距離画像を求めるために用いられる三
角測量の説明図である。なお、以下の説明では、特記し
ない限り、右側カメラ及び左側カメラのイメージセンサ
がそれぞれポジ面に置かれた状態で図示する。三角測量
によれば、２つのカメラ（右側カメラおよび左側カメ
ラ）を用いて三次元空間内にある物体（被写体）を撮像
した場合で、右側カメラのレンズの中心点をＯ_R、左側
カメラのレンズの中心点をＯ_Lとしたとき、この物体上
の一点Ｐの、右側カメラのセンサ面Ａ_SRでの投影点Ｐ_R
と左側カメラのセンサ面Ａ_SLでの投影点Ｐ_Lとから、こ
の物体上の一点Ｐの三次元座標を得ることができる。In this compound-eye image pickup device, when the subject distance changes, for example, the left image pickup system 110 _L and the right image pickup system 110 _R are respectively rotated around the X mark shown in FIG. 8 to change the subject distance. The imaging is performed after changing the vergence angle 2θ according to. Next, the distance image will be described. FIG. 9 is an explanatory diagram of triangulation used for obtaining a distance image. In the following description, unless otherwise specified, the image sensors of the right camera and the left camera are placed on the positive surface. According to triangulation, an object (subject) in the three-dimensional space by using two cameras (right camera and the left camera) when captured, the center point of the right camera lens O _R, the left camera lens When the center point of the object is O _L , the projection point P _{R of the} point P on this object on the sensor surface A _SR of the right camera
And the projection point P _L on the sensor plane A _SL of the left camera, the three-dimensional coordinates of one point P on this object can be obtained.

【０００７】図９において、基線Ｂ、基線長Ｌ_B、エピ
ポーラ面（視線面）Ａ_e及びエピポーラ・ライン（視線
像）Ｌ_eR，Ｌ_eLは、それぞれ以下のように定義される。
即ち、基線Ｂとは、右側カメラのレンズの中心点Ｏ_Rと
左側カメラのレンズの中心点Ｏ_Lとを結ぶ線をいう。基
線長Ｌ_Bとは、基線Ｂの長さをいう。エピポーラ面Ａ_eと
は、物体上の一点Ｐ、投影点Ｐ_R及び投影点Ｐ_Lの三点を
結んでできる平面をいう。エピポーラ・ライン（視線
像）Ｌ_eRとは、エピポーラ面Ａ_eと右側カメラのセンサ
面Ａ_SRとの交線をいい、エピポーラ・ラインＬ_eLとは、
エピポーラ面Ａ_eと左側カメラのセンサ面Ａ_SLとの交線
をいう。In FIG. 9, the base line B, the base line length L _B , the epipolar plane (line of sight) A _e, and the epipolar lines (line of sight images) L _eR and L _eL are defined as follows.
That is, the base line B is a line connecting the center point O _R of the lens of the right camera and the center point O _L of the lens of the left camera. The base line length L _B means the length of the base line B. The epipolar plane A _e is a plane formed by connecting one point P on the object, the projection point P _R, and the projection point P _L. The epipolar line (line-of-sight image) L _eR is a line of intersection between the epipolar surface A _e and the sensor surface A _SR of the right camera, and the epipolar line L _eL is
The line of intersection between the epipolar surface A _e and the sensor surface A _SL of the left camera.

【０００８】図１０に示すように、基線Ｂの中点を原点
Ｏ（０，０，０）、基線Ｂに沿ってｘ軸、紙面に垂直な
方向にｙ軸（図示せず。）、そして、基線Ｂ及びｙ軸に
垂直な方向にｚ軸をとり、右側カメラのレンズおよび左
側カメラのレンズの焦点距離をそれぞれｆとし、物体上
の一点Ｐの座標を（ｘ_P，ｙ_P，ｚ_P）、投影点Ｐ_Rの座標
を（ｘ_PR，ｙ_PR，ｚ_PR）及び投影点Ｐ_Lの座標を
（ｘ_PL，ｙ_PL，ｚ_PL）とする。このとき、右側カメラお
よび左側カメラの光軸がそれぞれ、図１０に示すよう
に、基線Ｂに対して垂直である場合（即ち、２つの光軸
が互いに平行である場合）、以下の式が成立する。即
ち、As shown in FIG. 10, the midpoint of the base line B is the origin O (0,0,0), the x axis is along the base line B, the y axis is perpendicular to the plane of the drawing (not shown), and , The base axis B and the z axis in the direction perpendicular to the y axis, the focal lengths of the lens of the right camera and the lens of the left camera are f, and the coordinates of a point P on the object are (x _P , y _P , z _P ), And the coordinates of the projection point P _R are (x _PR , y _PR , z _PR ) and the coordinates of the projection point P _L are (x _PL , y _PL , z _PL ). At this time, when the optical axes of the right camera and the left camera are respectively perpendicular to the base line B as shown in FIG. 10 (that is, when the two optical axes are parallel to each other), the following formula is established. To do. That is,

【０００９】[0009]

【数１】(x_PL+L_B/2)/f=(x_P+L_B/2)/z_P [Equation 1] (x _PL + L _B / 2) / f = (x _P + L _B / 2) / z _P

【００１０】[0010]

【数２】(x_PR-L_B/2)/f=(x_P-L_B/2)/z_P [Equation 2] (x _PR -L _B / 2) / f = (x _P -L _B / 2) / z _P

【００１１】[0011]

【数３】ｙ_Ｌ／ｆ＝ｙ_Ｒ／ｆ＝ｙ／ｚ_Ｐ [Number 3] _{_{y L / f = y R /}} f = y / z P

【００１２】[0012]

【数４】（Ｌ_Ｂ＋ｘ_ＰＬ−ｘ_ＰＲ）／ｆ＝Ｌ_Ｂ／ｚ_Ｐよって、物体上の一点Ｐの座標（ｘ_P，ｙ_P，ｚ_P）は、
下記式により求められる。即ち、Equation 4] _{_{_{(L B + x PL -x PR}}} ) / f = L B / z P Therefore, the coordinates of a point P on the object (x _P, y _P, z _P) are
It is calculated by the following formula. That is,

【００１３】[0013]

【数５】x_P=L_B((x_PL+x_PR)/2)/(L_B+x_PL-x_PR)[Formula 5] x _P = L _B ((x _PL + x _PR ) / 2) / (L _B + x _PL -x _PR )

【００１４】[0014]

【数６】y_P=L_B((y_PL+y_PR)/2)/(L_B+X_PL-x_PR)[Equation 6] y _P = L _B ((y _PL + y _PR ) / 2) / (L _B + X _PL -x _PR )

【００１５】[0015]

【数７】z_P=L_Bf/(L_B+x_PL-x_PR) また、右側カメラ及び左側カメラの光軸がそれぞれ、図
１１に示すように基線Ｂに対して所定の角度（輻輳角）
θをもっている場合には、以下の式が成立する。即ち、## EQU00007 ## z _P = L _B f / (L _B + x _PL -x _PR ) Further, the optical axes of the right camera and the left camera are each at a predetermined angle (convergence) with respect to the base line B as shown in FIG. Corner)
When θ is held, the following formula is established. That is,

【００１６】[0016]

【数８】(x_PL+L_B/2)z_PL=(x_P+L_B/2)/z_P (Equation 8) (x _PL + L _B / 2) z _PL = (x _P + L _B / 2) / z _P

【００１７】[0017]

【数９】(x_PR-L_B/2)/z_PR=(x_P-L_B/2)/z_P (Formula 9) (x _PR -L _B / 2) / z _PR = (x _P -L _B / 2) / z _P

【００１８】[0018]

【数１０】y_PL/z_PL=y_PR/z_PR=y_P/z_P [Equation 10] y _PL / z _PL = y _PR / z _PR = y _P / z _P

【００１９】[0019]

【数１１】Ｌ_Ｂ／ｚ_Ｐ＝（（ｚ_ＰＬ＋Ｌ_Ｂ／２）−（ｚ_ＰＬ／ｚ
_ＰＲ）（ｘ_ＰＲ−Ｌ_Ｂ／２）／Ｚ_ＰＬ但し、｜ｘ_ＰＲ｜≧｜x_PL｜Equation 11] _{_{_{_{L B / z P = ((}}}} z PL + L B / 2) - (z PL / z
_{_{_{PR) (x PR -L B /}}} 2) / Z PL _{However, | x PR | ≧ | x} PL |

【００２０】[0020]

【数１２】 L_B/z_P=(-(x_PR-L_B/2)+(z_PR/z_PL)(x_PL+L_B/2))/z_PR 但し、｜x_PR｜＜｜x_PL｜[Equation 12] L _B / z _P = (-(x _PR -L _B / 2) + (z _PR / z _PL ) (x _PL + L _B / 2)) / z _PR where | x _PR | <| x _PL ｜

【００２１】[0021]

【数１３】z_PR=(x_PR-L_B/2)tan(θ)+fcos(θ)[Formula 13] z _PR = (x _PR -L _B / 2) tan (θ) + fcos (θ)

【００２２】[0022]

【数１４】z_PL=-(x_PL+L_B/2)tan(θ)+fcos(θ)[Formula 14] z _PL =-(x _PL + L _B / 2) tan (θ) + fcos (θ)

【００２３】これらのThese

【数１】〜[Equation 1] ~

【数１４】から、物体上の一点Ｐの座標（ｘ_P，ｙ_P，ｚ
_P）を求めることができる。From the equation (14), the coordinates (x _P , y _P , z
_P ) can be obtained.

【００２４】以上説明した三角測量により、右側撮像系
及び左側撮像系からなる複眼撮像系によって撮像した二
枚の画像から、物体（被写体）までの距離を求めること
ができる。しかし、三角測量は、右側カメラのセンサ面
Ａ_SRでの投影点Ｐ_Rと左側カメラのセンサ面Ａ_SLでの投
影点Ｐ_Lとが同じ点Ｐの投影点であることを前提条件と
して、物体までの距離を求めるものであるので、左側カ
メラのセンサ面Ａ_SLでの投影点Ｐ_Lに対応する右側カメ
ラのセンサ面Ａ_SRでの投影点Ｐ_Rが抽出されている必要
がある。従って、複眼撮像系を用いて距離情報を得るに
は、如何にして対応点を抽出するか（対応点抽出方法）
が問題となる。代表的な対応点抽出方法としては、既に
工場などで応用されているテンプレート・マッチング法
等がある。By the triangulation described above, the distance to the object (subject) can be obtained from the two images picked up by the compound eye image pickup system including the right side image pickup system and the left side image pickup system. However, triangulation is based on the assumption that the projection point P _R on the sensor surface A _SR of the right camera and the projection point P _L on the sensor surface A _SL of the left camera are projection points of the same point P. Since the distance to is calculated, the projection point P _R on the sensor plane A _SR of the right camera corresponding to the projection point P _L on the sensor plane A _SL of the left camera needs to be extracted. Therefore, how to extract corresponding points in order to obtain distance information using the compound-eye imaging system (corresponding point extraction method)
Is a problem. As a typical corresponding point extraction method, there is a template matching method already applied in factories and the like.

【００２５】テンプレート・マッチング法を説明する。
テンプレート・マッチング法は、左側カメラのセンサ面
Ａ_SLに結像される左画像の任意の一点を囲むテンプレー
トを考え、このテンプレート内の画像に対して、右側カ
メラのセンサＡ_SRに結像される右画像とを比較し、その
類似性から対応点を決定するものである。なお、類似性
の判定には、ＳＳＤＡ（ＳｅｑｕｅｎｔｉａｌＳｉｍ
ｉｌａｒｉｔｙＤｅｔｅｃｔｉｏｎＡｌｇｏｒｉｔ
ｈｍ）法と、相関法がある。The template matching method will be described.
The template matching method considers a template surrounding an arbitrary point of the left image formed on the sensor surface A _SL of the left camera, and an image in this template is formed on the sensor A _{SR of the} right camera. The corresponding image is compared with the right image and the corresponding point is determined from the similarity. Note that SSDA (Sequential Sim) is used to determine the similarity.
ilarity Detection Algorithm
hm) method and correlation method.

【００２６】ＳＳＤＡ法では、数１５に示すように、左
画像のエピポーラ・ラインＬ_eL上の全ての画素と右画像
のエピポーラ・ラインＬ_eR上の全ての画素に対して、左
画像のテンプレート内の画像中の画素値Ｅ_Lと探索する
右画像中の画素値Ｅ_Rとの差を加算し、得られた和Ｅ
（ｘ，ｙ）が最小になる座標をもって対応点の座標とす
るものである。In the SSDA method, as shown in equation 15, for all the pixels on the epipolar line L _eL of the left image and all the pixels on the epipolar line L _eR of the right image, in the template of the left image. The sum E obtained by adding the difference between the pixel value E _L in the image of _P and the pixel value E _R in the right image to be searched for
The coordinates that minimize (x, y) are the coordinates of the corresponding points.

【００２７】[0027]

【数１５】 [Equation 15]

【００２８】ＳＳＤＡ法では、今までに計算した他の座
標における最小値よりも計算中の画素値の差の和が大き
〈なった場合には、計算を中止して次の座標に移動して
もよいので、余分な計算をなくして計算時間を短縮でき
る。In the SSDA method, when the sum of the differences in the pixel values being calculated becomes larger than the minimum value at other coordinates calculated up to now, the calculation is stopped and the operation moves to the next coordinate. Since it is good, the calculation time can be shortened by eliminating extra calculation.

【００２９】相関法は、数１６に示すように、左画像の
テンプレート内の画像中の画素値ＥLと探索する右画像
中の画素値Ｅ_Rとの相互相関をとることにより相関値Ｐ
（ｘ，ｙ）を求めて、求めた相関値Ｐ（ｘ，ｙ）が最大
となる座標をもって対応点の座標とするものである。な
お、数１６に示す正規化相互相関では、最大値は１とな
る。In the correlation method, the correlation value P is obtained by taking the cross-correlation between the pixel value E L in the image in the template of the left image and the pixel value E _R in the right image to be searched, as shown in Equation 16.
(X, y) is obtained, and the coordinate at which the obtained correlation value P (x, y) is maximum is used as the coordinate of the corresponding point. In the normalized cross-correlation shown in Expression 16, the maximum value is 1.

【００３０】[0030]

【数１６】 [Equation 16]

【００３１】また、視線を一致させる技術に関しては、
平成４年特許出願公開第１９６６９６号がある。そこに
は、通信相手の顔画像を見ながら同時にその観察者であ
る通信者の画像を撮像して通信相手に送信する対面通信
において、通信相手との視線一致を光学的に実現する方
式として次のようなものが提案されている。例えば、動
画像として表示される通信相手と正対する通信において
良好な視線一致を得るために、撮像系（カメラ）と表示
系（ディスプレイ装置）の光軸をハーフミラーで結合す
る構成を開示している。Regarding the technique for matching the lines of sight,
There is a patent application publication No. 196696 in 1992. In the face-to-face communication in which the face image of the communication partner is simultaneously observed and the image of the communication partner who is the observer is captured and transmitted to the communication partner, there is a method for optically achieving line-of-sight matching with the communication partner. Something like is proposed. For example, in order to obtain a good line-of-sight match in communication directly facing a communication partner displayed as a moving image, a configuration is disclosed in which the optical axes of the imaging system (camera) and the display system (display device) are coupled by a half mirror. There is.

【００３２】また、他の例として、図１２に示すよう
に、表示装置に近接した位置にカメラを配置する。図１
２（１）は正面図、同（２）はその側面図を示し、表示
装置（ディスプレイ）の上方にカメラ２を配置する。カ
メラ２は、被撮影者３に対しては角度θ（カメラ２とデ
ィスプレイ１面の動画ウインドウ中心Ｔ点を注視する被
撮影者３の眼Ｅを結ぶ線とのなす角度）をもっている。
これは、ディスプレイ面と被撮影者との距離Ｌが大きい
場合に、視線の不一致を感じさせない機器配置構成であ
る。As another example, as shown in FIG. 12, a camera is arranged at a position close to the display device. Figure 1
2 (1) is a front view and FIG. 2 (2) is a side view thereof, and the camera 2 is arranged above a display device (display). The camera 2 has an angle θ with respect to the person to be photographed 3 (an angle formed by the camera 2 and a line connecting the eyes E of the person to be photographed 3 gazing at the center point T of the moving image window on the display 1 surface).
This is a device arrangement configuration that does not make the user feel a line of sight mismatch when the distance L between the display surface and the person to be photographed is large.

【００３３】[0033]

【発明が解決しようとする課題】しかし、従来例では、
視覚的に違和感を感じさせるという問題点があった。ま
た、より臨場感を高めるためには大画面表示が必須とな
り、表示装置の画面サイズが大きくなると共に半透鏡の
サイズも大きくしなければならない。その結果、奥行き
が大きくなり、表示・撮像装置自体が大型のものとなっ
てしまうという問題点があった。However, in the conventional example,
There was a problem that it felt visually uncomfortable. Further, a large screen display is indispensable in order to enhance the sense of realism, and the screen size of the display device must be increased and the size of the semi-transparent mirror must be increased. As a result, there is a problem that the depth becomes large and the display / imaging device itself becomes large.

【００３４】大画面化に応じて撮像装置にも広角のレン
ズを用いる事が不可欠となり、かつ、半透鏡の価格もサ
イズに応じて高いものとなるので、その結果、装置全体
が高価なものになるという問題点があった。It is indispensable to use a wide-angle lens also in the image pickup apparatus in accordance with the increase in screen size, and the price of the semi-transparent mirror becomes high depending on the size. As a result, the entire apparatus becomes expensive. There was a problem that

【００３５】従来例では更に、目を避けたい場合に、有
効な手段が存在しない。Further, in the conventional example, there is no effective means for avoiding eyes.

【００３６】本発明は、このような問題点を解決する画
像通信装置を提示することを目的とする。It is an object of the present invention to provide an image communication device that solves such problems.

【００３７】[0037]

【課題を解決するための手段】本発明に係る画像通信装
置は、通信相手の画像を互いの表示装置に表示して会話
する画像通信装置であって、各通信端末が、複数の撮像
装置と、当該複数の撮像装置による撮影画像を、通信相
手の視線と一致させるべく処理する視線一致処理装置と
を具備することを特徴とする。An image communication apparatus according to the present invention is an image communication apparatus for displaying images of communication partners on a mutual display device for conversation, and each communication terminal includes a plurality of image pickup devices. And a line-of-sight matching processing device that processes images captured by the plurality of imaging devices so as to match the line-of-sight of the communication partner.

【００３８】[0038]

【作用】上記手段により、通信相手との視線が一致した
画像を通信相手に送信できるようになり、視線を相互に
一致させることができる。By the above means, it becomes possible to transmit to the communication partner an image in which the line of sight of the communication partner matches, and the lines of sight can be matched with each other.

【００３９】[0039]

【実施例】以下、図面を参照して本発明の実施例を説明
する。Embodiments of the present invention will be described below with reference to the drawings.

【００４０】図１は、本発明の一実施例の概略構成ブロ
ック図を示し、図２は端末の外観図を示す。FIG. 1 shows a schematic block diagram of an embodiment of the present invention, and FIG. 2 shows an external view of a terminal.

【００４１】図２に示すように、端末としては、ディス
プレイ装置、２台のカメラ、マンマシン・インターフェ
ース部（キーボード、マウス、スイッチ類など）からな
る。図１において、カメラ１，２は同じ被写体を撮像す
る。ＴＶ会議などの状況下では、通常、主たる被写体は
ＴＶ会議しようとする人物である。撮像された画像は、
画像メモリを有するビデオ信号処理部３，４へ伝送され
る。ビデオ信号処理部３，４で処理された画像は、対応
点抽出部５及び法線ベクトル抽出部６に印加される。対
応点抽出部５は対応点を抽出し、法線ベクトル抽出部６
は法線ベクトルを抽出する。As shown in FIG. 2, the terminal comprises a display device, two cameras, and a man-machine interface section (keyboard, mouse, switches, etc.). In FIG. 1, cameras 1 and 2 image the same subject. In a situation such as a video conference, the main subject is usually a person who is going to make a video conference. The captured image is
It is transmitted to the video signal processing units 3 and 4 having an image memory. The images processed by the video signal processing units 3 and 4 are applied to the corresponding point extracting unit 5 and the normal vector extracting unit 6. The corresponding point extraction unit 5 extracts the corresponding points, and the normal vector extraction unit 6
Extracts the normal vector.

【００４２】３次元構造処理部７は、対応点抽出部５及
び法線ベクトル抽出部６で抽出された情報を使い、被写
体のおおよその３次元位置情報を算出する。このように
算出された被写体の構造情報を使い、座標変換部８が、
任意の指定の方向に被写体を向かせる。指定の方向に向
けられた被写体の画像を送受信伝送部９から通信相手に
送信する。The three-dimensional structure processing unit 7 uses the information extracted by the corresponding point extraction unit 5 and the normal vector extraction unit 6 to calculate the approximate three-dimensional position information of the subject. Using the structure information of the subject calculated in this way, the coordinate conversion unit 8
Aim the subject in any specified direction. The image of the subject pointed in the designated direction is transmitted from the transmission / reception transmission unit 9 to the communication partner.

【００４３】以上はごく基本的な情報の流れであるが、
以下、細部を詳細に説明する。The above is a very basic flow of information,
The details will be described below in detail.

【００４４】対応点抽出部５を説明する。対応点抽出部
５は、２つの撮像画像からそれぞれ抽出した二本のエピ
ポーラ・ラインから形成される視差画面上の各交点で、
この二枚の二値画像の実画素に基づく興奮性結合及び抑
制性結合をもつ局所演算を行うことにより視差ラインを
算出する。この二枚の二値画像の対応点を抽出する際
に、各二値画像の互いに隣り合う実画素の間に、実画素
間の連続性を示す仮想画素をそれぞれ設け、仮想画素に
基づく興奮性結合及び抑制性結合をもつ局所演算を、実
画素に基づく興奮性結合及び抑制性結合をもつ局所演算
と並行して行う。なお、仮想画素に基づく興奮性結合
は、実画素に基づく興奮性結合と互いに拮抗し、また、
仮想画素に基づく抑制性結合は実画素に基づく抑制性結
合と互いに拮抗する。The corresponding point extraction unit 5 will be described. The corresponding point extraction unit 5 calculates, at each intersection point on the parallax screen formed from two epipolar lines extracted from the two captured images,
A parallax line is calculated by performing a local calculation having excitatory coupling and inhibitory coupling based on the actual pixels of these two binary images. When extracting the corresponding points of these two binary images, virtual pixels indicating continuity between the actual pixels are provided between adjacent real pixels of each binary image, and excitability based on the virtual pixels is provided. A local operation with associative and inhibitory connections is performed in parallel with a local operation with excitatory and inhibitory connections based on real pixels. In addition, the excitatory connection based on the virtual pixel counteracts with the excitatory connection based on the real pixel,
Inhibitory connections based on virtual pixels compete with inhibitory connections based on real pixels.

【００４５】ＤａｖｉｄＭａｒｒにより提案された協
調アルゴリズムでは以下の３つの規則がある。即ち、規則１（適合性）・・・黒点は黒点とのみ整合し得る。In the cooperative algorithm proposed by David Marr, there are the following three rules. That is, Rule 1 (Compatibility) ... Black dots can only match black dots.

【００４６】規則２（一意性）・・・ほとんど常に、一
つの画像の一つの黒点は、他方の画像の唯一の黒点と整
合し得る。Rule 2 (Uniqueness) —Almost always one black dot in one image can match the only black dot in the other image.

【００４７】規則３（連続性）・・・整合する点の視差
は、ほとんど全域で滑らかに変化する。Rule 3 (Continuity) ... The parallax of matching points changes smoothly over almost the entire area.

【００４８】本実施例は、これらの３つの規則に更に、
以下の規則４〜６を追加するものである。即ち、規則４・・・視差の連続性の度合いを、局所処理の中心
ほど強く局所処理の周辺ほど弱くする。This embodiment further adds to these three rules:
The following rules 4 to 6 are added. That is, rule 4 ... The degree of continuity of parallax is set stronger at the center of the local processing and weaker at the periphery of the local processing.

【００４９】規則５・・・一意性を強化するため、局所
処理の周辺に行くほど抑制性結合を強くする。Rule 5: In order to strengthen uniqueness, the inhibitory coupling is strengthened toward the periphery of the local processing.

【００５０】規則６・・・画像中の互いに隣り合う実画
素の間に、実画素間の連続性を示す仮想画素をそれぞれ
設け、設けた各仮想画素に基づいて規則１〜５に拮抗す
る処理を施す。Rule 6 ... Processing in which virtual pixels indicating continuity between real pixels are provided between adjacent real pixels in an image, and rules 1 to 5 are matched based on the provided virtual pixels. Give.

【００５１】なお、規則５における局所処理とは、一つ
の画像の一実画素に注目したとき、その実画素を中心と
した局所領域に施す処理をいい、局所処理の中心とは、
このとき注目した一実画素の位置をいい、局所処理の周
辺とは、このとき注目した一実画素から離れた位置をい
う。この局所処理に繰り返しにより、視差ラインを算出
し、対応点を抽出する。Note that the local processing in rule 5 is the processing performed on a local area centered on the real pixel when one real pixel of one image is focused, and the center of the local processing is
At this time, the position of one real pixel of interest is referred to, and the periphery of the local processing means a position away from the one real pixel of interest at this time. By repeating this local processing, the parallax line is calculated and the corresponding points are extracted.

【００５２】法線ベクトル抽出部６は、図３に示すよう
に３つのモジュールからなる。ＳＳモジュールは、画像
上の物体の遮蔽輪郭部分の法線方向を基に３次元構造情
報を抽出する。このモジュールは、３次元物体の表面に
対する滑らかな拘束を仮定して、緩和法により大局的に
最も誤差の少ない状態を解とするものである。初期条件
として遮蔽輪郭の法線方向情報が用いられる。画素点
〈ｘ，ｙ）における法線方回を表わす関数をｆ（ｘ，
ｙ），ｇ（ｘ，ｙ）とする。近傍の法線との問の滑らか
さを表わす量egを、次式のように定義する。即ち、The normal vector extraction unit 6 is composed of three modules as shown in FIG. The SS module extracts three-dimensional structural information based on the normal direction of the occluding contour portion of the object on the image. This module assumes a smooth constraint on the surface of a three-dimensional object and uses the relaxation method to find the state with the smallest error globally. As the initial condition, the normal direction information of the shield contour is used. The function representing the normal direction at the pixel point <x, y) is f (x,
y) and g (x, y). The quantity eg, which represents the smoothness of the question with the normal line in the vicinity, is defined as the following equation. That is,

【００５３】[0053]

【数１７】eg=∫∫((f_x ²+f_y ²)+(g_x ²+g_y ²))dxdy 但し、f_xは関数ｆのｘ方向の偏微分、f_yは関数ｆのｙ方
向の偏微分、g_xは関数ｇのｘ方向の偏微分、g_yは関数ｇ
のｙ方向の偏微分である。## EQU17 ## eg = ∫∫ ((f _x ² + f _y ² ) + (g _x ² + g _y ² )) dxdy where f _x is the partial differential of the function f in the x direction and f _y is the function f partial differential in the y direction, g _x is the partial differential in the x direction of the function g, and g _y is the function g
Is a partial differential in the y direction.

【００５４】また、測定輝度値Ｉと理論値Ｒ（ｆ，ｇ）
との誤差を表わす量ｅｆを、次式のように定義する。即
ち、Further, the measured luminance value I and the theoretical value R (f, g)
The quantity ef representing the error between and is defined as the following equation. That is,

【００５５】[0055]

【数１８】ef=∫∫(I(x,y)−R(f,g))²dxdy ＳＳモジュールは、ｅｇ＋λｅｆを最小化するために変
分法によりｆ，ｇを解いて離散化を行ない、緩和法を実
行して法線方向情報を抽出する。[Equation 18] ef = ∫∫ (I (x, y) −R (f, g)) ² dxdy SS module solves f and g by the variational method to minimize eg + λef and performs discretization. , The relaxation method is executed to extract the normal direction information.

【００５６】ＬＳＡＭモジュールは、物体の表面形状と
して球面を想定して、画素の局所的な輝度変化情報に基
づいて法線方向を算出する。画素上での方向ξにおける
輝度Ｉの２次微分ｄ²Ｉは、ξだけ回転した座標系
（ｕ，ｖ）におけるｕ方向の２次偏微分Ｉ_uuの値と等し
い。この値は、次の座標変換式 I_uu=I_xxcos²ξ+I_yysin²ξ+2I_xycosξsinξ から求められる。The LSAM module assumes a spherical surface as the surface shape of the object and calculates the normal direction based on the local luminance change information of the pixel. The second derivative d ² I of the luminance I in the direction ξ on the pixel is equal to the value of the second partial derivative I _uu in the u direction in the coordinate system (u, v) rotated by ξ. This value is obtained from the following coordinate conversion formula I _uu = I _xx cos ² ξ + I _yy sin ² ξ + 2I _xy cos ξ sin ξ.

【００５７】Ｉ_uuが最小になる条件は、ｄＩ_uu／ｄξ＝０である。従って、 tanξ＝２Ｉ_ｘｙ／（Ｉ_ｘｘ−Ｉ_ｙｙ）となる。The condition for minimizing I _uu is dI _uu / dξ = 0. Therefore, tan ξ = 2I _xy / (I _xx −I _yy ).

【００５８】物体表面が球状であると仮定し、そのとき
のＩ_ｘｘ，I_yy，I_xyの値を代入すると、 tan2ξ＝
2xy/(x²-y²) となる。また、 tanθ＝ｙ／ｘであるから、 tan2θ＝2tanθ／（１−tan²θ）＝2xy/(x²-y²) ＝tan2ξ となる。Assuming that the surface of the object is spherical and substituting the values of I _xx , I _yy , and I _xy at that time, tan2ξ =
It becomes 2xy / (x ² -y ² ). Since tan θ = y / x, tan 2θ = ² tan θ / (1-tan ² θ) = 2xy / (x ² −y ² ) = tan ² ξ.

【００５９】従って、表面形状を球面と仮定したとき、
法線の方位角θは、輝度の２次微分ｄ²Ｉの最小方向に
等しい。ＬＳＡＭモジュールは、このような演算により
法線方向を算出する。Therefore, assuming that the surface shape is spherical,
The azimuth angle θ of the normal is equal to the minimum direction of the ^second derivative d ² I of luminance. The LSAM module calculates the normal direction by such calculation.

【００６０】ＳＳモジュール及びＬＳＡＭモジュールの
詳細は、Ｉｋｅｕｃｈｉ，Ｋ．＆Ｂ．Ｋ．Ｐ．Ｈｏ
ｒｎ ”ＮｕｍｅｒｉｃａｌＳｈａｐｅｆｒｏｍ
ＳｈａｄｉｎｇａｎｄＯｃｄｕｌｉｎｇＢｏｕｎ
ｄａｒｉｅｓ”，Ａ１，ｖｏｌ．１７，＃１−３，Ｐ
Ｐ．１４１−１８４，１９８１、及び、Ｐｅｎｔｌａｎ
ｄＡ．Ｐ．”ＬｏｃａｌＳｈａｄｉｎｇａｎａｌ
ｙｓｉｓ”，ＩＥＥＥＴｒａｎｓ，ＰＡＭ１−６，＃
２，１９８４に記載されている。Details of the SS module and the LSAM module are described in Ikeuchi, K. et al. & B. K. P. Ho
rn "Numerical Shape from
Shading and Occluding Boun
"Daries", A1, vol. 17, # 1-3, P
P. 141-184, 1981 and Pentlan
d A. P. "Local Shading anal
ysis ”, IEEE Trans, PAM1-6, #
2, 1984.

【００６１】ＳＳモジュール及びＬＳＡＭモジュールで
算出された法線方向情報から、新たに利用する法線方向
情報を次のように定義する。即ち、ＳＳモジュールで算
出された法線ベクトルをｆ_SSとし、ＬＳＡＭモジュール
で算出された法線ベクトルをｆ_LSAMとすると、新たに利
用する法線方向情報ｆを次式により求める。即ち、ｆ＝（ｆ_SS＋ｆ_LSAM）／２又は、任意の点（ｘ，ｙ）の近傍領域のｆ_SSとｆ_LSAMの
全体の平均からｆを定義してもよい。仮に、近傍領域か
ら４つのｆ_SS1，ｆ_SS2，ｆ_SS3，ｆ_SS4；ｆ_LSAM1，ｆ
_LSAM2，ｆ_LSAM3，ｆ_LSAM4が得られたとき、ｆ＝（１／８）（Σ（ｆ_SSi＋ｆ_LSAMi））とする。但し、ｉは１〜４である。The normal direction information to be newly used is defined as follows from the normal direction information calculated by the SS module and the LSAM module. That is, assuming that the normal vector calculated by the SS module is f _SS and the normal vector calculated by the LSAM module is f _LSAM , newly used normal direction information f is obtained by the following equation. That is, f = (f _SS + f _LSAM ) / 2, or f may be defined from the overall average of f _SS and f _{LSAM in} the vicinity of an arbitrary point (x, y). If four f _SS1 , f _SS2 , f _SS3 , f _SS4 ; f _LSAM1 , f from the neighboring region
_{When LSAM2} , f _LSAM3 , and f _LSAM4 are obtained, f = (1/8) (Σ (f _SSi + f _LSAMi )). However, i is 1 to 4.

【００６２】座標変換部８を説明する。座標変換部８
は、自動選択モードと任意選択モードのどちらか一方を
選択し、実行する。自動変換モードは、法線ベクトル抽
出部６で算出した法線ベクトルを利用する。通常、法線
ベクトルは被写体の形によって様々な方向を向いている
が、例えば、ＴＶ会議などで人物が中心の被写体の場
合、撮像される被写体として人物がほぼ中心に位置する
ので、中心近傍の領域の平均値から全体の方向を見地
し、これがカメラの光軸と一致するように回転させる。
数学的には、次の変換になる。The coordinate conversion unit 8 will be described. Coordinate conversion unit 8
Selects and executes either the automatic selection mode or the arbitrary selection mode. The automatic conversion mode uses the normal vector calculated by the normal vector extraction unit 6. Normally, the normal vector is oriented in various directions depending on the shape of the subject. For example, in a video conference or the like, where the person is the center of the subject, the person is almost centered as the subject to be imaged. From the average value of the area, look at the whole direction and rotate it so that it coincides with the optical axis of the camera.
Mathematically, the conversion is as follows.

【００６３】[0063]

【数１９】 [Formula 19]

【００６４】ここで、（ω，φ，χ）が回転角である。Here, (ω, φ, χ) is the rotation angle.

【００６５】任意選択モードでは、ＴＶ会議における通
信相手が、送信側の撮像画像の向きを変換することを可
能にする。変換式として数１９と同じであるが、変換の
回転角の情報が通信相手から伝送されてくるのが特徴で
ある。In the optional mode, the communication partner in the TV conference can change the orientation of the captured image on the transmission side. Although the conversion formula is the same as that of the equation (19), it is characterized in that the information about the rotation angle of the conversion is transmitted from the communication partner.

【００６６】次に、複眼光学系を説明する。図４は、本
実施例における複眼撮像系の基本配置を示す。右側撮像
系１００は、右側撮像光学系１０２と右側イメージセン
サ１０３とからなり、左側撮像光学系２００は、左側撮
像光学系２０２と左側イメージセンサ２０３とからな
る。ここで、右側撮像光学系１０２と左側撮像光学系２
０２とは互いに等価な仕様（又は性能）を有し、ここで
はズームレンズからなる。また、右側イメージセンサ１
０３と左側イメージセンサ２０３は互いに等価な仕様
（又は性能）を有し、ここでは、サチコンなどの撮像管
又はＣＣＤ型撮像素子等のような固体撮像索子からな
る。Next, the compound eye optical system will be described. FIG. 4 shows the basic arrangement of the compound-eye imaging system in this embodiment. The right imaging system 100 includes a right imaging optical system 102 and a right image sensor 103, and the left imaging optical system 200 includes a left imaging optical system 202 and a left image sensor 203. Here, the right imaging optical system 102 and the left imaging optical system 2
02 has specifications (or performance) equivalent to each other, and is composed of a zoom lens here. Also, the right image sensor 1
03 and the left-side image sensor 203 have specifications (or performance) equivalent to each other, and in this case, they are image pickup tubes such as Sachicon or solid-state image pickup elements such as CCD image pickup elements.

【００６７】右側撮像系１００と左側撮像系２００は、
それらの光軸１０１，２０１が被写体面ｌ上の点Ｏでほ
ぼ交差し、かつ、被写体面１０００の法線Ｏ−Ｏ’に対
して線対称になるように配置されている。なお、各光軸
１０１，２０１と被写体面１の法線Ｏ−Ｏ’とのなす角
（以下、傾斜角と称する）をそれぞれθとしたときに、
２θを輻輳角と定義する。この複眼撮像装置では、被写
体距離が変化した場合には、たとえば回転中心Ｆ₁，Ｆ₂
を中心として右側撮像系１００及び左側撮像系２００を
それぞれ回転させて、被写体距離の変化に応じて輻輳角
２θを変更し、その後、撮像を実行する。The right side imaging system 100 and the left side imaging system 200 are
The optical axes 101 and 201 substantially intersect at a point O on the object plane 1 and are arranged so as to be line-symmetric with respect to a normal line OO ′ of the object plane 1000. When the angles formed by the optical axes 101 and 201 and the normal line OO ′ of the object plane 1 (hereinafter referred to as tilt angles) are θ,
2θ is defined as the vergence angle. In this compound-eye imaging device, when the subject distance changes, for example, the rotation centers F ₁ and F ₂ are rotated.
The right-side imaging system 100 and the left-side imaging system 200 are respectively rotated around the center, and the convergence angle 2θ is changed according to the change in the subject distance, and then the imaging is performed.

【００６８】次に、本実施例における複眼撮像装置およ
びその処理系を具体的に説明する。図５は本実施例の複
眼撮像装置の第１実施例を示す概略構成図、図６はその
処理ブロック図である。図５に示す複眼撮像装置では、
図４に示す基本配置を前提に、２組の撮像系（右側撮像
系１００及び左側撮像系２００）を用いて、共通の被写
体を撮像して得られる２つの画像を合成することにより
高精細な１つの画像を得るものである。Next, the compound-eye image pickup apparatus and its processing system in this embodiment will be specifically described. FIG. 5 is a schematic configuration diagram showing a first embodiment of the compound-eye image pickup apparatus of this embodiment, and FIG. 6 is a processing block diagram thereof. In the compound-eye imaging device shown in FIG. 5,
Based on the basic arrangement shown in FIG. 4, two sets of image pickup systems (right side image pickup system 100 and left side image pickup system 200) are used to synthesize two images obtained by picking up an image of a common subject to obtain a high-definition image. One image is obtained.

【００６９】右側撮像系１００及び左側撮像系２００の
構成を、図５を参照して詳細に説明する。右側撮像系１
００は、右側撮像光学系１０２と、撮像管からなる右側
イメージ・センサ１０３とからなり、左側撮像系２００
は、左側撮像光学系２０２と、撮像管からなる左側イメ
ージ・センサ２０３とからなる。The structures of the right side image pickup system 100 and the left side image pickup system 200 will be described in detail with reference to FIG. Right imaging system 1
Reference numeral 00 denotes a right side image pickup optical system 102 and a right side image sensor 103 formed of an image pickup tube.
Is composed of a left-side image pickup optical system 202 and a left-side image sensor 203 formed of an image pickup tube.

【００７０】右側撮像光学系１０２及び左側撮像光学系
２０２はそれぞれ、変倍群１０２ｂ，２０２ｂ及び合焦
群１０２ｄ，２０２ｄを含むレンズ群１０２ａ〜１０２
ｄ，２０２ａ〜２０２ｄと、変倍群１０２ｂ，２０２ｂ
を駆動するための駆動系であるズーム・モータ１０６，
２０６と、合焦群１０２ｄ，２０２ｄを駆動するための
駆動系であるフォーカス・モータ１０７，２０７と、光
軸１０１，２０１を含む平面内で撮像光学系１０２，２
０２及びイメージ・センサ１０３，２０３を一体として
回転させるための機構系（不図示）及び駆動系（輻輳角
モータ１０４，２０４）と、輻輳角モータ１０４，２０
４の回転角を検出する角度エンコーダ１０５，２０５と
を含む。なお、角度エンコータ１０５，２０５として
は、例えばポテンショメータのような外付けの部材を用
いてもよいし、例えばパルス・モータのように駆動信号
から回転角を検出できるモータではの駆動信号を回転角
に変換する手段でもよい。The right-side imaging optical system 102 and the left-side imaging optical system 202 are respectively lens groups 102a to 102 including a zooming group 102b, 202b and a focusing group 102d, 202d.
d, 202a to 202d, and the zoom groups 102b and 202b
A zoom motor 106, which is a drive system for driving the
206, the focus motors 107 and 207 which are drive systems for driving the focusing groups 102d and 202d, and the imaging optical systems 102 and 2 in a plane including the optical axes 101 and 201.
02 and the image sensors 103 and 203 as a unit, a mechanism system (not shown) and a drive system (convergence angle motors 104 and 204), and convergence angle motors 104 and 20.
The angle encoders 105 and 205 for detecting the rotation angle of No. 4 are included. Note that external members such as potentiometers may be used as the angle encoders 105 and 205. For example, a drive signal of a motor capable of detecting a rotation angle from a drive signal such as a pulse motor can be used as a rotation angle. It may be a means for converting.

【００７１】右側撮像光学系１０２のフォーカス・モー
タ１０７及びズーム・モータ１０６は、変倍群１０２ｂ
の光軸方向の位置情報を得るためのフォーカス・エンコ
ーダ１０９の出力信号及び合焦群１０２ｄの光軸方向の
位置情報を得るためのズーム・エンコーダ１０８の出力
信号により制御され、左側撮像光学系２０２のフォーカ
ス・モータ２０７及びズーム・モータ２０６は、変倍群
２０２ｂの光軸方向の位置情報を得るためのフォーカス
・エンコーダ２０９の出力信号及び合焦群２０２ｄの光
軸方向の位置情報を得るためのズーム・エンコーダ２０
８の出力信号により制御される。これらにより、右側撮
像光学系１０２の焦点距離と左側撮像光学系２０２の焦
点距離とは常に一致するように制御され、この結果、右
側撮像光学系１０２の結像倍率と左側撮像光学系２０２
の結像倍率は常に一致する。The focus motor 107 and the zoom motor 106 of the right imaging optical system 102 are the zooming group 102b.
The left imaging optical system 202 is controlled by the output signal of the focus encoder 109 for obtaining the position information of the focusing group 102d and the output signal of the zoom encoder 108 for obtaining the position information of the focusing unit 102d in the optical axis direction. The focus motor 207 and the zoom motor 206 are used to obtain the output signal of the focus encoder 209 for obtaining the position information of the variable power group 202b in the optical axis direction and the position information of the focusing group 202d in the optical axis direction. Zoom encoder 20
8 output signals. As a result, the focal length of the right imaging optical system 102 and the focal length of the left imaging optical system 202 are controlled so as to always match, and as a result, the imaging magnification of the right imaging optical system 102 and the left imaging optical system 202 are controlled.
The imaging magnifications of are always the same.

【００７２】以上の構成により、まず回転角情報検出手
段１０５，２０５が輻輳角２θを検出する。次に、ズー
ム・エンコーダ１０８，２０８の出力信号により、各撮
像光学系１０２，２０２の焦点距離が求められる。そし
て、フォーカス・エンコーダ１０９，２０９の出力信号
により各撮像光学系１０２，２０２に対する被写体距離
が求められ、上述の各撮像光学系１０２．２０２の焦点
距離と合わせて、各撮像光学系１０２，２０２のレンズ
バックが求められる。なお、ズーム・エンコーダ１０
８，２０８及びフォーカス・エンコーダ１０９，２０９
の出力信号により、各撮像光学系１０２，２０２の焦点
距離とレンズバックも常に一致する。With the above configuration, first, the rotation angle information detecting means 105, 205 detect the convergence angle 2θ. Next, the focal lengths of the image pickup optical systems 102 and 202 are obtained from the output signals of the zoom encoders 108 and 208. Then, the subject distances to the respective image pickup optical systems 102 and 202 are obtained from the output signals of the focus encoders 109 and 209, and the object distances of the respective image pickup optical systems 102 and 202 are combined with the focal lengths of the respective image pickup optical systems 102 and 202 described above. A lens back is required. The zoom encoder 10
8, 208 and focus encoders 109, 209
With the output signal of, the focal lengths of the respective image pickup optical systems 102 and 202 and the lens back always match.

【００７３】図６に示すように、変換信号生成部１２
Ｒ，１２Ｌは、それぞれイメージ・センサ１０３，２０
３からの画像信号及び角度エンコーダ１０５，２０５か
らの角度信号をもとに、後述するエピポーラ・ラインを
再構成する。座標変換部１１Ｒ，１１Ｌは、変換信号生
成部１２Ｒ，１２Ｌの出力信号に従い、イメージ・セン
サ１０３，２０３の出力画像を座標変換処理する。画像
メモリ１１１，２１１は、座標変換部１１Ｒ，１１Ｌに
おいてそれぞれ座標変換された映像信号１１２，２１２
を一時保存する。補間処理部１３Ｒ，１３Ｌは、画像メ
モリ１１１，２１１に記憶された映像信号１１２，２１
２に対し補間処理を施し、サンプリング点を内挿する。As shown in FIG. 6, the converted signal generator 12
R and 12L are image sensors 103 and 20, respectively.
An epipolar line, which will be described later, is reconstructed based on the image signal from the signal No. 3 and the angle signals from the angle encoders 105 and 205. The coordinate conversion units 11R and 11L perform coordinate conversion processing on the output images of the image sensors 103 and 203 according to the output signals of the conversion signal generation units 12R and 12L. The image memories 111 and 211 have video signals 112 and 212 whose coordinates have been converted by the coordinate conversion units 11R and 11L, respectively.
Temporarily store. The interpolation processing units 13R and 13L include the video signals 112 and 21 stored in the image memories 111 and 211, respectively.
Interpolation processing is performed on 2 and the sampling points are interpolated.

【００７４】対応点検出部５は、補間処理部１３Ｒ，１
３Ｌにおいてそれぞれ補間処理された映像信号１１３，
２１３における対応点を検出する。メモリ１５は、対応
点検出部５において得られた対応点の位置と、対応する
位置における画索間の差の絶対値の和（以下「残差」と
称する）を保存する。法線ベクトル抽出部６は２つの画
像を利用して法線ベクトルを求める。画像メモリ３１１
には、法線ベクトル抽出部６において抽出きれた映像が
書き込まれる。The corresponding point detecting section 5 includes an interpolation processing section 13R, 1R.
Video signal 113 interpolated in 3L,
The corresponding points in 213 are detected. The memory 15 stores the position of the corresponding point obtained by the corresponding point detecting unit 5 and the sum of the absolute values of the differences between the picture lines at the corresponding positions (hereinafter referred to as “residual”). The normal vector extraction unit 6 uses two images to obtain a normal vector. Image memory 311
The video extracted by the normal vector extraction unit 6 is written in.

【００７５】最後に、伝送部９を説明する。伝送部９
は、通信相手に画像を伝送すると共に、通信相手から座
標変換コントロール信号を受信する。通常、通信相手側
にジョイスティック等のポインティング・デバイスが装
備される。通信相手は、再正像の任意の点を設定して、
変換処理装置で使用される任意の方位角（ω，φ，χ）
に相当する情報を伝送する。Finally, the transmission section 9 will be described. Transmission unit 9
Transmits an image to a communication partner and receives a coordinate conversion control signal from the communication partner. Usually, the communication partner is equipped with a pointing device such as a joystick. The communication partner sets an arbitrary point of the re-normal image,
Arbitrary azimuth (ω, φ, χ) used in the conversion processor
The information corresponding to is transmitted.

【００７６】他の実施例として、座標変換部において、
座標変換の方位角と被写体の最高部、即ち、ディスプレ
イと最も近い最近部の近傍領域の法線方向を基に座標変
換してもよい。これは例えば、ＴＶ会議などでは通常、
被写体の顔がディスプレイに相対しており、この時、鼻
のあたりが一番ディスプレイに近く、その近傍領域で、
カメラの倍率及びその鼻の概略の大きさから決まる領域
の法線方向から、その法線方向をディスプレイと垂直に
するように変換し、伝送する。As another embodiment, in the coordinate conversion section,
The coordinate conversion may be performed based on the azimuth angle of the coordinate conversion and the normal direction of the highest area of the subject, that is, the nearest area closest to the display. This is usually the case in video conferencing, for example.
The face of the subject is relative to the display, and at this time, the area around the nose is closest to the display, and in the area near it,
The normal direction of the area determined by the magnification of the camera and the approximate size of the nose is converted so that the normal direction is perpendicular to the display, and then transmitted.

【００７７】[0077]

【発明の効果】以上の説明から容易に理解できるよう
に、本発明によれば、互いに視線が一致するので、通信
相手と違和感の無い対面会議又は通話を実現できる。ま
た、そのための構成も簡単であり、端末装置を安価に構
築できる。As can be easily understood from the above description, according to the present invention, the lines of sight coincide with each other, so that it is possible to realize a face-to-face conference or a call without feeling uncomfortable with the communication partner. Also, the configuration for that is simple, and the terminal device can be constructed at low cost.

[Brief description of drawings]

【図１】本発明の一実施例の概略構成ブロック図であ
る。FIG. 1 is a schematic block diagram of an embodiment of the present invention.

【図２】端末の外観図である。FIG. 2 is an external view of a terminal.

【図３】法線ベクトル抽出部６の内部構成ブロック図
である。FIG. 3 is a block diagram of the internal configuration of a normal vector extraction unit 6.

【図４】複眼撮像装置の概略構成図である。FIG. 4 is a schematic configuration diagram of a compound-eye imaging device.

【図５】撮像系の概略構成図である。FIG. 5 is a schematic configuration diagram of an imaging system.

【図６】処理の流れに応じた概略機能ブロック図であ
る。FIG. 6 is a schematic functional block diagram according to the flow of processing.

【図７】複眼撮像装置の基本原理の説明図である。FIG. 7 is an explanatory diagram of a basic principle of a compound-eye imaging device.

【図８】図７の左側撮像系及び右側撮像系の基本構造
の説明図である。8 is an explanatory diagram of a basic structure of a left imaging system and a right imaging system of FIG.

【図９】距離画像を求めるための三角測量の説明図で
ある。FIG. 9 is an explanatory diagram of triangulation for obtaining a distance image.

【図１０】右側カメラと左側カメラの光軸が平行な場
合の物体座標算出の説明図である。FIG. 10 is an explanatory diagram of object coordinate calculation when the optical axes of the right camera and the left camera are parallel to each other.

【図１１】右側カメラと左側カメラの光軸が交差する
場合の物体座標算出の説明図である。FIG. 11 is an explanatory diagram of object coordinate calculation when the optical axes of the right camera and the left camera intersect.

【図１２】ＴＶ会議で視線をほぼ一致させる従来の端
末装置のディスプレイ正面図と端末の側面図である。FIG. 12 is a front view of a display and a side view of a terminal of a conventional terminal device for making the lines of sight substantially coincide with each other in a video conference.

[Explanation of symbols]

１，２：カメラ３，４：ビデオ信号処理部５：対応点抽出部６：法線ベクトル抽出部７：３次元構造処理部８：座標変換部９：送受信伝送部１１Ｒ，１１Ｌ：座標変換部１２Ｒ，１２Ｌ：変換信号生成部１３Ｒ，１３Ｌ：補間処理部１５：メモリ１００：右側撮像系１０１：光軸１０２：右側撮像光学系１０３：右側イメージセンサ２００：左側撮像系２０１：光軸２０２：左側撮像光学系２０３：左側イメージセンサ１０２ｂ，２０２ｂ：変倍群１０２ｄ，２０２ｄ：合焦群１０２ａ〜１０２ｄ，２０２ａ〜２０２ｄ：レンズ群１０４，２０４：輻輳角モータ１０５，２０５：角度エンコーダ１０６，２０６：ズーム・モータ１０７，２０７：フォーカス・モータ１０８，２０８：ズーム・エンコーダ１０９，２０９：フォーカス・エンコーダ１１１，２１１：画像メモリ１１２，２１２：座標変換された映像信号１０１：被写体１０２：被写体面１１０_L：左側撮像系１１０_R：右側撮像系１１１_L：左側撮像光学系１１２_L：左側イメージセンサ１１１_R：右側撮像光学系１１２_R：右側イメージセンサ１２０：マイクロプロセッサ３１１：画像メモリ１０００：被写体面Ｌ_L，Ｌ_R：光軸Ｉ_L：左側画像Ｉ_R：右側画像Ｆ₁，Ｆ₂：回転中心1, 2: camera 3, 4: video signal processing unit 5: corresponding point extraction unit 6: normal vector extraction unit 7: three-dimensional structure processing unit 8: coordinate conversion unit 9: transmission / reception transmission unit 11R, 11L: coordinate conversion unit 12R, 12L: Converted signal generator 13R, 13L: Interpolation processor 15: Memory 100: Right imaging system 101: Optical axis 102: Right imaging optical system 103: Right image sensor 200: Left imaging system 201: Optical axis 202: Left Imaging optical system 203: Left side image sensor 102b, 202b: Variable magnification group 102d, 202d: Focusing group 102a-102d, 202a-202d: Lens group 104, 204: Convergence angle motor 105, 205: Angle encoder 106, 206: Zoom -Motors 107 and 207: Focus motors 108 and 208: Zoom encoders 109 and 209: Okasu encoder 111, 211: image memory 112, 212: coordinate-converted video signal 101: object 102: object plane 110 _L: left imaging system 110 _R: right-side imaging system 111 _L: left imaging optical system 112 _L: left image Sensor 111 _R : Right side imaging optical system 112 _R : Right side image sensor 120: Microprocessor 311: Image memory 1000: Subject plane L _L , L _R : Optical axis I _L : Left side image I _R : Right side image F ₁ , F ₂ : Rotation center

───────────────────────────────────────────────────── フロントページの続き (72)発明者関根正慶東京都大田区下丸子３丁目30番２号キヤノン株式会社内 (72)発明者光武英明東京都大田区下丸子３丁目30番２号キヤノン株式会社内 ─────────────────────────────────────────────────── ─── Continued Front Page (72) Masayoshi Sekine, Inventor Masayoshi Shimomaruko, 3-30-2, Ota-ku, Tokyo Canon Inc. (72) Inventor Hideaki Mitsutake, 3-30-2, Shimomaruko, Ota-ku, Tokyo Canon Within the corporation

Claims

[Claims]

1. An image communication apparatus for displaying an image of a communication partner on a display device of each other for conversation, each communication terminal comprising: a plurality of imaging devices; and images captured by the plurality of imaging devices. And a line-of-sight matching processing device for processing to match the line-of-sight of the image communication device.

2. The image communication device according to claim 1, wherein the line-of-sight coincidence processing device includes a calculation unit that calculates a distance structure and normal line information from images captured by the plurality of image-capturing optical devices and performs coordinate conversion.