JP2004185611A

JP2004185611A - Face position extraction method, program for causing computer to execute face position extraction method, and face position extraction device

Info

Publication number: JP2004185611A
Application number: JP2003391148A
Authority: JP
Inventors: Shinjiro Kawato; 慎二郎川戸; Yasutaka Senda; 康隆千田
Original assignee: ATR Advanced Telecommunications Research Institute International
Current assignee: ATR Advanced Telecommunications Research Institute International
Priority date: 2002-11-21
Filing date: 2003-11-20
Publication date: 2004-07-02
Anticipated expiration: 2023-11-20
Also published as: JP4166143B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method capable of extracting a face image from image information for real-time tracking of its position. <P>SOLUTION: The method prepares digital data of each pixel value within a target image region, including a human face region, and performs filtering processes (steps S102-110) step by step within the target image region using a between-the-eyes detecting filter, which is composed of six rectangles Si (1≤i≤6) connected with one another, to extract between-the-eyes candidate points. The method further sections object images by a predetermined size, centering the extracted between-the-eyes candidates, and selects a true candidate point from the between-the-eyes candidate points, according to pattern determination processing. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

この発明はカメラ等からの画像を処理する画像処理に関し、特に、画像中の人物の顔を抽出するための画像認識の分野に関する。 The present invention relates to image processing for processing an image from a camera or the like, and more particularly, to the field of image recognition for extracting a human face in an image.

通信により、遠隔地にいる複数の人間で会議を行なうＴＶ会議システムが実用化されている。しかしこれらシステムにおいて、映像そのものを送ると通信データ量が増大するという問題点がある。そのために、たとえば対象となる人物の視線、顔の向き、表情等に関する特徴データを各地で抽出し、抽出したデータのみを互いに送信する技術が研究されている。受信側では、このデータに基づいて仮想的な人物の顔面の画像を生成して表示する。これにより、通信データ量を削減しながら、ＴＶ会議を効率良く行なえる。 2. Description of the Related Art A TV conference system in which a plurality of people at remote locations hold a conference by communication has been put to practical use. However, in these systems, there is a problem that the amount of communication data increases when the video itself is transmitted. For this purpose, for example, a technique of extracting feature data on the gaze, face direction, facial expression, and the like of a target person in various places and transmitting only the extracted data to each other has been studied. The receiving side generates and displays a virtual image of the face of the person based on the data. Thereby, a TV conference can be efficiently performed while reducing the amount of communication data.

さらに、このような画像中から人物を検出する技術は、ヒューマンコンピュータインタラクション、ジェスチャー認識、セキュリティーなどの分野の発展に欠かせない技術としても、盛んに研究されている。 Further, the technology of detecting a person from such an image has been actively studied as a technology indispensable for the development of fields such as human computer interaction, gesture recognition, and security.

これらの人物検出技術の応用においては、１）検出率が高い、２）照明環境の変化に強い、３）実時間で動作する、という条件を満たす安定したシステムを構築する必要がある。さらに今後、高品質画像（一画面を構成する画素数の多い画像）を対象にした実時間人物検出の必要性が高まってくると考えられ、今後はさらに、より高速な人物検出アルゴリズムの開発が必要になる。 In the application of these person detection techniques, it is necessary to construct a stable system that satisfies the following conditions: 1) high detection rate, 2) resistance to changes in lighting environment, and 3) operation in real time. In the future, the need for real-time human detection for high-quality images (images with a large number of pixels constituting one screen) is expected to increase, and in the future, faster human detection algorithms will be developed. Will be needed.

人物を検出するには、まず顔を検出する手法が有効である。顔は表情など重要な情報を有しており、顔を検出できれば四肢の位置を推測して探索することが容易になる。 In order to detect a person, a method of first detecting a face is effective. The face has important information such as facial expressions, and if the face can be detected, it is easy to estimate the position of the limb and search.

これまでに、肌色情報を用いた顔検出システムは報告が数多くなされている（たとえば、特許文献１、非特許文献１〜非特許文献４を参照）。 Up to now, many reports have been made on a face detection system using skin color information (for example, see Patent Document 1, Non-Patent Documents 1 to 4).

これらの手法は、画像から肌色領域を抽出し顔候補領域を求める。顔候補領域を限定できることから、処理の範囲が限定され、計算量を大幅に減らすことができるので、高速なシステムを構築することが可能である。しかし、色情報を利用する手法は照明環境の変動に弱く、一般的な環境下で動作させる場合、安定した性能を期待できない。 These methods extract a skin color region from an image to obtain a face candidate region. Since the face candidate area can be limited, the processing range is limited and the amount of calculation can be significantly reduced, so that a high-speed system can be constructed. However, the method using color information is vulnerable to changes in the lighting environment, and cannot be expected to have stable performance when operated in a general environment.

一方、色情報を用いない（濃淡情報を用いる）顔検出手法では、テンプレートマッチングやニューラルネットワーク等の学習的手法を利用した手法が数多く報告されている（たとえば、非特許文献５〜非特許文献６を参照）。これらの手法は高い検出率と照明環境に対するロバスト性が特徴である。たとえば、非特許文献５に開示された技術においては、ニューラルネットワークを応用し、非常に高い検出率を実現している。 On the other hand, many face detection methods that do not use color information (use density information) use learning methods such as template matching and neural networks (for example, Non-Patent Documents 5 to 6). See). These techniques are characterized by high detection rates and robustness to lighting environments. For example, in the technology disclosed in Non-Patent Document 5, a very high detection rate is realized by applying a neural network.

しかし、これらの手法は、サイズを変えながら画像全体にわたってテンプレート（モデル）とのマッチングをとる必要があり、計算量が多いという問題がある。そのため、画素サイズが大きくなった場合には、計算量が飛躍的に増加するため、実時間システムを構築することは非常に困難である。 However, these methods need to match the template (model) over the entire image while changing the size, and have a problem that the amount of calculation is large. Therefore, when the pixel size increases, the amount of calculation increases dramatically, and it is very difficult to construct a real-time system.

一方、非特許文献７に開示された技術では、分割預域の平均明るさの明暗関係から顔を検出するが、その領域が額から顎まで分布していて１６分割領域があり、まともにヘアスタイルや髭の影響を受けてしまうという問題がある。
川戸慎二郎、鉄谷信二、”リング周波数フィルタを利用した眉間の実時間検出”信学論（Ｄ−ＩＩ），ｖｏｌ．Ｊ８４−Ｄ−ＩＩ，ｎｏ１２，ｐｐ．２５７７−２５８４，Ｄｅｃ．２００１．川戸慎二郎、鉄谷信二、”目のリアルタイム検出と追跡”，信学技報，ＰＲＭＵ２０００−６３，ｐｐ．１５−２２、Ｓｅｐｔ．２０００．チャイＤ，ガンＫ．Ｎ．「テレビ電話アプリケーションにおける肌色マップを用いた顔の分割」ＩＥＥＥトランザクションオンサーキッツアンドシステムズフォービデオテクノロジー，第９巻、Ｎｏ．４，ｐｐ．５５１−５６４，１９９０（Chai, D. and Ngan, K.N.:"Face Segmentation Using Skin-Color Map in Videophone Application," IEEE Trans. on Circuits and Systems For Video Technology, Vol.9, No., pp.551-564, 1990）Ｊ．ヤン，Ａ．ワイベル，「実時間の顔追跡器」，プロシィーディング３ｒｄＩＥＥＥワークショップオンアプリケーションオブコンピュータビジョン，ｐｐ．１４２−１４７，１９９６年（J. Yang and A. Waibel, "A real-time face tracker," Proc. 3rd IEEE Workshop on Application of Computer Vision, pp.142-147, 1996) Ｈ．ローリー，Ｓ．バルージャ，Ｔ．カナダ，「ニューラルネットワークによる顔検知」ＩＥＥＥトランザクションパターンアナリシスアンドマシンインテリジェンス，第２０巻，ｎｏ．１，ｐｐ．２３−３８，１月１９９８年（H. Rowly, S. Baluja, and T. Kanada, "Neural-Network-Based Face Detection," IEEE Trans. Pattern Analysis and Machine Intelligence, vol.20, no.1, pp.23-38, Jan.1998）Ｅ．ジェルマス、Ｂ．Ｋ．ロウ「顔検知：サーベイ」コンピュータビジョンアンドイメージアンダスタンディング、８３（３）、ｐｐ．２３６−２７４，２００１年 (E. Hjelmas and B. K. Low, “Face Detection : A survey,” Computer Vision and Image Understanding, 83(3), pp.236-274, 2001) ブライアンスカセッラティ「フォビーティッドアクティブビジョンシステムに対する顔検出による目の検出」プロシィーディングズＡＡＡＩ１９９８年ｐｐ．９６９−９７６（Brian Scassellati, “Eye Finding via Face Detection for a Foveated, Active Vision System”, Proc. AAAI ’98, pp.969-976）特開２００１−５２１７６号公報明細書上述した特許文献１に開示された技術では、安定した顔の特徴点として両目の間の点（以下では眉間（Ｂｅｔｗｅｅｎ−ｔｈｅ−Ｅｙｅｓ）と呼ぶ）に着目している。つまり、眉間の周囲は、額部と鼻筋は相対的に明るく、両サイドの目と眉の部分は暗いパターンになっており、それを検出するリング周波数フィルタを用いている。 On the other hand, in the technology disclosed in Non-Patent Document 7, a face is detected from the lightness / darkness relationship of the average brightness of the divided deposit area. However, the area is distributed from the forehead to the chin, and there are 16 divided areas. There is a problem of being influenced by style and beard.
Shinjiro Kawato and Shinji Tetsuya, "Real-time detection of eyebrows using a ring frequency filter", IEICE (D-II), vol. J84-D-II, no12, pp. 2577-2584, Dec. 2001. Shinjiro Kawato and Shinji Tetsuya, "Real-time detection and tracking of eyes", IEICE Technical Report, PRMU2000-63, pp. 15-22, Sept. 2000. Chai D, Gun K. N. "Segmentation of Faces Using Skin Color Maps in Videophone Applications", IEEE Transactions on Circuits and Systems for Video Technology, Vol. 4, pp. 551-564, 1990 (Chai, D. and Ngan, KN: "Face Segmentation Using Skin-Color Map in Videophone Application," IEEE Trans. On Circuits and Systems For Video Technology, Vol. 9, No., pp. 551- 564, 1990) J. Yang, A. Weibel, "Real-time face tracker", Proceedings 3rd IEEE Workshop on Application of Computer Vision, pp. 142-147, 1996 (J. Yang and A. Waibel, "A real-time face tracker," Proc. 3rd IEEE Workshop on Application of Computer Vision, pp. 142-147, 1996) H. Raleigh, S.M. Baruja, T .; Canada, "Face Detection by Neural Networks" IEEE Transaction Pattern Analysis and Machine Intelligence, Vol. 20, no. 1, pp. 23-38, January 1998 (H. Rowly, S. Baluja, and T. Kanada, "Neural-Network-Based Face Detection," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 1, pp. .23-38, Jan.1998) E. FIG. Germouth, B.A. K. Low "Face Detection: Survey" Computer Vision and Image Understanding, 83 (3), pp. 139 236-274, 2001 (E. Hjelmas and BK Low, “Face Detection: A survey,” Computer Vision and Image Understanding, 83 (3), pp.236-274, 2001) Brian Scassellati "Eye Detection by Face Detection for Foveated Active Vision Systems" Procedings AAAI 1998 pp. 969-976 (Brian Scassellati, “Eye Finding via Face Detection for a Foveated, Active Vision System”, Proc. AAAI '98, pp.969-976) The technique disclosed in Japanese Patent Application Laid-Open No. 2001-52176 pays attention to a point between both eyes (hereinafter, referred to as a “between-the-Eyes”) as a stable facial feature point. ing. In other words, the forehead and nose are relatively bright around the area between the eyebrows, and the eyes and eyebrows on both sides have a dark pattern, and a ring frequency filter for detecting this is used.

しかしリング周波数フィルタでは、肌色領域を抽出して、領域を限定する前処理が必要な点と、前髪が眉までかかっているような顔は上述のようなパターンが現れないために、検出できない場合があるという問題があった。 However, with the ring frequency filter, it is necessary to perform preprocessing to extract the skin color area and limit the area, and if the face with the bangs up to the eyebrows cannot be detected because the above pattern does not appear There was a problem that there is.

それゆえに本発明の目的は、照明の状況や人物の髪型の影響等を抑えて、画像情報から顔画像を抽出することが可能な顔位置抽出装置、そのための方法および当該方法をコンピュータを用いて実現するためのプログラムを提供することである。 Therefore, an object of the present invention is to provide a face position extraction device capable of extracting a face image from image information while suppressing the influence of lighting conditions and the hairstyle of a person, a method therefor, and the method using a computer. It is to provide a program to realize it.

さらに、本発明の他の目的は、照明の状況や人物の髪型の影響等を抑えて、顔の眉間の位置を特定して実時間でその位置を追跡することが可能な顔位置抽出装置、そのための方法および当該方法をコンピュータを用いて実現するためのプログラムを提供することである。 Still another object of the present invention is to provide a face position extracting device capable of identifying the position between eyebrows of a face and tracking the position in real time by suppressing the influence of lighting conditions and the hairstyle of a person, An object of the present invention is to provide a method for that and a program for realizing the method using a computer.

この発明のある局面に従うと、顔位置の抽出方法であって、人間の顔領域を含む対象画像領域内の各画素の値のデジタルデータを準備するステップと、対象となる画像領域内において、６つの矩形形状の結合した眉間検出フィルタによるフィルタリング処理により眉間候補点の位置を抽出するステップと、抽出された眉間候補点の位置を中心として、所定の大きさで対象画像を切り出し、パターン判別処理に応じて、眉間候補点のうちから真の候補点を選択するステップとを備える。 According to an aspect of the present invention, there is provided a face position extracting method, comprising the steps of: preparing digital data of a value of each pixel in a target image region including a human face region; Extracting the position of the eyebrow candidate point by a filtering process using the eyebrow detection filter combined with the two rectangular shapes; cutting out a target image with a predetermined size around the position of the extracted eyebrow candidate point; Responsively, selecting a true candidate point from the eyebrow candidate points.

好ましくは、眉間検出フィルタは、１つの矩形形状を６分割したものである、請求項１記載の顔位置の抽出方法。 Preferably, the eyebrows detection filter is obtained by dividing one rectangular shape into six parts.

好ましくは、６つの矩形形状は、鉛直方向に隣接する２つの第１の矩形形状と、第１の矩形形状とは鉛直方向に所定量だけずれ、かつ鉛直方向に隣接する２つの第２の矩形形状と、第２の矩形形状とは鉛直方向に所定量だけずれ、かつ鉛直方向に隣接する２つの第３の矩形形状とを含む。 Preferably, the six rectangular shapes are two first rectangular shapes vertically adjacent to each other, and two second rectangular shapes vertically shifted from the first rectangular shape by a predetermined amount and vertically adjacent to each other. The shape and the second rectangular shape include two third rectangular shapes that are vertically displaced by a predetermined amount and that are adjacent in the vertical direction.

好ましくは、真の候補点を選択するステップは、眉間候補点に対応する眉間検出フィルタを構成する矩形形状のうち、所定の２つの矩形形状に対応する対象画像に対して、目のパターン判別処理により、目の位置を検出するステップと、検出された目の位置に基づいて、眉間候補点の位置を、２つの目の中点の位置に修正するステップと、修正された眉間候補点位置を中心に２つの目が水平となるように入力画像を回転するステップと、回転された入力画像について、修正された眉間候補点の位置を中心として、所定の大きさで対象画像を切り出し、パターン判別処理に応じて、眉間候補点のうちから真の候補点を選択するステップとを含む。 Preferably, the step of selecting a true candidate point includes an eye pattern discriminating process for a target image corresponding to two predetermined rectangular shapes among rectangular shapes forming a eyebrow detection filter corresponding to the eyebrow candidate points. The step of detecting the position of the eye, the step of correcting the position of the eyebrow candidate point to the position of the middle point of the two eyes based on the detected position of the eye, and the step of correcting the corrected position of the eyebrow candidate point Rotating the input image so that the two eyes are horizontal at the center, and cutting out a target image of a predetermined size around the corrected position of the eyebrow candidate point in the rotated input image, and determining the pattern. Selecting a true candidate point from the eyebrows candidate points in accordance with the processing.

好ましくは、デジタルデータを準備するステップは、対象画像をステレオ画像として準備するステップを含み、真の候補点を選択するステップは、ステレオ画像に基づいて検出される眉間候補点の観測点からの距離に応じて、眉間候補点のうちから真の候補点を選択するステップとを含む。 Preferably, the step of preparing digital data includes the step of preparing the target image as a stereo image, and the step of selecting a true candidate point includes a step of: selecting a distance between the observation point of the eyebrow candidate point detected based on the stereo image. And selecting a true candidate point from among the eyebrow candidate points.

この発明の他の局面に従うと、コンピュータに、対象となる画像領域内の顔位置を抽出する方法を実行させるためのプログラムであって、プログラムは、人間の顔領域を含む対象画像領域内の各画素の値のデジタルデータを準備するステップと、対象となる画像領域内において、６つの矩形形状の結合した眉間検出フィルタによるフィルタリング処理により眉間候補点の位置を抽出するステップと、抽出された眉間候補点の位置を中心として、所定の大きさで対象画像を切り出し、パターン判別処理に応じて、眉間候補点のうちから真の候補点を選択するステップとを備える。 According to another aspect of the present invention, there is provided a program for causing a computer to execute a method of extracting a face position in a target image region, the program comprising: Preparing digital data of pixel values, extracting the positions of the eyebrow candidate points by a filtering process using a combined eyebrow space detection filter of six rectangular shapes in the target image area, Cutting out a target image with a predetermined size around the position of the point, and selecting a true candidate point from eyebrow candidate points in accordance with the pattern determination processing.

好ましくは、眉間検出フィルタは、１つの矩形形状を６分割したものである。 Preferably, the eyebrows detection filter is obtained by dividing one rectangular shape into six.

この発明のさらに他の局面に従うと、顔位置抽出装置であって、人間の顔領域を含む対象画像領域内の各画素の値のデジタルデータを準備する撮影手段と、対象となる画像領域内において、６つの矩形形状の結合した眉間検出フィルタによるフィルタリング処理により眉間候補点の位置を抽出する手段と、抽出された眉間候補点の位置を中心として、所定の大きさで対象画像を切り出し、パターン判別処理に応じて、眉間候補点のうちから真の候補点を選択する選択手段とを備える。 According to still another aspect of the present invention, there is provided a face position extracting apparatus, comprising: a photographing unit that prepares digital data of a value of each pixel in a target image region including a human face region; Means for extracting the positions of the eyebrow candidate points by a filtering process using the eyebrow space detection filter in which the six rectangular shapes are combined, and cutting out the target image with a predetermined size around the positions of the extracted eyebrow candidate points to determine the pattern. Selecting means for selecting a true candidate point from the eyebrow candidate points in accordance with the processing.

好ましくは、選択手段は、眉間候補点に対応する眉間検出フィルタを構成する矩形形状のうち、所定の２つの矩形形状に対応する対象画像に対して、目のパターン判別処理により、目の位置を検出する手段「と、検出された目の位置に基づいて、眉間候補点の位置を、２つの目の中点の位置に修正する手段と、修正された眉間候補点位置を中心に２つの目が水平となるように入力画像を回転する手段と、回転された入力画像について、修正された眉間候補点の位置を中心として、所定の大きさで対象画像を切り出し、パターン判別処理に応じて、眉間候補点のうちから真の候補点を選択する手段とを含む。 Preferably, the selecting means determines an eye position by performing an eye pattern discriminating process on a target image corresponding to two predetermined rectangular shapes among the rectangular shapes forming the eyebrows gap detection filter corresponding to the eyebrows candidate points. Means for detecting, and means for correcting the position of the eyebrow candidate point to the position of the middle point of the two eyes based on the detected eye position, and two eyes centering on the corrected eyebrow candidate point position Means for rotating the input image so that is horizontal, and, for the rotated input image, a target image is cut out at a predetermined size around the position of the corrected eyebrow candidate point, and according to the pattern determination process, Means for selecting a true candidate point from the forehead candidate points.

好ましくは、撮影手段は、対象画像をステレオ画像として準備する手段を含み、選択手段は、ステレオ画像に基づいて検出される眉間候補点の観測点からの距離に応じて、眉間候補点のうちから真の候補点を選択する手段を含む。 Preferably, the photographing unit includes a unit that prepares the target image as a stereo image, and the selecting unit selects one of the eyebrow candidate points according to a distance from the observation point of the eyebrow candidate point detected based on the stereo image. Means for selecting true candidate points.

以上説明したとおり、本発明によれば、連続する画面情報から実時間で、人物の顔の位置、特に、眉間または目の位置を検出することができる。 As described above, according to the present invention, it is possible to detect the position of a person's face, in particular, the position between the eyebrows or the eyes, from continuous screen information in real time.

［実施の形態１］
［ハードウェア構成］
以下、本発明の実施の形態１にかかる顔位置抽出装置について説明する。この顔位置抽出装置は、パーソナルコンピュータまたはワークステーション等、コンピュータ上で実行されるソフトウェアにより実現されるものであって、対象画像から人物の顔を抽出し、さらに人物の顔の映像から、眉間の位置および目の位置を検出するためのものである。図１に、この顔位置抽出装置の外観を示す。 [Embodiment 1]
[Hardware configuration]
Hereinafter, the face position extracting device according to the first embodiment of the present invention will be described. This face position extraction device is realized by software executed on a computer, such as a personal computer or a workstation, and extracts a person's face from a target image, and further extracts a human face image from a video of the person's face. It is for detecting the position and the position of the eyes. FIG. 1 shows the appearance of the face position extracting device.

図１を参照してこのシステム２０は、ＣＤ−ＲＯＭ（Compact Disc Read-Only Memory ）ドライブ５０およびＦＤ（Flexible Disk ）ドライブ５２を備えたコンピュータ本体４０と、コンピュータ本体４０に接続された表示装置としてのディスプレイ４２と、同じくコンピュータ本体４０に接続された入力装置としてのキーボード４６およびマウス４８と、コンピュータ本体４０に接続された、画像を取込むためのカメラ３０とを含む。この実施の形態の装置では、カメラ３０としてはＣＣＤ（固体撮像素子）を含むビデオカメラを用い、カメラ３０の前にいてこのシステム２０を操作する人物の眉間または目の位置を検出する処理を行なうものとする。 Referring to FIG. 1, a system 20 includes a computer main body 40 having a CD-ROM (Compact Disc Read-Only Memory) drive 50 and an FD (Flexible Disk) drive 52, and a display device connected to the computer main body 40. , A keyboard 46 and a mouse 48 as input devices also connected to the computer main body 40, and a camera 30 connected to the computer main body 40 for capturing images. In the apparatus of this embodiment, a video camera including a CCD (solid-state image sensor) is used as the camera 30, and processing for detecting the position of the eyebrows or eyes of a person who operates the system 20 in front of the camera 30 is performed. Shall be.

すなわち、カメラ３０により、人間の顔領域を含む画像であって対象となる画像領域内の各画素の値のデジタルデータが準備される。 That is, the camera 30 prepares digital data of the value of each pixel in the target image region, which is an image including the human face region.

図２に、このシステム２０の構成をブロック図形式で示す。図２に示されるようにこのシステム２０を構成するコンピュータ本体４０は、ＣＤ−ＲＯＭドライブ５０およびＦＤドライブ５２に加えて、それぞれバス６６に接続されたＣＰＵ（Central Processing Unit ）５６と、ＲＯＭ（Read Only Memory) ５８と、RAM （Random Access Memory）６０と、ハードディスク５４と、カメラ３０からの画像を取込むための画像取込装置６８とを含んでいる。ＣＤ−ＲＯＭドライブ５０にはＣＤ−ＲＯＭ６２が装着される。ＦＤドライブ５２にはＦＤ６４が装着される。 FIG. 2 is a block diagram showing the configuration of the system 20. As shown in FIG. 2, a computer main body 40 constituting the system 20 includes a CPU (Central Processing Unit) 56 connected to a bus 66 and a ROM (Read) in addition to a CD-ROM drive 50 and an FD drive 52. Only Memory) 58, a RAM (Random Access Memory) 60, a hard disk 54, and an image capturing device 68 for capturing an image from the camera 30. A CD-ROM 62 is mounted on the CD-ROM drive 50. An FD 64 is mounted on the FD drive 52.

既に述べたようにこの顔位置抽出装置の主要部は、コンピュータハードウェアと、ＣＰＵ５６により実行されるソフトウェアとにより実現される。一般的にこうしたソフトウェアはＣＤ−ＲＯＭ６２、ＦＤ６４等の記憶媒体に格納されて流通し、ＣＤ−ＲＯＭドライブ５０またはＦＤドライブ５２等により記憶媒体から読取られてハードディスク５４に一旦格納される。または、当該装置がネットワークに接続されている場合には、ネットワーク上のサーバから一旦ハードディスク５４にコピーされる。そうしてさらにハードディスク５４からＲＡＭ６０に読出されてＣＰＵ５６により実行される。なお、ネットワーク接続されている場合には、ハードディスク５４に格納することなくＲＡＭ６０に直接ロードして実行するようにしてもよい。 As described above, the main part of the face position extraction device is realized by computer hardware and software executed by the CPU 56. Generally, such software is stored and distributed in a storage medium such as a CD-ROM 62 or an FD 64, and is read from the storage medium by a CD-ROM drive 50 or an FD drive 52 and temporarily stored in a hard disk 54. Alternatively, when the device is connected to a network, the device is temporarily copied to a hard disk 54 from a server on the network. Then, the data is further read from the hard disk 54 to the RAM 60 and executed by the CPU 56. When a network connection is established, the program may be directly loaded into the RAM 60 and executed without being stored in the hard disk 54.

図１および図２に示したコンピュータのハードウェア自体およびその動作原理は一般的なものである。したがって、本発明の最も本質的な部分は、ＦＤ６４、ハードディスク５４等の記憶媒体に記憶されたソフトウェアである。 The hardware itself and the operating principle of the computer shown in FIGS. 1 and 2 are general. Therefore, the most essential part of the present invention is the software stored in the storage medium such as the FD 64 and the hard disk 54.

なお、最近の一般的傾向として、コンピュータのオペレーティングシステムの一部として様々なプログラムモジュールを用意しておき、アプリケーションプログラムはこれらモジュールを所定の配列で必要な時に呼び出して処理を進める方式が一般的である。そうした場合、当該顔位置抽出装置を実現するためのソフトウェア自体にはそうしたモジュールは含まれず、当該コンピュータでオペレーティングシステムと協働してはじめて顔位置抽出装置が実現することになる。しかし、一般的なプラットフォームを使用する限り、そうしたモジュールを含ませたソフトウェアを流通させる必要はなく、それらモジュールを含まないソフトウェア自体およびそれらソフトウェアを記録した記録媒体（およびそれらソフトウェアがネットワーク上を流通する場合のデータ信号）が実施の形態を構成すると考えることができる。 As a recent general tendency, various program modules are prepared as a part of a computer operating system, and an application program calls these modules in a predetermined arrangement when necessary and proceeds with processing. is there. In such a case, the software itself for realizing the face position extracting device does not include such a module, and the face position extracting device is realized only in cooperation with the operating system on the computer. However, as long as a general platform is used, it is not necessary to distribute software including such modules, and software itself not including those modules and a recording medium on which the software is recorded (and the software distributed on a network). Data signals in such a case) can be considered to constitute an embodiment.

［顔画像の抽出の基本的原理］
まず、本発明の手続きの概略をまとめると、顔を連続撮影したビデオ画像を処理するにあたり、横が顔幅、縦がその半分程度の大きさの矩形フィルターで画面を走査する。矩形は、たとえば、３×２に６分割されていて、各分割領域の平均明るさが計算され、それらの相対的な明暗関係がある条件を満たすとき、その矩形の中心を眉間候補とする。 [Basic principle of face image extraction]
First, in summary of the procedure of the present invention, in processing a video image in which a face is continuously photographed, a screen is scanned with a rectangular filter having a width of about a face and a height of about half of the width. The rectangle is, for example, divided into six parts of 3 × 2, and the average brightness of each divided region is calculated. When the condition of the relative light-dark relationship is satisfied, the center of the rectangle is set as the eyebrows candidate.

連続した画素が眉間候補となるときは、それを取囲む枠の中心候補のみを眉間候補として残す。残った眉間候補を標準パターンと比較してテンプレートマッチング等を行なうことで、上述した手続きで得られた眉間候補のうちから、偽の眉間候補を捨て、真の眉間を抽出する。 When a continuous pixel is an eyebrow interval candidate, only the center candidate of the frame surrounding it is left as an eyebrow interval candidate. By comparing the remaining eyebrow interval candidates with the standard pattern and performing template matching or the like, a false eyebrow interval candidate is discarded from the eyebrow interval candidates obtained by the above-described procedure, and a true eyebrow interval is extracted.

以下、本発明の顔検出の手続きについて、さらに詳しく説明する。 Hereinafter, the procedure of face detection according to the present invention will be described in more detail.

（６分割矩形フィルタ）
図３は、上述した３×２に６分割された矩形フィルタ（以下、「６分割矩形フィルタ」と呼ぶ）を示す図である。 (6-segment rectangular filter)
FIG. 3 is a diagram illustrating the above-described rectangular filter divided into six by 3 × 2 (hereinafter, referred to as “six-part rectangular filter”).

６分割矩形フィルタは、１）鼻筋は両目領域よりも明るい、２）目領域は頬部よりも暗い、という顔の特徴を抽出し、顔の眉間位置を求めるフィルタである。点（ｘ、ｙ）を中心として、横ｉ画素、縦ｊ画素（ｉ，ｊ：自然数）の矩形の枠を設ける。 The six-segment rectangular filter is a filter that extracts a facial feature that 1) the nose muscle is brighter than the both-eye area and 2) the eye area is darker than the cheek, and obtains the position between the eyebrows of the face. A rectangular frame of i horizontal pixels and j vertical pixels (i, j: natural number) is provided around the point (x, y).

図３のように、この矩形の枠を、横に３等分、縦に２等分して、６個のブロックＳ１〜Ｓ６に分割する。 As shown in FIG. 3, this rectangular frame is horizontally divided into three equal parts and vertically into two equal parts to be divided into six blocks S1 to S6.

図４は、このような６分割矩形フィルタを顔画像に当てはめた場合を示す概念図である。図４（ａ）は６分割矩形フィルタの形状を示し、図４（ｂ）は６分割矩形フィルタを顔画像の両目領域および頬部に当てはめた状態を示す。 FIG. 4 is a conceptual diagram showing a case where such a six-segment rectangular filter is applied to a face image. FIG. 4A shows the shape of a six-divided rectangular filter, and FIG. 4B shows a state in which the six-divided rectangular filter is applied to both eyes and a cheek of a face image.

なお、鼻筋の部分が目の領域よりも通常は狭いことを考慮すると、ブロックＳ２およびＳ５の横幅ｗ２は、ブロックＳ１，Ｓ３，Ｓ４およびＳ６の横幅ｗ１よりも狭い方がより望ましい。好ましくは、幅ｗ２は幅ｗ１の半分とすることができる。図５は、このような場合の６分割矩形フィルタの構成を示す概念図である。 Considering that the nose muscle is usually narrower than the eye region, it is more desirable that the width w2 of the blocks S2 and S5 is smaller than the width w1 of the blocks S1, S3, S4 and S6. Preferably, width w2 can be half of width w1. FIG. 5 is a conceptual diagram showing a configuration of a 6-divided rectangular filter in such a case.

実施の形態１では、図５に示すような６分割矩形フィルタを用いるものとする。 In the first embodiment, a six-divided rectangular filter as shown in FIG. 5 is used.

また、ブロックＳ１、Ｓ２およびＳ３の縦幅ｈ１と、ブロックＳ４、Ｓ５およびＳ６の縦幅ｈ２とは、必ずしも同一である必要もない。ただし、以下の説明では、縦幅ｈ１と縦幅ｈ２とは等しいものとして説明する。 Further, the vertical width h1 of the blocks S1, S2 and S3 and the vertical width h2 of the blocks S4, S5 and S6 do not necessarily have to be the same. However, in the following description, the vertical width h1 and the vertical width h2 are described as being equal.

図５に示す６分割矩形フィルタにおいて、それぞれのブロックＳｉ（１≦ｉ≦６）について、画素の輝度の平均値「バーＳｉ」（Ｓｉに上付きの“−”をつける）を求める。 In the six-segment rectangular filter shown in FIG. 5, for each block Si (1.ltoreq.i.ltoreq.6), the average value of the pixel luminance "bar Si" (the superscript "-" is added to Si) is obtained.

ブロックＳ１に１つの目と眉が存在し、ブロックＳ３に他の目と眉が存在するものとすると、以下の関係式（１）が成り立つ。 Assuming that one eye and eyebrows are present in the block S1 and the other eyes and eyebrows are present in the block S3, the following relational expression (1) holds.

図６は、このような６分割矩形フィルタを走査する対象となる画像を示す概念図である。 FIG. 6 is a conceptual diagram showing an image to be scanned by such a six-segment rectangular filter.

図６に示すとおり、顔画像を検知する対象画像は、横方向にＭ画素、縦方向にＮ画素のＭ×Ｎ画素から構成される。原理的には、左上隅の画素（０，０）から横方向および縦方向について順次１画素ずつずらせながら、上記６分割矩形フィルタを当てはめて、上記関係式（１）の妥当性をチェックする作業を行なえばよいことになる。しかしながら、このように６分割矩形フィルタをずらせるたびに、各ブロック内の輝度の平均値を求めるのでは、効率が悪い。 As shown in FIG. 6, the target image for detecting a face image is composed of M pixels in the horizontal direction and M × N pixels in the vertical direction. In principle, the above-mentioned six-segment rectangular filter is applied while sequentially shifting one pixel in the horizontal and vertical directions from the pixel (0, 0) at the upper left corner to check the validity of the relational expression (1). Should be performed. However, it is inefficient to obtain the average value of the luminance in each block every time the six-part rectangular filter is shifted in this way.

そこで、本発明では、矩形枠内の画素の総和を求める処理について、公知の文献（P. Viola and M. Jones, “Rapid Object Detection using a Boosted Cascade of Simple Features,” Proc. Of IEEE Conf. CVPR,1,pp.511-518, 2001）がにおいて開示されている、インテグラルイメージ（ＩｎｔｅｇｒａｌＩｍａｇｅ）を利用した計算の高速化手法を取り入れる。 Therefore, in the present invention, a process for obtaining the sum of pixels in a rectangular frame is described in a known document (P. Viola and M. Jones, “Rapid Object Detection using a Boosted Cascade of Simple Features,” Proc. Of IEEE Conf. CVPR). , 1, pp. 511-518, 2001), which incorporates a technique for accelerating calculations using integral images.

画像ｉ（ｘ、ｙ）から、「インテグラルイメージ」は、次式（２）で定義される。 From the image i (x, y), the “integral image” is defined by the following equation (2).

インテグラルイメージは、以下の繰り返しで求めることができる。 The integral image can be obtained by repeating the following.

ｓ（ｘ、ｙ）は行の画素の総和を表わす。ただしｓ（ｘ、−１）＝０、ｉｉ（−１、ｙ）＝０とする。重要な点は、画像全体を一回走査するだけで、インテグラルイメージを求めることができることである。 s (x, y) represents the sum of the pixels in the row. However, s (x, -1) = 0 and ii (-1, y) = 0. The important point is that an integral image can be obtained by scanning the entire image only once.

インテグラルイメージを用いると、長方形領域内の画素の輝度値の総和を簡単に求めることができる。図７は、このようなインテグラルイメージを用いて、総和を求める長方形領域を示す図である。 Using the integral image makes it possible to easily obtain the sum of the luminance values of the pixels in the rectangular area. FIG. 7 is a diagram showing a rectangular area for which the sum is obtained using such an integral image.

インテグラルイメージを用いて、図７に示す長方形Ｄの枠内の画素の輝度の総和Ｓｒは、以下のように４点の値の計算で求めることができる。 Using the integral image, the sum Sr of the luminance of the pixels in the frame of the rectangle D shown in FIG. 7 can be obtained by calculating the values of four points as follows.

このように、インテグラルイメージを用いることによって、長方形領域内の画素の輝度値の総和、ひいては、画素の輝度値の平均を高速に求めることができるので、高速に６分割矩形フィルタに対する処理を行なうことが可能である。 As described above, by using the integral image, the sum of the luminance values of the pixels in the rectangular area, and thus the average of the luminance values of the pixels, can be obtained at a high speed. It is possible.

（眉間候補点の抽出処理）
以下では、上述した６分割矩形フィルタを用いて、眉間の候補点を抽出する処理を説明する。 (Extraction process of eyebrows candidate points)
In the following, a process of extracting candidate points between eyebrows using the above-described six-segment rectangular filter will be described.

図８は、眉間の候補点を抽出する処理を説明するためのフローチャートである。 FIG. 8 is a flowchart illustrating a process of extracting candidate points between eyebrows.

図８を参照して、まず、初期化処理として、変数ｍ、ｎの値を、ｍ＝０，ｎ＝０とする（ステップＳ１００）。 Referring to FIG. 8, first, as initialization processing, values of variables m and n are set to m = 0 and n = 0 (step S100).

続いて、６分割フィルタの左上コーナーを画像の（ｍ，ｎ）画素に合わせる（ステップＳ１０２）。さらに、ブロックＳｉの領域の画素の平均濃度バーＳｉを計算する（ステップＳ１０４）。 Subsequently, the upper left corner of the six-division filter is matched with the (m, n) pixel of the image (step S102). Further, the average density bar Si of the pixels in the area of the block Si is calculated (step S104).

次に、平均濃度バーＳｉの値の大小が、式（１）による眉間候補条件を満たすがどうかテストする（ステップＳ１０６）。 Next, it is tested whether or not the value of the average density bar Si satisfies the eyebrows candidate condition according to the equation (1) (step S106).

テスト条件を満たす場合は（ステップＳ１０８）、フィルタの中心点に相当する（ｍ＋ｉ/２，ｎ＋ｊ/２）の位置の画素に眉間候補マークをつける（ステップＳ１１０）。一方、テスト条件を満たさない場合は（ステップＳ１０８）、処理はステップＳ１１２に移行する。 If the test condition is satisfied (step S108), a pixel at the (m + i / 2, n + j / 2) position corresponding to the center point of the filter is marked with a forehead candidate mark (step S110). On the other hand, when the test condition is not satisfied (step S108), the process proceeds to step S112.

ステップＳ１１２では、変数ｍの値が１だけインクリメントされる。次に、変数ｍの値が対象画像の中で横方向にフィルタが動ける範囲内であるかが判定される（ステップＳ１１４）。フィルタが動ける範囲内であるときは、処理はステップＳ１０２に復帰する。一方、フィルタが横方向に動ける限界になっているときは、変数ｍの値を０にリセットし、変数ｎの値を１だけインクリメントする（ステップＳ１１６）。 In step S112, the value of the variable m is incremented by one. Next, it is determined whether the value of the variable m is within the range in which the filter can move in the horizontal direction in the target image (step S114). If it is within the range in which the filter can move, the process returns to step S102. On the other hand, if the filter has reached the limit in which it can move in the horizontal direction, the value of the variable m is reset to 0, and the value of the variable n is incremented by 1 (step S116).

次に、変数ｎの値が対象画像の中で縦方向にフィルタが動ける範囲内であるかが判定される（ステップＳ１１８）。フィルタが動ける範囲内であるときは、処理はステップＳ１０２に復帰する。一方、フィルタが縦方向に動ける限界になっているときは、眉間候補マークのついて、画素の連結性を調べ、各連結要素ごとに連結要素の外形枠の中央の画素を眉間候補点とする（ステップＳ１２０）。ここで、「中央の画素」とは、特に限定されないが、たとえば、各連結要素の重心位置とすることができる。 Next, it is determined whether the value of the variable n is within the range in which the filter can move in the vertical direction in the target image (step S118). If it is within the range in which the filter can move, the process returns to step S102. On the other hand, when the filter has reached the limit in which it can move in the vertical direction, the connectivity of the pixels is checked for the eyebrow candidate mark, and the center pixel of the outer shape frame of the connected element is determined as the eyebrow candidate point for each connected element ( Step S120). Here, the “center pixel” is not particularly limited, but may be, for example, the position of the center of gravity of each connected element.

図９は、以上のような処理による眉間候補点の抽出結果を示す図である。 FIG. 9 is a diagram showing the result of extracting eyebrows candidate points by the above processing.

図９（ａ）は、適用した６分割矩形フィルタの形状および大きさを示し、図９（ｂ）は、眉間候補マークのついた連結要素をハッチングした領域として示す。 FIG. 9A shows the shape and size of the applied six-segment rectangular filter, and FIG. 9B shows the connected element with the eyebrow interval mark as a hatched area.

なお、与えられた対象画像に対して、どのような大きさの６分割矩形フィルタを適用するかについては、たとえば、予め対象画像中の顔画像の大きさが分かっている場合は、その大きさに合わせて設定しておくことも可能である。あるいは、撮影対象となる範囲内（カメラ３０からの距離）に人物が存在する場合の顔の大きさに対応して、予め幾種類かの大きさの６分割矩形フィルタを準備しておき、一番最初に顔を検出する際には、この複数種類の６分割矩形フィルタのうちから、順次違う大きさのものを選択して適用して、以下に説明するような顔検出の適合度が最も高いものを選ぶこととしてもよい。 The size of the six-part rectangular filter to be applied to a given target image is determined by, for example, the size of the face image in the target image if the size is known in advance. It is also possible to set according to. Alternatively, in accordance with the size of the face when a person is present within the range to be photographed (distance from the camera 30), six-segment rectangular filters of various sizes are prepared in advance, and When detecting a face for the first time, from among the plurality of types of six-segment rectangular filters, ones having different sizes are sequentially selected and applied, and the degree of conformity of the face detection described below is the highest. You may choose a higher one.

（目の候補点の抽出および真の眉間候補点の抽出）
以上のようにして抽出された眉間候補点には、真の眉間候補点以外に偽の眉間候補点も含まれる。そこで、以下に説明する手順で、真の眉間候補点を抽出する。 (Extraction of eye candidate points and extraction of true eyebrows candidate points)
The eyebrows candidate points extracted as described above include false eyebrows candidate points in addition to the true eyebrows candidate points. Therefore, a true eyebrows candidate point is extracted by the procedure described below.

まず、眉間候補点の情報に基づいて、目の位置の候補点を抽出する。 First, a candidate point of the eye position is extracted based on the information of the eyebrow candidate point.

そのために、複数の目の画像を顔画像データベースから抽出し、その平均画像を得る。 For this purpose, a plurality of eye images are extracted from the face image database, and an average image is obtained.

図１０は、このようにして得られた右目のテンプレートを示す図である。左目のテンプレートは、この右目テンプレートを水平方向に反転させればよい。 FIG. 10 is a diagram showing the template of the right eye obtained in this manner. The left-eye template may be obtained by horizontally inverting the right-eye template.

この右目テンプレートおよび左目のテンプレートを用いて、図３に示した眉間候補点を中心とする６分割矩形フィルタのブロックＳ１およびＳ３の領域において、テンプレートマッチング処理を行なえば、右目および左目の各々の候補点を抽出できる。 By using the right-eye template and the left-eye template and performing template matching processing in the areas of the blocks S1 and S3 of the six-segment rectangular filter centered on the eyebrow candidate point shown in FIG. Points can be extracted.

図１１は、このような目の候補点の抽出を行なった上で、真の眉間候補点の抽出を行なう処理を説明するためのフローチャートである。 FIG. 11 is a flowchart for explaining a process of extracting a true eyebrow candidate point after extracting such eye candidate points.

図１１を参照して、まず、眉間候補抽出フィルターのブロックＳ１とＳ３のそれぞれの領域において、目のテンプレートともっとも良くマッチする点を探索し、左右の目の候補点とする（ステップＳ２００）。 Referring to FIG. 11, first, in each area of blocks S1 and S3 of the eyebrow interval extraction filter, a point that best matches the eye template is searched for and set as candidate points for the left and right eyes (step S200).

次に、眉間候補点位置を左右の目の候補点の中点に修正する（ステップＳ２０２）。続いて、修正眉間候補点位置を中心に左右の目の候補点が水平に並ぶように入力画像を回転する（ステップＳ２０４）。 Next, the position of the eyebrow candidate point is corrected to the middle point of the left and right eye candidate points (step S202). Subsequently, the input image is rotated so that the candidate points of the left and right eyes are horizontally aligned around the corrected eyebrows candidate point position (step S204).

回転後の修正眉間候補点を中心とするパターンと、後に説明するような手続きで予め形成されている眉間テンプレートとの類似度を計算する（ステップＳ２０６）。 The degree of similarity between the pattern around the corrected corrected eyebrows candidate point and the eyebrows template formed in advance by a procedure described later is calculated (step S206).

類似度が予め定めたしきい値以上かを判断し（ステップＳ２０８）、しきい値以上であれば、それを真の眉間候補点とする（ステップＳ２１０）。一方、しきい値未満であれば、それを偽の眉間候補点とする（ステップＳ２１２）。 It is determined whether or not the similarity is equal to or greater than a predetermined threshold (step S208). On the other hand, if it is less than the threshold value, it is set as a false eyebrows candidate point (step S212).

このような処理をすべての眉間候補点について行なう。 Such processing is performed for all the eyebrow gap candidate points.

図１２は、図１１のステップＳ２００における目の候補点の抽出処理を説明するための図である。 FIG. 12 is a diagram for explaining the extraction process of eye candidate points in step S200 of FIG.

図１２において、白丸は、修正前の眉間の候補点であり、白十字は、目の候補点を示す。 In FIG. 12, a white circle is a candidate point between the eyebrows before correction, and a white cross indicates a candidate point of the eye.

（眉間テンプレート）
次に、図１１のステップＳ２０６において使用する眉間テンプレートの形成方法について説明する。 (Browser template)
Next, a method of forming a forehead template used in step S206 of FIG. 11 will be described.

図１３は、眉間テンプレートの形成手順を説明するためのフローチャートである。 FIG. 13 is a flowchart for explaining the procedure for forming the eyebrow template.

図１３を参照して、まず、複数の顔画像データを準備する（ステップＳ３００）。続いて、各顔画像について両目の位置をマウス等により、操作者が入力する（ステップＳ３０２）。 Referring to FIG. 13, first, a plurality of face image data are prepared (step S300). Subsequently, the operator inputs the position of both eyes for each face image using a mouse or the like (step S302).

さらに、以下は、計算機内部での処理として、両目の位置が水平となるように、両目の中点を中心に画像を回転して、向きの正規化を行なう（ステップＳ３０４）。両目の間隔が所定の距離となるように画像を拡大あるいは縮小して、サイズの正規化を行なう（ステップＳ３０６）。次に、両目の中点を中心とする眉間パターンｉ×ｊ画素を抽出する（ステップＳ３０８）。 Further, hereinafter, as processing inside the computer, the image is rotated about the midpoint of both eyes so as to be horizontal, and the orientation is normalized (step S304). The image is enlarged or reduced so that the distance between both eyes becomes a predetermined distance, and the size is normalized (step S306). Next, an i × j pixel between eyebrows is extracted about the center of both eyes (step S308).

さらに、抽出した眉間パターンの平均濃度が所定の値、たとえばゼロとなるように、分散が他の所定の値、たとえば１．０になるように濃度を変換して、濃度の正規化を行なう（ステップＳ３１０）。 Further, the density is converted so that the variance becomes another predetermined value, for example, 1.0 so that the average density of the extracted eyebrow pattern becomes a predetermined value, for example, zero, and the density is normalized ( Step S310).

正規化した多数の眉間パターンの平均パターンを計算して（ステップＳ３１２）、得られた平均パターンを眉間のテンプレートとする（ステップＳ３１４）。 The average pattern of many normalized eyebrow patterns is calculated (step S312), and the obtained average pattern is used as a template between eyebrows (step S314).

ただし、本発明では、ステップＳ３１４で得られた眉間テンプレートをさらに以下のように処理する。 However, according to the present invention, the eyebrow template obtained in step S314 is further processed as follows.

すなわち、髪の毛が眉までかかっている人の場合は、額部が低い輝度値になるが、平均テンプレートは高い濃度値になっており、このままマッチング評価を行なうと、マッチング度が低くなってしまう。そこで、髪型の影響を受けないように上から所定の画素数、たとえば、３画素の額にあたる部分は評価しない。たとえば、ステップＳ３１４で得られた眉間テンプレートが３２×１６画素のパターンであるならば、結局、３２×１３画素のパターンを用いてテンプレートマッチングを行なう。 That is, in the case of a person whose hair extends to the eyebrows, the forehead has a low luminance value, but the average template has a high density value. If the matching evaluation is performed as it is, the matching degree will be low. Therefore, a predetermined number of pixels from the top, for example, a portion corresponding to the sum of three pixels is not evaluated so as not to be affected by the hairstyle. For example, if the eyebrow template obtained in step S314 has a pattern of 32 × 16 pixels, template matching is eventually performed using a pattern of 32 × 13 pixels.

図１４は、眉間テンプレートを説明するための図である。 FIG. 14 is a diagram for explaining the forehead template.

図１４（ａ）は、図１３のステップＳ３１４で得られた眉間テンプレートを示し、図１４（ｂ）は、額の影響を排除するための最終的な眉間テンプレートを示す。 FIG. 14A shows the eyebrow template obtained in step S314 of FIG. 13, and FIG. 14B shows the final eyebrow template for eliminating the effect of the forehead.

なお、テンプレートマッチングは、顔の向きによって照明のあたり方が異なる場合を考慮して、左右独立に評価を行なうことも可能である。このときは、上述した眉間テンプレートを左右に２分割して、それぞれテンプレートマッチングを行なえばよい。たとえば、上記例のような大きさの眉間テンプレートであれば、左右それぞれ、片側１６×１３画素のパターンを用いてテンプレートマッチングを行なってもよい。 In the template matching, evaluation can be performed independently for the left and right sides in consideration of the case where the lighting direction differs depending on the face direction. In this case, the above-mentioned eyebrow template may be divided into two parts on the left and right sides and template matching may be performed. For example, in the case of the eyebrow template having the size as in the above example, the template matching may be performed using a pattern of 16 × 13 pixels on each side on the left and right sides.

次に、図１１のステップＳ２０６のテンプレートマッチングの処理をさらに詳しく説明する。 Next, the template matching process in step S206 of FIG. 11 will be described in more detail.

図１５は、ステップＳ２０６のテンプレートマッチングの手続きを説明するためのフローチャートである。 FIG. 15 is a flowchart for explaining the template matching procedure in step S206.

図１５を参照して、まず、眉間候補点を抽出して（ステップＳ４００）、必要に応じて、眉間候補点を中心に回転を行ない、スケール補正を行なう（ステップＳ４０２）。 Referring to FIG. 15, first, eyebrow interval candidate points are extracted (step S400), and rotation is performed about the eyebrow interval candidate points as necessary to perform scale correction (step S402).

次に、眉間候補点を中心として、テンプレートと同じサイズの画像を切り出す（ステップＳ４０４）。切り出した眉間候補パターンと眉間テンプレートとの相関値を計算して類似度とする（ステップＳ４０６）。 Next, an image having the same size as the template is cut out centering on the eyebrow candidate points (step S404). The correlation value between the cut out eyebrow interval pattern and the eyebrow interval template is calculated and set as a similarity (step S406).

なお、類似度の計算としては、切り出した眉間候補パターンの濃度を正規化（平均ゼロ、分散１．０）して、画素ごとにテンプレートの対応画素との差の２乗を計算し、その総和を求めることとしてもよい。すなわち、この場合、総和の値は、不類似度とみなせるので、この逆数により類似度を評価してもよい。 Note that the similarity is calculated by normalizing the density of the cut out eyebrow candidate pattern (mean zero, variance 1.0), calculating the square of the difference from the corresponding pixel of the template for each pixel, and summing them up. May be requested. That is, in this case, since the value of the sum can be regarded as the dissimilarity, the similarity may be evaluated by its reciprocal.

図１６は、このようにして対象画像から眉間および目の位置を抽出した例を示す図である。 FIG. 16 is a diagram illustrating an example of extracting the eyebrows and eye positions from the target image in this manner.

帽子をかぶり、かつ手で口を覆うという状態であるにも関わらず、眉間の位置（図中長方形の枠の中心）と目の位置（十字）が良好に検出されている。 Despite the state of wearing the hat and covering the mouth with the hand, the position of the eyebrows (the center of the rectangular frame in the figure) and the position of the eyes (cross) are detected well.

実施の形態１の本発明においては、濃淡情報を用いて６分割矩形フィルタにより、まず、眉間の候補点を抽出してから、最終的に目の位置を特定しているので、照明条件の変化に強く、かつ、高速な顔位置の抽出を行なうことができる。 In the present invention of the first embodiment, a candidate point between eyebrows is first extracted by a six-segment rectangular filter using grayscale information, and then the position of the eye is finally specified. And a high-speed face position extraction can be performed.

さらに、以上のような処理を、撮影されたビデオ画像の各フレームについて行なえば、動画像において、顔画像を追跡することも可能となる。 Further, if the above-described processing is performed for each frame of a captured video image, it is possible to track a face image in a moving image.

このときは、既に顔画像が検出されている前フレームの情報を基にして、原フレームにおいてフィルタ処理をする領域を絞り込むことも可能である。 At this time, it is also possible to narrow down the area to be filtered in the original frame based on the information of the previous frame in which the face image has already been detected.

なお、以上の説明では、眉間の候補点を探索する際に用いるフィルタは、矩形形状を３×２に６分割した６分割矩形フィルタを用いることとした。 In the above description, a six-segment rectangular filter obtained by dividing a rectangular shape into six parts of 3 × 2 is used as a filter used when searching for a candidate point between eyebrows.

ただし、顔画像が水平から傾いている場合にも対応可能とするためには、フィルタの形状は、図３や図５に示したものに限定されない。 However, the shape of the filter is not limited to those shown in FIGS. 3 and 5 in order to be able to cope with a case where the face image is inclined from horizontal.

図１７および図１８は、このようなフィルタの他の形状を説明するための図である。 FIG. 17 and FIG. 18 are diagrams for explaining other shapes of such a filter.

すなわち、図１７や図１８に示すように、図１におけるブロックＳ２およびＳ５に対して、ブロックＳ１およびＳ４と、ブロックＳ３およびＳ５とを、互いに反対方向に上下に所定の量だけずらせることも可能である。 That is, as shown in FIGS. 17 and 18, the blocks S1 and S4 and the blocks S3 and S5 can be shifted up and down by a predetermined amount in the opposite directions to the blocks S2 and S5 in FIG. It is possible.

この場合、ずれた量に対応する角度だけ、顔画像が傾いている場合にも良好に眉間の候補点を抽出できる。 In this case, even if the face image is inclined by an angle corresponding to the amount of displacement, candidate points between eyebrows can be extracted well.

本明細書中では、図３および図５に示した形状のフィルタ（６分割矩形フィルタ）と、図１７や図１８に示したようなフィルタとを総称して、「眉間検出フィルタ」と呼ぶことにする。 In the present specification, the filters having the shapes shown in FIGS. 3 and 5 (six-segment rectangular filters) and the filters shown in FIGS. 17 and 18 are collectively referred to as “inter-brows detection filter”. To

［実施の形態２］
実施の形態１の図１１において説明したとおり、眉間候補点から真の候補点を抽出する際には、一般には、眉間候補点の位置の修正および入力画像の回転等を行なう必要がある。ただし、テレビ会議のように画像中の人物の動きが比較的小さい場合には、真の候補点の抽出処理を簡略化することも可能である。 [Embodiment 2]
As described with reference to FIG. 11 of the first embodiment, when a true candidate point is extracted from the eyebrow candidate points, it is generally necessary to correct the position of the eyebrow candidate points, rotate the input image, and the like. However, when the motion of the person in the image is relatively small as in a video conference, the process of extracting the true candidate points can be simplified.

図１９は、このような実施の形態２の顔位置抽出装置において、真の眉間候補点の抽出を行なう処理を説明するためのフローチャートである。 FIG. 19 is a flowchart for explaining processing for extracting a true eyebrow interval candidate point in the face position extracting apparatus according to the second embodiment.

図１９を参照して、まず、入力画像における眉間候補点を中心とするパターンと、予め形成されている眉間テンプレートとの類似度を計算する（ステップＳ５００）。 Referring to FIG. 19, first, the similarity between a pattern centered on the eyebrow candidate point in the input image and a previously formed eyebrow template is calculated (step S500).

類似度が予め定めたしきい値以上かを判断し（ステップＳ５０２）、しきい値以上であれば、それを真の眉間候補点とする（ステップＳ５０４）。一方、しきい値未満であれば、それを偽の眉間候補点とする（ステップＳ５０６）。 It is determined whether or not the similarity is equal to or greater than a predetermined threshold (step S502). If the similarity is equal to or greater than the threshold, it is determined as a true eyebrows candidate point (step S504). On the other hand, if it is less than the threshold value, it is set as a false eyebrows candidate point (step S506).

その他の処理および構成は、実施の形態１の顔位置抽出装置と同様であるので、その説明は繰り返さない。 Other processes and configurations are the same as those of the face position extraction device of the first embodiment, and therefore, description thereof will not be repeated.

このような構成でも、人物のカメラ３０からの位置や顔の向きの変化が小さい場合は、実施の形態１と同様の効果を奏することができる。 Even in such a configuration, when the change in the position of the person from the camera 30 or the direction of the face is small, the same effect as in the first embodiment can be obtained.

［実施の形態３］
実施の形態１および２では、カメラ３０は１台で撮影を行なうこととしていた。 [Embodiment 3]
In the first and second embodiments, one camera 30 performs photographing.

これに対して、カメラ３０をたとえば２台として、２眼ステレオ構成とすると、人物についての距離の情報も得ることができる。 On the other hand, if two cameras 30 are used to form a two-lens stereo configuration, distance information on a person can be obtained.

すなわち、６分割矩形フィルタで抽出した候補点の中から、真の顔候補点を検出する手法は、実施の形態３でも、原理的には、実施の形態１または２と同様の方法を用いることができる。 In other words, the method of detecting a true face candidate point from the candidate points extracted by the six-segment rectangular filter uses a method similar to that of the first or second embodiment in principle even in the third embodiment. Can be.

ただし、実施の形態３の顔位置抽出装置では、検出できる顔の大きさの範囲をより広げるため、カメラ３０を２眼ステレオ構成とし、距離情報に応じて顔候補領域を切りだすサイズを切り替える。 However, in the face position extracting apparatus according to the third embodiment, the camera 30 has a twin-lens stereo configuration, and switches the size at which the face candidate region is cut out in accordance with the distance information in order to further expand the range of detectable face sizes.

顔候補領域の切り出すサイズを切り替えることで、平均顔テンプレートと同じ顔の大きさにスケーリングしてマッチングをとることができ、顔の大きさの検出範囲を広げることが可能である。 By switching the cutout size of the face candidate region, matching can be performed by scaling to the same face size as the average face template, and the detection range of the face size can be expanded.

実施の形態３では、上述のとおり、２眼ステレオ構成とし、候補点の視差情報を求める。顔の大きさは視差に反比例すると考えられるので、視差情報からその候補点を切り出すサイズを決定する。そのため、最適なサイズで顔候補領域を切り出してテンプレートとのマッチングを採ることができる。 In the third embodiment, as described above, a two-lens stereo configuration is used, and parallax information of a candidate point is obtained. Since the size of the face is considered to be inversely proportional to the parallax, the size at which the candidate point is cut out from the parallax information is determined. Therefore, it is possible to cut out a face candidate area with an optimal size and to match the template with the template.

ここで、以下では、４０人、各人１０枚、計４００枚の表情や向き照明条件などを少しずつ変化させた画像が納められている顔画像データベースに対する評価を交えて説明する。 In the following, a description will be given together with an evaluation of a face image database in which images of slightly different facial expressions, direction lighting conditions, and the like are stored for a total of 400 images for 40 people and 10 people each.

このデータベース内の顔画像は、画像サイズが９２×１１２のモノクロ画像である。矩形サイズは、横が左右こめかみの間の画素数、縦が眉毛から鼻先までの画素数の大きさを基準とした。手動で計測した結果、顔画像（９２×１１２）に対して、基準の矩形サイズは６０×３０とした。 The face images in this database are monochrome images having an image size of 92 × 112. The rectangle size was based on the number of pixels between the temples on the horizontal and the size of the number of pixels from the eyebrows to the tip of the nose on the vertical. As a result of manual measurement, the standard rectangular size was set to 60 × 30 for the face image (92 × 112).

まず、図２０は、異なるサイズの６分割矩形フィルタにより、同一の顔画像について、どの範囲で眉間候補点が検出可能であるかを示す図である。 First, FIG. 20 is a diagram illustrating a range in which eyebrow gap candidate points can be detected for the same face image using six-segment rectangular filters having different sizes.

図２０では、矩形サイズを基準サイズから２０％ずつ変化させながら、眉間抽出処理を行なっている。実験では、真の候補点抽出率と候補点の個数を調べた。候補点に真の候補点が含まれているかは、眉間付近に候補点が存在するかを目視で判断した。 In FIG. 20, the eyebrows are extracted while changing the rectangular size by 20% from the reference size. In the experiment, the true candidate point extraction rate and the number of candidate points were examined. Whether or not the candidate points include the true candidate points was visually determined whether or not the candidate points exist near the eyebrows.

図２０より、基準の矩形サイズ（６０×３０）での抽出率は、９２．０％であり、有効に機能していると考えられる。一方、矩形サイズが８４×４２の場合には抽出率が非常に悪くなっており、矩形が大きすぎて顔の特徴を抽出できていないと考えられる。 According to FIG. 20, the extraction rate in the standard rectangular size (60 × 30) is 92.0%, which is considered to be functioning effectively. On the other hand, when the rectangle size is 84 × 42, the extraction rate is extremely poor, and it is considered that the rectangle is too large to extract facial features.

図２０を参照すると、基準の矩形サイズから、０．６〜１．２倍のサイズの矩形で眉間候補点の抽出ができることを確認できる。顔の大きさと矩形サイズとは単純な比例関係にあると考えられる。したがって、矩形フィルタは、基準の大きさの顔から、０，８３〜１．６７倍の範囲の大きさの顔の眉間候補点が抽出できると考えられる。 Referring to FIG. 20, it can be confirmed that the eyebrows candidate points can be extracted with a rectangle having a size of 0.6 to 1.2 times the size of the reference rectangle. It is considered that the face size and the rectangle size have a simple proportional relationship. Therefore, it is considered that the rectangular filter can extract eyebrow candidate points of the face having a size in the range of 0.83 to 1.67 times from the face having the reference size.

次に、人物の距離と切り出す顔候補領域の大きさの関係を求めるために、顔位置抽出装置で使用しているカメラ構成で人物の顔を撮影し、カメラ３０との距離を変えながら、眉間位置の視差と、その顔に最適な顔を切り出すサイズを計測しておく。 Next, in order to obtain the relationship between the distance of the person and the size of the face candidate region to be cut out, the face of the person is photographed with the camera configuration used in the face position extraction device, and the distance between the Measure the parallax of the position and the size of cutting out the most suitable face for that face.

たとえば、視差は左右のカメラ３０に写る人物の眉間の位置の横方向の画素数の差を手動で計測することにより得る。顔を切り出すサイズは、左右のこめかみの間の画素数を手動で測定する。特に限定されないが、６分割矩形フィルタの縦方向のサイズは横方向の半分と定めることができる。 For example, the parallax is obtained by manually measuring the difference in the number of pixels in the horizontal direction at the position between the eyebrows of the person captured by the left and right cameras 30. The size of the face is measured by manually measuring the number of pixels between the left and right temples. Although not particularly limited, the size of the six-segment rectangular filter in the vertical direction can be determined to be half of the horizontal direction.

図２１は、視差と最適な顔を切り出すサイズの関係を示す図である。 FIG. 21 is a diagram illustrating a relationship between parallax and a size at which an optimal face is cut out.

この図２１を基に、６分割矩形フィルタのサイズ、顔候補点を切り出すサイズ、視差と顔候補点を切り出すサイズの関係を決定する。 Based on FIG. 21, the relationship between the size of the 6-divided rectangular filter, the size for cutting out face candidate points, and the relationship between parallax and the size for cutting out face candidate points is determined.

図２２は、図２１より設定した６分割矩形フィルタサイズ、視差、候補点を切り出すサイズの関係を示す図である。あるサイズの６分割矩形フィルタが抽出できる顔候補領域を切り出すサイズが０．８３〜１．６７倍の範囲を持つことを利用し、たとえば、４０×２０、２４×１２の２段階のフィルタサイズで全体をカバーできるように設定した。顔候補領域を切り出すサイズは視差５画素ごとに切り替えるように設定した。切り出すサイズは細かく設定する方が制度が高くなると考えられるが、平均顔テンプレートのマッチング処理はある程度の大きさに対する柔軟性があるため、この範囲での切り替えで十分である。図２２では、例えば、矩形フィルタサイズが４０×２０のとき、ステレオマッチングの結果、視差が２０であれば、４８×２４の大きさで候補点を切り出すという意味である。 FIG. 22 is a diagram showing the relationship among the 6-part rectangular filter size, the parallax, and the size at which a candidate point is cut out, set from FIG. Utilizing that the size of extracting a face candidate region that can be extracted by a six-segment rectangular filter of a certain size has a range of 0.83 to 1.67 times, for example, a filter size of two stages of 40 × 20 and 24 × 12 is used. It was set to cover the whole. The size for cutting out the face candidate area was set to be switched every 5 parallax pixels. It is considered that setting the cut-out size finely increases the accuracy. However, since the matching processing of the average face template has a certain degree of flexibility, switching within this range is sufficient. In FIG. 22, for example, when the rectangular filter size is 40 × 20, if the parallax is 20 as a result of stereo matching, this means that candidate points are cut out at a size of 48 × 24.

もしも、この表に当てはまらない視差が出てきた場合、または、どこにもマッチングしなかった場合、その候補点は偽の候補点であるとして切り捨てる。 If a disparity that does not fit in this table appears, or if no match is found, the candidate point is discarded as a false candidate point.

以上の処理により、実施の形態３の顔位置抽出装置において、対象となる画像から眉間の候補点を抽出することができる。 Through the above processing, the face position extraction device of the third embodiment can extract candidate points between eyebrows from a target image.

図２３は、実施の形態３の顔位置抽出装置において、真の眉間候補点の抽出を行なう処理を説明するためのフローチャートである。 FIG. 23 is a flowchart illustrating a process of extracting a true inter-brows candidate point in the face position extracting device according to the third embodiment.

図２３を参照して、まず、候補点のカメラ３０からの距離を２眼ステレオ方式により推定する（ステップＳ６００）。 Referring to FIG. 23, first, the distance of the candidate point from camera 30 is estimated by the twin-lens stereo method (step S600).

次に、距離が予め定めた範囲内にあるかを判断する（ステップＳ６０２）。もしも、予め定めた範囲内に眉間の候補点がないならば、それは偽の候補点であると判断する（ステップＳ６１２）。 Next, it is determined whether the distance is within a predetermined range (step S602). If there is no candidate point between the eyebrows within the predetermined range, it is determined that it is a false candidate point (step S612).

一方、距離が予め定めた範囲内にある場合は、次に、距離に応じて、予め用意しておいたサイズの異なる眉間テンプレートを選択する（ステップＳ６０４）。 On the other hand, when the distance is within the predetermined range, next, the eyebrow templates having different sizes prepared in advance are selected according to the distance (step S604).

入力画像における眉間候補点を中心とするパターンと、選択された眉間テンプレートとの類似度を計算する（ステップＳ６０６）。 The similarity between the pattern around the eyebrow candidate point in the input image and the selected eyebrow template is calculated (step S606).

類似度が予め定めたしきい値以上かを判断し（ステップＳ６０８）、しきい値以上であれば、それを真の眉間候補点とする（ステップＳ６１０）。一方、しきい値未満であれば、それを偽の眉間候補点とする（ステップＳ６１２）。 It is determined whether or not the similarity is equal to or greater than a predetermined threshold (step S608). On the other hand, if it is less than the threshold value, it is set as a false eyebrow interval candidate point (step S612).

このような構成では、人物のカメラ３０からの距離も考慮して、真の候補点の抽出を行なうので、より高速に顔画像の位置検出を行なうことが可能である。したがって、この実施の形態３の処理を動画像の各フレームに対して行なうことで、顔画像の追跡を行なうことも可能である。 In such a configuration, since the true candidate point is extracted in consideration of the distance of the person from the camera 30, the position of the face image can be detected at a higher speed. Therefore, the face image can be tracked by performing the processing of the third embodiment on each frame of the moving image.

なお、実施の形態３においても、実施の形態１の図１１において説明したとおり、眉間候補点から真の候補点を抽出する際に、目の位置を検出した上で眉間候補点の位置の修正および入力画像の回転等を行なうことも可能である。 In the third embodiment as well, as described in FIG. 11 of the first embodiment, when extracting a true candidate point from the eyebrow candidate points, the position of the eyebrow candidate point is corrected after detecting the eye position. It is also possible to rotate the input image.

［実施の形態４］
実施の形態３では、予め準備しておいたサイズの異なる眉間テンプレートから眉間候補点のカメラ３０からの距離に応じて、眉間テンプレートを選択した。 [Embodiment 4]
In the third embodiment, the eyebrow template is selected according to the distance from the camera 30 to the eyebrow candidate point from the eyebrow templates of different sizes prepared in advance.

しかしながら、眉間候補点のカメラ３０からの距離に応じて、基準となる眉間テンプレートのサイズに合うように入力画像を縮小（または拡大）して、テンプレートマッチングを行なうことも可能である。 However, it is also possible to perform template matching by reducing (or enlarging) the input image so as to match the size of the reference eyebrow template according to the distance of the eyebrow candidate point from the camera 30.

図２４は、このような実施の形態４の顔位置抽出装置において、真の眉間候補点の抽出を行なう処理を説明するためのフローチャートである。 FIG. 24 is a flowchart for explaining processing for extracting a true eyebrow interval candidate point in the face position extracting apparatus according to the fourth embodiment.

図２４を参照して、まず、候補点のカメラ３０からの距離を２眼ステレオ方式により推定する（ステップＳ７００）。 Referring to FIG. 24, first, the distance of the candidate point from camera 30 is estimated by the twin-lens stereo method (step S700).

次に、距離が予め定めた範囲内にあるかを判断する（ステップＳ７０２）。もしも、予め定めた範囲内に眉間の候補点がないならば、それは偽の候補点であると判断する（ステップＳ７１２）。 Next, it is determined whether the distance is within a predetermined range (step S702). If there is no candidate point between the eyebrows within the predetermined range, it is determined that it is a false candidate point (step S712).

一方、距離が予め定めた範囲内にある場合は、次に、距離に応じて、眉間像がテンプレートサイズに合うように入力画像を縮小する（ステップＳ７０４）。 On the other hand, if the distance is within the predetermined range, the input image is reduced according to the distance so that the eyebrow image matches the template size (step S704).

入力画像の候補点を中心とする縮小パターンと眉間テンプレートとの類似度を計算する（ステップＳ７０６）。 The similarity between the reduced pattern centered on the candidate points of the input image and the eyebrow template is calculated (step S706).

類似度が予め定めたしきい値以上かを判断し（ステップＳ７０８）、しきい値以上であれば、それを真の眉間候補点とする（ステップＳ７１０）。一方、しきい値未満であれば、それを偽の眉間候補点とする（ステップＳ７１２）。 It is determined whether or not the similarity is equal to or greater than a predetermined threshold (step S708). On the other hand, if it is less than the threshold value, it is set as a false eyebrows candidate point (step S712).

その他の処理および構成は、実施の形態３の顔位置抽出装置と同様であるので、その説明は繰り返さない。 Other processes and configurations are the same as those of the face position extracting device of the third embodiment, and therefore, description thereof will not be repeated.

このような構成では、人物のカメラ３０からの距離も考慮して、真の候補点の抽出を行なうので、より高速に顔画像の位置検出を行なうことが可能である。したがって、この実施の形態４の処理を動画像の各フレームに対して行なうことで、顔画像の追跡を行なうことも可能である。 In such a configuration, since the true candidate point is extracted in consideration of the distance of the person from the camera 30, the position of the face image can be detected at a higher speed. Therefore, by performing the processing of the fourth embodiment on each frame of a moving image, it is possible to track a face image.

なお、実施の形態４においても、実施の形態１の図１１において説明したとおり、眉間候補点から真の候補点を抽出する際に、目の位置を検出した上で眉間候補点の位置の修正および入力画像の回転等を行なうことも可能である。 Also, in the fourth embodiment, as described in FIG. 11 of the first embodiment, when extracting a true candidate point from the eyebrow candidate points, the position of the eyebrow candidate point is corrected after the position of the eye is detected. It is also possible to rotate the input image.

以上説明したような各実施の形態の処理で、時間軸について所定間隔で連続する画面情報、たとえば、連続するフレーム画像から、実時間で、眉間または目の位置を検出することができる。さらに、このような連続する画面情報の各々において、眉間または目の位置の検出を連続して行なっていくことで、眉間または目の位置のトラッキングを行なうことができる。 In the processing of each of the embodiments described above, the position of the eyebrows or the eyes can be detected in real time from screen information that is continuous at predetermined intervals on the time axis, for example, continuous frame images. Further, by continuously detecting the position between the eyebrows or the eyes in each of such continuous screen information, the position between the eyebrows or the eyes can be tracked.

［眉間の候補点の中から真の眉間を選択する処理の変形例］
以上説明した実施の形態では、顔位置の抽出処理において、眉間検出フィルタで画像中から眉間の候補点を抽出し、候補点の中から真の眉間を選択する、という処理を行なっている。 [Modified Example of Processing for Selecting True Eyebrows Between Candidate Points Between Eyebrows]
In the embodiment described above, in the face position extraction processing, processing is performed such that a candidate point between the eyebrows is extracted from the image with the eyebrow detection filter, and the true eyebrows are selected from the candidate points.

この「真の眉間を検出する処理」は、言い換えると、複数の眉間候補点から、真の眉間に相当する候補点を選択するためのパターン判別処理を行なっていることに相当する。上述した実施の形態では、「眉間テンプレートとの類似度」に基づいて、パターン判別処理を行なうものとしたが、パターン判別の方法としては、必ずしもこのような方法に限定されるものではない。 In other words, the “process of detecting a true eyebrow interval” corresponds to performing a pattern determination process for selecting a candidate point corresponding to a true eyebrow interval from a plurality of eyebrow interval candidate points. In the above-described embodiment, the pattern discrimination processing is performed based on the “similarity with the eyebrow template”. However, the pattern discrimination method is not necessarily limited to such a method.

以下では、「眉間テンプレートとの類似度」によるパターン判別処理も含めて、このようなパターン判別処理として可能な変形例について説明する。 In the following, a modified example that can be used as such a pattern determination process, including a pattern determination process based on “similarity with the eyebrow template” will be described.

（１）パターンテンプレートとの類似度によるパターン判別処理
テンプレートをｆ＝｛ｔ_ij｝、評価されるパターンをｆ＝｛ｆ_ij｝とすると、単純な類似度評価値（ｑ）としては、以下の式（５）のような各対応画素値の差の絶対値の総和がある。 (1) Pattern discrimination processing based on the similarity with the pattern template Assuming that the template is f = {t _ij } and the evaluated pattern is f = {f _ij }, a simple similarity evaluation value (q) is as follows: There is a sum of the absolute values of the differences between the corresponding pixel values as in equation (5).

あるいは、以下の式（６）のような差の絶対値の２乗和を用いることもできる。 Alternatively, the sum of squares of the absolute value of the difference as in the following equation (6) can be used.

式（５）または（６）を用いる場合は値が小さいほど類似度が高いと判断することになる。 When equation (5) or (6) is used, it is determined that the smaller the value is, the higher the similarity is.

一方、他の評価値としては次式（７）で表わされる正規化相関値を用いることもできる。 On the other hand, a normalized correlation value represented by the following equation (7) can be used as another evaluation value.

式（７）において、｛ｔ_ij｝と、｛ｆ_ij｝とが完全に一致していれば、ｑの値は１であり、完全に反転パターン（明暗が逆）ならばｑの値は−１になる。それ以外の場合は、ｑの値は１と−１の間の値となる。式（７）を用いるときは、ｑの値が大きいほど類似度は高いという評価になる。 In equation (7), if {t _ij } and {f _ij } completely match, the value of q is 1; if it is a completely inverted pattern (the opposite of light and dark), the value of q is − Becomes 1. Otherwise, the value of q is between 1 and -1. When equation (7) is used, the evaluation is such that the larger the value of q, the higher the similarity.

正規化相関値は、平均値からの差で評価しているので、全体的に明るさレベルがシフトしてもその評価に影響がない。また、例えば照明が暗くなると明るさの平均値が下がるだけでなく、明暗の差も小さくなる。この場合でも、分母の正規化項のおかげでｑの値に影響がない。 Since the normalized correlation value is evaluated based on the difference from the average value, even if the brightness level is shifted as a whole, the evaluation is not affected. Further, for example, when the illumination becomes dark, not only does the average value of brightness decrease, but also the difference between light and dark decreases. Even in this case, the value of q is not affected by the normalization term of the denominator.

また、以下の式（８）で示されるように、テンプレートととして多くのサンプルパターン（ｓⁿ＝｛ｓⁿ _ij｝）の平均パターンを使うこともできる。 It is also possible to use the average pattern of the following as shown in equation (8), a number of sample pattern as a template ^{^{_{(s n = {s n ij}}} }).

この場合は、重み付きの類似度評価を行なうことができる。例えば、右目の右上部分や左目の左上部分は、人によっては前髪が下がっていたりいなかったりする。このため、その部分は、テンプレートと差があっても、あまり重要でないと考えられる。 In this case, weighted similarity evaluation can be performed. For example, the upper right part of the right eye and the upper left part of the left eye may or may not have bangs depending on the person. Therefore, even if there is a difference from the template, it is considered that the part is not so important.

そこで、多くのサンプルパターンがある場合は、以下の式（９）に示すように、各画素位置で明るさがどれくらいサンプル間でばらついているかを示す分散をまず計算する。 Therefore, when there are many sample patterns, the variance indicating how much the brightness varies between samples at each pixel position is first calculated as shown in the following equation (9).

次に、その分散の逆数を重みづけに使って、以下の式（１０）に示すような評価値ｑを用いた重み付き類似度評価を行なうこともできる。 Next, using the reciprocal of the variance for weighting, weighted similarity evaluation using an evaluation value q as shown in the following equation (10) can be performed.

あるいは、「右目位置に対象な位置には左目があって同じように黒いはず」であり、「その中央は鼻筋で明るいはず」というように、互いの画素間にも関係があって、その関係がどれくらいばらついているかを表わす指標である共分散を考慮して重み付けを行なうことができる。なお、これに対して、式（９）は、自己分散の場合である。 Or, there is a relationship between the pixels, such as "the right eye position should have the left eye at the target position and be black as well", and "the center should be bright at the nose line". Can be weighted in consideration of covariance, which is an index indicating how much the data varies. Expression (9), on the other hand, is for the case of self-dispersion.

このような共分散を考慮して重みづけをした類似度は、「マハラノビス距離」と呼ばれる。 The similarity weighted in consideration of such a covariance is called “Maharanobis distance”.

すなわち、ｔ_ijを１列にならべてベクトルのように表わすとすると、マハラノビス距離ｑは、以下の式（１１）のように表わされる。 That is, _assuming that t _ij is represented as a vector in a line, the Mahalanobis distance q is represented as the following equation (11).

ここでΣはｓⁿの共分散行列である。このマハラノビス距離ｑを用いても、パターンテンプレートとの類似度によるパターン判別処理を実施することができる。 Where Σ is the covariance matrix of s ^n. Even using this Mahalanobis distance q, a pattern discrimination process based on the similarity with the pattern template can be performed.

（２）統計的パターン判別処理
眉間検出フィルタで画像中から眉間の候補点を抽出し、候補点の中から真の眉間を選択する、という処理は、言い換えれば、眉間の候補点の中から、顔のパターンに対応するのか、あるいは、顔パターンではないのかを判別することで、真の眉間を抽出する、との手続きとみることもできる。 (2) Statistical pattern discriminating process The process of extracting candidate points between eyebrows from an image with an eyebrow detection filter and selecting a true eyebrow interval from the candidate points is, in other words, a process of extracting candidate points between eyebrows. It can be considered as a procedure of extracting a true eyebrows by determining whether the pattern corresponds to a face pattern or not a face pattern.

この場合、「顔」と「非顔」の判別処理には、統計的パターン判別処理を適用することができる。 In this case, a statistical pattern determination process can be applied to the “face” and “non-face” determination process.

すなわち、統計的パターン判別処理は、多数の「顔」と「非顔」のサンプルが与えられたときに、それらのデータを元に「不明」のパターンを「顔」か「非顔」に判別するものである。これに対して、上述した類似度計算では「非顔」という概念は、必要ない。 That is, when a large number of “face” and “non-face” samples are given, the statistical pattern discrimination process determines the “unknown” pattern as “face” or “non-face” based on those data. Is what you do. On the other hand, the concept of “non-face” is not necessary in the above-described similarity calculation.

（２−１）線形判別法
パターンｆ＝｛ｆ_ij｝を、その画素値を一列にならべたＩ×Ｊ次元のベクトルと考えると、１パターンはＩ×Ｊ次元空間の１点と考えられる。 (2-1) Linear Discriminant Method When a pattern f = {f _ij } is considered as an I × J-dimensional vector in which pixel values are arranged in a line, one pattern is considered as one point in an I × J-dimensional space.

以下の説明では、３次元以上は平面上に図示しにくいので、２次元の場合を例にとって説明する。 In the following description, since it is difficult to show three or more dimensions on a plane, a two-dimensional case will be described as an example.

図２５は、「顔」のサンプルと「非顔」のサンプルの分布の一例を示す概念図である。 FIG. 25 is a conceptual diagram illustrating an example of the distribution of “face” samples and “non-face” samples.

図２５に示すように、「顔」のサンプル（○）と「非顔」のサンプル（×）が分布していたとすると、「顔」（○）と「非顔」（×）を分離する直線Ｌ１を予め求めておき、「不明」の入力パターンが直線Ｌ１のどちらにあるかで、「顔」（○）と「非顔」（×）かを判定することができる。 As shown in FIG. 25, if a sample of “face” (顔) and a sample of “non-face” (×) are distributed, a straight line separating “face” (顔) and “non-face” (×) L1 is obtained in advance, and it can be determined whether the input pattern of "unknown" is on the straight line L1 as "face" (o) or "non-face" (x).

２次元では直線ａｘ＋ｂｙになるが、３次元ではａｘ＋ｂｙ＋ｃｚで表現される平面になる。より一般に、さらに高次元では各次元要素の線形結合で表わされる超平面となる。このような超平面による判別を、「線形判別法」と呼ぶ。 In two dimensions, it is a straight line ax + by, but in three dimensions, it is a plane represented by ax + by + cz. More generally, in a higher dimension, there is a hyperplane represented by a linear combination of each dimensional element. Such a discrimination based on a hyperplane is referred to as a “linear discrimination method”.

一般には、一つの超平面で完全に「顔」（○）と「非顔」（×）を分離できるとはかぎらないものの、「顔」（○）の側に「非顔」（×）がくる誤りと、「非顔」（×）の側に「顔」（○）がくる誤りの合計が最小になるように超平面を決定しておく。 In general, it is not always possible to completely separate “face” (○) and “non-face” (×) in one hyperplane, but “non-face” (×) is placed on the side of “face” (○). The hyperplane is determined so that the sum of the error that comes and the error that causes a “face” (○) on the side of the “non-face” (×) is minimized.

（２−２）サポートベクターマシン
上述した線形判別法で誤りが最小になるように超平面を決定しても、実用上は、誤りが大きすぎる場合もあり得る。 (2-2) Support Vector Machine Even if the hyperplane is determined so that the error is minimized by the above-described linear discriminant method, the error may be too large in practical use.

そのようなときであっても、例えば（ｘ，ｙ，ｚ）の３次元の空間の点を（ｘ²，ｙ²，ｚ²，ｘｙ，ｙｚ，ｚｘ）のようなより高次元（この場合６次元）の空間に写像してやると、その空間の超平面でうまく、上述したような「顔」（○）と「非顔」（×）とが分離できるようになる場合があることが知られている。しかも、サポートベクターマシンでは、実際には高次元の空間に写像することなく、もとの空間で写像先の高次元空間の超平面を計算することができる。 Even in such a case, for example, a point in a three-dimensional space of (x, y, z) is set to a higher dimension such as (x ² , y ² , z ² , xy, yz, zx) (in this case, It is known that, when mapping onto a (six-dimensional) space, the "face" (o) and "non-face" (x) can sometimes be separated on the hyperplane of the space, as described above. ing. Moreover, the support vector machine can calculate the hyperplane of the high-dimensional space to be mapped in the original space without actually mapping to the high-dimensional space.

サポートベクターマシンで顔の検出を行なう具体的な構成については、たとえば、文献：E.Osuna, R.Freund, and F.Girosi: "Training Support Vector Machines: an Application to Face Detection", Proc. of International Conference on Computer Vision and Pattern Recognition, pp.130-136(1997)に開示されている。 For a specific configuration for detecting a face using a support vector machine, see, for example, E. Osuna, R. Freund, and F. Girosi: "Training Support Vector Machines: an Application to Face Detection", Proc. Of International Conference on Computer Vision and Pattern Recognition, pp. 130-136 (1997).

以下では、サポートベクターマシンの概要について説明する。 Hereinafter, an outline of the support vector machine will be described.

図２６は、サポートベクターマシンが適用される写像先の高次元空間を示す図である。 FIG. 26 is a diagram illustrating a high-dimensional space of a mapping destination to which the support vector machine is applied.

図２６でも、高次元空間を２次元空間として説明している。 FIG. 26 also describes the high-dimensional space as a two-dimensional space.

サポートベクターマシンでは平行な超平面が２つ想定される。この２つの超平面は、１つは「非顔」（図では×）のサンプルに接する超平面Ｐ１であり、もう１つは「顔」（図では○）のサンプルに接する超平面Ｐ２のようなペアである。 In the support vector machine, two parallel hyperplanes are assumed. One of these two hyperplanes is a hyperplane P1 which is in contact with a sample of "non-face" (x in the figure), and another is a hyperplane P2 which is in contact with a sample of "face" (in the figure). Pair.

他のペアの超平面Ｐ３および超平面Ｐ４も考えられる。しかし、サポートベクターマシンでは、可能な超平面のペアの中で間隔が最大となるペアが採用される。この間隔が、判別の際の余裕と考えられ、余裕が最大となるようなペアが採用されることになる。 Other pairs of hyperplanes P3 and P4 are also conceivable. However, in the support vector machine, the pair having the largest interval among pairs of possible hyperplanes is adopted. This interval is considered to be a margin for determination, and a pair having the maximum margin is adopted.

図２６に示すような超平面による、「顔」パターンと「非顔」パターンの判別は、超平面Ｐ１と超平面Ｐ２から等距離にある中間の超平面を、上述した線形判別におけるの判定のための超平面のようにみなして行なう。 The determination of the “face” pattern and the “non-face” pattern by the hyperplane as shown in FIG. 26 is performed by determining the intermediate hyperplane equidistant from the hyperplane P1 and the hyperplane P2 in the above-described linear determination. It is performed as if it were a hyperplane.

（２−３）ベイズ推定による判別
排反事象Ｈ₁（顔である）とＨ₂（非顔である）があって、Ａを任意の事象（切り出した濃淡パターン）としたとき、ベイズの定理は、以下の式で表わされる。 (2-3) Discrimination by Bayesian Estimation When there are exclusion events H ₁ (which is a face) and H ₂ (which is a non-face), and A is an arbitrary event (an extracted light and shade pattern), Bayes' theorem Is represented by the following equation.

ここで、Ｐ（Ｈ₁｜Ａ）はＡが生じた時にそれがＨ₁である事後確率で、Ｐ（Ａ｜Ｈ₁）は、Ｈ₁の時にＡが生じる事前確率である。Ａが生じたとわかったあとで、それがＨ₁である事後確率またはＨ₂である事後確率の両者を比較して、ベイズ判定では確率の高い方のパターンであると判定を行なう。ふたつの事後確率の比は、以下の式で表わされる。 Here, P (H ₁ | A) is the posterior probability that when A occurs, it is H ₁ , and P (A | H ₁ ) is the prior probability that A occurs when H ₁ . After you find the A occurs, it compares both the posterior probability is the posterior probability or H ₂ is H _1, a determination that the pattern of the higher probability in the Bayesian decision. The ratio of the two posterior probabilities is expressed by the following equation.

式（９）が１より大きければ方１と判断することになる。式（９）は書き直せば、以下の式（１０）となる。 If the expression (9) is larger than 1, it is determined that the direction is 1. Equation (9) can be rewritten as equation (10) below.

そこで、事象Ｈ₁とＨ₂のサンプルをたくさん収集して、Ｐ（Ａ｜Ｈ₁）とＰ（Ａ｜Ｈ₂）を推定しておき、λをしきい値パラメータとして、式（１０）により判定すれば、事象Ａを事象Ｈ₁と判断するか事象Ｈ₂と判断するかを決めることができる。 Therefore, a large number of samples of the events H ₁ and H ₂ are collected, P (A | H ₁ ) and P (A | H ₂ ) are estimated, and λ is used as a threshold parameter to obtain an equation (10). When it is determined, it is possible to determine to determine whether events H ₂ determines an event a and event H _1.

ベイズ判定方法で顔を検出する方法については、たとえば、文献：H.Shneiderman and T.Kanade:"Probabilistic Modeling Of Local Appearance and Spatial Relationships for Object Recognition", Proc. of International Conference on Computer Vision and Pattern Recognition, pp.45-51(1998)に開示されている。 For a method of detecting a face by the Bayes determination method, for example, see H. Shneiderman and T. Kanade: "Probabilistic Modeling Of Local Appearance and Spatial Relationships for Object Recognition", Proc. Of International Conference on Computer Vision and Pattern Recognition, pp. 45-51 (1998).

この他、ニューラルネットワークによる判別により、「顔」と「非顔」の判別処理を行なうことも可能である。 In addition, it is also possible to perform a “face” and “non-face” discrimination process by a discrimination using a neural network.

今回開示された実施の形態はすべての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は上記した説明ではなくて特許請求の範囲によって示され、特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 The embodiments disclosed this time are to be considered in all respects as illustrative and not restrictive. The scope of the present invention is defined by the terms of the claims, rather than the description above, and is intended to include any modifications within the scope and meaning equivalent to the terms of the claims.

本発明の１実施の形態にかかるシステムの外観図である。It is an outline view of a system concerning one embodiment of the present invention. 本発明の１実施の形態にかかるシステムのハードウェア的構成を示すブロック図である。1 is a block diagram illustrating a hardware configuration of a system according to an embodiment of the present invention. ６分割矩形フィルタを示す図である。It is a figure showing a 6-partition rectangular filter. ６分割矩形フィルタを顔画像に当てはめた場合を示す概念図である。FIG. 9 is a conceptual diagram showing a case where a six-divided rectangular filter is applied to a face image. ６分割矩形フィルタの他の構成を示す概念図である。FIG. 11 is a conceptual diagram illustrating another configuration of the six-segment rectangular filter. 分割矩形フィルタを走査する対象となる画像を示す概念図である。FIG. 5 is a conceptual diagram illustrating an image to be scanned by a divided rectangular filter. インテグラルイメージを用いて、総和を求める長方形領域を示す図である。FIG. 9 is a diagram illustrating a rectangular area for which a sum is obtained using an integral image. 眉間の候補点を抽出する処理を説明するためのフローチャートである。It is a flowchart for demonstrating the process which extracts the candidate point between eyebrows. 眉間候補点の抽出結果を示す図である。It is a figure showing the extraction result of the eyebrows candidate point. 右目のテンプレートを示す図である。It is a figure showing a template of the right eye. 目の候補点の抽出を行なった上で、真の眉間候補点の抽出を行なう処理を説明するためのフローチャートである。It is a flowchart for demonstrating the process of extracting the eyebrows candidate point after extracting the eye candidate point. 図１１のステップＳ２００における目の候補点の抽出処理を説明するための図である。FIG. 12 is a diagram for describing an eye candidate point extraction process in step S200 of FIG. 11. 眉間テンプレートの形成手順を説明するためのフローチャートである。It is a flowchart for demonstrating the formation procedure of an eyebrow template. 眉間テンプレートを説明するための図である。It is a figure for explaining a forehead template. ステップＳ２０６のテンプレートマッチングの手続きを説明するためのフローチャートである。It is a flowchart for demonstrating the procedure of template matching of step S206. 対象画像から眉間および目の位置を抽出した例を示す図である。FIG. 9 is a diagram illustrating an example in which the positions of the eyebrows and the eyes are extracted from the target image. 眉間検出フィルタの他の形状を説明するための第１の図である。It is a 1st figure for demonstrating the other shape of an eyebrows detection filter. 眉間検出フィルタの他の形状を説明するための第２の図である。FIG. 10 is a second diagram for explaining another shape of the eyebrow detection filter. 実施の形態２の顔位置抽出装置において、真の眉間候補点の抽出を行なう処理を説明するためのフローチャートである。13 is a flowchart for describing processing for extracting a true eyebrow interval candidate point in the face position extracting device according to the second embodiment. 異なるサイズの６分割矩形フィルタにより、同一の顔画像について、どの範囲で眉間候補点が検出可能であるかを示す図である。FIG. 11 is a diagram showing in which range eyebrow candidate points can be detected with respect to the same face image using six-division rectangular filters of different sizes. 視差と最適な顔を切り出すサイズの関係を示す図である。FIG. 9 is a diagram illustrating a relationship between parallax and a size for cutting out an optimal face. 図２１より設定した６分割矩形フィルタサイズ、視差、候補点を切り出すサイズの関係を示す図である。FIG. 22 is a diagram illustrating a relationship among a 6-divided rectangular filter size, parallax, and a size at which a candidate point is cut out, set from FIG. 21. 実施の形態３の顔位置抽出装置において、真の眉間候補点の抽出を行なう処理を説明するためのフローチャートである。15 is a flowchart for describing processing for extracting a true eyebrow interval candidate point in the face position extracting device according to the third embodiment. 実施の形態４の顔位置抽出装置において、真の眉間候補点の抽出を行なう処理を説明するためのフローチャートである。15 is a flowchart illustrating a process of extracting a true inter-brows candidate point in the face position extracting device according to the fourth embodiment. 「顔」のサンプルと「非顔」のサンプルの分布の一例を示す概念図である。It is a conceptual diagram which shows an example of distribution of a "face" sample and a "non-face" sample. サポートベクターマシンが適用される写像先の高次元空間を示す図である。FIG. 3 is a diagram illustrating a high-dimensional space of a mapping destination to which a support vector machine is applied.

Explanation of reference numerals

２０顔位置抽出装置、３０カメラ、４０コンピュータ本体、４２モニタ。 20 face position extraction device, 30 camera, 40 computer body, 42 monitor.

Claims

Preparing digital data of the value of each pixel in the target image region including the human face region;
Extracting a position of a candidate point between eyebrows by a filtering process using an eyebrow detection filter in which six rectangular shapes are combined in the target image area;
Centering on the position of the extracted eyebrow interval candidate point, cutting out the target image with a predetermined size, and selecting a true candidate point from the eyebrow interval points in accordance with a pattern determination process. Extraction method of face position.

The face position extracting method according to claim 1, wherein the eyebrow gap detecting filter is obtained by dividing one rectangular shape into six.

The six rectangular shapes are:
Two first rectangular shapes that are vertically adjacent,
Two first rectangular shapes that are shifted from the first rectangular shape by a predetermined amount in the vertical direction and that are adjacent in the vertical direction;
2. The face position extracting method according to claim 1, wherein the second rectangular shape is shifted by a predetermined amount in the vertical direction and includes two third rectangular shapes adjacent in the vertical direction.

Selecting the true candidate points comprises:
Detecting a position of an eye by eye pattern discrimination processing for the target image corresponding to two predetermined rectangular shapes among the rectangular shapes forming the eyebrows detection filter corresponding to the eyebrows candidate point; ,
Correcting the position of the eyebrow candidate point to the position of the midpoint of two eyes based on the detected position of the eye;
Rotating the input image so that the two eyes are horizontal about the corrected eyebrows candidate point position;
With respect to the rotated input image, the target image is cut out at a predetermined size around the position of the corrected eyebrow candidate point, and a true candidate point is selected from the eyebrow candidate points according to a pattern determination process. Selecting the face position. 2. The method according to claim 1, further comprising:

Preparing digital data includes preparing the target image as a stereo image,
Selecting the true candidate points comprises:
2. The face position extraction according to claim 1, further comprising the step of selecting a true candidate point from the eyebrow candidate points according to a distance from the observation point of the eyebrow candidate point detected based on the stereo image. Method.

A program for causing a computer to execute a method of extracting a face position in a target image region, wherein the program includes:
Preparing digital data of the value of each pixel in the target image region including the human face region;
Extracting a position of a candidate point between eyebrows by a filtering process using an eyebrow detection filter in which six rectangular shapes are combined in the target image area;
Centering on the position of the extracted eyebrow interval candidate point, cutting out the target image with a predetermined size, and selecting a true candidate point from the eyebrow interval points in accordance with a pattern determination process. program.

The program according to claim 6, wherein the eyebrows detection filter is obtained by dividing one rectangular shape into six.

The six rectangular shapes are:
Two first rectangular shapes that are vertically adjacent,
Two first rectangular shapes that are shifted from the first rectangular shape by a predetermined amount in the vertical direction and that are adjacent in the vertical direction;
7. The program according to claim 6, wherein the second rectangular shape is shifted from the second rectangular shape by a predetermined amount in the vertical direction, and includes two third rectangular shapes adjacent in the vertical direction.

Selecting the true candidate points comprises:
Detecting a position of an eye by eye pattern discrimination processing for the target image corresponding to two predetermined rectangular shapes among the rectangular shapes forming the eyebrows detection filter corresponding to the eyebrows candidate point; ,
Correcting the position of the eyebrow candidate point to the position of the midpoint of two eyes based on the detected position of the eye;
Rotating the input image so that the two eyes are horizontal about the corrected eyebrows candidate point position;
With respect to the rotated input image, the target image is cut out at a predetermined size around the position of the corrected eyebrow candidate point, and a true candidate point is selected from the eyebrow candidate points according to a pattern determination process. The step of selecting.

Preparing digital data includes preparing the target image as a stereo image,
Selecting the true candidate points comprises:
The program according to claim 6, further comprising a step of selecting a true candidate point from the eyebrow candidate points according to a distance from the observation point of the eyebrow candidate point detected based on the stereo image.

Photographing means for preparing digital data of the value of each pixel in the target image area including the human face area,
Means for extracting the positions of the eyebrow candidate points by filtering processing by the eyebrow space detection filter in which the six rectangular shapes are combined in the target image area;
Selecting means for cutting out the target image at a predetermined size around the position of the extracted eyebrow interval candidate point, and selecting a true candidate point from the eyebrow interval candidate points in accordance with the pattern discrimination processing , Face position extraction device.

The face position extraction device according to claim 11, wherein the eyebrow gap detection filter is obtained by dividing one rectangular shape into six.

The six rectangular shapes are:
Two first rectangular shapes that are vertically adjacent,
Two first rectangular shapes that are shifted from the first rectangular shape by a predetermined amount in the vertical direction and that are adjacent in the vertical direction;
12. The face position extracting device according to claim 11, wherein the second rectangular shape is shifted by a predetermined amount in the vertical direction and includes two third rectangular shapes adjacent in the vertical direction.

The selecting means,
Means for detecting the position of the eyes by eye pattern discrimination processing for the target image corresponding to two predetermined rectangular shapes among the rectangular shapes forming the eyebrows space detection filter corresponding to the eyebrows space candidate points; ,
Means for correcting the position of the eyebrow candidate point to a position of a midpoint between two eyes based on the detected position of the eye;
Means for rotating the input image so that the two eyes are horizontal about the corrected eyebrows candidate point position;
With respect to the rotated input image, the target image is cut out at a predetermined size around the position of the corrected eyebrow candidate point, and a true candidate point is selected from the eyebrow candidate points according to a pattern determination process. Means for selecting a face position.

The photographing unit includes a unit that prepares the target image as a stereo image,
12. The method according to claim 11, wherein the selecting unit includes a unit that selects a true candidate point from the eyebrow candidate points according to a distance from the observation point of the eyebrow candidate point detected based on the stereo image. Face position extraction device.