JP2006174022A

JP2006174022A - Image processing apparatus and method

Info

Publication number: JP2006174022A
Application number: JP2004362904A
Authority: JP
Inventors: Kotaro Yano; 光太郎矢野
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2004-12-15
Filing date: 2004-12-15
Publication date: 2006-06-29

Abstract

【課題】フォーカス状態が悪い場合においても精度よく被写体パターンを検出する。
【解決手段】撮像手段から出力された画像中から所定の被写体パターンを検出する手段を備えた画像処理装置において、前記撮像手段の合焦距離を検出する焦点検出手段を備え、前記被写体パターン検出手段は被写体パターンを照合する際の参照パターンを前記合焦距離にもとづいて推定されるフォーカス状態に応じて選択的に切り替えて前記被写体パターンの照合を行う。
【選択図】図２
PROBLEM TO BE SOLVED: To detect a subject pattern accurately even when a focus state is bad.
An image processing apparatus comprising means for detecting a predetermined subject pattern from an image output from an image pickup means, further comprising a focus detection means for detecting a focusing distance of the image pickup means, and the subject pattern detection means. Performs the collation of the subject pattern by selectively switching the reference pattern for collating the subject pattern according to the focus state estimated based on the in-focus distance.
[Selection] Figure 2

Description

本発明は画像中から被写体パターンを自動的に検出する画像処理装置および方法に関するものである。 The present invention relates to an image processing apparatus and method for automatically detecting a subject pattern from an image.

近年、画像中から特定の被写体パターンを検出する技術をカメラの焦点制御などへ応用してその精度向上を行う提案がなされている。例えば、本出願人は被写体認識機能を搭載したカメラを提案した。（特許文献１）この従来例では、画像中から顔を検出することによりカメラの露出や焦点合わせの精度を向上する提案がなされている。 In recent years, proposals have been made to improve accuracy by applying a technique for detecting a specific subject pattern from an image to focus control of a camera. For example, the present applicant has proposed a camera equipped with a subject recognition function. (Patent Document 1) In this conventional example, there is a proposal for improving the accuracy of camera exposure and focusing by detecting a face from an image.

一方、画像中から顔を検出する技術としては、例えば、に非特許文献１に各種方式が挙げられている。 On the other hand, as a technique for detecting a face from an image, for example, Non-Patent Document 1 lists various methods.

しかしながら、現状の顔検出の技術は画像中でピントが合っているか合っていないかを表すフォーカスの状態や明るさが適正な顔が撮影されている場合や、正面を向いている場合には精度よく検出ができるものの、撮影条件が悪い場合には検出性能が著しく劣化するものであった。例えば、特許文献１では顔の検出結果を利用してカメラの露出や焦点合わせの精度を向上する提案がなされているものの、そもそも撮影状態の悪い状況で精度よく顔を検出する方法については説明されていない。 However, the current face detection technology is accurate when a face with proper focus and brightness indicating whether the image is in focus or not, or when facing the front, is accurate. Although the detection can be performed well, the detection performance is remarkably deteriorated when the photographing condition is bad. For example, although Patent Document 1 proposes to improve the accuracy of camera exposure and focusing using the face detection result, a method of accurately detecting a face in a situation where the shooting state is poor is explained. Not.

また、多点測距データをもとに最短距離から焦点制御を行い、ピントの合った領域に限定して顔を検出する方法も知られている。しかしながら、この従来例では多点測距データを用意する必要がある。また、顔の検出結果をカメラの制御に利用する際にピントの合った顔のみを検出することで問題を回避してはいるが、問題の本質的な解決にはなっていない。
特開２００１−３３０８８２号公報 Yang，”Detecting Faces in Images: A Survey”，IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.24 , NO.1, JANUARY 2002 A method is also known in which focus control is performed from the shortest distance based on multi-point distance measurement data, and a face is detected only in a focused area. However, in this conventional example, it is necessary to prepare multipoint distance measurement data. Although the problem is avoided by detecting only the in-focus face when the face detection result is used for camera control, the problem is not essentially solved.
JP 2001-330882 A Yang, “Detecting Faces in Images: A Survey”, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.24, NO.1, JANUARY 2002

本発明は以上の問題に鑑みて成されたものであり、フォーカス状態が悪い場合においても精度よく被写体パターンを検出することを目的とする。 The present invention has been made in view of the above problems, and an object of the present invention is to accurately detect a subject pattern even when the focus state is poor.

本発明の以下の目的を達成するために、撮像手段から出力された画像中から所定の被写体パターンを検出する手段を備えた画像処理装置において、前記撮像手段の合焦距離を検出する焦点検出手段を備え、前記被写体パターン検出手段は被写体パターンを照合する際の参照パターンを前記合焦距離にもとづいて推定されるフォーカス状態に応じて選択的に切り替えて前記被写体パターンの照合を行うことを特徴とする。 In order to achieve the following object of the present invention, in an image processing apparatus comprising means for detecting a predetermined subject pattern from an image output from an image pickup means, a focus detection means for detecting a focus distance of the image pickup means And the subject pattern detection means selectively matches a reference pattern for matching the subject pattern according to a focus state estimated based on the in-focus distance, and performs the subject pattern matching. To do.

本発明の構成により、フォーカス状態が悪い場合においても精度よく被写体パターンを本発明の構成により、フォーカス状態が悪い場合においても精度よく被写体パターンを検出することができる。 With the configuration of the present invention, it is possible to detect a subject pattern with high accuracy even when the focus state is poor. With the configuration of the present invention, the subject pattern can be detected with high accuracy even when the focus state is poor.

図１に本実施例の概略構成を示す。 FIG. 1 shows a schematic configuration of this embodiment.

１０は撮像手段であり、撮像レンズ、CCD等の撮像素子、撮像素子の出力信号のゲインを調整するゲイン調整回路、ゲイン調整後の撮像素子の出力信号をデジタル信号に変換するＡ/Ｄ変換回路より構成され、撮像素子からの信号をデジタル画像データとして出力する。 Reference numeral 10 denotes an imaging unit, an imaging lens, an imaging device such as a CCD, a gain adjustment circuit that adjusts the gain of the output signal of the imaging device, and an A / D conversion circuit that converts the output signal of the imaging device after gain adjustment into a digital signal The signal from the image sensor is output as digital image data.

２０は画像メモリであり、撮像手段１０から出力された画像データを一時的に記憶する。 An image memory 20 temporarily stores the image data output from the imaging unit 10.

３０は焦点制御手段であり、画像データの所定領域のコントラストに基づき、画像データのコントラストが最大になるように撮影レンズの合焦レンズの位置を制御する。 A focus control unit 30 controls the position of the focusing lens of the photographing lens so that the contrast of the image data is maximized based on the contrast of a predetermined area of the image data.

４０は露出制御手段であり、画像データの所定領域の輝度に基づき、画像データが所定の明るさになるように所定のプログラム線図に従いシャッター速度と絞り値を決定し、撮像手段１０の撮像素子の露光時間と撮影レンズの絞りを制御する。 Reference numeral 40 denotes exposure control means, which determines the shutter speed and aperture value according to a predetermined program diagram so that the image data has a predetermined brightness based on the brightness of a predetermined area of the image data, and the image pickup device of the image pickup means 10 Control the exposure time and aperture of the taking lens.

５０は顔検出手段であり、画像メモリ２０に記憶されている画像データから顔のパターンを検出し、その画像中の位置を出力する。 Reference numeral 50 denotes a face detection unit, which detects a face pattern from image data stored in the image memory 20 and outputs a position in the image.

６０は領域設定手段であり、顔検出手段５０の出力に応じて焦点制御手段３０および露出制御手段４０で制御する対象の画像領域を設定する。 Reference numeral 60 denotes an area setting unit that sets an image area to be controlled by the focus control unit 30 and the exposure control unit 40 according to the output of the face detection unit 50.

７０は焦点検出手段であり、焦点制御手段３０によって制御された撮影レンズの合焦レンズの位置を検出する。 A focus detection unit 70 detects the position of the focusing lens of the photographing lens controlled by the focus control unit 30.

また、以上の各ブロックは不図示のカメラ全体の制御を行う制御手段により動作を制御される。 The operation of each of the above blocks is controlled by a control unit that controls the entire camera (not shown).

次に、本実施例の動作を図２に従って説明する。 Next, the operation of this embodiment will be described with reference to FIG.

ユーザが被写体を撮影するために不図示の電源をONにし、撮影準備の状態に入ると、撮像手段１０は画像データを出力し、出力された画像データが画像メモリ２０に記憶される。（S101）すなわち、露出の状態を決める撮像レンズの絞り値、シャッター速度、撮像素子のゲインおよびフォーカス状態を決める撮影レンズの合焦レンズが初期状態にセットされ、撮像素子から出力された信号がデジタル信号に変換されてＡ/Ｄ変換回路から画像データとして出力される。 When a user turns on a power source (not shown) to photograph a subject and enters a preparation state for photographing, the imaging means 10 outputs image data, and the output image data is stored in the image memory 20. (S101) That is, the aperture value of the imaging lens that determines the exposure state, the shutter speed, the gain of the imaging device, and the focusing lens of the imaging lens that determines the focus state are set to the initial state, and the signal output from the imaging device is digital It is converted into a signal and output as image data from the A / D conversion circuit.

次に、焦点制御手段３０は画像メモリ２０に記憶された画像データの所定領域のコントラストに基づき、画像データのコントラストが最大になるように撮影レンズの合焦レンズの位置を制御する。（S102）すなわち、撮影レンズの合焦レンズを移動させながら撮像手段１０から出力された画像データの中央部付近の所定領域における輝度コントラストを繰り返し評価し、合焦レンズを移動させた場合の輝度コントラストの変化量をもとに、輝度コントラストの最も高い位置に合焦レンズが来るように合焦レンズの位置を制御する。 Next, the focus control means 30 controls the position of the focusing lens of the photographic lens based on the contrast of a predetermined area of the image data stored in the image memory 20 so that the contrast of the image data is maximized. (S102) That is, the luminance contrast in the case where the focusing lens is moved by repeatedly evaluating the luminance contrast in a predetermined area near the center of the image data output from the imaging means 10 while moving the focusing lens of the photographing lens. Based on the amount of change, the position of the focusing lens is controlled so that the focusing lens comes to the position with the highest luminance contrast.

次に、露出制御手段４０は画像データの所定領域の輝度に基づき、画像データが所定の明るさになるように所定のプログラム線図に従いシャッター速度と絞り値を決定し、撮像手段１０の撮像素子の露光時間と撮影レンズの絞りを制御する。（S103）
次に、焦点検出手段７０は撮影レンズの合焦レンズの位置を検出する。そして、合焦レンズの位置からピントの最も合ったカメラから被写体までの距離（以下、合焦距離と称す）をフォーカス情報として出力する。また、このとき同時にフォーカス情報として撮影レンズの焦点距離、絞り値が出力される。（S104）なお、合焦レンズの位置と物体距離の関係は予めテーブルとして記憶されており、例えば、撮影レンズが焦点距離可変のズームレンズの場合には焦点距離ごとのテーブルが用意される。 Next, the exposure control means 40 determines a shutter speed and an aperture value according to a predetermined program diagram so that the image data has a predetermined brightness based on the luminance of a predetermined area of the image data, and the image pickup device of the image pickup means 10. Control the exposure time and aperture of the taking lens. (S103)
Next, the focus detection means 70 detects the position of the focusing lens of the taking lens. Then, the distance from the focusing lens position to the subject (hereinafter referred to as the focusing distance) from the position of the focusing lens is output as focus information. At the same time, the focal length and aperture value of the photographing lens are output as focus information. (S104) The relationship between the position of the focusing lens and the object distance is stored in advance as a table. For example, when the photographing lens is a zoom lens with a variable focal length, a table for each focal length is prepared.

次に、顔検出手段５０はフォーカス情報を利用して画像メモリ２０に記憶されている画像データから顔のパターンを検出し、その画像中の位置を出力する。（S105）図３に顔検出手段５０の構成図、図４に処理の流れを示す。以下、図３および図４に沿って顔検出手段の処理詳細を説明する。 Next, the face detection means 50 detects the face pattern from the image data stored in the image memory 20 using the focus information, and outputs the position in the image. (S105) FIG. 3 shows a configuration diagram of the face detecting means 50, and FIG. 4 shows a processing flow. Hereinafter, the processing details of the face detecting means will be described with reference to FIGS.

まず、フォーカス情報を入力する。（S501）
次に、画像データから輝度分布生成手段５１において輝度画像データを生成する。（S502）例えば、入力画像データがＲＧＢ画像データの場合には、所定のマトリクス変換によりＹＵＶ画像データとし、Ｙ成分を輝度値として輝度画像データを生成する。ＲＧＢからＹＵＶへの変換は例えば特開２００３−１０２０２５に示されているような方法により行う。 First, focus information is input. (S501)
Next, luminance image data is generated in the luminance distribution generation means 51 from the image data. (S502) For example, when the input image data is RGB image data, YUV image data is generated by predetermined matrix conversion, and luminance image data is generated with the Y component as a luminance value. The conversion from RGB to YUV is performed by a method as disclosed in Japanese Patent Application Laid-Open No. 2003-102025, for example.

次に、画像データから肌色マスク生成手段５２において肌色マスクを生成する。（S503）例えば、入力画像データがＲＧＢ画像データの場合には、所定のマトリクス変換によりＹＵＶ画像データとし、ＵＶ成分が所定の範囲を満たす場合に肌色画素とし、肌色画素を１、その他の画素を０とした２値画像データを肌色マスクとして出力する。肌色画素の判定を行う範囲はあらかじめ多数の人物画像データの肌領域に対してＹＵＶの座標値の分布を調査しておき、その範囲を肌色画素判定のパラメータとして記憶しておく。 Next, a skin color mask is generated by the skin color mask generation means 52 from the image data. (S503) For example, when the input image data is RGB image data, YUV image data is obtained by a predetermined matrix conversion, and when the UV component satisfies a predetermined range, the skin color pixel is set, the skin color pixel is 1, and the other pixels are set. The binary image data set to 0 is output as a skin color mask. For the skin color pixel determination range, the YUV coordinate value distribution is examined in advance for the skin regions of a large number of human image data, and the range is stored as a skin color pixel determination parameter.

次に、検出サイズ設定手段５３において画像中から検出する顔のサイズを設定する。（S504）画像データには被写体の撮影倍率に依存して様々なサイズの顔が撮影されている可能性があるので、予め定めた所定のサイズの顔を検出するようにここでサイズの設定を行う。後述するが、本実施例では複数サイズの顔を検出するため、検出する顔のサイズの設定と画像中からの設定されたサイズの顔の検出を繰り返し行う。 Next, the detection size setting means 53 sets the size of the face to be detected from the image. (S504) Since there is a possibility that faces of various sizes are shot in the image data depending on the shooting magnification of the subject, the size is set here to detect a face of a predetermined size. Do. As will be described later, in this embodiment, in order to detect faces of a plurality of sizes, the setting of the size of the face to be detected and the detection of the face of the set size from the image are repeated.

次に、参照パターン選択手段５９においてフォーカス情報および設定された顔の検出サイズをもとに参照パターン記憶手段５７に記憶されている複数の参照パターンの中から適切な参照パターンを選択する。（S505）カメラで撮影される顔画像のフォーカス状態はカメラから被写体（この場合は顔）までの距離と合焦距離、撮影レンズの焦点距離、絞り値によって一意に定まる。さらに詳細には、撮像素子の１画素の画素サイズによってそのぼけ量が画素単位で定まる。また、検出対象である顔の大きさの範囲はほぼ決まっているので、画像中の検出すべき顔のサイズに対応するぼけ量はフォーカス情報として合焦距離、撮影レンズの焦点距離、絞り値を利用すればおおよそ推定できることになる。本実施例ではこの原理を用いて、フォーカス情報と検出する顔のサイズから適切な参照パターンを選択するようにしている。ここで、本実施例では参照パターンとして輝度分布および肌色分布を用いるが、そのうち輝度分布のパターンにおいてはフォーカス状態の異なるあらかじめ多数の人物画像データの顔に対してその分布を調査しておき、その分布をフォーカス状態ごとに参照パターンとして参照パターン記憶手段５７に記憶されている。なお、参照パターンのサイズはフォーカス状態のよい（すなわちピントの合った）顔に対するパターンほど高周波成分が大きいので大きなサイズのパターンとして記憶している。 Next, an appropriate reference pattern is selected from a plurality of reference patterns stored in the reference pattern storage unit 57 based on the focus information and the set face detection size in the reference pattern selection unit 59. (S505) The focus state of the face image photographed by the camera is uniquely determined by the distance from the camera to the subject (in this case, the face), the focusing distance, the focal length of the photographing lens, and the aperture value. More specifically, the amount of blur is determined in pixel units depending on the pixel size of one pixel of the image sensor. In addition, since the size range of the face to be detected is almost fixed, the amount of blur corresponding to the size of the face to be detected in the image is determined by using the focus distance, the focal length of the photographing lens, and the aperture value as focus information. If used, it can be estimated roughly. In this embodiment, using this principle, an appropriate reference pattern is selected from the focus information and the size of the face to be detected. Here, in this embodiment, the luminance distribution and the skin color distribution are used as the reference pattern, and in the luminance distribution pattern, the distribution is investigated in advance for a large number of faces of human image data having different focus states. The distribution is stored in the reference pattern storage means 57 as a reference pattern for each focus state. The size of the reference pattern is stored as a large size pattern because the high frequency component is larger for a face with a better focus state (ie, a focused face).

次に、探索範囲設定手段５４において顔のサイズに応じて輝度画像データと肌色マスクを用いて画像中の顔を検出する範囲を設定する。（S506）すなわち、所定サイズの顔の検出を行う際、画像のすべての領域から検出するようにしてもよいが検出回数が膨大なものになるため、あらかじめ輝度画像データと肌色マスクを用いて顔が存在しないまたは検出する可能性のない領域を削除するために探索する範囲をマスクとして記憶しておく。ここでは検出するサイズの顔領域での輝度画像データの平均値が所定の領域内にあるかおよび肌色画素の割合が所定の範囲にあるかで探索すべきかどうかを判定するようにする。 Next, the search range setting means 54 sets a range for detecting a face in the image using luminance image data and a skin color mask according to the size of the face. (S506) That is, when detecting a face of a predetermined size, detection may be performed from all regions of the image, but the number of detections is enormous, so the face is previously obtained using luminance image data and a skin color mask. Is stored as a mask in order to delete a region that does not exist or is not likely to be detected. Here, it is determined whether or not to search based on whether the average value of the luminance image data in the face area of the size to be detected is within the predetermined area and the ratio of the skin color pixels is within the predetermined range.

次に、顔と照合するパターンを切り出す画像中の位置を設定する。（S507）すなわち、ここでの設定および後段の検出処理の繰り返しにより、検出対象である顔のパターンを照合する位置を画像全域から順次走査していく。なお、このとき探索範囲設定手段５４に記憶された探索範囲の結果に応じて探索すべき位置のみ探索を行うように設定する。 Next, the position in the image where the pattern to be matched with the face is cut out is set. (S507) That is, by repeating the setting here and the subsequent detection process, the position where the face pattern to be detected is collated is sequentially scanned from the entire image. At this time, a setting is made so that only the position to be searched is searched according to the result of the search range stored in the search range setting means 54.

次に、照合パターン抽出手段５５ではS504で設定されたサイズとS507で設定された切り出し位置にもとづき輝度画像データおよび肌色マスクから照合パターンを切り出す。（S508）
次に、照合パターン抽出手段５５はS508で切り出した輝度画像データおよび肌色マスクの照合パターンを参照パターン選択手段５９で選択された参照パターンと同じサイズに正規化する。（S509）また、輝度画像データにおいては参照パターンと同じ平均輝度、輝度分散値になるように輝度分布を正規化する。 Next, the collation pattern extraction means 55 cuts out the collation pattern from the luminance image data and the skin color mask based on the size set in S504 and the cutout position set in S507. (S508)
Next, the collation pattern extraction unit 55 normalizes the luminance image data and skin color mask collation pattern cut out in S508 to the same size as the reference pattern selected by the reference pattern selection unit 59. (S509) In the luminance image data, the luminance distribution is normalized so as to have the same average luminance and luminance dispersion value as the reference pattern.

次に、パターン照合手段５６において照合パターン抽出手段５５で出力された照合パターンと参照パターン選択手段５９で選択された参照パターンとを照合して類似度を求める。（S510）例えば、照合パターンの輝度分布と参照パターンの輝度分布の差分絶対値の和の逆数を第1の類似度、照合パターンの肌色分布と参照パターンの肌色分布の差分絶対値の和和の逆数を第２の類似度とし、第1および第２の類似度の所定の重み付き和として類似度を求める。なお、S505で参照パターンとしてフォーカス状態のよい顔に対する参照パターンを用いる場合ほど大きい類似度が出力されるように重みを設定する。 Next, the pattern matching unit 56 collates the matching pattern output by the matching pattern extraction unit 55 with the reference pattern selected by the reference pattern selection unit 59 to obtain the similarity. (S510) For example, the reciprocal of the sum of absolute differences between the luminance distribution of the matching pattern and the luminance distribution of the reference pattern is the first similarity, and the sum of the absolute differences of the difference between the skin color distribution of the matching pattern and the skin color distribution of the reference pattern The reciprocal is used as the second similarity, and the similarity is obtained as a predetermined weighted sum of the first and second similarities. In S505, the weight is set so that the higher the degree of similarity is output as the reference pattern for the face in good focus is used as the reference pattern.

次に、照合するパターンの類似度が所定の閾値以上かどうかを判断し、所定の閾値以上のパターンについてはその位置座標と類似度を記憶する。（S511）
次に、S507からS511の処理を探索範囲設定手段５４に記憶された探索範囲に応じて画像の全領域を走査するように繰り返す。また、S504からS511の処理を所定の検出サイズ分だけ繰り返す。 Next, it is determined whether or not the similarity of the pattern to be collated is equal to or greater than a predetermined threshold, and the position coordinates and the similarity are stored for a pattern equal to or greater than the predetermined threshold. (S511)
Next, the processing from S507 to S511 is repeated so as to scan the entire area of the image according to the search range stored in the search range setting means 54. Further, the processing from S504 to S511 is repeated for a predetermined detection size.

次に、画像全域に対して所定サイズの範囲の顔の探索が終わった段階で顔領域判定手段５８は顔が検出され、その位置座標が記憶された顔の候補領域のうち、領域が重複している候補に対しては類似度の低い顔の候補を削除する。（S512）ここで、S510で参照パターンとしてフォーカス状態のよい顔に対する参照パターンを用いる場合ほど大きい類似度が出力されるように重みが設定されているので、フォーカス状態の異なる顔が重複して検出された場合には、フォーカス状態のよい顔のパターンを優先するように顔候補の削除が行われる。 Next, when the search for the face within the predetermined size range is completed for the entire image, the face area determination means 58 detects the face, and among the face candidate areas in which the position coordinates are stored, the areas overlap. The candidate of the face with a low similarity is deleted with respect to the candidate. (S512) Here, since the weight is set so that the similarity is larger as the reference pattern for the face with good focus state is used as the reference pattern in S510, the faces with different focus states are detected in duplicate. If the face pattern is selected, the face candidate is deleted so as to give priority to a face pattern with good focus.

さらに、顔領域判定手段５８は顔が検出された領域について照合パターン抽出手段５５で抽出された画像の輝度分布をもとに参照パターン選択手段５９に選択された参照パターンを参照し、顔パターンか否かの最終的な判定を行い、偽の顔パターンを削除し、顔検出の最終結果が画像の領域を示す位置座標として出力される。（S511）なお、顔領域判定手段５８では最終的な顔判定を行うためパターン照合手段５６より高い精度の判定が必要であるため、ここで用いる参照パターンはパターン照合手段５６で用いる参照パターンより精度の高いパターンを利用することが望ましい。参照パターンは画像の輝度パターンのようなものでもよいし、例えば、IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.20 , NO.1, JANUARY 1998に”Neural network-based face detection”と題するRowleyらの方式を用いるような場合には、ニューラルネットワークに適用する重み係数であってもよい。 Further, the face area determination means 58 refers to the reference pattern selected by the reference pattern selection means 59 based on the luminance distribution of the image extracted by the matching pattern extraction means 55 for the area where the face is detected, A final determination of whether or not is made, the fake face pattern is deleted, and the final face detection result is output as position coordinates indicating the area of the image. (S511) Since the face area determination unit 58 needs to perform determination with higher accuracy than the pattern matching unit 56 in order to perform final face determination, the reference pattern used here is more accurate than the reference pattern used by the pattern matching unit 56. It is desirable to use a high pattern. The reference pattern may be something like the luminance pattern of an image, for example, Rowley et al. Entitled “Neural network-based face detection” in IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.20, NO.1, JANUARY 1998. When using a method, it may be a weighting coefficient applied to a neural network.

そして、顔検出手段５０の結果にもとづき、顔が検出された場合には領域設定手段６０は露出および焦点制御の対象領域を変更する。（S106）すなわち、画像中で一つの顔が検出された場合には、検出された顔の領域を制御対象領域とし、焦点制御手段３０および露出制御手段４０に出力する。以下、図１に示すように焦点制御手段３０による焦点制御（S107）、露出制御手段４０による露出制御（S108）、撮像（S109）が行われ、主被写体である人物に制御対象を絞った好適な焦点および露出制御にもとづく撮像が行われる。なお、S107、S108、S109の処理はそれぞれS102、S103、S101と同様の処理であることはいうまでもない。また、画像中で複数の顔が検出された場合には、領域設定手段６０は検出された顔のそれぞれの参照パターンに対する類似度を参考にして最も類似度の高い顔領域を制御対象領域とし、焦点制御手段３０および露出制御手段４０に出力するようにする。以降の処理は一つの顔が検出された場合と同じである。また、このとき、領域設定手段６０は複数の領域を焦点制御手段３０および露出制御手段４０に出力するようにしてもよい。例えば、焦点制御手段３０は複数の顔領域が焦点深度内に入るよう深度優先の焦点制御を行うことができる。また、領域設定手段６０が露出制御手段４０に検出された顔の各領域の類似度を出力するようにすることで露出制御手段４０は類似度にもとづいて各顔領域の重みを決定し、重み付きの評価測光にもとづく露出制御を行うことができる。 Based on the result of the face detection means 50, when a face is detected, the area setting means 60 changes the target area for exposure and focus control. (S106) That is, when one face is detected in the image, the detected face area is set as a control target area, and is output to the focus control means 30 and the exposure control means 40. Hereinafter, as shown in FIG. 1, focus control (S107) by the focus control means 30, exposure control (S108) by the exposure control means 40, and imaging (S109) are performed, and the control target is narrowed down to the person who is the main subject. Imaging is performed based on accurate focus and exposure control. Needless to say, the processes of S107, S108, and S109 are the same as those of S102, S103, and S101, respectively. When a plurality of faces are detected in the image, the area setting unit 60 sets the face area having the highest similarity as a control target area with reference to the similarity to each reference pattern of the detected face, It outputs to the focus control means 30 and the exposure control means 40. The subsequent processing is the same as when one face is detected. At this time, the region setting unit 60 may output a plurality of regions to the focus control unit 30 and the exposure control unit 40. For example, the focus control means 30 can perform depth priority focus control so that a plurality of face regions fall within the depth of focus. Further, the region setting means 60 outputs the similarity of each detected face area to the exposure control means 40, so that the exposure control means 40 determines the weight of each face area based on the similarity, and the weight Exposure control based on attached evaluation metering can be performed.

また、顔検出手段５０の結果、顔が検出されなかった場合には、撮影した画像中に人物が存在しないと判断し、S102、S103で初期状態で行った通常の画像の所定領域に合わせた焦点制御、露出制御の状態に戻して、撮像を行う。（S110）
以上、本実施例ではフォーカス状態が悪い場合においても精度よく顔を検出し、検出結果にもとづいて焦点制御および露出制御を好適に行うことができる。特に、本実施例では検出対象とする顔のフォーカス状態を予測してフォーカス状態ごとに最適化した顔パターンの照合を行っているので、顔検出の精度の劣化を招くことがない。 If no face is detected as a result of the face detection means 50, it is determined that no person is present in the photographed image, and is matched with a predetermined area of a normal image performed in the initial state in S102 and S103. Returning to the focus control and exposure control states, imaging is performed. (S110)
As described above, in this embodiment, even when the focus state is poor, the face can be accurately detected, and the focus control and the exposure control can be suitably performed based on the detection result. In particular, in this embodiment, since the focus state of the face to be detected is predicted and the face pattern optimized for each focus state is collated, the face detection accuracy is not deteriorated.

本実施例では顔の検出結果にもとづいて焦点制御および露出制御を行うようにしたが、ストロボ制御などの制御にも応用できる。また、検出された顔の領域に処理対象を絞って白バランス処理や色補正、ガンマ補正処理を好適に行うようにしてもよい。 In this embodiment, focus control and exposure control are performed based on the face detection result, but the present invention can also be applied to control such as strobe control. Alternatively, the processing target may be narrowed down to the detected face area, and white balance processing, color correction, and gamma correction processing may be suitably performed.

また、本実施例では被写体パターンとして人物の顔を検出するようにしたが、シーンごとに重要とされるその他の被写体のパターンであってもよい。 In the present embodiment, the human face is detected as the subject pattern, but other subject patterns that are important for each scene may be used.

本実施例ではデジタルスチルカメラの露出制御を利用して特定の被写体パターンの検出精度を向上させ、制御にフィードバックするようにしたが、例えば、ビデオカムコーダーや監視カメラなど露出制御機能を持つ画像入力装置にも本実施例は適用される。 In this embodiment, the exposure accuracy of a digital still camera is used to improve the detection accuracy of a specific subject pattern and feed back to the control. For example, an image input device having an exposure control function such as a video camcorder or a surveillance camera. This embodiment is also applied.

また、本実施例ではデジタルスチルカメラについて説明したが、デジタルスチルカメラで撮影した画像をコンピュータで処理する場合について本実施例を適用してもよい。例えば、デジタルスチルカメラで撮影した画像とともにフォーカス状態を表す合焦距離、撮影レンズの焦点距離、絞り値、撮像素子の１画素の画素サイズの撮影情報を取得するようにしてもよい。 In the present embodiment, the digital still camera has been described. However, the present embodiment may be applied to a case where an image captured by a digital still camera is processed by a computer. For example, in addition to an image captured by a digital still camera, in-focus distance indicating a focus state, a focal length of a photographing lens, an aperture value, and photographing information of a pixel size of one pixel of an image sensor may be acquired.

また、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記録媒体（または記憶媒体）を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に格納されたプログラムコードを読み出し実行することによって実現してもよい。この場合、記録媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになる。 In addition, a recording medium (or storage medium) that records software program codes for realizing the functions of the above-described embodiments is supplied to a system or apparatus, and the computer (or CPU or MPU) of the system or apparatus stores the recording medium. You may implement | achieve by reading and executing the stored program code. In this case, the program code itself read from the recording medium realizes the functions of the above-described embodiment.

また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼働しているオペレーティングシステム（ＯＳ）などが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能を実現してもよい。 Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an operating system (OS) running on the computer based on the instruction of the program code. A part or all of the actual processing may be performed, and the functions of the above-described embodiments may be realized by the processing.

さらに、記録媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張カードやコンピュータに接続された機能拡張ユニットに備わるメモリに書込まれた後、そのプログラムコードの指示に基づき、その機能拡張カードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能を実現してもよい。 Furthermore, after the program code read from the recording medium is written into a memory provided in a function expansion card inserted into the computer or a function expansion unit connected to the computer, the function is based on the instruction of the program code. The CPU of the expansion card or the function expansion unit may perform part or all of the actual processing, and the functions of the above-described embodiments may be realized by the processing.

上記記録媒体を用いる場合、その記録媒体には、先に説明したフローチャートに対応するプログラムコードが格納されることになる。 When the above recording medium is used, program code corresponding to the flowchart described above is stored in the recording medium.

本実施形態に係るカメラの構成を示すブロック図。FIG. 2 is a block diagram illustrating a configuration of a camera according to the present embodiment. 本実施例の動作を示す図。The figure which shows operation | movement of a present Example. 本実施例の顔検出手段の構成を示す図。The figure which shows the structure of the face detection means of a present Example. 本実施例の顔検出処理の動作を示す図。The figure which shows operation | movement of the face detection process of a present Example.

Claims

An image processing apparatus comprising means for detecting a predetermined subject pattern from an image output from the image pickup means, further comprising a focus detection means for detecting an in-focus distance of the image pickup means, wherein the subject pattern detection means detects the subject pattern. An image processing apparatus, wherein the subject pattern is collated by selectively switching a reference pattern for collation according to a focus state estimated based on the in-focus distance.

The image processing apparatus according to claim 1, wherein the subject pattern is a face.

The imaging unit includes a focus control unit, and the focus control unit determines an image area to be a focus control target of the imaging unit according to a result of the subject pattern detection unit, and performs focus control of the imaging unit. The image processing apparatus according to claim 1, wherein the image processing apparatus is characterized.

An image processing method for detecting a predetermined subject pattern from an image, comprising a shooting information acquisition step for acquiring shooting information of the image, and selecting a reference pattern for matching the subject pattern according to the shooting information An image processing method characterized in that the subject pattern is collated by switching between the two.

5. The image processing method according to claim 4, wherein the photographing information is information representing a focus state of the image.

The image processing method according to claim 4, wherein the photographing information includes a focusing distance of an imaging unit when the image is photographed.

The image processing method according to claim 4, wherein the subject pattern is a face.

A recording medium storing a program for executing the image processing method according to claim 4.