JP2013120504A

JP2013120504A - Object extraction device, object extraction method and program

Info

Publication number: JP2013120504A
Application number: JP2011268456A
Authority: JP
Inventors: Kenji Tsukuba; 健史筑波; Masahiro Shioi; 正宏塩井; Takeaki Suenaga; 健明末永; 敦稔〆野; Atsutoshi Simeno
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2011-12-08
Filing date: 2011-12-08
Publication date: 2013-06-17

Abstract

【課題】ユーザが指定する注目オブジェクトを高精度に追跡、抽出することが可能なオブジェクト抽出装置の提供。
【解決手段】オブジェクト抽出装置１は、時系列画像から注目するオブジェクト領域を抽出するものであって、時系列画像における基準画像上のオブジェクト領域の画像とその周辺領域の画像より、オブジェクト領域とオブジェクト周辺領域とを識別する識別特徴量を算出する識別特徴量算出部２０と、前記時系列画像における参照画像上のオブジェクト領域内の特徴点と時系列画像における処理対象画像の特徴点との対応により処理対象画像上のオブジェクト領域を推定するオブジェクト領域推定部３０と、推定されたオブジェクト領域とその周辺領域の画像の特徴量を算出し、該算出した特徴量と識別特徴量との対比によって、推定した処理対象画像におけるオブジェクトの領域を修正確定する。
【選択図】図１An object extraction apparatus capable of tracking and extracting a target object designated by a user with high accuracy.
An object extracting apparatus extracts an object region of interest from a time-series image. An object region and an object are extracted from an image of the object region on a reference image in the time-series image and an image of its peripheral region. Based on the correspondence between the feature point in the object region on the reference image in the time-series image and the feature point of the processing target image in the time-series image An object region estimation unit 30 that estimates an object region on the processing target image, calculates the feature amount of the image of the estimated object region and its surrounding region, and estimates by comparing the calculated feature amount and the identification feature amount The area of the object in the processed image is corrected and confirmed.
[Selection] Figure 1

Description

本発明は、ユーザが所望する注目オブジェクトを映像中より抽出するオブジェクト抽出装置、オブジェクト抽出方法に関する。 The present invention relates to an object extraction apparatus and an object extraction method for extracting an object of interest desired by a user from a video.

近年、デジタルビデオカメラ、デジタルスチルカメラ、携帯電話など撮像機能付き端末が急速に普及している。また、これらのカメラによって撮影された映像に対して、加工や処理を施し、新たな映像を生成する装置がある。例えば、映像中の特定の画像領域を抽出し、抽出した画像領域を部品として映像の加工に利用することや、映像を個々の画像領域毎に抽出し、抽出した映像を元の映像の管理や検索に利用することが知られている。 In recent years, terminals with an imaging function such as a digital video camera, a digital still camera, and a mobile phone are rapidly spreading. In addition, there is an apparatus that generates a new video by processing and processing video shot by these cameras. For example, a specific image area in a video can be extracted, and the extracted image area can be used as a part for video processing, or a video can be extracted for each individual image area, and the extracted video can be managed in the original video. It is known to use for search.

映像からユーザが指定する注目オブジェクトを追跡し、抽出する手法として、例えば、特許文献１記載の技術がある。
特許文献１では、背景差分に基づいて取得したオブジェクト領域を示すシルエット画像とオブジェクト固有の画像特徴量（色情報）とに基づいて、前記オブジェクトが存在する領域の部分領域を検出し、前記部分領域を前記シルエット画像に基づいて領域を成長させることでオブジェクト領域全体を抽出するオブジェクト抽出装置が開示されている。 As a technique for tracking and extracting a target object designated by a user from a video, for example, there is a technique described in Patent Document 1.
In Patent Literature 1, a partial area of an area where the object exists is detected based on a silhouette image indicating an object area acquired based on a background difference and an image characteristic amount (color information) unique to the object, and the partial area An object extraction device is disclosed that extracts an entire object region by growing the region based on the silhouette image.

特開２００３−４４８６０号公報Japanese Patent Laid-Open No. 2003-44860

しかしながら、特許文献１は、背景差分に基づいてオブジェクト領域を示すシルエット画像を取得するため、カメラワークや、シーンチェンジにより背景に変化を伴う場合は、オブジェクト領域を示すシルエット画像内に、背景領域を誤ってオブジェクト領域として抽出することがある。 However, since Patent Document 1 acquires a silhouette image indicating an object area based on a background difference, if the background changes due to camera work or a scene change, the background area is included in the silhouette image indicating the object area. It may be extracted as an object area by mistake.

本発明は上記の点に鑑みてなされたものであり、ユーザが指定する注目オブジェクトを高精度に追跡、抽出することが可能なオブジェクト抽出装置、オブジェクト抽出方法、及びプログラムを提供する。 The present invention has been made in view of the above points, and provides an object extraction device, an object extraction method, and a program capable of tracking and extracting a target object designated by a user with high accuracy.

上記課題を解決するために、本発明の第１の技術手段は、時系列画像から注目するオブジェクト領域を抽出するオブジェクト抽出装置であって、時系列画像における基準画像上のオブジェクト領域の画像とその周辺領域の画像より、オブジェクト領域とオブジェクト周辺領域とを識別する識別特徴量を算出する識別特徴量算出部と、前記時系列画像における参照画像上のオブジェクト領域内の特徴点と前記時系列画像における処理対象画像の特徴点との対応により処理対象画像上のオブジェクト領域を推定するオブジェクト領域推定部と、前記推定されたオブジェクト領域とその周辺領域の画像の特徴量を算出し、該算出した特徴量と前記算出した識別特徴量との対比によって、前記推定した処理対象画像におけるオブジェクトの領域を修正確定するオブジェクト領域確定部と、を備えることを特徴としたものである。 In order to solve the above-described problem, a first technical means of the present invention is an object extraction device that extracts an object region of interest from a time-series image, the image of the object region on the reference image in the time-series image and its An identification feature amount calculating unit for calculating an identification feature amount for identifying an object region and an object peripheral region from an image of the peripheral region, a feature point in the object region on the reference image in the time-series image, and the time-series image An object region estimation unit for estimating an object region on the processing target image by correspondence with a feature point of the processing target image; and calculating a feature amount of the image of the estimated object region and its surrounding region; The region of the object in the estimated processing target image is corrected by comparing the calculated identification feature amount An object region determination unit for constant, is obtained by comprising: a.

本発明の第２の技術手段は、第１の技術手段において、前記識別特徴量算出部が、前記基準画像におけるオブジェクト領域を示すシルエット画像に基づいて、オブジェクト領域とオブジェクト周辺領域との識別のための画像特徴量を算出する特徴量算出範囲を設定する特徴量算出範囲設定部と、設定した特徴量算出範囲において、所定サイズの小領域毎に前記画像特徴量を算出する画像特徴量算出部と、前記画像特徴量のうち、オブジェクト領域上の画像特徴量をクラスタリングし、オブジェクト領域の画像特徴量を表現するＫ個の異なる代表画像特徴量を算出する代表画像特徴量算出部と、小領域毎に、該小領域の画像特徴量とＫ個の異なる代表画像特徴量とから、画像特徴量と代表画像特徴量との類似度ベクトルを算出する類似度ベクトル算出部と、該算出したオブジェクト領域上の類似度ベクトル及びオブジェクト周辺領域の類似度ベクトルから、小領域が特徴量空間上においてオブジェクト領域に属するか否かのクラス境界を表す識別関数を算出する識別関数算出部と、を備え、前記識別特徴量として前記Ｋ個の異なる代表画像特徴量と前記識別関数を出力することを特徴としたものである。 According to a second technical means of the present invention, in the first technical means, the identification feature amount calculation unit is configured to identify an object region and an object peripheral region based on a silhouette image indicating the object region in the reference image. A feature amount calculation range setting unit that sets a feature amount calculation range for calculating the image feature amount, and an image feature amount calculation unit that calculates the image feature amount for each small region of a predetermined size in the set feature amount calculation range A representative image feature amount calculation unit that clusters image feature amounts on the object region out of the image feature amounts and calculates K different representative image feature amounts expressing the image feature amount of the object region; Further, a similarity vector for calculating a similarity vector between the image feature quantity and the representative image feature quantity from the image feature quantity of the small region and K different representative image feature quantities. Identification for calculating an identification function representing a class boundary as to whether or not a small area belongs to the object area in the feature amount space from the calculation unit and the calculated similarity vector on the object area and the similarity vector of the object peripheral area A function calculation unit, and outputs the K different representative image feature values and the discrimination function as the discrimination feature values.

本発明の第３の技術手段は、第２の技術手段において、前記オブジェクト領域確定部が、前記推定されたオブジェクト領域を示す推定シルエット画像に基づいて、画像特徴量を算出する特徴量算出範囲を設定する特徴量算出範囲設定部と、該設定した特徴量算出範囲において、所定サイズの小領域毎に画像特徴量を算出する画像特徴量算出部と、小領域毎に、該小領域の画像特徴量と、前記基準画像におけるＫ個の異なる代表画像特徴量とから、前記類似度ベクトルを算出する類似度ベクトル算出部と、を備え、該算出した類似度ベクトルと、前記識別関数に基づいて、各小領域を特徴量空間上においてオブジェクト領域に属するか否かを識別し、該オブジェクト領域に含まれると判定された小領域の集合を前記処理対象画像上のオブジェクト領域と確定することを特徴としたものである。 According to a third technical means of the present invention, in the second technical means, a feature amount calculation range in which the object region determining unit calculates an image feature amount based on an estimated silhouette image indicating the estimated object region. A feature amount calculation range setting unit to be set, an image feature amount calculation unit that calculates an image feature amount for each small region of a predetermined size in the set feature amount calculation range, and an image feature of the small region for each small region A similarity vector calculation unit that calculates the similarity vector from an amount and K different representative image feature values in the reference image, and based on the calculated similarity vector and the identification function, Whether each small area belongs to the object area in the feature amount space is identified, and a set of small areas determined to be included in the object area is defined as an object on the processing target image. Is obtained is characterized in that determining the region.

本発明の第４の技術手段は、第１〜第３のいずれか１の技術手段において、前記オブジェクト領域推定部が、前記参照画像と該参照画像上のオブジェクト領域を示すシルエット画像とから、前記参照画像上のオブジェクト領域内の特徴点を算出する特徴点算出部と、該算出した参照画像上のオブジェクト領域内の特徴点と、前記参照画像上のオブジェクト領域を示すシルエット画像と、前記処理対象画像とから、前記参照画像上のオブジェクト領域内の特徴点と対応する処理対象画像上の特徴点を算出する対応点算出部と、前記参照画像上のオブジェクト領域内の特徴点と、該特徴点と対応する処理対象画像上の特徴点とから、前記参照画像上のオブジェクト領域内の特徴点を該特徴点と対応する処理対象画像上の特徴点へ射影する変換行列を算出する変換行列算出部と、を備え、前記変換行列と、前記参照画像上のオブジェクト領域を示すシルエット画像とから、処理対象画像上の各画素について、変換行列の逆行列を乗じ参照画像上のオブジェクト領域に含まれるか否かを判定し、該オブジェクト領域に含まれると判定された画素の集合を前記処理対象画像上のオブジェクト領域と推定することを特徴としたものである。 According to a fourth technical means of the present invention, in any one of the first to third technical means, the object region estimation unit includes the reference image and a silhouette image indicating an object region on the reference image. A feature point calculation unit for calculating a feature point in an object area on the reference image, a feature point in the object area on the reference image, a silhouette image indicating the object area on the reference image, and the processing target A corresponding point calculating unit that calculates a feature point on the processing target image corresponding to a feature point in the object area on the reference image from the image, a feature point in the object area on the reference image, and the feature point And a feature matrix on the image to be processed corresponding to a feature point in the object area on the reference image to a feature point on the image to be processed corresponding to the feature point A conversion matrix calculating unit for calculating, on the reference image, multiplying each pixel on the processing target image by the inverse matrix of the conversion matrix from the conversion matrix and the silhouette image indicating the object region on the reference image. It is determined whether it is included in the object area, and a set of pixels determined to be included in the object area is estimated as an object area on the processing target image.

本発明の第５の技術手段は、時系列画像から注目するオブジェクト領域を抽出するオブジェクト抽出方法であって、時系列画像における基準画像上のオブジェクト領域の画像とその周辺領域の画像より、オブジェクト領域とオブジェクト周辺領域とを識別する識別特徴量を算出する識別特徴量算出ステップと、前記時系列画像における参照画像上のオブジェクト領域内の特徴点と前記時系列画像における処理対象画像の特徴点との対応により処理対象画像上のオブジェクト領域を推定するオブジェクト領域推定ステップと、前記推定されたオブジェクト領域とその周辺領域の画像の特徴量を算出し、該算出した特徴量と前記算出した識別特徴量との対比によって、前記推定した処理対象画像におけるオブジェクトの領域を修正確定するオブジェクト領域確定ステップと、を含むことを特徴としたものである。 According to a fifth technical means of the present invention, there is provided an object extraction method for extracting an object region of interest from a time-series image, wherein the object region is obtained from an image of the object region on the reference image and its surrounding region in the time-series image. An identification feature amount calculating step for calculating an identification feature amount for distinguishing between the object peripheral region and the object peripheral region, a feature point in the object region on the reference image in the time-series image, and a feature point of the processing target image in the time-series image An object region estimation step for estimating an object region on the processing target image by correspondence; calculating a feature amount of the image of the estimated object region and its surrounding region; and the calculated feature amount and the calculated identification feature amount The object for correcting and confirming the object region in the estimated processing target image by comparing And transfected region determination step, it is obtained by comprising a.

本発明の第６の技術手段は、第５の技術手段において、前記識別特徴量算出ステップが、前記基準画像におけるオブジェクト領域を示すシルエット画像に基づいて、オブジェクト領域とオブジェクト周辺領域との識別のための画像特徴量を算出する特徴量算出範囲を設定する特徴量算出範囲設定ステップと、設定した特徴量算出範囲において、所定サイズの小領域毎に前記画像特徴量を算出する画像特徴量算出ステップと、前記画像特徴量のうち、オブジェクト領域上の画像特徴量をクラスタリングし、オブジェクト領域の画像特徴量を表現するＫ個の異なる代表画像特徴量を算出する代表画像特徴量算出ステップと、小領域毎に、該小領域の画像特徴量とＫ個の異なる代表画像特徴量とから、画像特徴量と代表画像特徴量との類似度ベクトルを算出する類似度ベクトル算出ステップと、該算出したオブジェクト領域上の類似度ベクトル及びオブジェクト周辺領域の類似度ベクトルから、小領域が特徴量空間上においてオブジェクト領域に属するか否かのクラス境界を表す識別関数を算出する識別関数算出ステップと、前記識別特徴量として前記Ｋ個の異なる代表画像特徴量と前記識別関数を出力するステップと、を含むことを特徴としたものである。 According to a sixth technical means of the present invention, in the fifth technical means, the identification feature amount calculating step is for identifying an object region and an object peripheral region based on a silhouette image indicating the object region in the reference image. A feature amount calculation range setting step for setting a feature amount calculation range for calculating the image feature amount, and an image feature amount calculation step for calculating the image feature amount for each small region of a predetermined size in the set feature amount calculation range; A representative image feature amount calculating step of clustering image feature amounts on the object region out of the image feature amounts and calculating K different representative image feature amounts expressing the image feature amount of the object region; Further, from the image feature amount of the small area and K different representative image feature amounts, the similarity vector between the image feature amount and the representative image feature amount is calculated. A class boundary indicating whether or not a small region belongs to the object region in the feature amount space from the similarity vector calculating step for calculating the similarity vector and the similarity vector on the object region and the similarity vector of the object peripheral region An identification function calculating step for calculating an identification function; and a step of outputting the K different representative image feature values and the identification function as the identification feature values.

本発明の第７の技術手段は、第６の技術手段において、前記オブジェクト領域確定ステップが、前記推定されたオブジェクト領域を示す推定シルエット画像に基づいて、画像特徴量を算出する特徴量算出範囲を設定する特徴量算出範囲設定ステップと、該設定した特徴量算出範囲において、所定サイズの小領域毎に画像特徴量を算出する画像特徴量算出ステップと、小領域毎に、該小領域の画像特徴量と、前記基準画像におけるＫ個の異なる代表画像特徴量とから、前記類似度ベクトルを算出する類似度ベクトル算出ステップと、該算出した類似度ベクトルと、前記識別関数に基づいて、各小領域を特徴量空間上においてオブジェクト領域に属するか否かを識別し、該オブジェクト領域に含まれると判定された小領域の集合を前記処理対象画像上のオブジェクト領域と確定するステップと、を含むことを特徴としたものである。 According to a seventh technical means of the present invention, in the sixth technical means, a feature amount calculation range in which the object region determination step calculates an image feature amount based on an estimated silhouette image indicating the estimated object region. A feature amount calculation range setting step to be set; an image feature amount calculation step of calculating an image feature amount for each small region of a predetermined size in the set feature amount calculation range; and an image feature of the small region for each small region A similarity vector calculating step for calculating the similarity vector from the amount and K different representative image feature values in the reference image, and each subregion based on the calculated similarity vector and the identification function Are included in the object area in the feature amount space, and a set of small areas determined to be included in the object area is defined as the processing target image. A step of determining an object region of the upper is obtained by comprising a.

本発明の第８の技術手段は、第５〜第７のいずれか１の技術手段において、前記オブジェクト領域推定ステップが、前記参照画像と該参照画像上のオブジェクト領域を示すシルエット画像とから、前記参照画像上のオブジェクト領域内の特徴点を算出する特徴点算出ステップと、該算出した参照画像上のオブジェクト領域内の特徴点と、前記参照画像上のオブジェクト領域を示すシルエット画像と、前記処理対象画像とから、前記参照画像上のオブジェクト領域内の特徴点と対応する処理対象画像上の特徴点を算出する対応点算出ステップと、前記参照画像上のオブジェクト領域内の特徴点と、該特徴点と対応する処理対象画像上の特徴点とから、前記参照画像上のオブジェクト領域内の特徴点を該特徴点と対応する処理対象画像上の特徴点へ射影する変換行列を算出する変換行列算出ステップと、前記変換行列と、前記参照画像上のオブジェクト領域を示すシルエット画像とから、処理対象画像上の各画素について、変換行列の逆行列を乗じ参照画像上のオブジェクト領域に含まれるか否かを判定し、該オブジェクト領域に含まれると判定された画素の集合を前記処理対象画像上のオブジェクト領域と推定するステップと、を含むことを特徴としたものである。 According to an eighth technical means of the present invention, in any one of the fifth to seventh technical means, the object region estimation step includes the reference image and a silhouette image indicating the object region on the reference image. A feature point calculating step for calculating a feature point in the object area on the reference image, a feature point in the object area on the reference image, a silhouette image showing the object area on the reference image, and the processing target A corresponding point calculating step for calculating a feature point on the processing target image corresponding to a feature point in the object region on the reference image from the image, a feature point in the object region on the reference image, and the feature point And a feature point on the processing target image corresponding to the feature point in the object area on the reference image A reference image obtained by multiplying a transformation matrix calculation step for calculating a transformation matrix to be projected, the transformation matrix, and a silhouette image indicating an object region on the reference image by multiplying an inverse matrix of the transformation matrix for each pixel on the processing target image. Determining whether or not the object area is included in the upper object area, and estimating a set of pixels determined to be included in the object area as the object area on the processing target image. It is.

本発明の第９の技術手段は、第１〜第４のいずれか１の技術手段のオブジェクト抽出装置の各処理部として機能させるためのオブジェクト抽出プログラムである。 A ninth technical means of the present invention is an object extraction program for causing each processing unit of the object extraction device of any one of the first to fourth technical means to function.

本発明によれば、ユーザが指定する注目オブジェクトを高精度に追跡、抽出することができる。 According to the present invention, an object of interest designated by a user can be tracked and extracted with high accuracy.

本発明の一実施形態に係るオブジェクト抽出装置の構成を表すブロック図である。It is a block diagram showing the structure of the object extraction apparatus which concerns on one Embodiment of this invention. 図１のオブジェクト抽出装置の動作を表すフローチャートである。It is a flowchart showing operation | movement of the object extraction apparatus of FIG. 図１のオブジェクト抽出装置の識別特徴量算出部の構成を表すブロック図である。It is a block diagram showing the structure of the identification feature-value calculation part of the object extraction apparatus of FIG. 図１のオブジェクト抽出装置の識別特徴量算出部の動作を表すフローチャートである。It is a flowchart showing operation | movement of the identification feature-value calculation part of the object extraction apparatus of FIG. オブジェクト領域及びオブジェクト周辺領域の特徴量の算出範囲の設定について説明する図である。It is a figure explaining the setting of the calculation range of the feature-value of an object area | region and an object peripheral area | region. 図１のオブジェクト抽出装置の識別特徴量算出部において、識別関数を求める模式図である。It is a schematic diagram which calculates | requires an identification function in the identification feature-value calculation part of the object extraction apparatus of FIG. 図１のオブジェクト抽出装置の識別特徴量算出部の動作を表す模式図である。It is a schematic diagram showing operation | movement of the identification feature-value calculation part of the object extraction apparatus of FIG. 図１のオブジェクト抽出装置のオブジェクト領域推定部の構成を表すブロック図である。It is a block diagram showing the structure of the object area | region estimation part of the object extraction apparatus of FIG. 図１のオブジェクト抽出装置のオブジェクト領域推定部の動作を表すフローチャートである。It is a flowchart showing operation | movement of the object area estimation part of the object extraction apparatus of FIG. 図１のオブジェクト抽出装置のオブジェクト領域推定部の動作を表す模式図である。It is a schematic diagram showing operation | movement of the object area | region estimation part of the object extraction apparatus of FIG. 図１のオブジェクト抽出装置のオブジェクト領域確定部の構成を表すブロック図である。It is a block diagram showing the structure of the object area | region determination part of the object extraction apparatus of FIG. 図１のオブジェクト抽出装置のオブジェクト領域確定部の動作を表すフローチャートである。It is a flowchart showing operation | movement of the object area | region determination part of the object extraction apparatus of FIG. 時刻ｔ_１の画像（処理対象画像）とその推定シルエット画像、及び時刻ｔ_２の画像（参照画像）とそのシルエット画像を示す一例である。Time t ₁ of the image (target image) and the estimated silhouette image, and time t ₂ of the image (reference image) and an example showing the silhouette image. 時刻ｔ_１の画像（処理対象画像）における特徴量算出範囲と、その特徴量算出範囲においてオブジェクト領域を識別した結果を示す一例である。A feature quantity calculation range at time t ₁ of the image (processed image) is an example showing the results of identifying the object region in the feature calculation range. 勾配ヒストグラムの算出方法を表す模式図である。It is a schematic diagram showing the calculation method of a gradient histogram.

以下、図面を参照しながら本発明の実施形態について詳しく説明する。図面において同じ機能を有する部分については同じ符号を付し、繰り返しの説明は省略する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In the drawings, portions having the same function are denoted by the same reference numerals, and repeated description is omitted.

図１は、本発明に係るオブジェクト抽出装置の概略構成を表すブロック図である。この図において、オブジェクト抽出装置１は、識別特徴量算出部２０、オブジェクト領域推定部３０及びオブジェクト領域確定部４０を備えている。また図２は、図１のオブジェクト抽出装置１の動作例を説明するフロー図である。 FIG. 1 is a block diagram showing a schematic configuration of an object extracting apparatus according to the present invention. In this figure, the object extraction apparatus 1 includes an identification feature amount calculation unit 20, an object region estimation unit 30, and an object region determination unit 40. FIG. 2 is a flowchart for explaining an operation example of the object extraction apparatus 1 of FIG.

図１のオブジェクト抽出装置１は、時刻ｔ_０の画像Ｆ（ｔ_０）（以降、基準画像ともいう）と、予め作成された、抽出対象となるオブジェクトの基準画像上の領域を示すシルエット画像Ｓ（ｔ_０）を外部より取得し、識別特徴量算出部２０へ入力する（図２のステップＳ１−１）。ここで、シルエット画像Ｓ（ｔ）とは、「Ｓ（ｘ，ｙ｜ｔ）＝１」となる画素の集合をオブジェクト領域（前景領域ともいう）、「Ｓ（ｘ、ｙ｜ｔ）＝０」となる画素の集合を背景領域として示す２値画像である。また、オブジェクト領域を示す情報として、シルエット画像を用いているが、本発明はこれに限定されない。シルエット画像の代わりに、オブジェクト領域の輪郭線をチェインコードで表現した輪郭情報を用いても良い。なお、チェインコードとは、ある点Ａに対して隣接する点Ｂの位置を数値化し、さらに、その隣接する点Ｂに対して隣接する点Ｃ（点Ａではない点）の位置を数値化する、ことを繰り返し、それらの数値の結合によって、線を表すものである。 The object extraction apparatus 1 in FIG. 1 includes an image F (t ₀ ) (hereinafter also referred to as a reference image) at time t ₀ and a silhouette image S indicating a region on the reference image of an object to be extracted that has been created in advance. (T ₀ ) is acquired from the outside and input to the discriminating feature amount calculation unit 20 (step S1-1 in FIG. 2). Here, the silhouette image S (t) is a set of pixels where “S (x, y | t) = 1” is an object area (also referred to as a foreground area), and “S (x, y | t) = 0. Is a binary image showing a set of pixels as a background region. Moreover, although the silhouette image is used as the information indicating the object area, the present invention is not limited to this. Instead of the silhouette image, contour information expressing the contour line of the object region with a chain code may be used. The chain code is a numerical value of a position of a point B adjacent to a certain point A, and a numerical value of a position of a point C (a point other than the point A) adjacent to the adjacent point B. , And so on, and a line is represented by the combination of those numerical values.

（識別特徴量算出部２０について）
識別特徴量算出部２０は、入力された画像Ｆ（ｔ_０）と、画像Ｆ（ｔ_０）上の抽出対象オブジェクトＯ_ＦＧの領域を示すシルエット画像Ｓ（ｔ_０）に基づいて、オブジェクト領域Ｏ_ＦＧを表現するＮ個の代表画像特徴量（代表ローカル特徴量）と、画像Ｆ（ｔ_０）における各小領域の画像特徴量（ローカル特徴量）を算出し、その代表ローカル特徴量と各小領域のローカル特徴量とから、小領域が抽出対象オブジェクトであるか否かを特徴量空間上で識別する識別関数（オブジェクト領域とそれ以外の領域との境界）を算出し、算出した識別特徴量（代表ローカル特徴量と識別関数）をオブジェクト領域確定部４０へ出力する（図２のステップＳ１−２）。以下では、画像特徴量の一例として、色ヒストグラムを用いる場合の識別特徴量算出部２０について、図３〜図７を用いて説明する。 (About the identification feature amount calculation unit 20)
Based on the input image F (t ₀ ) and the silhouette image S (t ₀ ) indicating the region of the extraction target object _OFG on the image F (t ₀ ), the identification feature amount calculation unit 20 N representative image feature amounts (representative local feature amounts) representing the _FG and image feature amounts (local feature amounts) of each small area in the image F (t ₀ ) are calculated, and the representative local feature amounts and the respective small feature amounts are calculated. An identification function (boundary between the object area and the other area) that identifies whether or not a small area is an object to be extracted is calculated from the local feature quantity of the area, and the calculated identification feature quantity (Representative local feature value and discriminant function) are output to the object area determining unit 40 (step S1-2 in FIG. 2). Hereinafter, as an example of the image feature amount, the identification feature amount calculation unit 20 in the case of using a color histogram will be described with reference to FIGS.

図３に示すように、識別特徴量算出部２０は、特徴量算出範囲設定部２０１、ローカル特徴量算出部２０２、代表ローカル特徴量算出部２０３、類似度ベクトル算出部２０４、及び識別関数算出部２０５とで構成されている。図４は、識別特徴量算出部２０の動作例を説明するフロー図である。 As illustrated in FIG. 3, the identification feature amount calculation unit 20 includes a feature amount calculation range setting unit 201, a local feature amount calculation unit 202, a representative local feature amount calculation unit 203, a similarity vector calculation unit 204, and an identification function calculation unit. 205. FIG. 4 is a flowchart for explaining an operation example of the identification feature amount calculation unit 20.

特徴量算出範囲設定部２０１は、入力されたシルエット画像Ｓ（ｔ）（図３の例ではシルエット画像Ｓ（ｔ_０））に基づいて、オブジェクト領域とオブジェクト周辺領域のそれぞれのローカル特徴量を算出する範囲を設定し、設定した特徴量算出範囲をローカル特徴量算出部２０２及び識別関数算出部２０５へ出力する（図４のステップＳ２０−１）。オブジェクト領域、及びオブジェクト周辺領域の特徴量の算出範囲の設定について、図５を用いて説明する。 The feature amount calculation range setting unit 201 calculates local feature amounts of the object region and the object peripheral region based on the input silhouette image S (t) (silhouette image S (t ₀ ) in the example of FIG. 3). The set feature amount calculation range is output to the local feature amount calculation unit 202 and the discrimination function calculation unit 205 (step S20-1 in FIG. 4). The setting of the feature amount calculation range of the object area and the object peripheral area will be described with reference to FIG.

図５（Ａ）は時刻ｔ_０のオブジェクト領域Ｏ_ＦＧを表すシルエット画像Ｓ（ｔ_０）であるとする。まず、特徴量算出範囲設定部２０１は、図５（Ｂ）に示すように、オブジェクト領域Ｏ_ＦＧを所定のカーネルによりモルフォロジー演算である膨張処理をＫ回行い、領域を拡張したオブジェクト領域Ｏ^Ｋ _{ｄｉｌａｔｅｄ}を求める。続いて、特徴算出範囲設定部２０１は、図５（Ｃ）に示す拡張オブジェクト領域Ｏ^Ｋ _{ｄｉｌａｔｅｄ}からオブジェクト領域Ｏ_ＦＧを除いた領域（オブジェクト周辺領域Ｏ_{Ａｒｏｕｎｄ}）を求める。このとき、オブジェクト領域とそれ以外の領域（オブジェクト周辺領域）から取得する特徴量のサンプル数を同程度にするため、オブジェクト領域Ｏ_ＦＧの面積と同程度になるようオブジェクト周辺領域Ｏ_{Ａｒｏｕｎｄ}を設定することが望ましい。例えば、Ｋ回目の膨張処理により得られる拡張オブジェクト領域の面積をＳ（Ｏ^ｋ _{ｄｉｌａｔｅｄ}）、オブジェクト領域Ｏ_ＦＧの面積をＳ（Ｏ_ＦＧ）とすれば、式（２０−１）を最初に満たす拡張オブジェクト領域Ｏ^Ｋ _{ｄｉｌａｔｅｄ}より、オブジェクト領域Ｏ_ＦＧを除いた領域を取得すればよい。 FIG. 5A is a silhouette image S (t ₀ ) representing the object area _OFG at time t ₀ . First, the feature quantity calculation range setting unit 201, as shown in FIG. 5 (B), is performed K times the expansion processing is the morphology operation the object area _{O FG} by a predetermined kernel object region to an extended region ^O _{K dilated} Ask for. Subsequently, feature calculation range setting unit 201 obtains the 5 area excluding the object area _{O FG} from Extended object area ^O _{K dilated} shown in (C) (Object peripheral region _{O Around).} At this time, the object peripheral area O _Around is set so as to be approximately the same as the area of the object area _{OFG in} order to make the number of samples of the feature amount acquired from the object area and other areas (object peripheral area) the same. It is desirable. For example, the area of the extended object region obtained by K-th expansion processing ^S _{(O k dilated),} if the area of the object area _{O FG} and S _{(O FG),} extended satisfying equation (20-1) in the first from the object area ^O _{K dilated,} it may acquire the area excluding the object area _{O FG.}

図３のローカル特徴量算出部２０２は、入力された画像Ｆ（ｔ）（図３の例ではＦ（ｔ_０））と、特徴量算出範囲設定部２０１で設定された特徴量算出範囲に基づいて、オブジェクト領域Ｏ_ＦＧとオブジェクト周辺領域Ｏ_{Ａｒｏｕｎｄ}に含まれる各小領域のローカル特徴量を算出する（図４のステップＳ２０−２）。ここで、ローカル特徴量として、色ヒストグラムを例に算出方法を説明する。なお、小領域は画素（ｘ，ｙ）を中心とする半径ｒの円であるとする。まず、ローカル特徴量算出部２０２は、小領域内の画素毎に量子化ステップサイズＱにより量子化した画素値Ｃ_Ｑ（ｒ，ｇ，ｂ）を取得する（式（２０−２））。なお、本例では、小領域を、画素（ｘ,ｙ）を中心とする半径ｒの円として説明しているが、これに限定されない。例えば、画素（ｘ、ｙ）を中心とするＮ×Ｍ画素単位の矩形領域であってよい。 3 is based on the input image F (t) (F (t ₀ ) in the example of FIG. 3) and the feature amount calculation range set by the feature amount calculation range setting unit 201. Te, and calculates the local feature amount of each small area included in the object area _{O FG} and objects peripheral region _{O around} (step S20-2 in Fig. 4). Here, a calculation method will be described using a color histogram as an example of the local feature amount. It is assumed that the small area is a circle with a radius r centered on the pixel (x, y). First, the local feature amount calculation unit 202 acquires a pixel value C _Q (r, g, b) quantized with a quantization step size Q for each pixel in the small region (Formula (20-2)). In this example, the small region is described as a circle having a radius r centered on the pixel (x, y), but is not limited thereto. For example, it may be a rectangular area in units of N × M pixels with the pixel (x, y) as the center.

なお、式（２０−２）において、演算子‘／／’は整数除算である。続いて、量子化した画素値Ｃ_Ｑ（ｒ，ｇ，ｂ）の出現頻度を数え上げ、ヒストグラムの各要素を小領域の面積で正規化した色ヒストグラムＨ_Ｃ（ｘ，ｙ）を算出する。また、ローカル特徴量２０２は、画像の色空間としてＲＧＢを用いてローカル特徴量を算出するが、本発明はこれに限らず、ＹＣｂＣｒ（ＹＵＶ）、ＣＩＥＬ＊ａ＊ｂ＊、ＣＩＥＬ＊ｕ＊ｖ＊であっても良いし、他の色空間であってもよい。 In Expression (20-2), the operator “//” is integer division. Subsequently, the appearance frequency of the quantized pixel value C _Q (r, g, b) is counted, and a color histogram H _C (x, y) in which each element of the histogram is normalized by the area of the small region is calculated. The local feature amount 202 is calculated using RGB as the color space of the image. However, the present invention is not limited to this, and the present invention is not limited to this. YCbCr (YUV), CIE L * a * b *, CIE L * u * V * or other color space may be used.

図３の代表ローカル特徴量算出部２０３は、入力されたオブジェクト領域、及びオブジェクト周辺領域の各小領域のローカル特徴量をクラスタリングし、オブジェクト領域を表現するＫ個の代表ローカル特徴量を算出する（図４のステップＳ２０−３）。例えば、ローカル特徴量にＫ−ｍｅａｎｓ法を適用し、Ｋ個の代表ローカル特徴量を取得すればよい。 The representative local feature value calculation unit 203 in FIG. 3 clusters the local feature values of the input object region and each small region of the object peripheral region, and calculates K representative local feature values representing the object region ( Step S20-3 in FIG. For example, the K-means method may be applied to the local feature amount to acquire K representative local feature amounts.

図３の類似度ベクトル算出部２０４は、各小領域のローカル特徴量（色ヒストグラム）と、オブジェクト領域の画像特徴量を表す代表ローカル特徴量（色ヒストグラム）より、オブジェクト領域、オブジェクト周辺領域の小領域毎のローカル特徴量と、オブジェクト領域のＫ個の代表ローカル特徴量との類似度を算出し、その類似度ベクトルを出力する（図４のステップＳ２０−４）。ここで、小領域の色ヒストグラムＨ_ｃ（ｘ、ｙ）と代表色ヒストグラムＨ_ｃ，ｋ（Ｏ_ＦＧ）との類似度ｄ（Ｈ_Ｃ（ｘ，ｙ），Ｈ_Ｃ，ｋ（Ｏ_ＦＧ））は、例えば、式（２０−３）に示すヒストグラムインターセクションにより算出する。類似度の値域は０〜１であり、１に近いほどヒストグラムの形状が似ており、０に近いほどヒストグラムの形状が異なることを示す。 The similarity vector calculation unit 204 in FIG. 3 uses the local feature amount (color histogram) of each small region and the representative local feature amount (color histogram) representing the image feature amount of the object region to reduce the object region and the object peripheral region. The similarity between the local feature amount for each region and the K representative local feature amounts of the object region is calculated, and the similarity vector is output (step S20-4 in FIG. 4). Here, the color histogram _H c (x, y) of the small area representative color histogram _H c, k _{(O FG)} and similarity _{_{d (H C (x, y}} ), H C, k (O FG)) Is calculated by, for example, a histogram intersection shown in Expression (20-3). The similarity value range is 0 to 1. The closer the value is to 1, the more similar the shape of the histogram is. The closer the value is to 0, the different the shape of the histogram is.

図３の識別関数算出部２０５は、入力されたオブジェクト領域Ｏ_ＦＧの代表ローカル特徴量と、オブジェクト領域Ｏ_ＦＧ、及びオブジェクト周辺領域Ｏ_{Ａｒｏｕｎｄ}の各小領域のローカル特徴量との類似度ベクトルより、図６に示すように特徴量空間においてオブジェクト領域Ｏ_ＦＧとオブジェクト周辺領域Ｏ_{Ａｒｏｕｎｄ}とを識別（分類）する識別関数「φ（ｘ）＝０」を算出する（図３のステップＳ２０−５）。 The discriminant function calculation unit 205 in FIG. 3 uses the similarity vector between the representative local feature amount of the input object region _{OFG and} the local feature amount of each small region of the object region _OFG and the object peripheral region _OAround . As shown in FIG. 6, an identification function “φ (x) = 0” for identifying (classifying) the object area _OFG and the object peripheral area O _Around in the feature amount space is calculated (step S20-5 in FIG. 3).

図７において、画像Ｆ（ｔ_０）上の点線Ｕに囲まれている領域内の各小領域を、オブジェクト領域とオブジェクト周辺領域との２クラスへ分類する問題を考える。２クラスへの分類問題は、Ｎ個の特徴量ベクトルｘと教師信号ｙからなるデータＤ＝｛（ｘ^ｕ，ｙ^ｕ），ｕ＝１，２，．．．，Ｎ｝を与えられて、式（２０−４）に示す二乗和誤差を最小にする識別関数φ（ｘ）を求める問題として定式化される。 In FIG. 7, consider a problem of classifying each small region in the region surrounded by the dotted line U on the image F (t ₀ ) into two classes of an object region and an object peripheral region. The classification problem into two classes includes data D = {(x ^u , ^yu ), u = 1, 2,. . . , N} and is formulated as a problem for obtaining the discriminant function φ (x) that minimizes the sum-of-squares error shown in equation (20-4).

式（２０−４）において、ｓｇｎ（・）は符号関数である。ここで、データＤにおいて、ｘ^ｕは、図７の領域Ｕ上のｕ番目の小領域における特徴量ベクトルを表し、ｙ^ｕはｕ番目の小領域がオブジェクト領域Ｏ_ＦＧに属するか否かの正解を与える２×１のベクトルで表される教師信号である。なお、教師信号ｙ^ｕは、ｕ番目の小領域がオブジェクト領域Ｏ_ＦＧに属する場合（小領域の中心となる画素（ｘ，ｙ）のシルエット画像Ｓ（ｔ）の値が「Ｓ（ｘ、ｙ｜ｔ）＝１」のとき）、「ｙ^ｕ＝（１，０）」とし、それ以外の場合（小領域の中心となる画素（ｘ，ｙ）のシルエット画像Ｓ（ｔ）の値が「Ｓ（ｘ、ｙ｜ｔ）＝０」のとき）、「ｙ^ｕ＝（０，１）」であるとする。ここで、識別関数φ（ｘ）を式（２０−５）で与えられると仮定すれば、式（２０−４）は式（２０−６）を最小化する重みｗを求める問題となる。 In Expression (20-4), sgn (•) is a sign function. Here, the data D, x ^u represents the feature vector of u-th small region on the region U in FIG. 7, y ^u is correct u-th small region of whether belonging to the object area O _FG Is a teacher signal represented by a 2 × 1 vector. Note that the teacher signal ^yu is obtained when the u-th small region belongs to the object region _OFG (the value of the silhouette image S (t) of the pixel (x, y) that is the center of the small region is “S (x, y | T) = 1 ”,“ y ^u = (1, 0) ”, otherwise (the value of the silhouette image S (t) of the pixel (x, y) that is the center of the small region is“ S (x, y | t) = 0), and “y ^u = (0, 1)”. Here, if it is assumed that the discriminant function φ (x) is given by the equation (20-5), the equation (20-4) becomes a problem for obtaining the weight w that minimizes the equation (20-6).

式（２０−６）を最小二乗法により解くと、式（２０−７）として重みｗは与えられる。 When the equation (20-6) is solved by the least square method, the weight w is given as the equation (20-7).

以上、識別特徴量算出部２０によれば、基準となる画像Ｆ（ｔ_０）とそのシルエット画像Ｓ（ｔ_０）から、オブジェクト領域の画像特徴量を表現するＮ個の代表ローカル特徴量と、オブジェクト領域、及びオブジェクト周辺領域の小領域毎のローカル特徴量との類似度ベクトルから、各小領域がオブジェクト領域、オブジェクト周辺領域のいずれのクラスに属するかを識別する識別関数を算出することができる。なお、本例では、識別特徴量算出部２０では、識別関数を線形分離可能であると仮定し、最小二乗法により算出する場合について説明したが、これに限定されない。例えば、最小二乗法の代わりに、非線形サポートベクトルマシン（support vector machine）、決定木、多層パーセプトロンなどの機械学習によって識別関数を算出してもよい。 As described above, according to the identification feature amount calculation unit 20, the N representative local feature amounts representing the image feature amount of the object region from the reference image F (t ₀ ) and its silhouette image S (t ₀ ), An identification function for identifying whether each small region belongs to the class of the object region or the object peripheral region can be calculated from the similarity vector with the local feature amount for each small region of the object region and the object peripheral region. . In this example, the identification feature amount calculation unit 20 has been described assuming that the identification function is linearly separable and is calculated by the least square method. However, the present invention is not limited to this. For example, the discriminant function may be calculated by machine learning such as a non-linear support vector machine, a decision tree, or a multilayer perceptron instead of the least square method.

なお、図７には、小領域のローカル特徴量の一例である色ヒストグラムＨ_ｃ（ｘ、ｙ）と、オブジェクト領域のＫ個の代表ローカル特徴量の一例である代表色ヒストグラムＨ_ｃ，１（ｘ、ｙ）〜Ｈ_ｃ，Ｋ（ｘ、ｙ）も示されている。 In FIG. 7, a color histogram H _c (x, y), which is an example of a local feature amount in a small area, and a representative color histogram H _{c, 1} (an example of K representative local feature amounts in an object area) are shown. x, y) to H _{c, K} (x, y) are also shown.

図２に戻って、図１のオブジェクト抽出装置１は、時刻ｔ_１の画像Ｆ（ｔ_１）（以降、処理対象画像ともいう）と、時刻ｔ_１の直前に算出した時刻ｔ_２におけるオブジェクト領域を示すシルエット画像Ｓ（ｔ_２）とその画像Ｆ（ｔ_２）（以降、参照画像ともいう）を外部より取得し、オブジェクト領域推定部３０へ入力する（図２のステップＳ１−３）。 Returning to FIG. 2, the object extraction apparatus 1 in FIG. 1 includes an image F (t ₁ ) at time t ₁ (hereinafter also referred to as a processing target image) and an object region at time t ₂ calculated immediately before time t _1. A silhouette image S (t ₂ ) indicating the above and an image F (t ₂ ) (hereinafter also referred to as a reference image) are acquired from the outside and input to the object region estimation unit 30 (step S1-3 in FIG. 2).

（オブジェクト領域推定部３０について）
図２に戻って、図１のオブジェクト領域推定部３０は、入力された画像Ｆ（ｔ_１）と、外部で記憶されている、１つ前の画像Ｆ（ｔ_２）とその画像Ｆ（ｔ_２）上のオブジェクト領域を示すシルエット画像Ｓ（ｔ_２）とから、時刻ｔ_２のオブジェクト領域上の特徴点を、時刻ｔ_１の対応する特徴点へ射影する変換行列Ｈを求め、その変換行列から時刻ｔ_１のオブジェクト領域を推定した推定シルエット画像Ｓ_ｐｒｅｄ（ｔ_１）を生成し、オブジェクト領域確定部４０へ出力する（図２のステップＳ１−４）。 (About the object region estimation unit 30)
Returning to FIG. 2, the object region estimation unit 30 in FIG. 1 inputs the input image F (t ₁ ), the previous image F (t ₂ ) stored externally, and the image F (t ₂ ) From the silhouette image S (t ₂ ) showing the object region on the top, a transformation matrix H for projecting the feature points on the object region at time t ₂ onto the corresponding feature points at time t ₁ is obtained, and the transformation matrix From this, an estimated silhouette image S _pred (t ₁ ) in which the object region at time t ₁ is estimated is generated and output to the object region determining unit 40 (step S1-4 in FIG. 2).

続いて、図８〜図１０を参照して、本実施形態におけるオブジェクト領域推定部３０について詳細に説明する。図８はオブジェクト領域推定部３０の概略構成を表すブロック図である。図８に示すように、オブジェクト領域推定部３０は、特徴点算出部３０１、対応点算出部３０２、変換行列算出部３０３、及び推定シルエット画像生成部３０４とで構成されている。図９は、オブジェクト領域推定部３０の動作例を説明するフロー図である。また、図１０は、時刻ｔ_１と時刻ｔ_２の特徴点の対応関係から、時刻ｔ_１におけるオブジェクト領域を推定する動作を表す模式図である。 Next, the object region estimation unit 30 in the present embodiment will be described in detail with reference to FIGS. FIG. 8 is a block diagram illustrating a schematic configuration of the object region estimation unit 30. As illustrated in FIG. 8, the object region estimation unit 30 includes a feature point calculation unit 301, a corresponding point calculation unit 302, a transformation matrix calculation unit 303, and an estimated silhouette image generation unit 304. FIG. 9 is a flowchart for explaining an operation example of the object region estimation unit 30. FIG. 10 is a schematic diagram illustrating an operation of estimating the object region at time t ₁ from the correspondence between the feature points at time t ₁ and time t ₂ .

図９のステップＳ３０−１において、図８の特徴点算出部３０１は、一つ前の画像Ｆ（ｔ_２）のオブジェクト領域と、入力された画像Ｆ（ｔ_１）との画像間の対応関係を求めるために画像Ｆ（ｔ_２）におけるオブジェクト領域内のＮ個の特徴点Ｋｓ（ｔ_２）（ｓ＝１，・・・，Ｎ）を算出し、その結果を対応点算出部３０２へ出力する（ステップＳ３０−１）。
なお、特徴点とは、画素間の色や輝度の変化等に基づいて被写体のエッジの一部や頂点として抽出される点である。例えば、ある画素（ｘ，ｙ）を中心とした局所領域Ｓの範囲内のｘ方向、ｙ方向の輝度の勾配ベクトルＧ_ｉ（ｘ，ｙ）（ｉ＝ｘ，ｙ）を用いて表される二次モーメント行列Ａ（式（３０−１））の第一固有値λ_１と第二固有値λ_２を求め、式（３０−２）に示す条件を満たす画素（ｘ，ｙ）を特徴点として検出する。 In step S30-1 in FIG. 9, the feature point calculation unit 301 in FIG. 8 correlates the image between the object region of the previous image F (t ₂ ) and the input image F (t ₁ ). N feature points Ks (t ₂ ) (s = 1,..., N) in the object region in the image F (t ₂ ) are calculated, and the result is output to the corresponding point calculation unit 302. (Step S30-1).
A feature point is a point extracted as a part or vertex of a subject's edge based on a change in color or luminance between pixels. For example, it is represented using a gradient vector G _i (x, y) (i = x, y) of luminance in the x and y directions within the range of the local region S with a certain pixel (x, y) as the center. First eigenvalue λ ₁ and second eigenvalue λ ₂ of second moment matrix A (Equation (30-1)) are obtained, and pixel (x, y) satisfying the condition shown in Equation (30-2) is detected as a feature point. To do.

つまり、二次モーメント行列Ａの第一固有値λ_１、及び第二固有値λ_２のうち小さい方の固有値が所定の閾値λ_ｔｈより大きい（または以上）場合に特徴点とするものである（例えば、「J. Shi and C. Tomasi, “Good Features to Track,” 9^th IEEE Conference on Computer Vision and Pattern Recognition, June 1994」参照）。なお、式（３０−１）において係数ｗ（ｕ，ｖ）は、画素（ｘ，ｙ）からｘ方向にｕ，ｙ方向にｖだけ離れた画素（ｘ＋ｕ，ｙ＋ｖ）に関する重み係数を表し、例えば、式（３０−３）の条件を満たすように定めた、局所領域Ｓの範囲内の２次ガウス分布の値を正規化した値を用いる。 That is, a characteristic point is obtained when the smaller eigenvalue of the first eigenvalue λ ₁ and the second eigenvalue λ ₂ of the second moment matrix A is greater than (or greater than) the predetermined threshold λ _th (for example, see "J. Shi and C. Tomasi," Good Features to Track, "9 th IEEE Conference on Computer Vision and Pattern Recognition, June 1994 "). In the equation (30-1), the coefficient w (u, v) represents a weighting coefficient related to the pixel (x + u, y + v) separated from the pixel (x, y) by u in the x direction and v in the y direction. A value obtained by normalizing the value of the secondary Gaussian distribution within the range of the local region S, which is determined so as to satisfy the condition of Expression (30-3), is used.

ここで、図１０（Ａ）に画像Ｆ（ｔ_２）のオブジェクト領域上から、特徴点Ｋｓ（ｔ_２）を検出した結果の一例を示す。
図９のステップＳ３０−２に進んで、図８の対応点算出部３０２は、入力された画像Ｆ（ｔ_１）と外部より読みだした一つ前の画像Ｆ（ｔ_２）と、ステップＳ３０−１で取得した画像Ｆ（ｔ_２）の特徴点情報Ｋ（ｔ_２）に基づいて、画像Ｆ（ｔ_２）の各特徴点Ｋｓ（ｔ_２）（ｓ＝１，・・・，Ｎ）と対応する画像Ｆ（ｔ_１）上の点Ｋｓ（ｔ_１）をオプティカルフローにより算出し、その特徴点Ｋｓの時刻ｔ_１，時刻ｔ_２における位置を記述した対応点情報Ｑ（ｔ_１，ｔ_２）を変換行列算出部３０３へ出力する（ステップＳ３０−２）。図１０（Ａ）には、画像Ｆ（ｔ_２）のオブジェクト領域上の特徴点Ｋｓ（ｔ_２）と画像Ｆ（ｔ_１）上の対応する特徴点Ｋｓ（ｔ_１）を検出した結果が示されている。また、画像Ｆ（ｔ_２）のオブジェクト領域上の特徴点Ｋｓ（ｔ_２）と一致する画像Ｆ（ｔ_１）上の特徴点Ｋｓ（ｔ_１）を結ぶ点線は、対応点情報Ｑ（ｔ_１，ｔ_２）を視覚的に表したものである。 Here, FIG. 10A shows an example of the result of detecting the feature point Ks (t ₂ ) from the object area of the image F (t ₂ ).
Proceeding to step S30-2 in FIG. 9, the corresponding point calculation unit 302 in FIG. 8 receives the input image F (t ₁ ), the previous image F (t ₂ ) read from the outside, and step S30. Based on the feature point information K (t ₂ ) of the image F (t ₂ ) acquired at −1, each feature point Ks (t ₂ ) (s = 1,..., N) of the image F (t ₂ ) corresponding image F _{(t 1)} point on Ks a _{(t 1)} is calculated by the optical flow for the time _{t 1} of the feature point Ks, time _t corresponding points describing the position in ₂ information Q _(t 1, t and ₂ ) is output to the transformation matrix calculation unit 303 (step S30-2). The FIG. 10 (A), the image F _{(t 2)} feature point Ks _{(t 2)} of the object area of the image F _{(t 1)} on the corresponding feature point Ks _{(t 1)} detected result is shown Has been. A dotted line connecting the feature point Ks (t ₁ ) on the image F (t ₁ ) that matches the feature point Ks (t ₂ ) on the object region of the image F (t ₂ ) is the corresponding point information Q (t ₁ , T ₂ ) is a visual representation.

なお、画像Ｆ（ｔ_２）の各特徴点Ｋｓ（ｔ_２）と対応する画像Ｆ（ｔ_１）上の点Ｋｓ（ｔ_１）は、例えば、式（３０−４）に示す勾配法によるオプティカルフローの拘束条件をＫｓ（ｔ_１）について解くことで取得できる（例えば、「B. D. Lucas and T. Kanade, “An iterative image registration technique with an application to stereo vision,” Proceedings of the 1981 DARPA imaging Understanding Workshop (pp.121-130), 1981」参照）。 Each feature point Ks of the image F _{(t 2)} _{(t 2)} and the corresponding image F _{(t 1)} points on Ks _{(t 1),} for example, optical with a gradient method shown in Equation (30-4) It can be obtained by solving the flow constraint on Ks (t ₁ ) (for example, “BD Lucas and T. Kanade,“ An iterative image registration technique with an application to stereo vision, ”Proceedings of the 1981 DARPA imaging Understanding Workshop ( pp. 121-130), 1981 ”).

ここで、式（３０−４）において、Ｇ_ｉ（ｘ，ｙ｜ｔ_１）（ｉ＝ｘ，ｙ，ｔ）は画像Ｆ（ｔ_１）の輝度に関するｘ方向、ｙ方向、ｔ方向（時間方向）の勾配ベクトルを表し、Ｓは特徴点Ｋｓを中心とする所定サイズの局所領域を表す。 Here, in Expression (30-4), G _i (x, y | t ₁ ) (i = x, y, t) is the x direction, y direction, and t direction (time) regarding the luminance of the image F (t ₁ ). Direction), and S represents a local region of a predetermined size centered on the feature point Ks.

図９のステップＳ３０−３に進んで、図８の変換行列算出部３０３は、ステップＳ３０−２で取得した対応点情報Ｑ（ｔ_１，ｔ_２）から、特徴点Ｋｓ（ｓ＝１，・・・，Ｎ）を一つ前の画像Ｆ（ｔ_２）上の位置から、画像Ｆ（ｔ_１）上の位置へ射影する変換行列Ｈを算出し、その変換行列Ｈを記述した情報を推定シルエット画像生成部３０４へ出力する（ステップＳ３０−３）。 Proceeding to step S30-3 in FIG. 9, the transformation matrix calculating unit 303 in FIG. 8 uses the feature point Ks (s = 1,...) From the corresponding point information Q (t ₁ , t ₂ ) acquired in step S30-2. .., N) is calculated from a position on the previous image F (t ₂ ) to a position on the image F (t ₁ ), and information describing the conversion matrix H is estimated. It outputs to the silhouette image generation part 304 (step S30-3).

画像Ｆ（ｔ_２）上のオブジェクト領域Ｏ（ｔ_２）上の画素（ｘ，ｙ）を、変換行列Ｈによって、画像Ｆ（ｔ_１）に射影した点の座標を（ｘ_ｐｒｅｄ，ｙ_ｐｒｅｄ）とすると、推定オブジェクト領域Ｏ_ｐｒｅｄ（ｔ_１）とオブジェクト領域Ｏ（ｔ_２）の対応関係は、変換行列Ｈを用いて式（３０−５）により表される（例えば、「ＣＧ−ＡＲＴＳ協会，ディジタル画像処理第２版，２００９」参照）。 The coordinates of a point obtained by projecting the pixel (x, y) on the object area O (t ₂ ) on the image F (t ₂ ) onto the image F (t ₁ ) by the transformation matrix H are (x _pred , y _pred ). Then, the correspondence relationship between the estimated object region O _pred (t ₁ ) and the object region O (t ₂ ) is expressed by Expression (30-5) using the transformation matrix H (for example, “CG-ARTS Association, Digital image processing 2nd edition, 2009 ").

式（３０−５）において、記号「〜」は同値関係を表し、定数倍の違いを許して等しいことを意味する。また、変換行列Ｈは、一般的な変換を表現することができるため、射影変換と呼ばれる。ここで、時刻ｔ_２と時刻ｔ_１のオブジェクト領域間の対応関係を平行移動として表現できると仮定すると、式（３０−５）は、式（３０−６）として表現される。 In the expression (30-5), the symbol “˜” represents an equivalence relationship, which means that they are equal by allowing a constant multiple difference. The transformation matrix H is called projective transformation because it can express general transformation. Here, assuming that the correspondence between the object region at time _{t 2} and time _{t 1} can be expressed as a translation, formula (30-5) is expressed as Equation (30-6).

式（３０−６）中の係数ｔ_ｘ、ｔ_ｙはそれぞれｘ方向、ｙ方向への移動量を表す。また、画像間の対応関係を平行移動、回転、拡大・縮小を含めたアフィン変換として表現できると仮定すると、式（３０−５）は、式（３０−７）として表現される。 Coefficient _t x in the formula (30-6), _{t y} is x-direction, respectively, represent the amount of movement in the y-direction. Assuming that the correspondence between images can be expressed as affine transformation including translation, rotation, and enlargement / reduction, Expression (30-5) is expressed as Expression (30-7).

式（３０−７）中の係数ａ，ｂ，ｃ，ｄは拡大・縮小、及び回転を表し、係数ｔ_ｘ、ｔ_ｙは式（３０−６）と同様である。なお、式（３０−５）乃至式（３０−７）における変換行列Ｈの各係数ｈ_ｉｊ（ｉ，ｊ＝１，２，３）は、各変換モデル（平行移動、アフィン変換、射影変換）の拘束条件と対応点情報Ｑ（ｔ_１，ｔ_２）から導出される連立方程式を最小二乗法により解くことで算出する。なお、十分な対応点数が無く、変換行列Ｈを算出できない場合は、所定の変換行列Ｈ_０を用いる。例えば、変換行列Ｈ_０の具体例としては、一つ前の変換行列の算出結果を用いる。 Factor in the equation (30-7) a, b, c, d is scaled, and represents a rotation factor _t x, is _{t y} is the same as in the formula (30-6). Note that the coefficients h _ij (i, j = 1, 2, 3) of the transformation matrix H in the equations (30-5) to (30-7) are the transformation models (translation, affine transformation, projective transformation). This is calculated by solving simultaneous equations derived from the constraint conditions and the corresponding point information Q (t ₁ , t ₂ ) by the method of least squares. If there is not a sufficient number of corresponding points and the conversion matrix H cannot be calculated, a predetermined conversion matrix H ₀ is used. For example, as a specific example of the transformation matrix H ₀ , the calculation result of the previous transformation matrix is used.

図９のステップＳ３０−４に進んで、図８の推定シルエット画像生成部３０４は、変換行列Ｈとシルエット画像Ｓ（ｔ_２）に基づいて、「時刻ｔ_１におけるオブジェクト領域は、一つ前の画像Ｆ（ｔ_２）上のオブジェクト領域を、図８の変換行列算出部３０３で算出した変換行列Ｈを用いて、画像Ｆ（ｔ_１）上へ射影した領域にある」と仮定して、時刻ｔ_１のオブジェクト領域を推定し、そのオブジェクト領域を表す推定シルエット画像Ｓ_ｐｒｅｄ（ｔ_１）を生成する（ステップＳ３０−４）。具体的には、画像Ｆ（ｔ_１）上の各画素（ｘ，ｙ）（図１０（Ｂ）の点Ｐｎ）に関して、変換行列Ｈの逆行列Ｈ^−１を乗じて画像Ｆ（ｔ_２）上へ逆射影した位置の座標（ｘ’、ｙ’）（図１０（Ｂ）の点Ｐｎ´）が、時刻ｔ_２のオブジェクト領域Ｏ（ｔ_２）上に含まれるどうか判定する。なお、時刻ｔ_２のオブジェクト領域Ｏ（ｔ_２）とは、シルエット画像Ｓ（ｔ_２）上で、「Ｓ（ｘ，ｙ｜ｔ_２）＝１」となる画素の集合である。 Proceeding to step S30-4 in FIG. 9, the estimated silhouette image generation unit 304 in FIG. 8 determines that “the object region at time t ₁ is the previous one based on the transformation matrix H and the silhouette image S (t ₂ ). Assuming that the object area on the image F (t ₂ ) is an area projected onto the image F (t ₁ ) using the transformation matrix H calculated by the transformation matrix calculation unit 303 in FIG. The object area of t ₁ is estimated, and an estimated silhouette image S _pred (t ₁ ) representing the object area is generated (step S30-4). Specifically, for each pixel (x, y) (point Pn in FIG. 10B) on the image F (t ₁ ), the image F (t ₂ ) is multiplied by the inverse matrix H ⁻¹ of the transformation matrix H. It is determined whether the coordinates (x ′, y ′) (point Pn ′ in FIG. 10B) of the back-projected position are included on the object area O (t ₂ ) at time t ₂ . Note that the object region O (t ₂ ) at time t ₂ is a set of pixels with “S (x, y | t ₂ ) = 1” on the silhouette image S (t ₂ ).

式（３０−８）に示すように、画像Ｆ（ｔ_１）上の画素（ｘ，ｙ）を画像Ｆ（ｔ_２）へ逆射影した位置がオブジェクト領域Ｏ（ｔ_２）に含まれる場合、画素（ｘ，ｙ）は推定オブジェクト領域Ｏ_ｐｒｅｄ（ｔ_１）を構成する画素であると判定し、推定シルエット画像Ｓ_ｐｒｅｄ（ｘ，ｙ｜ｔ_１）に「１」を設定する。また、画像Ｆ（ｔ_１）上の画素（ｘ，ｙ）オブジェクト領域Ｏ（ｔ_２）に含まれない場合、画素（ｘ，ｙ）は推定オブジェクト領域Ｏ_ｐｒｅｄ（ｔ_１）を構成する画素ではないと判定し、推定シルエット画像Ｓ_ｐｒｅｄ（ｘ，ｙ｜ｔ_１）に「０」を設定する。ここで、図１０（Ｂ）にシルエット画像Ｓ（ｔ_２）より求めた推定シルエット画像Ｓ_ｐｒｅｄ（ｔ_１）の一例を示す。 As shown in Equation (30-8), if the image F _{(t 1)} on the pixel (x, y) was back projected to the image F _{(t 2)} position is included in the object area O _{(t 2),} The pixel (x, y) is determined to be a pixel constituting the estimated object region O _pred (t ₁ ), and “1” is set to the estimated silhouette image S _pred (x, y | t ₁ ). Further, when the pixel (x, y) on the image F (t ₁ ) is not included in the object area O (t ₂ ), the pixel (x, y) is not a pixel constituting the estimated object area O _pred (t ₁ ). It is determined that the estimated silhouette image S _pred (x, y | t ₁ ) is “0”. Here, FIG. 10B shows an example of the estimated silhouette image S _pred (t ₁ ) obtained from the silhouette image S (t ₂ ).

以上、オブジェクト領域推定部３０によれば、オブジェクト領域の固有の特徴点に基づいて、時刻ｔ_１と時刻ｔ_２のオブジェクト領域の対応関係を求めるため、カメラワーク、シーンチェンジなど背景に変化が伴う場合にも頑健なオブジェクト領域を推定することができる。 As described above, according to the object region estimation unit 30, based on the specific feature point of the object area, to determine the correspondence between the object region at time t ₁ and time t _2, the camera work, accompanied by changes in the background such as a scene change Even in this case, a robust object region can be estimated.

（オブジェクト領域確定部４０について）
再び図２に戻って、図１のオブジェクト領域確定部４０は、推定シルエット画像Ｓ_ｐｒｅｄ（ｔ_１）より特徴量を算出する範囲を設定し、画像Ｆ（ｔ_１）上の小領域毎にローカル特徴量を算出し、そのローカル特徴量と、基準画像の識別特徴量（代表ローカル特徴量と識別関数φ（ｘ））とに基づいて、オブジェクト領域であるか否かを識別し、オブジェクト領域を示すシルエット画像Ｓ（ｔ_１）を出力する（図２のステップＳ１−４）。
続いて、本実施形態におけるオブジェクト領域確定部４０について詳細に説明する。 (About the object area determination unit 40)
Returning again to FIG. 2, the object region determination unit 40 in FIG. 1 sets a range for calculating the feature amount from the estimated silhouette image S _pred (t ₁ ), and performs local processing for each small region on the image F (t ₁ ). The feature amount is calculated, and based on the local feature amount and the identification feature amount (representative local feature amount and identification function φ (x)) of the reference image, the object region is identified. A silhouette image S (t ₁ ) is output (step S1-4 in FIG. 2).
Next, the object area determination unit 40 in this embodiment will be described in detail.

図１１に示すように、オブジェクト領域確定部４０は、特徴量算出範囲設定部２０１、ローカル特徴量算出部２０２、類似度ベクトル算出部２０３、及びオブジェクト領域識別部４０４とで構成されている。図１２は、オブジェクト領域確定部４０の動作例を説明するフロー図である。また、図１３（Ａ）は時刻ｔ_１の画像Ｆ（ｔ_１）、図１３（Ｂ）は時刻ｔ_２の画像Ｆ（ｔ_２）、図１３（Ｃ）は時刻ｔ_１のオブジェクト領域を推定した推定シルエット画像Ｓ_ｐｒｅｄ（ｔ_１）、図１３（Ｄ）は時刻ｔ_２のオブジェクト領域を示すシルエット画像Ｓ（ｔ_２）の一例である。なお、図１３（Ａ）の時刻ｔ_１のオブジェクト領域において、点線で囲まれた領域Ｒ１は、図１３（Ｂ）の時刻ｔ_２のオブジェクト領域上の点線で囲まれた領域Ｒ２の部分が形状変化したものとする。また、図１４（Ａ）は、図１３（Ｃ）上の点線で囲まれた領域Ｂ１の拡大図、図１４（Ｂ）は領域Ｂ１内の特徴量算出範囲Ｏ^Ｋ _{，ｄｉｌａｔｅｄ}からオブジェクト領域とオブジェクト周辺領域を識別した結果の一例である。 As shown in FIG. 11, the object region determination unit 40 includes a feature amount calculation range setting unit 201, a local feature amount calculation unit 202, a similarity vector calculation unit 203, and an object region identification unit 404. FIG. 12 is a flowchart for explaining an operation example of the object area determination unit 40. 13A is an image F (t ₁ ) at time t ₁ , FIG. 13B is an image F (t ₂ ) at time t ₂ , and FIG. 13C is an object region at time t _1. The estimated silhouette image S _pred (t ₁ ) and FIG. _13D are examples of the silhouette image S (t ₂ ) showing the object region at time t ₂ . Incidentally, in the object region at time t ₁ of FIG. 13 (A), the region R1 surrounded by a dotted line, a dotted line portion surrounded by a region R2 on the object region at time t ₂ shown in FIG. 13 (B) shape Suppose it has changed. Further, FIG. 14 (A) Fig. 13 (C) an enlarged view of a region B1 surrounded by a dotted line on FIG. 14 (B) is the feature quantity calculation range within the region B1 ^O _K, the object area from the _dilated and objects It is an example of the result of having identified the peripheral region.

図１１の特徴量算出範囲設定部２０１は、入力された時刻ｔ_１の推定オブジェクト領域を示す推定シルエット画像Ｓ_ｐｒｅｄ（ｔ_１）に基づいて、ローカル特徴量を算出する範囲として、図１４（Ａ）に示すように推定オブジェクト領域Ｏ_ｐｒｅｄを所定のカーネルによりモルフォロジー演算である膨張処理をＫ回行い、領域を拡張した領域Ｏ^Ｋ _{ｄｉｌａｔｅｄ}を求めて、その算出範囲をローカル特徴量算出部２０２、オブジェクト領域識別部４０４へ出力する（図１２のステップＳ４０−１）。 The feature amount calculation range setting unit 201 in FIG. 11 uses the estimated silhouette image S _pred (t ₁ ) indicating the input estimated object region at time t ₁ as a range for calculating the local feature amount as shown in FIG. the expansion process is a morphological operation is performed K times with the estimated object region O _pred predetermined kernel as shown in), seeking region O ^K _dilated an extension of the region, local feature calculation section 202 and the calculation range, object It outputs to the area | region identification part 404 (step S40-1 of FIG. 12).

図１１のローカル特徴量算出部２０２は、入力された画像Ｆ（ｔ_１）と、特徴量算出範囲設定部２０１で設定された特徴量算出範囲上にある各小領域のローカル特徴量を算出する（図１２のステップＳ４０−２）。 The local feature quantity calculation unit 202 in FIG. 11 calculates the local feature quantity of each small region within the input image F (t ₁ ) and the feature quantity calculation range set by the feature quantity calculation range setting unit 201. (Step S40-2 in FIG. 12).

図１１の類似度ベクトル算出部２０３は、各小領域のローカル特徴量と、オブジェクト領域の画像特徴量を表す代表ローカル特徴量より、オブジェクト領域、オブジェクト周辺領域の小領域毎のローカル特徴量と、オブジェクト領域のＫ個の代表ローカル特徴量との類似度を算出し、その類似度ベクトルｘを出力する（図１２のステップＳ４０−３）。 The similarity vector calculation unit 203 in FIG. 11 calculates the local feature amount for each small region of the object region and the object peripheral region based on the local feature amount of each small region and the representative local feature amount representing the image feature amount of the object region. The similarity with the K representative local feature quantities of the object area is calculated, and the similarity vector x is output (step S40-3 in FIG. 12).

図１１のオブジェクト領域識別部４０４は、入力された各小領域の類似度ベクトルｘと、識別関数φ（ｘ）に基づいて、特徴量算出範囲Ｏ^Ｋ _{，ｄｉｌａｔｅｄ}上の各小領域がオブジェクト領域であるか、オブジェクト周辺領域であるかを識別し、図１３（Ｄ）に示すように確定したオブジェクト領域を示すシルエット画像Ｓ（ｔ_１）を出力する（図１２のステップＳ４０−４）。具体的には、画素（ｘ，ｙ）を中心とする小領域の類似度ベクトルｘを識別関数φ（ｘ）に入力し、その応答値がオブジェクト領域を示す場合（φ（ｘ）＞０）、画素（ｘ，ｙ）はオブジェクト領域Ｏ_ＦＧを構成する画素であると判定し、シルエット画像Ｓ（ｘ，ｙ｜ｔ_１）に「１」を設定する。また、応答値がオブジェクト周辺領域を示す場合（φ（ｘ）＜０）、画素（ｘ，ｙ）はオブジェクト領域Ｏ_ＦＧを構成する画素ではないと判定し、シルエット画像Ｓ（ｘ，ｙ｜ｔ_１）に「０」を設定する。また、応答値が識別関数の境界上を示す場合（φ（ｘ）＝０）、ランダムにオブジェクト領域かオブジェクト周辺領域であるか選択し、シルエット画像Ｓ（ｘ，ｙ｜ｔ_１）に該当する値を設定するものとする。なお、シルエット画像Ｓ（ｔ_１）は「０」により初期化されているものとする。 Object area identifying unit 404 in FIG. 11, a similarity degree vector x of each small region is inputted, based on the identification function phi (x), the feature amount calculation range O ^K, each small area on the _dilated is an object region A silhouette image S (t ₁ ) indicating the determined object region is output as shown in FIG. 13D (step S40-4 in FIG. 12). Specifically, when a similarity vector x of a small area centered on the pixel (x, y) is input to the discrimination function φ (x) and the response value indicates an object area (φ (x)> 0) , The pixel (x, y) is determined to be a pixel constituting the object region _OFG , and “1” is set to the silhouette image S (x, y | t ₁ ). If the response value indicates the object peripheral area (φ (x) <0), it is determined that the pixel (x, y) is not a pixel constituting the object area _OFG , and the silhouette image S (x, y | t ₁ ) Set "0". When the response value indicates the boundary of the discriminant function (φ (x) = 0), it is randomly selected whether the region is the object region or the object peripheral region, and corresponds to the silhouette image S (x, y | t ₁ ). A value shall be set. It is assumed that the silhouette image S (t ₁ ) has been initialized with “0”.

図１４（Ｂ）は図１４（Ａ）において特徴量算出範囲Ｏ^Ｋ _{，ｄｉｌａｔｅｄ}上の各小領域に対して、オブジェクト領域であるか否かを識別した結果である。図１４（Ａ）において、時刻ｔ_２上のオブジェクト領域を時刻ｔ_１に射影変換して得た推定オブジェクト領域Ｏ_ｐｒｅｄは、時刻ｔ_１のオブジェクト領域のうち、時刻ｔ_２の領域Ｒ２が形状変化した領域Ｒ１を含まない。つまり、時刻ｔ_２のオブジェクト領域を時刻ｔ_１へ射影変換するだけでは、形状変化を伴うオブジェクトの領域を精度良く抽出することができない。しかし、推定オブジェクト領域を拡張した特徴量算出範囲Ｏ^Ｋ _{ｄｉｌａｔｅｄ}に領域Ｒ１が含まれている場合、オブジェクト領域であるか否かを識別することで、領域Ｒ１を抽出することができる。つまり、形状変化を伴うオブジェクトの領域を精度良く抽出することができる。
以上、オブジェクト領域確定部４０によれば、推定したオブジェクト領域より特徴量算出範囲を設定し、設定した特徴量算出範囲に対して、オブジェクト領域であるか、オブジェクト周辺領域であるか識別し、オブジェクト領域を確定することができる。そのため、特徴量算出範囲内であれば、形状変化を伴うオブジェクト領域を抽出することができる。また、特徴量算出範囲を、オブジェクト領域の周辺部分に限定するため、特徴量の算出に要する演算量を削減することができる。 Figure 14 (B) is the feature quantity calculation range ^O _K in FIG. 14 _(A), the respective small regions on the _dilated, the result of identifying whether the object region. In FIG. 14A, the estimated object region O _pred obtained by projective transformation of the object region at time t ₂ to time t ₁ is the shape change of the region R ₂ at time t ₂ among the object regions at time t _1. The region R1 is not included. In other words, the object region accompanying the shape change cannot be extracted with high accuracy simply by projective transformation of the object region at time t ₂ to time t ₁ . However, if it contains regions R1 to extend the estimated object region feature calculation range O ^K _dilated, by identifying whether the object region can be extracted area R1. That is, it is possible to accurately extract an object region that accompanies a shape change.
As described above, according to the object region determination unit 40, a feature amount calculation range is set from the estimated object region, and it is identified whether the feature region is an object region or an object peripheral region with respect to the set feature amount calculation range. The area can be determined. Therefore, an object region with a shape change can be extracted as long as it is within the feature amount calculation range. In addition, since the feature amount calculation range is limited to the peripheral portion of the object region, it is possible to reduce the amount of calculation required for calculating the feature amount.

図２に戻って、図１のオブジェクト抽出装置１は、時刻ｔ_１の次の時刻（ｔ_１＋１）の画像におけるオブジェクト抽出処理を行なうため、ステップＳ１−３へ戻る（図２のステップＳ１−６においてＹｅｓ）。時刻（ｔ_１＋１）の画像がなければ、オブジェクト抽出処理を終了する（図２のステップＳ１−６においてＮｏ）。 Returning to FIG. 2, the object extracting apparatus 1 of FIG. 1, for performing object extraction processing in the image of the next time of the time _{t 1} _(t 1 +1), it returns to step S1-3 (in FIG. 2, step S1- 6). If there is no image at time (t ₁ +1), the object extraction process is terminated (No in step S1-6 in FIG. 2).

（効果）
このように、オブジェクト抽出装置は、時系列画像上の第１の画像（基準画像）におけるオブジェクト領域を示すシルエット画像と、基準画像上の該オブジェクト領域とそのオブジェクト周辺領域の画像信号から、各小領域のローカル特徴量と、オブジェクト領域を代表するローカル特徴量（代表ローカル特徴量）を算出し、算出した代表ローカル特徴量と各小領域のローカル特徴量とに基づいて、各小領域がオブジェクト領域か否かを識別する識別関数を算出する。また、時系列画像上の第２の画像（処理対象画像）と、時系列画像上の第３の画像（参照画像）におけるオブジェクト領域を示すシルエット画像と参照画像とに基づいて、処理対象画像と参照画像間のオブジェクト領域の対応関係を算出し、その対応関係から参照画像上のオブジェクト領域を処理対象画像上へ射影変換する変換行列を算出する。算出した変換行列と参照画像のシルエット画像に基づいて、処理対象画像上のオブジェクト領域を推定し、推定シルエット画像を生成する。また、処理対象画像上の推定シルエット画像が示す推定オブジェクト領域と処理対象画像の画像信号から、各小領域のローカル特徴量を算出し、そのローカル特徴量と基準画像の代表ローカル特徴量と識別関数とに基づいて、各小領域がオブジェクト領域であるか否かを識別して、処理対象画像上のオブジェクト領域を確定する。これにより、ユーザが指定する注目オブジェクトを高精度に追跡、抽出することができる。 (effect)
As described above, the object extraction device uses the silhouette image indicating the object area in the first image (reference image) on the time-series image, and the image signal of the object area on the reference image and the object peripheral area to calculate each small area. The local feature amount of the region and the local feature amount representing the object region (representative local feature amount) are calculated, and each small region is converted into the object region based on the calculated representative local feature amount and the local feature amount of each small region. An identification function for identifying whether or not is calculated. Further, based on the second image (processing target image) on the time-series image and the silhouette image and reference image indicating the object area in the third image (reference image) on the time-series image, A correspondence relationship between the object areas between the reference images is calculated, and a transformation matrix for projectively transforming the object area on the reference image onto the processing target image is calculated from the correspondence relation. Based on the calculated transformation matrix and the silhouette image of the reference image, an object region on the processing target image is estimated to generate an estimated silhouette image. Further, the local feature amount of each small region is calculated from the estimated object region indicated by the estimated silhouette image on the processing target image and the image signal of the processing target image, the local feature amount, the representative local feature amount of the reference image, and the discrimination function Based on the above, it is determined whether or not each small area is an object area, and the object area on the processing target image is determined. Thereby, the object of interest designated by the user can be tracked and extracted with high accuracy.

（ローカル特徴量算出部２０２の変形例）
上記実施形態において、ローカル特徴量算出部２０２では、画像特徴量の一例として、色ヒストグラムを用いる場合について説明したが、本発明はこれに限定されない。ローカル特徴量算出部２０２は、小領域における勾配ヒストグラムを算出してもよい。この場合のローカル特徴量算出部２０２において、勾配ヒストグラムの算出方法について、図１５を参照しながら説明する。図１５は、勾配ヒストグラムの算出方法を示す模式図である。 (Modification of local feature amount calculation unit 202)
In the above embodiment, the local feature amount calculation unit 202 has been described using a color histogram as an example of an image feature amount, but the present invention is not limited to this. The local feature amount calculation unit 202 may calculate a gradient histogram in a small region. A method for calculating the gradient histogram in the local feature amount calculation unit 202 in this case will be described with reference to FIG. FIG. 15 is a schematic diagram illustrating a gradient histogram calculation method.

まず、ローカル特徴量算出部２０２は、小領域内の画素毎に、輝度成分に関して勾配ベクトルＧ（ｘ，ｙ）＝（Ｇ_ｘ，Ｇ_ｙ）を算出する。次に、算出した各画素の勾配ベクトルＧ（ｘ，ｙ）より、量子化ステップサイズＱにより量子化した勾配ベクトルの角度θ_Ｑを取得する（式（２０−８））。 First, the local feature amount calculation unit 202 calculates a gradient vector G (x, y) = (G _x , G _y ) for the luminance component for each pixel in the small region. Next, the angle θ _Q of the gradient vector quantized by the quantization step size _Q is acquired from the calculated gradient vector G (x, y) of each pixel (formula (20-8)).

続いて、量子化した勾配ベクトルの角度θ_Ｑの出現頻度を数え上げ、図１５（Ａ）に示すヒストグラムの各要素を小領域の面積で正規化した勾配ヒストグラムＨ_θ（ｘ，ｙ）を算出する。続いて、算出した勾配ヒストグラムＨ_θ（ｘ，ｙ）より、出現頻度が最大となる勾配ベクトルの角度θ_ｍを取得し、勾配ヒストグラムＨ_θ（ｘ，ｙ）と出現頻度が最大となる勾配ベクトル角度θｍに基づいて、図１５（Ｂ）に示す角度θ_ｍを原点として正規化した勾配ヒストグラムＨ_θｍ，（ｘ，ｙ）を算出する。輝度の勾配ヒストグラムをローカル特徴量として用いることで、照明変化などの色変化に対して頑健となる。また、ローカル特徴量算出部２０２は、画像特徴量として、色ヒストグラム、勾配ヒストグラムのほかに、ウェーブレット特徴量、Ｈａｒｒ−ｌｉｋｅ特徴量、Ｅｄｇｅｌｅｔ特徴量、ＥＯＨ特徴量、ＨＯＧ特徴量を用いてもよい。また、各画像特徴量を異なる画像特徴量と組み合わせて用いてもよい。 Subsequently, the appearance frequency of the angle θ _Q of the quantized gradient vector is counted, and a gradient histogram H _θ (x, y) is calculated by normalizing each element of the histogram shown in FIG. 15A with the area of the small region. . Subsequently, from the calculated gradient histogram H _θ (x, y), the angle θ _{m of the} gradient vector having the maximum appearance frequency is acquired, and the gradient vector having the maximum appearance frequency with the gradient histogram H _θ (x, y) is acquired. based on the angle .theta.m, calculated normalized gradient histogram H _.theta.m the angle theta _m shown in FIG. 15 (B) as the origin, the (x, y). By using a luminance gradient histogram as a local feature amount, it is robust against color changes such as illumination changes. In addition to the color histogram and the gradient histogram, the local feature amount calculation unit 202 may use a wavelet feature amount, a Harr-like feature amount, an Edgelet feature amount, an EOH feature amount, and an HOG feature amount in addition to the color histogram and the gradient histogram. . Each image feature amount may be used in combination with a different image feature amount.

（オブジェクト抽出装置１の変形例（オブジェクト抽出装置１ａ））
上記実施形態において、図１のオブジェクト抽出装置１は、基準画像において、抽出対象オブジェクトのシルエット画像と基準画像の画像信号より、抽出対象オブジェクトの画像特徴量を表現する代表ローカル特徴量と、小領域をオブジェクト領域であるか否かを識別する識別関数を算出しているが、これに限定されない。例えば、オブジェクト領域確定部４０より得られる、時刻ｔ_１のオブジェクト領域を示すシルエット画像Ｓ（ｔ_１）と画像Ｆ（ｔ_１）を、時刻ｔ_１の次の時刻（ｔ_１＋１）にフィードバックし、基準画像、及び基準画像のシルエット画像として用いても良い。毎時刻におけるオブジェクト領域のローカル特徴量が微小に変化する場合にも、精度良く追跡、抽出をすることが可能となる。 (Modification of Object Extraction Device 1 (Object Extraction Device 1a))
In the above embodiment, the object extraction device 1 in FIG. 1 includes, in the reference image, a representative local feature amount that represents the image feature amount of the extraction target object from the silhouette image of the extraction target object and the image signal of the reference image, and a small region. Is used to calculate whether or not the object region is an object region. However, the present invention is not limited to this. For example, obtained from the object region determination unit 40, the time _{t 1} of the silhouette images showing the object area S _{(t 1)} and image F _{(t 1),} and fed back to the next time _(t 1 +1) at time _{t 1} , A reference image, and a silhouette image of the reference image. Even when the local feature amount of the object region at every time changes minutely, it is possible to accurately track and extract.

なお、上述した実施形態におけるオブジェクト抽出装置１の一部をコンピュータで実現するようにしても良い。その場合、この制御機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現しても良い。なお、ここでいう「コンピュータシステム」とは、オブジェクト抽出装置１に内蔵されたコンピュータシステムであって、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでも良い。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであっても良い。 A part of the object extraction device 1 in the above-described embodiment may be realized by a computer. In that case, the program for realizing the control function may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read by a computer system and executed. Here, the “computer system” is a computer system built in the object extraction apparatus 1 and includes hardware such as an OS and peripheral devices. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” is a medium that dynamically holds a program for a short time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line, In such a case, a volatile memory inside a computer system serving as a server or a client may be included and a program that holds a program for a certain period of time. The program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.

また、上述した実施形態におけるオブジェクト抽出装置１の一部、または全部を、ＬＳＩ（Large Scale Integration）等の集積回路として実現しても良い。立体画像生成装置１の各機能ブロックは個別にプロセッサ化してもよいし、一部、または全部を集積してプロセッサ化しても良い。また、集積回路化の手法はＬＳＩに限らず専用回路、または汎用プロセッサで実現しても良い。また、半導体技術の進歩によりＬＳＩに代替する集積回路化の技術が出現した場合、当該技術による集積回路を用いても良い。 Moreover, you may implement | achieve part or all of the object extraction apparatus 1 in embodiment mentioned above as integrated circuits, such as LSI (Large Scale Integration). Each functional block of the stereoscopic image generating apparatus 1 may be individually made into a processor, or a part or all of them may be integrated into a processor. Further, the method of circuit integration is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. Further, in the case where an integrated circuit technology that replaces LSI appears due to progress in semiconductor technology, an integrated circuit based on the technology may be used.

以上、図面を参照してこの発明の一実施形態について詳しく説明してきたが、具体的な構成は上述のものに限られることはなく、この発明の要旨を逸脱しない範囲内において様々な設計変更等をすることが可能である。 As described above, the embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to the above, and various design changes and the like can be made without departing from the scope of the present invention. It is possible to

１…オブジェクト抽出装置、２０…識別特徴量算出部、３０…オブジェクト領域推定部、４０…オブジェクト領域確定部、２０１、４０１…特徴量算出範囲設定部、２０２、４０２…ローカル特徴量算出部、２０３…代表ローカル特徴量算出部、２０４…識別関数算出部、３０１…特徴点算出部、３０２…対応点算出部、３０３…変換行列算出部、３０４…推定シルエット画像生成部、４０３…オブジェクト領域識別部。 DESCRIPTION OF SYMBOLS 1 ... Object extraction apparatus, 20 ... Identification feature-value calculation part, 30 ... Object area | region estimation part, 40 ... Object area | region determination part, 201, 401 ... Feature-value calculation range setting part, 202, 402 ... Local feature-value calculation part, 203 ... representative local feature amount calculation unit, 204 ... identification function calculation unit, 301 ... feature point calculation unit, 302 ... corresponding point calculation unit, 303 ... transformation matrix calculation unit, 304 ... estimated silhouette image generation unit, 403 ... object region identification unit .

Claims

An object extraction device that extracts an object region of interest from a time-series image,
An identification feature amount calculating unit that calculates an identification feature amount for identifying the object region and the object peripheral region from the image of the object region on the reference image in the time-series image and the image of the peripheral region;
An object region estimation unit that estimates an object region on the processing target image by correspondence between a feature point in the object region on the reference image in the time-series image and a feature point of the processing target image in the time-series image;
The feature amount of the image of the estimated object region and its peripheral region is calculated, and the object region in the estimated processing target image is corrected and confirmed by comparing the calculated feature amount with the calculated identification feature amount. And an object area determination unit.

The identification feature amount calculation unit sets a feature amount calculation range for calculating an image feature amount for identifying an object region and an object peripheral region based on a silhouette image indicating the object region in the reference image. A range setting section;
An image feature amount calculation unit that calculates the image feature amount for each small region of a predetermined size in a set feature amount calculation range;
A representative image feature amount calculation unit that clusters image feature amounts on the object region out of the image feature amounts and calculates K different representative image feature amounts expressing the image feature amount of the object region;
For each small region, a similarity vector calculation unit that calculates a similarity vector between the image feature amount and the representative image feature amount from the image feature amount of the small region and K different representative image feature amounts;
An identification function calculation unit for calculating an identification function representing a class boundary as to whether or not a small area belongs to the object area in the feature amount space from the calculated similarity vector on the object area and the similarity vector of the object peripheral area; The object extraction apparatus according to claim 1, wherein the K different representative image feature quantities and the discrimination function are output as the discrimination feature quantities.

The object area determination unit
A feature amount calculation range setting unit for setting a feature amount calculation range for calculating an image feature amount based on an estimated silhouette image indicating the estimated object region;
An image feature amount calculation unit that calculates an image feature amount for each small region of a predetermined size in the set feature amount calculation range;
A similarity vector calculating unit that calculates the similarity vector from the image feature amount of the small region and K different representative image feature amounts in the reference image for each small region;
Based on the calculated similarity vector and the discrimination function, each small area is identified as belonging to the object area in the feature amount space, and a set of small areas determined to be included in the object area is determined. The object extraction apparatus according to claim 2, wherein the object area is determined as an object area on the processing target image.

The object region estimation unit
A feature point calculation unit for calculating a feature point in the object region on the reference image from the reference image and a silhouette image indicating the object region on the reference image;
A process corresponding to a feature point in the object area on the reference image from the calculated feature point in the object area on the reference image, a silhouette image indicating the object area on the reference image, and the processing target image A corresponding point calculation unit for calculating feature points on the target image;
From the feature points in the object region on the reference image and the feature points on the processing target image corresponding to the feature points, the processing target image corresponding to the feature points in the object region on the reference image A transformation matrix calculation unit for calculating a transformation matrix to be projected onto the upper feature point,
From each of the transformation matrix and the silhouette image indicating the object region on the reference image, each pixel on the processing target image is multiplied by an inverse matrix of the transformation matrix to determine whether it is included in the object region on the reference image. The object extraction apparatus according to claim 1, wherein a set of pixels determined to be included in the object area is estimated as an object area on the processing target image.

An object extraction method for extracting an object region of interest from a time series image,
An identification feature amount calculating step for calculating an identification feature amount for identifying the object region and the object peripheral region from the image of the object region on the reference image in the time-series image and the image of the peripheral region;
An object region estimation step for estimating an object region on the processing target image by correspondence between a feature point in the object region on the reference image in the time-series image and a feature point of the processing target image in the time-series image;
The feature amount of the image of the estimated object region and its peripheral region is calculated, and the object region in the estimated processing target image is corrected and confirmed by comparing the calculated feature amount with the calculated identification feature amount. And an object region determination step.

The identification feature amount calculating step sets a feature amount calculation range for calculating an image feature amount for identifying an object region and an object peripheral region based on a silhouette image indicating the object region in the reference image. A range setting step;
An image feature amount calculating step for calculating the image feature amount for each small region of a predetermined size in the set feature amount calculation range;
A representative image feature amount calculating step of clustering image feature amounts on the object region out of the image feature amounts and calculating K different representative image feature amounts expressing the image feature amount of the object region;
A similarity vector calculation step for calculating a similarity vector between the image feature quantity and the representative image feature quantity from the image feature quantity of the small area and K different representative image feature quantities for each small area;
An identification function calculating step for calculating an identification function representing a class boundary as to whether or not a small area belongs to the object area in the feature amount space from the calculated similarity vector on the object area and the similarity vector of the object peripheral area; ,
6. The object extraction method according to claim 5, further comprising: outputting the K different representative image feature amounts and the discrimination function as the discrimination feature amounts.

The object region determination step includes
A feature amount calculation range setting step for setting a feature amount calculation range for calculating an image feature amount based on the estimated silhouette image indicating the estimated object region;
An image feature amount calculating step for calculating an image feature amount for each small region of a predetermined size in the set feature amount calculation range;
For each small region, a similarity vector calculating step for calculating the similarity vector from the image feature amount of the small region and K different representative image feature amounts in the reference image;
Based on the calculated similarity vector and the discrimination function, each small area is identified as belonging to the object area in the feature amount space, and a set of small areas determined to be included in the object area is determined. The object extracting method according to claim 6, further comprising: determining an object region on the processing target image.

The object region estimation step includes:
A feature point calculating step of calculating a feature point in the object area on the reference image from the reference image and a silhouette image indicating the object area on the reference image;
A process corresponding to a feature point in the object area on the reference image from the calculated feature point in the object area on the reference image, a silhouette image indicating the object area on the reference image, and the processing target image A corresponding point calculating step for calculating feature points on the target image;
From the feature points in the object region on the reference image and the feature points on the processing target image corresponding to the feature points, the processing target image corresponding to the feature points in the object region on the reference image A transformation matrix calculating step for calculating a transformation matrix to be projected onto the upper feature point;
From each of the transformation matrix and the silhouette image indicating the object region on the reference image, each pixel on the processing target image is multiplied by an inverse matrix of the transformation matrix to determine whether it is included in the object region on the reference image. And a step of estimating a set of pixels determined to be included in the object area as an object area on the processing target image. Extraction method.

The object extraction program for functioning a computer as each process part of the object extraction apparatus of any one of Claims 1-4.