JP2013206262A

JP2013206262A - Method and program for separating two or more subject area

Info

Publication number: JP2013206262A
Application number: JP2012076102A
Authority: JP
Inventors: Kentaro Yamada; 健太郎山田; Hiroshi Sanko; 浩嗣三功; Hitoshi Naito; 整内藤
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2012-03-29
Filing date: 2012-03-29
Publication date: 2013-10-07
Anticipated expiration: 2032-03-29
Also published as: JP5838112B2

Abstract

【課題】被写体の分離処理においては他のカメラの情報を必要とすることなく、１台のカメラで撮影された映像を用いて、オクルージョンを生じた複数被写体の分離を行う。
【解決手段】対象映像においてオクルージョンを生じた複数被写体の分離を行う方法は、各フレーム画像を前景領域と背景領域に分割し、前記フレーム画像間において各被写体領域を追跡し、前記前景領域および前記被写体領域の追跡結果から、前記フレーム画像においてオクルージョンが発生している前景領域をオクルージョン領域として検出し、前記オクルージョン領域において前記被写体追跡処理により追跡された複数の被写体領域間の境界を決定する。
【選択図】図１In subject separation processing, a plurality of subjects with occlusions are separated using video captured by one camera without the need for information from other cameras.
A method for separating a plurality of subjects in which occlusion occurs in a target video is obtained by dividing each frame image into a foreground region and a background region, tracking each subject region between the frame images, and From the tracking result of the subject area, a foreground area where occlusion occurs in the frame image is detected as an occlusion area, and boundaries between the plurality of subject areas tracked by the subject tracking process in the occlusion area are determined.
[Selection] Figure 1

Description

本発明は、映像中の複数の重なった被写体を分離する方法及びプログラムに関するものである。より詳細には、大規模な空間で行われるイベントを撮影した映像から仮想視点における画像を生成する過程において、被写体ごとのテクスチャ生成を行うために、時系列的な被写体追跡の結果を用いて複数の重なった被写体を分離する方法及びプログラムに関する。 The present invention relates to a method and a program for separating a plurality of overlapping subjects in a video. More specifically, in order to generate textures for each subject in the process of generating an image at a virtual viewpoint from an image of an event taking place in a large space, a plurality of time-series subject tracking results are used. It is related with the method and program which isolate | separate the object which overlapped.

大規模な空間で行われるイベントを撮影した画像から、仮想視点における画像を生成する方法として、特許文献１では各被写体からテクスチャを生成し、被写体ごとに３次元空間中に配置した平面にマッピングする方法を提案している。 As a method of generating an image at a virtual viewpoint from an image obtained by capturing an event performed in a large space, Patent Document 1 generates a texture from each subject and maps the subject to a plane arranged in a three-dimensional space. Proposed method.

被写体のテクスチャを生成する際に、被写体同士が重なるオクルージョンが発生していた場合、１枚のテクスチャに複数の被写体が含まれ、不正確な仮想視点画像が生成されてしまうという問題がある。 When generating the texture of the subject, if there is an occlusion where the subjects overlap each other, there is a problem that a plurality of subjects are included in one texture and an inaccurate virtual viewpoint image is generated.

上記問題に対し、特許文献１では、当該被写体について、隣の視点でオクルージョンが発生しているかどうかを判定し、隣の視点でオクルージョンが発生していない場合には隣の視点の背景領域をオクルージョンが発生している領域に射影することで、被写体領域のテクスチャを抜き出し、隣の視点でオクルージョンが発生していた場合には、複数の被写体が含まれたままのテクスチャをそのまま利用する、という方法を提案している。 With respect to the above problem, in Patent Document 1, it is determined whether or not occlusion has occurred in the adjacent viewpoint for the subject. If no occlusion has occurred in the adjacent viewpoint, the background area of the adjacent viewpoint is occluded. By extracting the texture of the subject area by projecting onto the area where the image is generated, and when occlusion occurs from the adjacent viewpoint, the texture that contains multiple subjects is used as it is Has proposed.

また、上記問題に対し、非特許文献１では、複数の他の視点から撮影された画像の情報を用いて、粗い３次元情報を復元することで、手前に存在する被写体を分離してテクスチャを獲得し、奥の被写体についてはオクルージョンの発生していない他の視点のテクスチャを用いるという方法を提案している。具体的には、オクルージョンが生じている画像に平行な平面を３次元空間中の被写体位置付近に設定し、この平面に他の視点から抽出された手前に存在する被写体の領域を投影することで、手前の被写体のテクスチャ領域を獲得することができる。 Further, in order to solve the above problem, Non-Patent Document 1 uses the information of images taken from a plurality of other viewpoints to restore rough three-dimensional information, thereby separating a subject existing in the foreground and creating a texture. We have proposed a method of acquiring and using textures from other viewpoints where no occlusion has occurred for the back subject. Specifically, a plane parallel to the image in which occlusion occurs is set near the subject position in the three-dimensional space, and the area of the subject existing in the foreground extracted from another viewpoint is projected onto this plane. The texture area of the subject in front can be acquired.

本発明とは、解決しようとする課題が異なる技術ではあるが、技術的に類似した従来技術として、非特許文献２では時系列的な被写体追跡結果を用いて、背景から被写体を分離する方法が提案されている。 Although the technique to be solved is different from the present invention, as a conventional technique that is technically similar, Non-Patent Document 2 discloses a method of separating a subject from a background using a time-series subject tracking result. Proposed.

特開２０１１−１７０４８７号公報JP 2011-170487 A

古山孝好、北原格、大田友一、「仮想化現実技術を用いた大規模空間自由視点映像のライブ配信と提示」、電子情報通信学会技術研究報告、PRMUVol.103, No.96, pp.61-66, (2003)Takayoshi Furuyama, Satoshi Kitahara, Yuichi Ota, "Live distribution and presentation of large-scale space free viewpoint video using virtual reality technology", IEICE technical report, PRMUVol.103, No.96, pp.61 -66, (2003) Lili Ma, Jing Liu, Jinqiao Wang, Jian Cheng, Hanqing Lu, “A improved silhouette tracking approach integrating particle filter with graph cuts”, in Proc. ICASSP2010, pp.1142-1145Lili Ma, Jing Liu, Jinqiao Wang, Jian Cheng, Hanqing Lu, “A improved silhouette tracking approach integrating particle filter with graph cuts”, in Proc. ICASSP2010, pp.1142-1145

特許文献１、及び、非特許文献１ではオクルージョンを生じた複数被写体の分離に、複数の視点から撮影された画像の情報を用いている。したがって、他の視点でも対象の被写体についてオクルージョンが発生していた場合、特許文献１では複数被写体の分離を行うことができず、非特許文献１では複数被写体の分離精度が低下するという問題があった。 In Patent Document 1 and Non-Patent Document 1, information of images taken from a plurality of viewpoints is used to separate a plurality of subjects that have caused occlusion. Therefore, when occlusion has occurred in the subject from other viewpoints, Patent Document 1 cannot separate a plurality of subjects, and Non-Patent Document 1 has a problem that the separation accuracy of a plurality of subjects decreases. It was.

また、大規模空間においては、カメラキャリブレーションと前景抽出処理を高精度に行うことは困難である。特許文献１、及び、非特許文献１では、他の視点の前景抽出処理の結果と、カメラキャリブレーションの結果を用いることで、オクルージョンを生じた視点に投影することで、複数被写体の分離を行っているため、カメラキャリブレーション誤差と前景抽出処理の誤差により、被写体の分離精度が低下する。カメラキャリブレーション誤差と前景抽出処理の誤差は、カメラ台数に応じて蓄積されるため、用いるカメラ台数を増加させることによる精度向上は困難である。前記の２つの課題は、特許文献１、及び、非特許文献１が複数台のカメラから撮影された映像の情報を用いて被写体の分離を行っていることに由来する。 In a large-scale space, it is difficult to perform camera calibration and foreground extraction processing with high accuracy. In Patent Literature 1 and Non-Patent Literature 1, by using the results of foreground extraction processing of other viewpoints and the results of camera calibration, a plurality of subjects are separated by projecting to the viewpoint where the occlusion occurs. Therefore, subject separation accuracy is reduced due to camera calibration errors and foreground extraction processing errors. Since the camera calibration error and the foreground extraction process error are accumulated according to the number of cameras, it is difficult to improve accuracy by increasing the number of cameras used. The two problems described above are derived from the fact that Patent Document 1 and Non-Patent Document 1 perform subject separation using information on images taken from a plurality of cameras.

したがって、本発明は、被写体の分離処理においては他のカメラの情報を必要とすることなく、１台のカメラで撮影された映像を用いて、オクルージョンを生じた複数被写体の分離を行うことを目的とする。 Accordingly, an object of the present invention is to perform separation of a plurality of subjects that have caused occlusion using video captured by one camera without requiring information of other cameras in subject separation processing. And

なお、非特許文献２は時系列的な被写体追跡結果を用いて、背景から被写体を分離する方法を提案しているが、複数の被写体を扱う技術ではなく、単数の被写体の追跡と分離を行う技術であるため、オクルージョンを生じた複数の被写体を分離することができない。 Non-Patent Document 2 proposes a method of separating a subject from the background using time-series subject tracking results, but is not a technique for handling a plurality of subjects, and performs tracking and separation of a single subject. Due to the technology, it is impossible to separate a plurality of subjects that have caused occlusion.

上記の課題を解決するため、本発明では各被写体のフレーム間での時系列的な追跡結果に基づいて、被写体どうしが重なって生じるオクルージョン領域を検出すると共に、当該オクルージョン領域に含まれる各被写体のテクスチャ情報に基づき、各被写体を適切に分離する手法を考案する。フレーム間での追跡結果を用いることで、他のカメラの情報を必要とすることなく、１台のカメラで撮影された映像から、オクルージョンを生じた複数被写体の分離を行うことができる。具体的な処理手順は以下の通りである。 In order to solve the above problems, the present invention detects an occlusion region where subjects overlap each other based on a time-series tracking result between frames of each subject, and detects each subject included in the occlusion region. Based on the texture information, we devise a method for properly separating each subject. By using the result of tracking between frames, it is possible to separate a plurality of subjects that have caused occlusion from an image captured by one camera without requiring information from other cameras. The specific processing procedure is as follows.

上記課題を解決するため本発明の被写体分離の方法は、対象映像においてオクルージョンを生じた複数被写体の分離を行う方法であって、各フレーム画像を前景領域と背景領域に分割する前景領域抽出処理のステップと、前記フレーム画像間において各被写体領域を追跡する被写体追跡処理のステップと、前記前景領域および前記被写体領域の追跡結果から、現在のフレーム画像においてオクルージョンが発生している前景領域をオクルージョン領域として検出するステップと、前記オクルージョン領域において前記被写体追跡処理により追跡された複数の被写体領域間の境界を決定するステップとを含む。 In order to solve the above-described problem, the subject separation method of the present invention is a method for separating a plurality of subjects in which occlusion occurs in a target video, and includes foreground region extraction processing for dividing each frame image into a foreground region and a background region. A foreground region in which occlusion occurs in the current frame image as an occlusion region based on a step, a subject tracking process step for tracking each subject region between the frame images, and a tracking result of the foreground region and the subject region Detecting, and determining a boundary between a plurality of subject areas tracked by the subject tracking process in the occlusion area.

また、前記被写体追跡処理のステップは、パーティクルフィルタを用いて行われることも好ましい。 The subject tracking process step is preferably performed using a particle filter.

また、前記オクルージョン領域として検出するステップは、追跡された被写体が、複数の被写体が含まれる連結した前景領域に含まれる場合、該領域をオクルージョン領域として検出することも好ましい。 In the step of detecting as an occlusion area, when the tracked subject is included in a connected foreground area including a plurality of subjects, the area is preferably detected as an occlusion area.

また、前記複数の被写体領域間の境界を決定するステップは、前記オクルージョン領域に存在する被写体ごとに当該被写体の領域を抽出するサブステップと、被写体ごとの領域抽出結果を統合することで被写体領域間の境界を決定するサブステップとを含むことも好ましい。 Further, the step of determining the boundary between the plurality of subject areas includes a sub-step of extracting the subject area for each subject existing in the occlusion area and a region extraction result for each subject to integrate the subject area regions. It is also preferable to include a sub-step for determining the boundary.

また、前記複数の被写体領域間の境界を決定するステップは、前記オクルージョン領域に外接する矩形領域を抽出するサブステップと、前記矩形領域において前記オクルージョン領域に含まれる被写体ごとに領域分割用のシードを生成するサブステップと、前記生成されたシードに基づき、２値の領域分割により前記被写体領域を抽出するサブステップと、被写体ごとの領域抽出結果を統合することで被写体領域間の境界を決定するサブステップとを含むことも好ましい。 The step of determining boundaries between the plurality of subject areas includes a sub-step of extracting a rectangular area circumscribing the occlusion area, and a seed for area division for each subject included in the occlusion area in the rectangular area. A sub-step for generating, a sub-step for extracting the subject region by binary region division based on the generated seed, and a sub-step for determining a boundary between subject regions by integrating the region extraction results for each subject It is also preferable to include a step.

また、前記領域分割用のシードを生成するサブステップは、前記矩形領域内の各位置における当該被写体らしさを表す２次元マップを被写体マップとし、当該被写体の追跡結果を用いて前記被写体マップに正の値を加えるサブステップと、当該被写体以外の被写体の追跡結果を用いて前記被写体マップに負の値を加えるサブステップと、前記被写体マップを正規化するサブステップと、前記正規化された被写体マップと、前記前景抽出処理により得られた背景領域の情報を用いて当該領域分割用のシードを生成するサブステップとを含むことも好ましい。 Further, the sub-step of generating the seed for dividing the region uses a two-dimensional map representing the subject likeness at each position in the rectangular region as a subject map, and uses the tracking result of the subject to positively add the subject map. A sub-step of adding a value, a sub-step of adding a negative value to the subject map using a tracking result of subjects other than the subject, a sub-step of normalizing the subject map, and the normalized subject map It is also preferable to include a sub-step of generating a seed for region division using information on the background region obtained by the foreground extraction process.

また、前記領域分割の方法は、グラフカットアルゴリズムによるものであることも好ましい。 The region dividing method is preferably based on a graph cut algorithm.

上記課題を解決するため本発明によるプログラムは、対象映像においてオクルージョンを生じた複数被写体の分離を行うためのコンピュータを、各フレーム画像を前景領域と背景領域に分割する前景領域抽出処理の手段と、前記フレーム画像間において各被写体領域を追跡する被写体追跡処理の手段と、前記前景領域および前記被写体領域の追跡結果から、前記フレーム画像においてオクルージョンが発生している前景領域をオクルージョン領域として検出する手段と、前記オクルージョン領域において前記被写体追跡処理により追跡された複数の被写体領域間の境界を決定する手段として機能させ、オクルージョンを生じた複数被写体の分離を行う。 In order to solve the above problems, a program according to the present invention provides a computer for separating a plurality of subjects in which occlusion occurs in a target video, means for foreground region extraction processing for dividing each frame image into a foreground region and a background region, Means for subject tracking processing for tracking each subject area between the frame images; and means for detecting a foreground area where occlusion occurs in the frame image as an occlusion area from the tracking results of the foreground area and the subject area. The occlusion area functions as means for determining boundaries between a plurality of object areas tracked by the object tracking process, and separates the plurality of objects that have caused occlusion.

本発明では複数台のカメラ映像を用いず、１台のカメラ映像を用いて被写体を分離することができるため、他のカメラで発生したオクルージョンによる被写体分離精度の低下が生じない。 In the present invention, since the subject can be separated using one camera image without using a plurality of camera images, the subject separation accuracy does not deteriorate due to the occlusion generated by another camera.

さらに、本発明では複数台のカメラ映像を用いず、１台のカメラ映像を用いて被写体を分離することができるため、他のカメラにおけるカメラキャリブレーション誤差と前景抽出処理の誤差による被写体分離精度の低下が生じない。 Furthermore, in the present invention, since the subject can be separated using one camera image without using a plurality of camera images, the object separation accuracy due to the camera calibration error and the foreground extraction processing error in other cameras can be improved. There is no reduction.

また、従来手法では、複数台のカメラ映像を用いて被写体分離を行うため、疎なカメラ配置では、カメラ間の幾何関係によっては適用できない場合がある。本発明では１台のカメラ映像を用いて被写体を分離することができるため、カメラ間の幾何関係に関係なく、疎なカメラ配置でも適用することができる Further, in the conventional method, since subject separation is performed using a plurality of camera images, there are cases where the sparse camera arrangement cannot be applied depending on the geometric relationship between the cameras. In the present invention, since the subject can be separated using one camera image, it can be applied to a sparse camera arrangement regardless of the geometric relationship between the cameras.

本発明による複数被写体の分離方法を示すフローチャートを示す。2 shows a flowchart illustrating a method for separating a plurality of subjects according to the present invention. パーティクルフィルタによる被写体追跡処理を示す。The object tracking process by a particle filter is shown. オクルージョン領域の画像例を示す。The example of an image of an occlusion area | region is shown. パーティクルフィルタの粒子情報を用いたグラフカット用シードの生成処理を示す。The graph cut seed generation processing using the particle information of the particle filter is shown. グラフカット用シードの生成過程の例を示す。The example of the production | generation process of the seed for graph cuts is shown. 被写体領域間の境界を決定する前の領域分割結果の例を示す。An example of a region division result before determining a boundary between subject regions is shown. 被写体領域間の境界を決定した結果の例を示す。An example of the result of determining the boundary between subject areas is shown. 本発明の適用有無による仮想視点画像の変化の例を示す。The example of the change of the virtual viewpoint image by the presence or absence of application of this invention is shown.

本発明を実施するための最良の実施形態について、以下では図面を用いて詳細に説明する。図１は、本発明による複数被写体の分離方法を示すフローチャートである。以下、本フローチャートに基づいて説明する。 The best mode for carrying out the present invention will be described in detail below with reference to the drawings. FIG. 1 is a flowchart illustrating a method for separating a plurality of subjects according to the present invention. Hereinafter, description will be given based on this flowchart.

本発明は、被写体追跡の結果を利用することで、複数カメラの情報を利用せず、１台のカメラから撮影された画像において被写体分離を行うことで、他のカメラのオクルージョンの影響を受けることなく、オクルージョン領域において高精度に複数被写体の領域を分離する。本発明は、１台のカメラで一定時間撮影された映像を入力して、現在のフレーム画像において、オクルージョンを生じた複数被写体の分離を行う。 The present invention uses the result of subject tracking, does not use information from multiple cameras, and is subject to the occlusion of other cameras by performing subject separation on an image taken from one camera. Instead, the areas of a plurality of subjects are separated with high accuracy in the occlusion area. According to the present invention, an image captured for a certain period of time by a single camera is input, and a plurality of subjects having occlusion are separated from the current frame image.

ステップ１：各フレーム画像を前景領域と背景領域とに分割する。前景領域は、時間とともに位置や形状の変化する領域であり、対象とする空間がサッカーフィールドである場合には、人物等の被写体領域に該当する。背景領域は前景領域以外の領域である。あらかじめ前景領域が写っていない背景画像を取得しておき、現在のフレーム画像と背景画像とから、背景差分を行うことにより、前景領域と背景領域とを分割したマスク画像を作成する。 Step 1: Each frame image is divided into a foreground area and a background area. The foreground region is a region whose position and shape change with time, and corresponds to a subject region such as a person when the target space is a soccer field. The background area is an area other than the foreground area. A background image in which the foreground area is not captured is acquired in advance, and a mask image is generated by dividing the foreground area and the background area by performing a background difference from the current frame image and the background image.

ステップ２：現在のフレームにおいて、被写体ごとにパーティクルフィルタによって被写体追跡処理を行う。パーティクルフィルタによる被写体追跡は、フレーム間の時系列的処理により行われる。図２はパーティクルフィルタによる時系列的な被写体追跡処理を示すフローチャートである。以下、本フローチャートに基づいて被写体追跡処理の説明を行う。 Step 2: Subject tracking processing is performed by a particle filter for each subject in the current frame. Subject tracking by the particle filter is performed by time-series processing between frames. FIG. 2 is a flowchart showing time-series subject tracking processing by the particle filter. The subject tracking process will be described below based on this flowchart.

ステップ２１：初期フレームにおける被写体ごとのパーティクルフィルタ外接枠は、マスク画像の外接領域として設定する。被写体にオクルージョンを生じていない初期フレームを入力する。また、オクルージョンを生じている初期フレームにおいて、マスク画像を目視で確認し、オクルージョンを生じている被写体については、手動で被写体ごとの外接領域を設定することも可能である。 Step 21: The particle filter circumscribing frame for each subject in the initial frame is set as the circumscribing region of the mask image. Input an initial frame in which no occlusion occurs in the subject. It is also possible to visually check the mask image in the initial frame in which occlusion occurs, and manually set a circumscribed area for each subject for the subject in which occlusion has occurred.

ステップ２２：時間ｔにおける各被写体パーティクルフィルタ内に存在する粒子ごとの状態量を以下のように定義する。
ここで、ｘ（ｔ）、ｙ（ｔ）は粒子の２次元座標であり、Δｘ（ｔ）およびΔｙ（ｔ）はそれぞれ、被写体の画像中の横方向の速度、および縦方向の速度に相当する。状態遷移関数は、フレーム間で等速直線運動を仮定し、ガウスノイズω（ｔ）を用いて、以下の通り設定する。
Step 22: A state quantity for each particle existing in each subject particle filter at time t is defined as follows.
Here, x (t) and y (t) are the two-dimensional coordinates of the particle, and Δx (t) and Δy (t) respectively correspond to the horizontal speed and the vertical speed in the subject image. To do. The state transition function is set as follows using Gaussian noise ω (t), assuming a constant velocity linear motion between frames.

ステップ２３：各粒子の尤度を算出する処理は２段階に分かれる。第１段階では、各粒子の位置がマスク画像において前景領域であるか背景領域であるかを判別し、背景領域と判別された粒子は尤度を０とする。前景領域と判別された粒子は、被写体ごと与えられた事前情報をもとに尤度を決定する。被写体がサッカー選手の場合、ユニフォーム色情報Ｉ_ｒｅｆに基づき、閾値ｔｈ_ｒｅｆを用いて、粒子位置の画素値Ｉが次式を満たすか否かを判別し、満たす場合は尤度を０とし、満たさない場合は尤度を１とする。
｜Ｉ−Ｉ_ｒｅｆ｜＞ｔｈ_ｒｅｆ Step 23: The process of calculating the likelihood of each particle is divided into two stages. In the first stage, it is determined whether the position of each particle is the foreground region or the background region in the mask image, and the particles determined to be the background region have a likelihood of zero. The particle determined to be the foreground region determines the likelihood based on the prior information given for each subject. When the subject is a soccer player, it is determined whether or not the pixel value I at the particle position satisfies the following equation using the threshold value th _ref based on the uniform color information I _ref. If not, the likelihood is 1.
| I-I _ref |> th _ref

ステップ２４：尤度が０となった粒子は消滅させ、尤度１の粒子近傍に再配置する。 Step 24: Particles with a likelihood of 0 are extinguished and rearranged in the vicinity of a particle with a likelihood of 1.

ステップ２５：被写体ごとに、再配置された粒子群の外接領域を算出し、被写体パーティクルフィルタの外接枠を更新する。ステップ２２〜ステップ２５は最終フレームまで繰り返される。 Step 25: The circumscribed area of the rearranged particle group is calculated for each subject, and the circumscribed frame of the subject particle filter is updated. Steps 22 to 25 are repeated until the final frame.

ステップ３：前景領域と背景領域とを分割した結果と、パーティクルフィルタによる被写体追跡処理の現在のフレームでの結果を用いて、オクルージョンが発生している前景領域を検出する。前記被写体追跡処理によって追跡された全ての被写体について、パーティクルフィルタの外接枠の位置により、当該被写体がいずれの連結した前景領域に含まれるかを決定し、複数の被写体が含まれる連結した前景領域をオクルージョン領域として検出する。 Step 3: A foreground area where occlusion has occurred is detected using the result of dividing the foreground area and the background area and the result of the subject tracking process using the particle filter in the current frame. For all the subjects tracked by the subject tracking process, the connected foreground region including a plurality of subjects is determined based on the position of the circumscribed frame of the particle filter to determine in which connected foreground region the subject is included. Detect as an occlusion area.

ステップ４：ステップ３で検出されたオクルージョン領域ごとに、現在のフレーム画像から当該オクルージョン領域に外接する矩形領域を抽出する。図３は、オクルージョン領域の画像例である。 Step 4: For each occlusion area detected in step 3, a rectangular area circumscribing the occlusion area is extracted from the current frame image. FIG. 3 is an image example of the occlusion area.

図３ａは、オクルージョン領域に外接する矩形領域の例である。図３ｂは、当該矩形領域におけるマスク画像である。図３ｃは、当該矩形領域の元画像（図３ａ）に対し、マスク画像（図３ｂ）を重ねた前景領域画像である。 FIG. 3a is an example of a rectangular area circumscribing the occlusion area. FIG. 3b is a mask image in the rectangular area. FIG. 3c is a foreground region image in which the mask image (FIG. 3b) is superimposed on the original image (FIG. 3a) of the rectangular region.

ステップ４以降のステップ５、６、７の処理は前記矩形領域を対象として、オクルージョン領域ごとに独立して行われる Steps 5, 6, and 7 after step 4 are performed independently for each occlusion area for the rectangular area.

ステップ５：被写体追跡処理におけるパーティクルフィルタの粒子情報を用いて、グラフカット用のシードを生成する。ステップ５、６では、オクルージョン領域に含まれる複数の被写体のうち、一つの被写体に注目し、前記両ステップは、オクルージョン領域内の被写体ごとに独立して行われる。ステップ５はステップ６でグラフカットによる領域分割を行うための準備のステップである。グラフカットによる領域分割を行うためには、シードと呼ばれるラベルが付された画素を与える必要がある。本実施形態では、全画素に対し、「注目被写体」、「注目被写体候補」、「背景候補」、「背景」のいずれかのラベルを与えたものをシードとして用いる。ステップ５、６では、矩形内の注目被写体以外の全画素を背景として、注目被写体と背景とを分割する。ステップ５では、ステップ６でグラフカットにより領域分割するための事前情報となるように、前記の４つのラベルによるシードを生成する。 Step 5: Generate a seed for graph cut using the particle information of the particle filter in the subject tracking process. In steps 5 and 6, attention is paid to one of the plurality of subjects included in the occlusion area, and both the steps are performed independently for each subject in the occlusion area. Step 5 is a preparation step for performing area division by graph cut in Step 6. In order to perform region division by graph cut, it is necessary to provide pixels with labels called seeds. In this embodiment, all pixels are given a label of any one of “target subject”, “target subject candidate”, “background candidate”, and “background” as a seed. In steps 5 and 6, the target subject and the background are divided using all pixels other than the target subject in the rectangle as the background. In step 5, seeds based on the four labels are generated so as to be prior information for dividing the region by graph cut in step 6.

図４はパーティクルフィルタの粒子情報を用いたグラフカット用シードの生成処理を示すフローチャートである。以下、本フローチャートに基づいてグラフカット用シードの生成処理の説明を行う。 FIG. 4 is a flowchart showing a process for generating a graph cut seed using particle information of a particle filter. The graph cut seed generation process will be described below based on this flowchart.

ステップ３１：パーティクルフィルタの粒子情報をもとに、被写体マップを生成する。被写体マップは、矩形領域内の各位置における注目被写体らしさを表す２次元マップである。具体的には、注目被写体における粒子を正の標本、注目被写体以外の被写体における粒子を負の標本とし、各標本位置を中心とした２次元ガウス分布の重ね合わせにより、被写体マップを生成する。 Step 31: Generate a subject map based on the particle information of the particle filter. The subject map is a two-dimensional map representing the likelihood of the subject of interest at each position in the rectangular area. Specifically, a subject map is generated by superimposing a two-dimensional Gaussian distribution around each sample position, with particles in the subject of interest as positive samples and particles in subjects other than the subject of interest as negative samples.

注目被写体の粒子数をｎ_ｍ、注目被写体以外の被写体の粒子数をｎ_ｅとする。また、注目被写体とそれ以外の被写体のそれぞれｉ番目の粒子について、矩形領域内の位置をｐ_ｍｉ，ｐ_ｅｉとし、尤度をｃ_ｍｉ、ｃ_ｅｉとすると、被写体マップ上の位置ｐにおける値ｖ（ｐ）は、位置ｐ_０を中心とした２次元ガウス関数Ｇ（ｐ｜ｐ_０）を用いて、次式で表される。
Particle number n _m of the target _object, the number of particles of the object other than the target object and n _e. Further, for each i-th particle of the subject of interest and the other subjects, assuming that the positions in the rectangular area are p _mi and p _ei and the likelihoods are c _mi and c _ei , the value v at the position p on the subject map (P) is expressed by the following equation using a two-dimensional Gaussian function G (p | p ₀ ) with the position p ₀ as the center.

ステップ３２：前記被写体マップの正規化を行う。当該マップ中の最大値と最小値をそれぞれ、ｖ_ｍａｘ、ｖ_ｍｉｎとすると、当該マップ上の位置ｐにおける値ｖ（ｐ）を正規化した値ｖ_ｎｏｒｍ（ｐ）は次式で表される
正規化後の値ｖ_ｎｏｒｍ（ｐ）の範囲は、−１≦ｖ_ｎｏｒｍ（ｐ）≦１となる。 Step 32: Normalize the subject map. Assuming that the maximum value and the minimum value in the map are v _max and v _min , respectively, the value v _norm (p) obtained by normalizing the value v (p) at the position p on the map is expressed by the following equation.
The range of the normalized value v _norm (p) is −1 ≦ v _norm (p) ≦ 1.

ステップ３３：前記正規化された被写体マップと、前景抽出処理の結果をもとにシードを生成する。位置ｐにおける前記正規化された被写体マップ上の値をｖ_ｎｏｒｍ（ｐ）、閾値をｔｈ（ただし、０＜ｔｈ＜１）とし、ステップ１の前景抽出処理の結果ｒ（ｐ）を、位置ｐが前景抽出処理の結果において背景の時ｒ（ｐ）＝０、前景の時ｒ（ｐ）＝１とすると、位置ｐにおけるシードのラベルｌ（ｐ）は次式で表される。
Step 33: Generate a seed based on the normalized subject map and the result of the foreground extraction process. The value on the normalized subject map at the position p is v _norm (p), the threshold is th (where 0 <th <1), and the result r (p) of the foreground extraction process in step 1 is the position p If r (p) = 0 in the foreground extraction process and r (p) = 1 in the foreground, the seed label l (p) at the position p is expressed by the following equation.

図５はグラフカット用シードの生成過程の例である。図５ａは注目被写体とそれ以外の被写体のパーティクルフィルタの粒子の例である。白色の点が注目被写体の粒子であり、濃い灰色の点が注目被写体以外の被写体の粒子である。図５ｂは正規化された被写体マップにマスク画像（図３ｂ）を重ねた画像の例である。マップ上の値が大きいほど輝度が高く表現されている。図５ｃはグラフカット用シードの例である。黒色の領域に背景ラベル、濃い灰色の領域に背景候補ラベル、薄い灰色の領域に注目被写体候補ラベル、白色の領域に注目被写体ラベルが与えられている。 FIG. 5 shows an example of the process of generating the graph cut seed. FIG. 5A is an example of particle filter particles of a subject of interest and other subjects. White dots are particles of the subject of interest, and dark gray points are particles of subjects other than the subject of interest. FIG. 5b is an example of an image in which a mask image (FIG. 3b) is superimposed on a normalized subject map. The larger the value on the map, the higher the brightness. FIG. 5c is an example of a seed for graph cut. A black region is provided with a background label, a dark gray region is provided with a background candidate label, a light gray region is provided with a subject subject candidate label, and a white region is provided with a subject subject label.

ステップ６：ステップ５で生成されたシードに基づき、２値のグラフカット領域分割処理により、前記矩形領域を、注目する被写体領域とそれ以外に分割することで、注目する被写体領域を抽出する。 Step 6: Extract the subject area of interest by dividing the rectangular area into the subject area of interest and other areas by the binary graph cut area division process based on the seed generated in step 5.

ステップ７：被写体ごとに抽出された前記被写体領域を統合し、被写体領域間の境界を決定する。ステップ５、６により、当該オクルージョン領域に含まれる複数の被写体それぞれについて、既に当該被写体領域が獲得されている。しかし、ステップ５、６は当該オクルージョン領域に含まれる被写体ごとに独立して処理されているため、複数の被写体領域に含まれる領域が存在する可能性があり、被写体領域間の境界を決定する必要がある。図６は、オクルージョン領域内のすべての被写体に対してステップ５、６の処理を行い、被写体領域間の境界を決定する前の領域分割結果の例である。背景領域が黒色、一方の被写体領域が白色、もう一方の被写体領域が濃い灰色、両方の被写体領域に含まれる領域が薄い灰色である。 Step 7: The subject areas extracted for each subject are integrated to determine a boundary between the subject areas. Through steps 5 and 6, the subject area has already been acquired for each of the plurality of subjects included in the occlusion area. However, since Steps 5 and 6 are processed independently for each subject included in the occlusion region, there may be regions included in a plurality of subject regions, and it is necessary to determine the boundary between the subject regions. There is. FIG. 6 is an example of the region division result before the processing of steps 5 and 6 is performed on all subjects in the occlusion region and the boundary between the subject regions is determined. The background region is black, one subject region is white, the other subject region is dark gray, and the regions included in both subject regions are light gray.

被写体領域間の境界を決定するための統合処理は、例えば、被写体領域の最下部がより上にある被写体から順番に被写体領域を確定する処理により実施可能である。図７は、上記統合処理により、当該オクルージョン領域内で被写体領域間の境界を決定した結果の例である。 The integration process for determining the boundary between the subject areas can be performed, for example, by a process for determining the subject areas in order from the subject whose lowermost part of the subject area is higher. FIG. 7 shows an example of the result of determining the boundary between subject areas in the occlusion area by the integration process.

図８は、本発明の適用有無による仮想視点画像の変化の例である。図８ａは、ある視点でのゴールポスト付近の一部の画像である。図８ｂは、図８ａにおけるオクルージョンが発生している領域である。図８ｃは、図８ｂのオクルージョンにおいて複数被写体を分離しないテクスチャを用いた仮想視点画像の例である。図８ｄは、本発明により複数被写体を分離した画像をテクスチャとして用いて生成した仮想視点画像の例である。本発明により複数被写体を分離したことで、奥側の被写体の位置や影等が、より高精度に合成されていることがわかる。 FIG. 8 is an example of a change in the virtual viewpoint image depending on whether or not the present invention is applied. FIG. 8a is a partial image near the goal post at a certain viewpoint. FIG. 8b shows a region where occlusion occurs in FIG. 8a. FIG. 8c is an example of a virtual viewpoint image using a texture that does not separate a plurality of subjects in the occlusion of FIG. 8b. FIG. 8d is an example of a virtual viewpoint image generated using an image obtained by separating a plurality of subjects as a texture according to the present invention. By separating a plurality of subjects according to the present invention, it can be seen that the position, shadow, etc. of the subject on the back side are synthesized with higher accuracy.

また、以上述べた実施形態は全て本発明を例示的に示すものであって限定的に示すものではなく、本発明は他の種々の変形態様および変更態様で実施することができる。従って本発明の範囲は特許請求の範囲およびその均等範囲によってのみ規定されるものである。 Moreover, all the embodiments described above are illustrative of the present invention and are not intended to limit the present invention, and the present invention can be implemented in other various modifications and changes. Therefore, the scope of the present invention is defined only by the claims and their equivalents.

Claims

A method for separating a plurality of subjects having occlusion in a target video,
Foreground region extraction processing steps for dividing each frame image into a foreground region and a background region;
Subject tracking processing steps for tracking each subject region between the frame images;
From the tracking results of the foreground area and the subject area, detecting a foreground area where occlusion occurs in the current frame image as an occlusion area;
Determining boundaries between a plurality of subject areas tracked by the subject tracking process in the occlusion area;
A method for separating a plurality of subjects in which occlusion has occurred.

The method for separating a plurality of subjects with occlusions according to claim 1, wherein the subject tracking processing step is performed using a particle filter.

3. The step of detecting as an occlusion area detects the area as an occlusion area when the tracked object is included in a connected foreground area including a plurality of objects. A method of separating multiple subjects that have caused occlusion.

Determining a boundary between the plurality of subject areas,
A sub-step of extracting a region of the subject for each subject existing in the occlusion region;
A sub-step for determining a boundary between subject areas by integrating region extraction results for each subject;
4. The method for separating a plurality of subjects in which occlusion has occurred according to claim 1.

Determining a boundary between the plurality of subject areas,
A sub-step of extracting a rectangular area circumscribing the occlusion area;
A sub-step of generating a seed for region division for each subject included in the occlusion region in the rectangular region;
A sub-step of extracting the subject region by binary region division based on the generated seed;
A sub-step for determining a boundary between subject areas by integrating region extraction results for each subject;
The method for separating a plurality of subjects that have caused occlusion according to claim 1,

The sub-step of generating the seed for region division is as follows:
A sub-step of adding a positive value to the subject map using a tracking result of the subject as a two-dimensional map representing the subjectness at each position in the rectangular region as a subject map;
A sub-step of adding a negative value to the subject map using a tracking result of subjects other than the subject;
A sub-step of normalizing the subject map;
Generating a seed for dividing the region using the normalized subject map and information on the background region obtained by the foreground extraction process;
6. The method for separating a plurality of subjects having caused occlusion according to claim 5.

7. The method for separating a plurality of subjects with occlusion according to claim 5, wherein the region dividing method is based on a graph cut algorithm.

A computer for separating multiple subjects that have occluded in the target video,
Means for foreground region extraction processing for dividing each frame image into a foreground region and a background region;
Means for subject tracking processing for tracking each subject region between the frame images;
Means for detecting, as an occlusion area, a foreground area where occlusion occurs in the frame image from the tracking results of the foreground area and the subject area;
Means for determining boundaries between a plurality of subject areas tracked by the subject tracking process in the occlusion area;
A program characterized in that it separates a plurality of subjects that caused occlusion.