JP2019080183A

JP2019080183A - Image transmission device, image transmission method, and program

Info

Publication number: JP2019080183A
Application number: JP2017206016A
Authority: JP
Inventors: 伊藤　博康; Hiroyasu Ito; 博康伊藤
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2017-10-25
Filing date: 2017-10-25
Publication date: 2019-05-23

Abstract

【課題】仮想視点画像の生成のために必要な撮影画像を伝送するネットワークの伝送帯域の逼迫を低減することを目的とする。【解決手段】複数の撮像装置が撮影した複数の撮影画像を用いて仮想視点画像を生成する画像処理システムにおいて、前記複数の撮影画像の少なくとも一部を、前記仮想視点画像を生成するために伝送する画像伝送装置は、前記複数の撮像装置のうちの少なくとも１台が撮影した撮影画像が前記仮想視点画像の生成に適しているか否かを判定し、前記仮想視点画像の生成に適していないと判定された撮影画像の伝送を制限する。【選択図】図２PROBLEM TO BE SOLVED: To reduce tightness of a transmission band of a network for transmitting a captured image necessary for generating a virtual viewpoint image. In an image processing system that generates a virtual viewpoint image using a plurality of captured images captured by a plurality of imaging devices, at least a part of the plurality of captured images is transmitted to generate the virtual viewpoint image. The image transmission device that determines whether or not a captured image captured by at least one of the plurality of imaging devices is suitable for generating the virtual viewpoint image, and is not suitable for generating the virtual viewpoint image. Limit transmission of the determined captured image. [Selection] Figure 2

Description

本発明は、複数の撮像装置が撮影した複数の撮影画像を伝送する技術に関する。 The present invention relates to a technology for transmitting a plurality of captured images captured by a plurality of imaging devices.

複数の撮像装置を異なる位置に設置して多視点で同期撮影し、当該撮影により得られた複数の画像を用いて、撮像装置の設置位置から撮影した画像だけでなく、視点を任意に変更可能な仮想視点画像を生成する技術がある。
仮想視点画像は、サーバなどの画像処理部が複数のカメラにより撮影された画像を集約し、三次元モデルを生成し、レンダリングなどの処理を施すことで生成され、閲覧のためにユーザ端末に伝送される。
特許文献１には、同一の範囲を取り囲むように複数の撮像装置を配置して、その同一の範囲を撮影した画像を用いて、任意の指定に対応する仮想視点画像を生成、表示する技術が開示されている。 Multiple imaging devices can be installed at different positions, synchronized shooting with multiple viewpoints, and using multiple images obtained by the shooting, not only images taken from the installation position of the imaging device, but also viewpoints can be arbitrarily changed There is a technology to generate a simple virtual viewpoint image.
A virtual viewpoint image is generated by an image processing unit such as a server collecting images taken by a plurality of cameras, generating a three-dimensional model, performing processing such as rendering, and transmitting it to the user terminal for viewing Be done.
In Patent Document 1, there is a technology in which a plurality of imaging devices are arranged so as to surround the same range, and a virtual viewpoint image corresponding to any designation is generated and displayed using an image obtained by capturing the same range. It is disclosed.

特開２０１４−２１５８２８号公報JP, 2014-215828, A

複数の撮像装置により撮影された撮影画像を、仮想視点画像を生成するサーバにネットワークを介して伝送するシステムでは、ネットワークの伝送帯域が逼迫すると、複数カメラの撮影画像の少なくとも一部がサーバに伝送されない場合が生じ得る。ネットワークの伝送帯域が逼迫したことにより、仮想視点画像に生成に必要な撮影画像がサーバに伝送できないと、生成される仮想視点画像の一部に抜けが生じたり、画質が低くなったり、仮想視点画像の生成に時間がかかったりする恐れがある。さらに、仮想視点画像の生成に限らず、パノラマ画像など、複数の撮影画像を用いて画像を生成する場合においても、同様の課題が生じ得る。 In a system that transmits captured images captured by a plurality of imaging devices to a server that generates a virtual viewpoint image via a network, when the transmission bandwidth of the network becomes tight, at least a part of the captured images of multiple cameras is transmitted to the server In some cases, this may not happen. If the transmission band of the network is too tight to transmit the captured image required for generation to the virtual viewpoint image to the server, part of the generated virtual viewpoint image may be dropped, the image quality may be lowered, or the virtual viewpoint It may take time to generate an image. Furthermore, the same problem may occur in the case of generating an image using a plurality of photographed images such as a panoramic image as well as the generation of a virtual viewpoint image.

本発明は、上記の課題に鑑みてなされたものであり、仮想視点画像の生成のために必要な撮影画像を伝送するネットワークの伝送帯域の逼迫を低減することを目的とする。 The present invention has been made in view of the above problems, and has as its object to reduce the strain on the transmission band of a network that transmits a photographed image necessary for generating a virtual viewpoint image.

本発明は、複数の撮像装置が撮影した複数の撮影画像を用いて仮想視点画像を生成する画像処理システムにおいて、前記複数の撮影画像の少なくとも一部を、前記仮想視点画像を生成するために画像伝送装置であって、
前記複数の撮像装置のうちの少なくとも１台が撮影した撮影画像を取得する取得手段と、
前記取得手段により取得される前記撮影画像を伝送する伝送手段と、
前記取得手段により取得される前記撮影画像が前記仮想視点画像の生成に適しているか否かを判定する判定手段と、
前記判定手段により前記撮影画像が前記仮想視点画像の生成に適していないと判定された場合、当該撮影画像の前記伝送手段による伝送を制限する制御手段と、
を有することを特徴とする。 The present invention is an image processing system for generating a virtual viewpoint image using a plurality of captured images captured by a plurality of imaging devices, wherein at least a part of the plurality of captured images is generated to generate the virtual viewpoint image. A transmission device,
An acquisition unit configured to acquire a captured image captured by at least one of the plurality of imaging devices;
Transmission means for transmitting the photographed image acquired by the acquisition means;
A determination unit that determines whether the captured image acquired by the acquisition unit is suitable for generating the virtual viewpoint image;
A control unit that restricts transmission of the captured image by the transmission unit when it is determined by the determination unit that the captured image is not suitable for generating the virtual viewpoint image;
It is characterized by having.

本発明によれば、仮想視点画像の生成のために必要な撮影画像を伝送するネットワークの伝送帯域の逼迫を低減することができる。 According to the present invention, it is possible to reduce the tightness of the transmission band of the network transmitting the photographed image necessary for generating the virtual viewpoint image.

画像処理システム１００の構成を説明するためのブロック図である。FIG. 1 is a block diagram for explaining a configuration of an image processing system 100. カメラアダプタ１２０の機能構成を説明するためのブロック図である。FIG. 2 is a block diagram for explaining a functional configuration of a camera adapter 120. 撮影開始処理を説明するためのシーケンス図である。FIG. 7 is a sequence diagram for describing an imaging start process. カメラアダプタ１２０が出力するデータ形式を示す図である。It is a figure which shows the data format which the camera adapter 120 outputs. カメラアダプタ１２０の動作を説明するためのフローチャートである。5 is a flowchart for explaining the operation of the camera adapter 120. カメラアダプタ１２０が出力するデータの流れを示す図である。It is a figure which shows the flow of the data which the camera adapter 120 outputs. カメラアダプタ１２０の動作を説明するためのフローチャートである。5 is a flowchart for explaining the operation of the camera adapter 120. カメラアダプタ１２０が出力するデータの流れを示す図である。It is a figure which shows the flow of the data which the camera adapter 120 outputs. カメラアダプタ１２０のハードウェア構成を示すブロック図である。FIG. 2 is a block diagram showing a hardware configuration of a camera adapter 120.

以下、図面を参照して実施形態を説明する。図１は、本実施形態の画像処理システムの概略の一例を示す図である。図１に示す画像処理システムは、例えば、競技等を実際に行うフィールドと、観客席を有するスタジアムやコンサートホールに設置される複数のカメラにより撮影された撮影画像を用いて仮想視点画像を生成するシステムである。画像処理システム１００は、カメラシステム１１０ａ〜１１０ｚ、画像コンピューティングサーバ２００、コントローラ３００、スイッチングハブ１８０、及びエンドユーザ端末１９０を有する。 Hereinafter, embodiments will be described with reference to the drawings. FIG. 1 is a diagram showing an example of an outline of an image processing system according to the present embodiment. The image processing system shown in FIG. 1, for example, generates a virtual viewpoint image using images taken by a plurality of cameras installed in a field having a competition and the like and a stadium or a concert hall having an audience seat. It is a system. The image processing system 100 includes camera systems 110 a to 110 z, an image computing server 200, a controller 300, a switching hub 180, and an end user terminal 190.

コントローラ３００は画像処理システム１００を構成するそれぞれの構成要素に対してネットワーク３００ａ、１８０ａ、１８０ｂ、及び１７０ａ〜１７０ｙを通じて動作状態の管理及びパラメータ設定制御などを行う情報処理装置である。ここで、ネットワークはＥｔｈｅｒｎｅｔ（登録商標）であるＩＥＥＥ標準準拠のＧｂＥ（ＧｉｇａｂｉｔＥｔｈｅｒｎｅｔ）や１０ＧｂＥでもよいし、インターコネクトＩｎｆｉｎｉｂａｎｄ、産業用Ｅｔｈｅｒｎｅｔ等を組合せて構成されてもよい。また、これらに限定されず、無線ネットワークなどの他の種別のネットワークであってもよい。また、各装置間の接続形態も図１に図示したものに限定されず、例えば、インターネットなどのネットワークを介して各装置間が接続される形態であってもよい。 A controller 300 is an information processing apparatus that performs management of operation states, control of parameter settings, and the like with respect to respective components of the image processing system 100 through networks 300 a, 180 a, 180 b, and 170 a to 170 y. Here, the network may be Ethernet (registered trademark) IEEE standard compliant GbE (Gigabit Ethernet) or 10 GbE, or may be configured by combining an interconnect Infiniband, industrial Ethernet, or the like. Moreover, it is not limited to these, It may be networks of other types, such as a wireless network. Further, the form of connection between the respective devices is not limited to that shown in FIG. 1. For example, the respective devices may be connected via a network such as the Internet.

カメラシステム１１０ａ〜１１０ｚは、撮像装置であるカメラが撮影した画像を伝送するシステムである。カメラシステム１１０ａ〜１１０ｚは、２６セットの画像をカメラシステム１１０ｚから画像コンピューティングサーバ２００へ送信する動作を説明する。 The camera systems 110a to 110z are systems for transmitting an image captured by a camera that is an imaging device. The camera systems 110a-110z describe the operation of transmitting 26 sets of images from the camera system 110z to the image computing server 200.

本実施形態において、特別な説明がない場合は、カメラシステム１１０ａからカメラシステム１１０ｚまでの２６セットのシステムを区別せずカメラシステム１１０と記載する。各カメラシステム１１０内の装置についても同様に、特別な説明がない場合は区別せず、カメラ１１２、センサ１１４、及びカメラアダプタ１２０と記載する。なお、カメラシステムの台数として２６セットと記載しているが、あくまでも一例であり、台数をこれに限定するものではない。なお、本実施形態では、特に断りがない限り、画像という文言が、動画と静止画の概念を含むものとして説明する。すなわち、本実施形態の画像処理システム１００は、静止画及び動画の何れについても処理可能である。 In the present embodiment, 26 sets of systems from the camera system 110 a to the camera system 110 z will be described as the camera system 110 without distinction, unless otherwise specified. Similarly, devices in each camera system 110 are described as camera 112, sensor 114, and camera adapter 120 without distinction unless otherwise described. Although 26 sets are described as the number of camera systems, this is merely an example, and the number is not limited to this. In the present embodiment, unless otherwise noted, the term “image” is described as including the concept of a moving image and a still image. That is, the image processing system 100 of the present embodiment can process both still images and moving images.

カメラシステム１１０ａ〜１１０ｚは、それぞれ１台ずつのカメラ１１２ａ〜１１２ｚを有する。即ち、画像処理システム１００は、撮影対象を複数の方向から撮影するための複数のカメラを有する。複数のカメラシステム１１０同士はデイジーチェーンにより接続されてもよい。この接続形態により、撮影画像の４Ｋや８Ｋなどへの高解像度化及び高フレームレート化に伴う画像データの大容量化において、接続ケーブル数の削減や配線作業の省力化ができる。 The camera systems 110a to 110z have one camera 112a to 112z, respectively. That is, the image processing system 100 has a plurality of cameras for shooting a shooting target from a plurality of directions. The plurality of camera systems 110 may be connected by a daisy chain. With this connection form, it is possible to reduce the number of connection cables and save labor of wiring work in increasing the resolution of the captured image to 4K, 8K, etc. and increasing the capacity of image data accompanying the increase in frame rate.

なお、これに限らず、接続形態として、各カメラシステム１１０ａ〜１１０ｚがスイッチングハブ１８０に接続されて、スイッチングハブ１８０を経由してカメラシステム１１０間のデータ送受信を行うスター型のネットワーク構成としてもよい。 However, the present invention is not limited to this. As a connection form, each camera system 110a to 110z may be connected to the switching hub 180, and may have a star network configuration for transmitting and receiving data between the camera systems 110 via the switching hub 180. .

また、図１では、デイジーチェーンとなるようカメラシステム１１０ａ〜１１０ｚの全てがカスケード接続されている構成を示したがこれに限定するものではない。例えば、複数のカメラシステム１１０をいくつかのグループに分割して、分割したグループ単位でカメラシステム１１０間をデイジーチェーン接続してもよい。そして、分割単位の終端となるカメラアダプタ１２０がスイッチングハブに接続されて画像コンピューティングサーバ２００へ画像の入力を行うようにしてもよい。このような構成は、スタジアムにおいてとくに有効である。例えば、スタジアムが複数階で構成され、フロア毎にカメラシステム１１０を配備する場合が考えられる。この場合に、フロア毎、あるいはスタジアムの半周毎に画像コンピューティングサーバ２００への入力を行うことができ、全カメラシステム１１０を１つのデイジーチェーンで接続する配線が困難な場所でも設置の簡便化及びシステムの柔軟化を図ることができる。 Further, FIG. 1 shows a configuration in which all of the camera systems 110a to 110z are cascaded to form a daisy chain, but the present invention is not limited to this. For example, the plurality of camera systems 110 may be divided into several groups, and the camera systems 110 may be daisy-chained among the divided group units. Then, the camera adapter 120 serving as the end of the division unit may be connected to the switching hub to input an image to the image computing server 200. Such an arrangement is particularly effective in stadiums. For example, it may be considered that a stadium is composed of a plurality of floors and the camera system 110 is deployed on each floor. In this case, the input to the image computing server 200 can be performed every floor or every half cycle of the stadium, and installation is simplified even in a place where it is difficult to connect all the camera systems 110 by one daisy chain and System flexibility can be achieved.

また、デイジーチェーン接続されて画像コンピューティングサーバ２００へ画像入力を行うカメラアダプタ１２０が１つであるか２つ以上であるかに応じて、画像コンピューティングサーバ２００での画像処理の制御が切り替えられる。すなわち、カメラシステム１１０が複数のグループに分割されているかどうかに応じて制御が切り替えられる。画像入力を行うカメラアダプタ１２０が１つの場合は、デイジーチェーン接続で画像伝送を行いながら競技場全周画像が生成されるため、画像コンピューティングサーバ２００において全周の画像データが揃うタイミングは同期がとられている。すなわち、カメラシステム１１０がグループに分割されていなければ、同期はとれる。 Further, control of image processing in the image computing server 200 can be switched depending on whether there is one or two or more camera adapters 120 connected in a daisy chain and performing image input to the image computing server 200. . That is, the control is switched depending on whether the camera system 110 is divided into a plurality of groups. When there is one camera adapter 120 that performs image input, images are generated all around the stadium while performing image transmission by daisy chain connection, so the image computing server 200 is synchronized when the image data of all the circumferences are aligned It is taken. That is, if the camera system 110 is not divided into groups, synchronization can be achieved.

しかし、画像入力を行うカメラアダプタ１２０が複数になる（カメラシステム１１０がグループに分割される）場合は、それぞれのデイジーチェーンのレーン（経路）によって遅延が異なる場合が考えられる。そのため、画像コンピューティングサーバ２００において全周の画像データが揃うまで待って同期をとる同期制御によって、画像データの集結をチェックしながら後段の画像処理を行う必要があり得る。 However, when there are a plurality of camera adapters 120 that perform image input (the camera system 110 is divided into groups), it is conceivable that the delays may differ depending on the lanes (paths) of the respective daisy chains. Therefore, it may be necessary to perform image processing of the latter stage while checking the concentration of the image data by synchronous control in which the image computing server 200 waits and synchronizes until the image data of all the circumferences are aligned.

カメラシステム１１０ａはカメラ１１２ａ、センサ１１４ａ、及びカメラアダプタ１２０ａを有する。なお、この構成に限定するものではなく、他の構成を有していてもよい。また、カメラシステム１１０ａは、少なくとも１台のカメラアダプタ１２０ａと、１台のカメラ１１２ａを有する構成であってもよい。また例えば、カメラシステム１１０ａは１台のカメラアダプタ１２０ａと、複数のカメラ１１２ａで構成されてもよいし、１台のカメラ１１２ａと複数のカメラアダプタ１２０ａで構成されてもよい。また、カメラシステム１１０ａは、１台のカメラ１１２ａのみで、カメラ１１２ａが後述するカメラアダプタ１２０ａの動作を行ってもよい。 The camera system 110a includes a camera 112a, a sensor 114a, and a camera adapter 120a. In addition, it does not limit to this structure, You may have another structure. Also, the camera system 110a may be configured to have at least one camera adapter 120a and one camera 112a. Also, for example, the camera system 110a may be configured by one camera adapter 120a and a plurality of cameras 112a, or may be configured by a single camera 112a and a plurality of camera adapters 120a. Also, the camera system 110a may operate the camera adapter 120a, which the camera 112a will be described later, with only one camera 112a.

画像処理システム１００内の複数のカメラ１１２と複数のカメラアダプタ１２０はＮ対Ｍ（ＮとＭは共に１以上の整数）で対応する。また、カメラシステム１１０は、カメラ１１２、及びカメラアダプタ１２０以外の装置を含んでいてもよい。また、カメラ１１２とカメラアダプタ１２０が一体となって構成されていてもよい。さらに、カメラアダプタ１２０の機能の少なくとも一部を画像コンピューティングサーバ２００が有していてもよい。カメラシステム１１０ａ〜１１０ｚについては、カメラシステム１１０ａと同様の構成であってもよい。なお、カメラシステム１１０は其々同じ構成に限定されるものではなく、其々のカメラシステム１１０が異なる構成でもよい。 The plurality of cameras 112 and the plurality of camera adapters 120 in the image processing system 100 correspond to each other by N to M (N and M are both integers of 1 or more). The camera system 110 may also include devices other than the camera 112 and the camera adapter 120. Also, the camera 112 and the camera adapter 120 may be integrally configured. Furthermore, the image computing server 200 may have at least a part of the functions of the camera adapter 120. The camera systems 110a to 110z may have the same configuration as the camera system 110a. The camera systems 110 are not always limited to the same configuration, and each camera system 110 may have a different configuration.

カメラアダプタ１２０は画像を伝送する画像伝送装置である。例えば、カメラ１１２ａにて撮影された画像は、カメラアダプタ１２０ａにおいて後述の画像処理が施された後、デイジーチェーンのネットワーク１７０を通してカメラシステム１１０ｂのカメラアダプタ１２０ｂに伝送される。同様にカメラシステム１１０ｂでは、カメラ１１２ｂ撮影された画像を、カメラシステム１１０ａから取得した画像と合わせてカメラシステム１１０ｃに伝送する。 The camera adapter 120 is an image transmission device that transmits an image. For example, an image captured by the camera 112a is subjected to image processing described later in the camera adapter 120a, and then transmitted to the camera adapter 120b of the camera system 110b through the daisy chain network 170. Similarly, the camera system 110b transmits an image captured by the camera 112b to the camera system 110c in combination with the image acquired from the camera system 110a.

前述した動作を続けることにより、カメラシステム１１０ａ〜１１０ｚが取得した撮影画像は、カメラシステム１１０ｚからネットワーク１８０ｂを用いてスイッチングハブ１８０に伝わり、その後、画像コンピューティングサーバ２００へ伝送される。このように本実施形態のシステムでは、あるカメラシステム１１０のカメラ１１２が撮影した画像（以後、自カメラ画像と呼ぶ）と、上流カメラシステム１１０から送られてきた画像（以後、他カメラ画像と呼ぶ）を下流カメラシステム１１０に送出する。このようなルーチン処理を各カメラシステム１１０で繰り返しながらカメラシステム１１０ａ〜１１０ｚが撮影した画像は、画像コンピューティングサーバ２００に到達することになる。このような、デイジーチェーン接続による分散システムでは、下流に行くにしたがって伝送する撮影画像が増え、伝送データ量が増えるため、特に下流では伝送帯域が逼迫する恐れが生じうる。本実施形態では、伝送帯域をオーバーする前に適切にデータを削減する。 By continuing the operation described above, the photographed images acquired by the camera systems 110a to 110z are transmitted from the camera system 110z to the switching hub 180 using the network 180b, and then transmitted to the image computing server 200. As described above, in the system according to the present embodiment, an image captured by the camera 112 of a certain camera system 110 (hereinafter referred to as a self-camera image) and an image transmitted from the upstream camera system 110 (hereinafter referred to as another camera image) ) To the downstream camera system 110. The images taken by the camera systems 110 a to 110 z arrive at the image computing server 200 while repeating such routine processing in each camera system 110. In such a distributed system based on daisy chain connection, the number of captured images to be transmitted increases as going downstream, and the amount of transmission data increases, so there is a possibility that the transmission band may become tight particularly in the downstream. In the present embodiment, data is appropriately reduced before the transmission band is exceeded.

センサ１１４は、カメラシステム１１０の撮影状況に関する情報をセンスするセンサである。センサ１１４は、例えば、赤外線センサ、ジャイロセンサ、加速度センサ、ミリ波センサ、ＧＰＳまたは電子コンパスであり得る。センサ１１４は、これらのセンサのうちの一つでもよいし、これらから複数個のセンサを組み合わせたものであってもよい。 The sensor 114 is a sensor that senses information related to the imaging condition of the camera system 110. The sensor 114 may be, for example, an infrared sensor, a gyro sensor, an acceleration sensor, a millimeter wave sensor, a GPS or an electronic compass. The sensor 114 may be one of these sensors or a combination of a plurality of sensors.

タイムサーバ２９０は時刻及び同期信号を配信する機能を有し、スイッチングハブ１８０を介してカメラシステム１１０ａ〜１１０ｚに時刻及び同期信号を配信する。時刻と同期信号を受信したカメラアダプタ１２０ａ〜１２０ｚは、カメラ１１２ａ〜１１２ｚを時刻と同期信号をもとにＧｅｎｌｏｃｋさせ撮影画像の時刻同期を行う。即ち、タイムサーバ２９０は、複数のカメラ１１２の撮影タイミングを同期させる。これにより、画像処理システム１００は同じタイミングで撮影された複数の撮影画像に基づいて仮想視点画像を生成できるため、撮影タイミングのずれによる仮想視点画像の品質低下を抑制できる。なお、本実施形態ではタイムサーバ２９０が複数のカメラ１１２の時刻同期を管理するものとするが、これに限らず、時刻同期のための処理を各カメラ１１２又は各カメラアダプタ１２０が独立して行ってもよい。 The time server 290 has a function of distributing time and synchronization signals, and distributes the time and synchronization signals to the camera systems 110a to 110z through the switching hub 180. The camera adapters 120a to 120z having received the time and the synchronization signal perform Genlock on the cameras 112a to 112z based on the time and the synchronization signal to perform time synchronization of the captured image. That is, the time server 290 synchronizes the imaging timings of the plurality of cameras 112. As a result, the image processing system 100 can generate a virtual viewpoint image based on a plurality of captured images captured at the same timing, so that it is possible to suppress the degradation of the quality of the virtual viewpoint image due to the shift of the capturing timing. In the present embodiment, the time server 290 manages time synchronization of a plurality of cameras 112. However, the present invention is not limited to this. Each camera 112 or each camera adapter 120 independently performs processing for time synchronization. May be

画像コンピューティングサーバ２００は、カメラシステム１１０ｚから取得したデータの処理を行う。画像コンピューティングサーバ２００は、カメラシステム１１０ａ〜１１０ｚから取得した複数の撮影画像に基づいて、三次元モデルを生成し、レンダリングなどの処理を施すことで仮想視点画像を生成する。仮想視点画像を生成する方式として、モデルベースレンダリング（ＭＯＤＥＬ−ＢＡＳＥＤＲＥＮＤＥＲＩＮＧ：ＭＢＲ）を用いてもよい。ＭＢＲとは、被写体を複数の方向から撮像した複数の撮影画像に基づいて生成される三次元モデルを用いて仮想視点画像を生成する方式である。具体的にＭＢＲは、視体積交差法、ＭＵＬＴＩ−ＶＩＥＷ−ＳＴＥＲＥＯ（ＭＶＳ）などの三次元形状復元手法により得られた対象シーンの三次元形状（モデル）を利用し、仮想視点からのシーンの見えを画像として生成する技術である。また、仮想視点画像を生成する方法は、ＭＢＲに限られず、例えば、イメージベースレンダリングを用いる方式が挙げられる。イメージベースレンダリングは、モデリング（幾何学図形を使用して物体の形状を作成する過程）をしないで、複数視点の撮影画像から仮想視点画像を生成するレンダリング方法である。画像コンピューティングサーバ２００は、カメラシステム１１０ｚから取得した画像に対して、コントローラ３００またはエンドユーザ端末１９０から視点の指定を受け付け、受け付けられた視点に基づいて、レンダリング処理を行って仮想視点画像を生成する。 The image computing server 200 processes data acquired from the camera system 110z. The image computing server 200 generates a three-dimensional model based on a plurality of photographed images acquired from the camera systems 110a to 110z, and generates a virtual viewpoint image by performing processing such as rendering. Model-based rendering (MBR) may be used as a method of generating the virtual viewpoint image. The MBR is a method of generating a virtual viewpoint image using a three-dimensional model generated based on a plurality of captured images obtained by capturing an object from a plurality of directions. Specifically, the MBR utilizes the three-dimensional shape (model) of the target scene obtained by a three-dimensional shape restoration method such as the visual volume intersection method or MULTI-VIEW-STEREO (MVS) to make the scene visible from a virtual viewpoint. Is a technology to generate as an image. Further, the method of generating the virtual viewpoint image is not limited to the MBR, and, for example, a method using image-based rendering may be mentioned. Image-based rendering is a rendering method that generates a virtual viewpoint image from captured images of multiple viewpoints without modeling (process of creating a shape of an object using geometric figures). The image computing server 200 receives a designation of a viewpoint from the controller 300 or the end user terminal 190 for an image acquired from the camera system 110 z, and performs a rendering process based on the accepted viewpoint to generate a virtual viewpoint image. Do.

画像コンピューティングサーバ２００は、レンダリング処理された画像を、エンドユーザ端末１９０に送信する。エンドユーザ端末１９０を操作するユーザは視点の指定に応じた画像閲覧が出来る。すなわち、画像コンピューティングサーバ２００は、複数のカメラ１１２により撮影された撮影画像（複数視点画像）と視点情報とに基づく仮想視点画像を生成する。より具体的には、画像コンピューティングサーバ２００は、例えば複数のカメラアダプタ１２０により複数のカメラ１１２による撮影画像から抽出された所定領域の画像データと、ユーザ操作により指定された視点に基づいて、仮想視点画像を生成する。そして画像コンピューティングサーバ２００は、生成した仮想視点画像をエンドユーザ端末１９０に提供する。 The image computing server 200 transmits the rendered image to the end user terminal 190. The user operating the end user terminal 190 can view an image according to the designation of the viewpoint. That is, the image computing server 200 generates a virtual viewpoint image based on the captured images (plural viewpoint images) captured by the plurality of cameras 112 and the viewpoint information. More specifically, the image computing server 200 is virtual based on, for example, image data of a predetermined area extracted from images captured by the plurality of cameras 112 by the plurality of camera adapters 120 and a viewpoint designated by the user operation. Generate a viewpoint image. Then, the image computing server 200 provides the end-user terminal 190 with the generated virtual viewpoint image.

画像コンピューティングサーバ２００は、仮想的な視点から被写体を撮影した場合に得られる画像としての仮想視点画像を生成する。仮想視点画像は、指定された視点における見えを表す画像であるとも言える。仮想的な視点（仮想視点）は、ユーザにより指定されてもよいし、画像解析の結果等に基づいて自動的に指定されてもよい。すなわち仮想視点画像には、ユーザが任意に指定した視点に対応する任意視点画像（自由視点画像）が含まれる。また、複数の候補からユーザが指定した視点に対応する画像や、装置が自動で指定した視点に対応する画像も、仮想視点画像に含まれる。 The image computing server 200 generates a virtual viewpoint image as an image obtained when the subject is photographed from a virtual viewpoint. The virtual viewpoint image can also be said to be an image representing the appearance at a designated viewpoint. The virtual viewpoint (virtual viewpoint) may be designated by the user, or may be automatically designated based on the result of image analysis or the like. That is, the virtual viewpoint image includes an arbitrary viewpoint image (free viewpoint image) corresponding to a viewpoint arbitrarily specified by the user. Further, an image corresponding to a viewpoint specified by the user from a plurality of candidates and an image corresponding to a viewpoint automatically specified by the apparatus are also included in the virtual viewpoint image.

また、画像コンピューティングサーバ２００は、仮想視点画像をＨ．２６４やＨＥＶＣに代表される標準技術により圧縮符号化したうえで、ＭＰＥＧ−ＤＡＳＨプロトコルを使ってエンドユーザ端末１９０へ送信してもよい。また、仮想視点画像は、非圧縮でエンドユーザ端末１９０へ送信されてもよい。とくに圧縮符号化を行う前者はエンドユーザ端末１９０としてスマートフォンやタブレットを想定しており、後者は非圧縮画像を表示可能なディスプレイを想定している。すなわち、エンドユーザ端末１９０の種別に応じて画像フォーマットが切り替え可能であることを明記しておく。また、画像の送信プロトコルはＭＰＥＧ−ＤＡＳＨに限らず、例えば、ＨＬＳ（ＨＴＴＰＬｉｖｅＳｔｒｅａｍｉｎｇ）やその他の送信方法を用いてもよい。 In addition, the image computing server 200 also transmits the virtual viewpoint image to the H.264 system. Alternatively, the data may be compressed and encoded by a standard technology represented by H.264 and HEVC, and then transmitted to the end user terminal 190 using the MPEG-DASH protocol. Also, the virtual viewpoint image may be transmitted to the end user terminal 190 uncompressed. In particular, the former that performs compression coding assumes a smartphone or a tablet as the end user terminal 190, and the latter assumes a display capable of displaying uncompressed images. That is, it is specified that the image format can be switched according to the type of the end user terminal 190. Further, the transmission protocol of the image is not limited to the MPEG-DASH, and for example, HLS (HTTP Live Streaming) or another transmission method may be used.

なお、画像コンピューティングサーバ２００が、カメラシステム１１０ａ〜１１０ｚが生成した画像データやそれらのデータのメタ情報を共通スキーマ及びデータ型に変換してもよい。これにより、カメラシステム１１０ａ〜１１０ｚのカメラ１１２が他機種のカメラに変化しても、コントローラ３００が適切に動作しない虞を低減できる。なお、コントローラ３００は、画像コンピューティングサーバ２００を介さず、直接カメラシステム１１０ａ〜１１０ｚから画像を取得してもよい。 The image computing server 200 may convert the image data generated by the camera systems 110a to 110z and the meta information of the data into a common schema and a data type. As a result, even if the cameras 112 of the camera systems 110a to 110z change to cameras of other models, the possibility that the controller 300 does not operate properly can be reduced. The controller 300 may obtain images directly from the camera systems 110a to 110z without passing through the image computing server 200.

なお、画像コンピューティングサーバ２００の構成はこれに限らない。画像コンピューティングサーバ２００の機能の少なくとも一部をエンドユーザ端末１９０やコントローラ３００が有していてもよい。 The configuration of the image computing server 200 is not limited to this. The end user terminal 190 or the controller 300 may have at least a part of the functions of the image computing server 200.

エンドユーザ端末１９０は、レンダリング処理された仮想視点画像を画像コンピューティングサーバ２００から取得し、取得した仮想視点画像を表示する情報処理装置である。 The end user terminal 190 is an information processing apparatus that acquires a rendering-processed virtual viewpoint image from the image computing server 200 and displays the acquired virtual viewpoint image.

以上説明したように、画像処理システム１００においては、被写体を複数の方向から撮影するための複数のカメラ１１２による撮影に基づく画像データに基づいて、画像コンピューティングサーバ２００により仮想視点画像が生成される。なお、本実施形態における画像処理システム１００は、上記で説明した物理的な構成に限定される訳ではなく、論理的に構成されていてもよい。 As described above, in the image processing system 100, a virtual viewpoint image is generated by the image computing server 200 based on image data based on shooting by a plurality of cameras 112 for shooting a subject from a plurality of directions. . The image processing system 100 in the present embodiment is not limited to the physical configuration described above, and may be logically configured.

図９は、画像処理システム１００の各装置のハードウェアの構成の一例を示す図である。装置１２００は、ＣＰＵ１２０１、ＲＯＭ１２０２、ＲＡＭ１２０３、補助記憶装置１２０４、表示部１２０５、操作部１２０６、通信部１２０７、及びバス１２０８を有する。 FIG. 9 is a diagram illustrating an example of a hardware configuration of each device of the image processing system 100. The apparatus 1200 includes a CPU 1201, a ROM 1202, a RAM 1203, an auxiliary storage device 1204, a display unit 1205, an operation unit 1206, a communication unit 1207, and a bus 1208.

ＣＰＵ１２０１は、ＲＯＭ１２０２やＲＡＭ１２０３に格納されているコンピュータプログラムやデータを用いて装置１２００の全体を制御する。ＲＯＭ１２０２は、変更を必要としないプログラムやパラメータを格納する。ＲＡＭ１２０３は、補助記憶装置１２０４から供給されるプログラムやデータ、及び通信部１２０７を介して外部から供給されるデータなどを一時記憶する。補助記憶装置１２０４は、例えばハードディスクドライブ等で構成され、静止画や動画などのコンテンツデータを記憶する。 The CPU 1201 controls the entire apparatus 1200 using computer programs and data stored in the ROM 1202 and the RAM 1203. The ROM 1202 stores programs and parameters that do not need to be changed. The RAM 1203 temporarily stores programs and data supplied from the auxiliary storage device 1204, data supplied from the outside via the communication unit 1207, and the like. The auxiliary storage device 1204 is configured by, for example, a hard disk drive and stores content data such as still images and moving images.

表示部１２０５は、例えば液晶ディスプレイ等で構成され、ユーザが装置１２００を操作するためのＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）などを表示する。操作部１２０６は、例えばキーボードやマウス等で構成され、ユーザによる操作を受けて各種の指示をＣＰＵ１２０１に入力する。通信部１２０７は、外部の装置と通信を行う。例えば、装置１２００が外部の装置と有線で接続される場合には、ＬＡＮケーブル等が通信部１２０７に接続される。なお、装置１２００が外部の装置と無線通信する機能を有する場合、通信部１２０７はアンテナを備える。バス１２０８は、装置１２００の各部を繋いで情報を伝達する。 The display unit 1205 is configured of, for example, a liquid crystal display, and displays a graphical user interface (GUI) or the like for the user to operate the device 1200. The operation unit 1206 includes, for example, a keyboard and a mouse, and receives various operations from the user and inputs various instructions to the CPU 1201. A communication unit 1207 communicates with an external device. For example, when the device 1200 is connected to an external device by wire, a LAN cable or the like is connected to the communication unit 1207. Note that in the case where the device 1200 has a function of performing wireless communication with an external device, the communication unit 1207 includes an antenna. A bus 1208 connects the units of the device 1200 to transmit information.

本実施形態では表示部１２０５と操作部１２０６は装置１２００の内部に存在するが、装置１２００は表示部１２０５及び操作部１２０６の少なくとも一方を備えていなくてもよい。また、表示部１２０５及び操作部１２０６の少なくとも一方が装置１２００の外部に別の装置として存在していて、ＣＰＵ１２０１が、表示部１２０５を制御する表示制御部、及び操作部１２０６を制御する操作制御部として動作してもよい。 In the present embodiment, the display unit 1205 and the operation unit 1206 exist inside the device 1200, but the device 1200 may not include at least one of the display unit 1205 and the operation unit 1206. In addition, at least one of the display unit 1205 and the operation unit 1206 exists as another device outside the apparatus 1200, and the CPU 1201 controls the display unit 1205, and the operation control unit controls the operation unit 1206. It may operate as

なお、ＣＰＵ１２０１は、単数もしくは複数のＣＰＵにより構成されていてもよいし、マルチコアのＣＰＵであってもよい。また、ＣＰＵ１２０１の代わりに、または、ＣＰＵ１２０１と共に、ＡＳＩＣ、ＦＰＧＡまたはＧＰＵなどのハードウェアを有していてもよい。この場合、ＡＳＩＣ、ＦＰＧＡまたはＧＰＵなどのハードウェアが、ＣＰＵ１２０１が行うべき処理の一部または全てを行ってもよい。また、装置１２００の処理のうち一部をハードウェアで行い、別の一部の処理を、ＣＰＵを用いたソフトウェア処理によって実現するようにしてもよい。なお、画像処理システム１００のすべての装置が同様の構成である必要はなく、装置に応じて、一部の構成を有さなくてもよいし、他の構成を有してもよい。 The CPU 1201 may be configured of a single or a plurality of CPUs, or may be a multi-core CPU. In addition, instead of the CPU 1201 or together with the CPU 1201, hardware such as an ASIC, an FPGA, or a GPU may be included. In this case, hardware such as an ASIC, an FPGA, or a GPU may perform some or all of the processing that the CPU 1201 should perform. In addition, a part of the processing of the apparatus 1200 may be performed by hardware, and another part of the processing may be realized by software processing using a CPU. Note that not all devices of the image processing system 100 need to have the same configuration, and depending on the devices, some components may not be included, or other components may be included.

次に本実施形態におけるカメラアダプタ１２０の機能ブロックについて図２を用いて説明する。図２に示すカメラアダプタ１２０の機能ブロックは、カメラアダプタ１２０のＣＰＵ１２０１が、ＲＯＭ１２０２やＲＡＭ１２０３に格納されているコンピュータプログラムを実行し、情報の演算及び各ハードウェアを制御することで実現される。なお、図２に示す各構成の一部またはすべてを、専用のハードウェアにより実現されてもよい。専用のハードウェアは、例えば、ＡＳＩＣ、ＦＰＧＡまたはＧＰＵ等である。 Next, functional blocks of the camera adapter 120 according to the present embodiment will be described with reference to FIG. The functional block of the camera adapter 120 shown in FIG. 2 is realized by the CPU 1201 of the camera adapter 120 executing a computer program stored in the ROM 1202 or the RAM 1203 to calculate information and control each hardware. Note that part or all of the components shown in FIG. 2 may be realized by dedicated hardware. The dedicated hardware is, for example, an ASIC, an FPGA, or a GPU.

カメラ制御部１２０１１は、カメラ１１２と接続し、カメラ１１２の制御、撮影画像取得、同期信号提供、及び時刻設定などを行う機能と、カメラ１１２で撮影した画像を取得する機能を有している。カメラ１１２の制御には、例えば撮影パラメータ（絞り、シャッタースピード、感度、画素数、色深度、フレームレート、及びホワイトバランスの設定など）の設定及び参照が含まれてもよい。また、カメラ１１２の制御には、カメラ１１２の状態（撮影中、停止中、同期中、及びエラーなど）の取得、撮影の開始及び停止や、ピント調整などが含まれてもよい。なお、本実施形態ではカメラ１１２を介してピント調整を行っているが、取り外し可能なレンズがカメラ１１２に装着されている場合は、カメラアダプタ１２０がレンズに接続し、直接レンズの調整を行ってもよい。また、カメラアダプタ１２０がカメラ１１２を介してズーム等のレンズ調整を行ってもよい。同期信号提供は、時刻同期制御部１２０３８がタイムサーバ２９０と同期した時刻を利用し、撮影タイミング（制御クロック）をカメラ１１２に提供することで行われる。時刻設定は、時刻同期制御部１２０３８がタイムサーバ２９０と同期した時刻を例えばＳＭＰＴＥ１２Ｍのフォーマットに準拠したタイムコードで提供することで行われてもよい。これにより、カメラ１１２から受取る画像データに提供したタイムコードが付与されることになる。なおタイムコードのフォーマットはＳＭＰＴＥ１２Ｍに限定されるわけではなく、他のフォーマットであってもよい。また、カメラ制御部１２０１１は、カメラ１１２に対するタイムコードの提供はせず、カメラ１１２から受取った画像データに自身がタイムコードを付与してもよい。 The camera control unit 12011 is connected to the camera 112 and has functions of controlling the camera 112, acquiring a captured image, providing a synchronization signal, setting a time, and the like, and acquiring an image captured by the camera 112. The control of the camera 112 may include, for example, setting and reference of shooting parameters (such as aperture, shutter speed, sensitivity, number of pixels, color depth, frame rate, and setting of white balance). Further, control of the camera 112 may include acquisition of the state of the camera 112 (during photographing, stopping, synchronizing, error, etc.), start and stop of photographing, focus adjustment, and the like. In the present embodiment, focus adjustment is performed via the camera 112. However, when a removable lens is attached to the camera 112, the camera adapter 120 is connected to the lens to directly adjust the lens. It is also good. In addition, the camera adapter 120 may perform lens adjustment such as zooming via the camera 112. The synchronization signal is provided by the time synchronization control unit 12038 providing the photographing timing (control clock) to the camera 112 using the time synchronized with the time server 290. The time setting may be performed by providing the time synchronized with the time server 290 by the time synchronization control unit 12038, for example, with a time code conforming to the format of SMPTE 12M. As a result, the time code provided to the image data received from the camera 112 is given. The format of the time code is not limited to SMPTE 12M, and may be another format. In addition, the camera control unit 12011 may not provide the time code to the camera 112, and may assign the time code to the image data received from the camera 112.

センサ制御部１２０１２は、センサ１１４と接続し、センサ１１４がセンシングした結果を示すセンサ情報を取得する機能を有する。例えば、センサ１１４としてジャイロセンサが利用される場合は、振動を表す情報を取得することができる。センサ制御部１２０１２で取得したセンサ情報は、画像補正部１２０１３に入力される。なお、カメラシステム１１０のセンサはセンサ１１４に限定するわけではなく、カメラアダプタ１２０またはカメラ１１２に内蔵されたセンサであってもよい。 The sensor control unit 12012 is connected to the sensor 114, and has a function of acquiring sensor information indicating the result of the sensing by the sensor 114. For example, when a gyro sensor is used as the sensor 114, information representing vibration can be obtained. The sensor information acquired by the sensor control unit 12012 is input to the image correction unit 12013. The sensor of the camera system 110 is not limited to the sensor 114, and may be a sensor built in the camera adapter 120 or the camera 112.

画像補正部１２０１３は、センサ制御部１２０１２で取得したセンサ情報を基に、カメラ制御部１２０１１から取得した画像データの補正を行い、補正後の画像データを画像処理部１２０２１に出力する。センサ１１４が、ジャイロセンサまたは加速度センサなどであり、センサ情報が振動やぶれを表す情報である場合は、画像補正部１２０１３では画像データのぶれ補正を行い、ぶれ補正量をメタ情報生成部１２０２２に出力する。また、センサ１１４が、赤外線センサまたはミリ波センサなどであり、センサ情報が撮影領域の遮蔽物体の有無を表す情報である場合は、画像補正部１２０１３では、画像中の遮蔽物体の領域検知を行う。そして、画像補正部１２０１３は、遮蔽物体の領域の大きさをメタ情報生成部１２０２２に出力する。また、センサ１１４が、電子コンパスなどであり、センサ情報が撮影領域の変更を表す情報である場合は、画像補正部１２０１３では撮影領域の変更度合いを検出し、その度合いをメタ情報生成部１２０２２に出力する。 The image correction unit 12013 corrects the image data acquired from the camera control unit 12011 based on the sensor information acquired by the sensor control unit 12012, and outputs the corrected image data to the image processing unit 12021. If the sensor 114 is a gyro sensor or an acceleration sensor, etc., and the sensor information is information representing vibration or shake, the image correction unit 12013 performs shake correction of the image data and outputs the shake correction amount to the meta information generation unit 12022 Do. When the sensor 114 is an infrared sensor or a millimeter wave sensor, and the sensor information is information indicating the presence or absence of a shielded object in the imaging region, the image correction unit 12013 performs region detection of the shielded object in the image. . Then, the image correction unit 12013 outputs the size of the area of the shielded object to the meta information generation unit 12022. If the sensor 114 is an electronic compass or the like and the sensor information is information representing a change in the imaging area, the image correction unit 12013 detects the degree of change in the imaging area, and the degree is used as the meta information generation unit 12022. Output.

画像処理部１２０２１は、カメラ１１２が撮影した画像データに対して、エッジ処理、ノイズ除去、色変換等の処理を行う。なお、画像処理部１２０２１は、前景背景分離処理を行ってもよい。前景背景分離処理では、画像からボールや選手などの特定の人物である所定のオブジェクトの領域を前景領域として抽出し、画像から前景領域を抽出した残りの画像を背景領域としてもよい。カメラアダプタ１２０は、前景領域を示す前景画像データ、背景領域を示す背景画像データと分けて、それぞれ別個に通信してもよい。 An image processing unit 12021 performs processing such as edge processing, noise removal, and color conversion on the image data captured by the camera 112. The image processing unit 12021 may perform foreground / background separation processing. In the foreground / background separation process, a region of a predetermined object which is a specific person such as a ball or a player may be extracted from the image as a foreground region, and the remaining image obtained by extracting the foreground region from the image may be used as a background region. The camera adapter 120 may separately communicate with the foreground image data indicating the foreground area and the background image data indicating the background area.

メタ情報生成部１２０２２は、画像を撮影した時のタイムコードまたはシーケンス番号、データ種別、及びカメラ１１２の個体を示す識別子、画像補正部１２０１３から取得したぶれ補正量などをセンサ情報に基づく情報を画像データのメタ情報として生成する。なお、メタ情報生成部１２０２２が生成する情報は、これらのうちの一部であってもよいし、他の情報であってもよい。例えば、メタ情報生成部１２０２２は、他のカメラアダプタ１２０の後述する不適切画像判定部１２０３４による判定処理に必要な情報を生成するようにしてもよい。また、例えば、メタ情報生成部１２０２２は、不適切画像判定部１２０３４の判定結果を画像データのメタ情報として生成してもよい。また、例えば、メタ情報生成部１２０２２は、センサ制御部１２０１２により取得したセンサ情報を画像データのメタ情報として生成してもよい。 The meta information generation unit 12022 generates information based on sensor information, such as a time code or sequence number at the time of shooting an image, a data type, an identifier indicating an individual of the camera 112, a blur correction amount acquired from the image correction unit 12013, and the like. Generate as meta information of data. Note that the information generated by the meta information generation unit 12022 may be part of these or other information. For example, the meta information generation unit 12022 may generate information necessary for determination processing by an inappropriate image determination unit 12034, which will be described later, of the other camera adapters 120. Also, for example, the meta information generation unit 12022 may generate the determination result of the inappropriate image determination unit 12034 as meta information of the image data. Also, for example, the meta information generation unit 12022 may generate the sensor information acquired by the sensor control unit 12012 as meta information of the image data.

データ量取得部１２０３１は、出力対象のデータであり、画像処理部１２０２１で処理した画像のデータ量を取得する。出力データ量算出部１２０３２は、データ量取得部１２０３１で取得した自カメラ画像のデータ量と、データ量計測部１２０３６で取得した他カメラ画像のデータ量とを加算することにより、当該カメラアダプタ１２０が出力すべきデータ量を算出する。即ち、カメラアダプタ１２０は、自システムで取得した撮影画像及び他のカメラシステムから取得した撮影画像のデータ量を取得する。出力データ量判定部１２０３３は、出力データ量算出部１２０３２で算出したデータ量と所定の伝送帯域制約量を比較し、伝送可能性を確認する。具体的には、データ送信部１２０４２へ出力するデータ量が予め指定された出力可能データ量の閾値を超えるか否かを判断する。また、出力データ量判定部１２０３３は、ネットワーク１７０の伝送帯域の使用状況に応じて、出力データ量の閾値を動的に変更してもよい。 The data amount acquisition unit 12031 is data to be output, and acquires the data amount of the image processed by the image processing unit 12021. The output data amount calculation unit 12032 adds the data amount of the self-camera image acquired by the data amount acquisition unit 12031 and the data amount of the other camera image acquired by the data amount measurement unit 12036 so that the camera adapter 120 Calculate the amount of data to be output. That is, the camera adapter 120 acquires the data amount of the photographed image acquired by the own system and the photographed image acquired from another camera system. The output data amount determination unit 12033 compares the data amount calculated by the output data amount calculation unit 12032 with a predetermined transmission bandwidth restriction amount to confirm the transmission possibility. Specifically, it is determined whether the amount of data to be output to data transmission unit 12042 exceeds a threshold of the amount of data that can be output specified in advance. In addition, the output data amount determination unit 12033 may dynamically change the threshold of the output data amount according to the use status of the transmission band of the network 170.

不適切画像判定部１２０３４は、自システムで取得した撮影画像及び他のカメラシステムから取得した撮影画像のうち、仮想想視点画像の生成において不適切な画像を判定する。不適切画像判定部１２０３４は、例えば、ぶれ量が大きい撮影画像を仮想想視点画像の生成において不適切な画像を判定する。ぶれ量が大きい撮影画像を仮想視点画像の生成に用いると、仮想視点画像も同様にぶれが生じ、画質が劣化するためである。また、不適切画像判定部１２０３４は、例えば、遮蔽物体が写る撮影画像を仮想想視点画像の生成において不適切な画像を判定する。遮蔽物体が写る撮影画像を仮想視点画像の生成に用いると、遮蔽物体が写る領域が仮想視点画像においてマッピングされると、仮想視点の位置からは本来見えない遮蔽物体が仮想視点画像に写ってしまうためである。また、例えば、不適切画像判定部１２０３４は、カメラ１１２の撮影領域がいたずらなどによって意図せず変更されていることを検出した場合、仮想想視点画像の生成において不適切な画像を判定する。撮影領域がいたずらなどによって意図せず変更されている撮影画像が仮想視点画像においてマッピングされると、撮影領域がずれているため、適切な位置にマッピングできず、画質が劣化するためである。不適切画像判定部１２０３４は、これらのうちの一部のみを判定してもよいし、仮想視点画像の生成において不適切な画像を他の基準を用いて判定してもよい。 The inappropriate image determination unit 12034 determines an image that is inappropriate for generation of a virtual fantasy image, from among the captured image acquired by the own system and the captured images acquired from another camera system. The inappropriate image determining unit 12034 determines, for example, an image that is inappropriate for generation of a virtual fantasy image, for a captured image with a large amount of blur. When a photographed image with a large amount of blurring is used to generate a virtual viewpoint image, blurring occurs in the virtual viewpoint image as well, and the image quality is degraded. In addition, the inappropriate image determination unit 12034 determines, for example, an image that is inappropriate in the generation of a virtual fantasy image, in which a captured image in which a shielded object is captured is generated. When a photographed image in which an occluded object is captured is used to generate a virtual viewpoint image, when the area in which the occluded object is imaged is mapped in the virtual viewpoint image, the occluded object that can not be seen from the virtual viewpoint is reflected in the virtual viewpoint image It is for. Also, for example, when detecting that the imaging area of the camera 112 has been unintentionally changed due to mischief or the like, the inappropriate image determination unit 12034 determines an inappropriate image in generation of the virtual fantasy viewpoint image. When a photographed image whose photographing region is unintentionally changed due to mischief or the like is mapped in the virtual viewpoint image, the photographing region is shifted, so that mapping can not be performed at an appropriate position, and the image quality is degraded. The inappropriate image determination unit 12034 may determine only a part of these, or may determine an inappropriate image using other criteria in generating the virtual viewpoint image.

不適切画像判定部１２０３４は、取得した撮影画像のメタ情報に基づいて、当該画像が仮想視点画像の生成において不適切か否かを判定してもよい。不適切画像判定部１２０３４は、メタ情報生成部１２０２２で生成された自カメラ画像のメタ情報と、メタ情報抽出部１２０３７で抽出された他カメラ画像のメタ情報を取得し、適切でない画像を判定してもよい。また、例えば、不適切画像判定部１２０３４は、メタ情報に含まれるカメラのぶれ補正量を閾値と比較し、ぶれ補正量の大きな画像データを不適切な画像として判定してもよい。また、不適切画像判定部１２０３４は、自カメラ画像及び他カメラ画像のうちぶれ量がより大きいものを仮想視点画像の生成において不適切と判定してもよい。また、不適切画像判定部１２０３４は、撮影画像の画像処理結果に基づいて、当該画像が仮想視点画像の生成において不適切か否かを判定してもよい。例えば、不適切画像判定部１２０３４は、撮影画像における遮蔽物の有無を画像処理で検出し、遮蔽物が検出された撮影画像を仮想視点画像の生成において不適切と判定してもよい。また、例えば、不適切画像判定部１２０３４は、同一のカメラ１１２で異なる時間で撮影された複数の撮影画像を比較し、撮影範囲や撮影領域が変化していることが画像処理により検出してもよい。この場合、不適切画像判定部１２０３４は、撮影範囲が変化後の撮影画像を仮想視点画像の生成において不適切と判定してもよい。 The inappropriate image determination unit 12034 may determine whether the image is inappropriate for generation of the virtual viewpoint image based on the acquired meta information of the captured image. The inappropriate image determination unit 12034 acquires the meta information of the self-camera image generated by the meta information generation unit 12022 and the meta information of the other camera image extracted by the meta information extraction unit 12037, and determines an inappropriate image. May be Also, for example, the inappropriate image determination unit 12034 may compare the camera shake correction amount included in the meta information with a threshold and determine image data having a large shake correction amount as an inappropriate image. In addition, the inappropriate image determination unit 12034 may determine that one of the own camera image and the other camera image having a larger shake amount is inappropriate in generation of the virtual viewpoint image. In addition, the inappropriate image determination unit 12034 may determine whether the image is inappropriate for generating a virtual viewpoint image based on the image processing result of the captured image. For example, the inappropriate image determination unit 12034 may detect the presence or absence of the shield in the captured image by image processing, and determine that the captured image in which the shield is detected is inappropriate in generation of the virtual viewpoint image. Also, for example, the inappropriate image determination unit 12034 compares a plurality of captured images captured at different times by the same camera 112, and detects that the imaging range or the imaging area is changing by image processing. Good. In this case, the inappropriate image determination unit 12034 may determine that the captured image after the change of the imaging range is inappropriate in the generation of the virtual viewpoint image.

不適切画像判定部１２０３４は、センサ１１４のセンシング結果に基づいて、仮想視点画像の生成において不適切な画像を判定してもよい。センサ１１４がジャイロセンサまたは加速度センサである場合における不適切画像判定部１２０３４の判定の一例について説明する。不適切画像判定部１２０３４は、これらのセンシング結果が、カメラ１１２が撮影を行う際に振動していることを示す場合、当該カメラ１１２に撮影された画像を、仮想視点画像の生成において不適切な画像を判定してもよい。センサ１１４が赤外線センサまたはミリ波センサである場合についての不適切画像判定部１２０３４による判定についての一例を説明する。これらのセンシング結果が、所定の範囲（例えば、カメラ１１２から５ｍ）の撮影範囲を遮蔽する物体が侵入している場合、不適切画像判定部１２０３４は、当該カメラ１１２に撮影された画像を、仮想視点画像の生成において不適切な画像を判定してもよい。センサ１１４が電子コンパスである場合についての不適切画像判定部１２０３４による判定についての一例を説明する。電子コンパスのセンシング結果がカメラ１１２の撮影方向が変化したことを示す場合、不適切画像判定部１２０３４は、当該カメラ１１２に撮影された画像を、仮想視点画像の生成において不適切な画像を判定してもよい。センサ１１４がＧＰＳである場合についての不適切画像判定部１２０３４による判定についての一例を説明する。ＧＰＳのセンシング結果がカメラ１１２の位置が変動したがことを示す場合、不適切画像判定部１２０３４は、当該カメラ１１２に撮影された画像を、仮想視点画像の生成において不適切な画像を判定してもよい。 The inappropriate image determination unit 12034 may determine an inappropriate image in generation of the virtual viewpoint image based on the sensing result of the sensor 114. An example of the determination of the inappropriate image determination unit 12034 when the sensor 114 is a gyro sensor or an acceleration sensor will be described. When the sensing result indicates that the camera 112 vibrates when shooting, the inappropriate image determining unit 12034 improperly generates an image captured by the camera 112 in generating a virtual viewpoint image. An image may be determined. An example of the determination by the inappropriate image determination unit 12034 in the case where the sensor 114 is an infrared sensor or a millimeter wave sensor will be described. In the case where an object whose imaging result in a predetermined range (for example, 5 m from the camera 112) intrudes these sensing results intruding, the inappropriate image determination unit 12034 virtualizes the image captured by the camera 112. An inappropriate image may be determined in generation of a viewpoint image. An example of the determination by the inappropriate image determination unit 12034 when the sensor 114 is an electronic compass will be described. If the sensing result of the electronic compass indicates that the shooting direction of the camera 112 has changed, the inappropriate image determination unit 12034 determines an image captured by the camera 112 as an inappropriate image in generation of a virtual viewpoint image. May be An example of the determination by the inappropriate image determination unit 12034 when the sensor 114 is the GPS will be described. When the sensing result of GPS indicates that the position of the camera 112 has changed, the inappropriate image determination unit 12034 determines an image captured by the camera 112 as an inappropriate image in generation of a virtual viewpoint image. It is also good.

画像伝送処理部１２０３５は、画像データを、データ送信部１２０４２を介して他のカメラアダプタ１２０（他のカメラシステム１１０）または画像コンピューティングサーバ２００へ送信する制御を行う。画像伝送処理部１２０３５は、画像データ及び各データのメタ情報が含まれるメッセージを作成し、作成したメッセージを送信させる。データ量計測部１２０３６は、データ受信部１２０４１で受信した他カメラ画像のデータ量を計測する。メタ情報抽出部１２０３７は、データ受信部１２０４１で受信した他カメラ画像のメッセージからメタ情報を抽出する。 The image transmission processing unit 12035 performs control to transmit image data to another camera adapter 120 (another camera system 110) or the image computing server 200 via the data transmission unit 12042. The image transmission processing unit 12035 creates a message including the image data and the meta information of each data, and transmits the created message. The data amount measuring unit 12036 measures the data amount of the other camera image received by the data receiving unit 12041. The meta information extraction unit 12037 extracts meta information from the other camera image message received by the data reception unit 12041.

時刻同期制御部１２０３８は、タイムサーバ２９０と時刻同期に係わる処理を行う。データ受信部１２０４１及びデータ送信部１２０４２は、ネットワーク１７０、１８０ａ、１８０ｂ、２９１、３００ａを介し他のカメラアダプタ１２０、画像コンピューティングサーバ２００、タイムサーバ２９０、及びコントローラ３００とデータ通信を行う。例えばデータ送信部１２０４２は、画像処理部１２０２１により処理された、カメラ１１２が撮影した撮影画像を、次のカメラアダプタ１２０に対して出力する。各カメラアダプタ１２０が画像を出力することで、複数の視点から撮影された画像に基づいて仮想視点画像が生成される。時刻制御部１２０４３は、タイムサーバ２９０との間で送受信したデータのタイムスタンプを保存したり、タイムサーバ２９０と時刻同期を行ったりする。 The time synchronization control unit 12038 performs processing relating to time synchronization with the time server 290. The data receiving unit 12041 and the data transmitting unit 12042 perform data communication with the other camera adapters 120, the image computing server 200, the time server 290, and the controller 300 via the networks 170, 180a, 180b, 291 and 300a. For example, the data transmission unit 12042 outputs the captured image captured by the camera 112, which has been processed by the image processing unit 12021, to the next camera adapter 120. As each camera adapter 120 outputs an image, a virtual viewpoint image is generated based on the images captured from a plurality of viewpoints. The time control unit 12043 stores time stamps of data transmitted to and received from the time server 290, and performs time synchronization with the time server 290.

続いて上述の構成を有する画像処理システム１００の動作について説明する。まず、カメラシステム１１０における撮影処理について図３を用いて説明する。図３は、撮影処理のシーケンスを示す図である。 Subsequently, the operation of the image processing system 100 having the above-described configuration will be described. First, photographing processing in the camera system 110 will be described with reference to FIG. FIG. 3 is a diagram showing a sequence of photographing processing.

まず、タイムサーバ２９０は例えばＧＰＳ２２０１などと時刻同期を行い、タイムサーバ内で管理される時刻の設定を行う（０６８０１）。なお、タイムサーバ２９０の時刻設定は、ＧＰＳを用いた方法に限定されるものではなく、ＮＴＰ（ＮｅｔｗｏｒｋＴｉｍｅＰｒｏｔｏｃｏｌ）など他の方法で時刻を設定してもよい。 First, the time server 290 performs time synchronization with, for example, the GPS 2201 and the like, and sets the time managed in the time server (06801). The time setting of the time server 290 is not limited to the method using GPS, and the time may be set by another method such as NTP (Network Time Protocol).

次にカメラアダプタ１２０は、タイムサーバ２９０との間で通信を行い、カメラアダプタ１２０内で管理される時刻を補正しタイムサーバ２９０と時刻同期を行う（０６８０２）。カメラアダプタ１２０は、タイムサーバ２９０と同期した時刻に基づいて、Ｇｅｎｌｏｃｋ信号や３値同期信号等の同期撮影信号及びタイムコード信号を、カメラ１１２に対して提供する。カメラアダプタ１２０は、カメラ１１２の撮影フレームレート（例えば、６０ｆｐｓ）に同期してカメラ１１２に対して同期撮影信号及びタイムコード信号の提供を開始する（０６８０３）。なお提供される情報はタイムコードに限定されるものではなく、撮影フレームを識別できる識別子であれば他の情報でもよい。 Next, the camera adapter 120 communicates with the time server 290, corrects the time managed in the camera adapter 120, and performs time synchronization with the time server 290 (06802). The camera adapter 120 provides a synchronized photographing signal such as a Genlock signal or a ternary synchronization signal and a time code signal to the camera 112 based on the time synchronized with the time server 290. The camera adapter 120 starts provision of the synchronized shooting signal and the time code signal to the camera 112 in synchronization with the shooting frame rate (for example, 60 fps) of the camera 112 (06803). The information to be provided is not limited to the time code, and may be other information as long as it is an identifier that can identify the shooting frame.

次に、カメラアダプタ１２０はカメラ１１２に対して撮影開始指示を行う（０６８０４）。カメラ１１２は撮影開始指示を受けると、Ｇｅｎｌｏｃｋ信号に同期して撮影を行う（０６８０５）。次に、カメラ１１２は撮影した画像をカメラアダプタ１２０へ送信する（０６８０６）。カメラ１１２が撮影を停止するまでＧｅｎｌｏｃｋ信号に同期した撮影が行われる。 Next, the camera adapter 120 instructs the camera 112 to start photographing (06804). When the camera 112 receives the imaging start instruction, the imaging is performed in synchronization with the Genlock signal (06805). Next, the camera 112 transmits the photographed image to the camera adapter 120 (06806). Shooting synchronized with the Genlock signal is performed until the camera 112 stops shooting.

また、カメラアダプタ１２０は、センサ１１４がセンシング（０６８０７）したセンサ情報を取得する（０６８０８）。なお、カメラアダプタ１２０は、Ｇｅｎｌｏｃｋ信号や３値同期信号等の同期撮影信号及びタイムコード信号をカメラ１１２の撮影フレームレートに同期してセンサ１１４に送信してもよい。そして、センサ１１４は、同期撮影信号またはタイムコード信号に基づいてセンシングを行ってもよい。また、カメラアダプタ１２０は、撮影時刻とセンシング時刻とが近い時刻にセンシングされたセンサ１１４からのセンシング情報を画像データに関連付ける構成としてもよい。 Further, the camera adapter 120 acquires sensor information (06808) sensed by the sensor 114 (06807). The camera adapter 120 may transmit to the sensor 114 in synchronization with the imaging frame rate of the camera 112 a synchronous imaging signal such as a Genlock signal or a ternary synchronization signal and a time code signal. Then, the sensor 114 may perform sensing based on the synchronous imaging signal or the time code signal. Also, the camera adapter 120 may be configured to associate sensing information from the sensor 114 sensed at a time when the shooting time and the sensing time are close to the image data.

続いて、カメラアダプタ１２０間で送受信される画像データを伝送するメッセージの一例を図４に示す。図４において、４０１は、このメッセージに含まれる画像を撮影したカメラ１１２を識別するカメラ番号であり、それぞれのカメラ１１２ａ〜１１２ｚの識別番号が入力される。図４において、カメラ番号４０１には、カメラ１１２ａを示す番号１１２ａが設定されており、後述する画像データ４０４を撮影したのは、カメラ１１２ａであることを示している。 Subsequently, an example of a message for transmitting image data transmitted and received between the camera adapters 120 is shown in FIG. In FIG. 4, reference numeral 401 denotes a camera number for identifying the camera 112 that has captured the image included in this message, and the identification number of each of the cameras 112 a to 112 z is input. In FIG. 4, the camera number 401 is set with a number 112a indicating the camera 112a, and it is shown that it is the camera 112a that photographed the image data 404 described later.

４０２は画像データ４０４のタイムコードであり、画像データ４０４が撮影された時刻を示す。４０３は、センサ１１４がセンシングにより取得したセンス情報に基づく情報である。図４では、センス情報の基づく情報４０３に、画像補正部１２０１３で補正したデータの補正量を格納している例を示す。ぶれ補正量を０〜９の１０段階で表してもよい。この場合、ぶれ補正量０はカメラ１１２に振動がなくぶれ補正を行っていないことを表し、１〜９の値は数字が大きくなるにしたがってぶれ補正量が大きいことを表すようにしてよい。なお、センス情報に基づく情報４０３に含まれる情報は、ぶれ補正量に限定されない。例えば、センス情報に基づく情報４０３には、センサ１１４から取得したセンサ情報そのものが入力されてもよい。４０４は画像データである。 Reference numeral 402 denotes a time code of the image data 404, which indicates the time when the image data 404 was captured. 403 is information based on sense information acquired by the sensor 114 by sensing. FIG. 4 shows an example in which the correction amount of the data corrected by the image correction unit 12013 is stored in the information 403 based on the sense information. The shake correction amount may be expressed in 10 steps of 0-9. In this case, the shake correction amount 0 may indicate that the camera 112 has no vibration and shake correction is not performed, and the values 1 to 9 may indicate that the shake correction amount increases as the number increases. The information included in the information 403 based on the sense information is not limited to the blur correction amount. For example, the sensor information itself acquired from the sensor 114 may be input to the information 403 based on the sense information. 404 is image data.

上記４０１〜４０４のデータのうち、画像データ以外の４０１〜４０３をメタ情報と呼ぶ。なお、図４で示すメッセージは一例であり、他の情報が含まれていてもよいし、４０１〜４０４のデータのうちの一部がなくてもよい。例えば、メッセージに不適切画像判定部１２０３４の判定結果を画像データのメタ情報として付加されてもよい。また、４０１〜４０３などのメタ情報と画像データ４０４とが別のメッセージとして通信されてもよい。 Among the data 401 to 404, the data 401 to 403 other than the image data are called meta information. The message shown in FIG. 4 is an example, and other information may be included, or some of the data 401 to 404 may not be present. For example, the determination result of the inappropriate image determination unit 12034 may be added to the message as meta information of the image data. Also, meta information such as 401 to 403 and the image data 404 may be communicated as separate messages.

次に、カメラアダプタ１２０の出力処理の流れについて図５のフローチャートを用いて説明する。なお、図５のフローチャートは、例えば、カメラアダプタ１２０のデータの出力周期ごとに開始されてもよい。カメラアダプタ１２０は、カメラ１１２の撮影フレームレートごとにデータを出力してもよく、図５のフローチャートに示す出力処理を開始してもよい。例えば、カメラ１１２が、６０ｆｐｓで撮影を行う場合、図５のフローチャートにより示される処理は、１／６０秒ごとに開始されてもよい。また、図５のフローチャートにより示される処理は、自システムのカメラ１１２が撮影した画像を示す画像データを取得するごとに開始されてもよい。後述するフローチャートに示す処理は、カメラアダプタ１２０のＣＰＵ１２０１がプログラムを実行することで情報の演算や各ハードウェアを制御することで実現される。なお、後述するフローチャートの少なくとも一部のステップが専用のハードウェアにより実行されてもよい。専用のハードウェアは、例えば、ＡＳＩＣ、ＦＰＧＡまたはＧＰＵである。 Next, the flow of the output process of the camera adapter 120 will be described using the flowchart of FIG. The flowchart in FIG. 5 may be started, for example, at each data output period of the camera adapter 120. The camera adapter 120 may output data for each shooting frame rate of the camera 112, or may start output processing shown in the flowchart of FIG. For example, when the camera 112 shoots at 60 fps, the process shown by the flowchart of FIG. 5 may be started every 1/60 seconds. Also, the process shown by the flowchart of FIG. 5 may be started each time image data indicating an image captured by the camera 112 of the own system is acquired. The process shown in the flowchart to be described later is realized by the CPU 1201 of the camera adapter 120 executing a program to calculate information and control each hardware. Note that at least a part of the steps of the flowchart to be described later may be executed by dedicated hardware. The dedicated hardware is, for example, an ASIC, an FPGA or a GPU.

カメラアダプタ１２０は、自システムのカメラ１１２により撮影された画像を取得し、取得した画像のメタ情報をメタ情報生成部１２０２２により生成する（Ｓ５０１）。メタ情報生成部１２０２２は、画像を撮影した時のタイムコードまたはシーケンス番号、データ種別、及びカメラ１１２の個体を示す識別子、画像補正部１２０１３から取得したぶれ補正量などを、画像データのメタ情報として生成する。 The camera adapter 120 acquires an image captured by the camera 112 of the own system, and generates meta information of the acquired image by the meta information generation unit 12022 (S501). The meta information generation unit 12022 uses, as meta information of the image data, a time code or sequence number when the image was captured, a data type, an identifier indicating an individual of the camera 112, a blur correction amount acquired from the image correction unit 12013, and the like. Generate

カメラアダプタ１２０は、自システムのカメラ１１２により撮影された画像を示す画像データのデータ量を取得する（Ｓ５０２）。Ｓ５０２の処理では、画像処理部１２０２１の処理結果である画像データのデータ量を取得してもよいし、圧縮処理を施した後の画像データのデータ量を取得してもよい。また、Ｓ５０２の処理では、カメラアダプタ１２０は、自システムのカメラ１１２により撮影された画像に前景背景分離処理を施した結果である前景画像を示す画像データ及び／または背景画像を示す画像データのデータ量を取得してもよい。 The camera adapter 120 acquires the data amount of the image data indicating the image captured by the camera 112 of the own system (S502). In the process of S502, the data amount of the image data which is the processing result of the image processing unit 12021 may be acquired, or the data amount of the image data after the compression process may be acquired. Also, in the process of S502, the camera adapter 120 performs image processing on the foreground image as a result of subjecting the image captured by the camera 112 of the own system to foreground / background separation processing and / or data on image data representing the background image. You may get an amount.

カメラアダプタ１２０は、他のカメラシステム１１０のカメラ１１２が撮影した画像を示す画像データが含まれるメッセージを上流において隣接するカメラアダプタ１２０から受信する。そして、カメラアダプタ１２０は、受信したメッセージのデータ量を計測する（Ｓ５０３）。即ち、カメラアダプタ１２０は、データ受信部１２０４１を介して上流カメラアダプタ１２０から取得したメッセージのデータ量をデータ量計測部１２０３６により計測する。 The camera adapter 120 receives a message including image data indicating an image captured by the camera 112 of another camera system 110 from the adjacent camera adapter 120 upstream. Then, the camera adapter 120 measures the data amount of the received message (S503). That is, the camera adapter 120 measures the data amount of the message acquired from the upstream camera adapter 120 via the data receiving unit 12041 by the data amount measuring unit 12036.

次に、カメラアダプタ１２０は、データ量取得部１２０３１で取得したデータ量と、データ量計測部１２０３６で計測したデータ量を基に、下流カメラアダプタ１２０ｄに出力するデータ量を出力データ量算出部１２０３２により算出する（Ｓ５０４）。 Next, the camera adapter 120 outputs an amount of data to be output to the downstream camera adapter 120d based on the amount of data acquired by the data amount acquisition unit 12031 and the amount of data measured by the data amount measurement unit 12036 as an output data amount calculation unit 12032 It calculates by (S504).

次に、カメラアダプタ１２０は、Ｓ５０４において算出された出力データ量が出力可能データ量を超えるか否かを判定する（Ｓ５０５）。Ｓ５０５において、カメラアダプタ１２０の出力データ量判定部１２０３３は、出力データ量と伝送帯域制約量を比較し、出力データ量が伝送帯域制約量を超えるか否かを判定する。具体的には、カメラアダプタ１２０は、データ送信部１２０４２へ出力するデータ量が予め指定された出力データ量の閾値を超えるか否かを判断する。例えば、カメラシステム１１０ｚが画像コンピューティングサーバ２００にデータを伝送するためのネットワークの実効速度１００Ｇｂｐｓであるとする。また、カメラ１１２ａ〜ｚが撮影を行う撮影フレームレートを１００ｆｐｓであるとし、１／１００秒ごとに各カメラシステム１１０ａ〜ｚは出力処理を行うものとする。カメラシステム１１０ｚの伝送において、ネットワークで伝送可能なデータ量を超えないようにするためには、カメラシステム１１０ｚは、出力処理ごとにデータ量を１Ｇｂｉｔ未満にする必要がある。したがって、２６台の各カメラアダプタ１２０ａ〜ｚは、自カメラデータのデータ量が１／２６Ｇｂｉｔであれば、カメラシステム１１０ｚの伝送において、ネットワークが伝送可能なデータ量を超えない。即ち、この例において、カメラアダプタ１２０は、Ｓ５０５において、出力データ量が、（１＋ｎ）／２６Ｇｂｉｔ（ｎは、上流のカメラシステムの数）を超えるか否を判定してもよい。 Next, the camera adapter 120 determines whether the amount of output data calculated in S504 exceeds the amount of data that can be output (S505). In step S505, the output data amount determination unit 12033 of the camera adapter 120 compares the output data amount with the transmission band restriction amount, and determines whether the output data amount exceeds the transmission band restriction amount. Specifically, the camera adapter 120 determines whether the amount of data to be output to the data transmission unit 12042 exceeds a predetermined threshold of the amount of output data. For example, it is assumed that the camera system 110 z has an effective network speed of 100 Gbps for transmitting data to the image computing server 200. In addition, it is assumed that a shooting frame rate at which the cameras 112a to 112z shoot is 100 fps, and each of the camera systems 110a to 110z performs output processing every 1/100 second. In the transmission of the camera system 110z, the camera system 110z needs to reduce the amount of data to less than 1 Gbit for each output process so as not to exceed the amount of data that can be transmitted by the network. Therefore, when the amount of data of own camera data is 1/26 Gbit, each of the 26 camera adapters 120a to 120z does not exceed the amount of data that can be transmitted by the network in transmission of the camera system 110z. That is, in this example, the camera adapter 120 may determine in S505 whether or not the output data amount exceeds (1 + n) / 26 Gbit (n is the number of upstream camera systems).

Ｓ５０５において、出力データ量が閾値である出力可能データ量を超えないと判断した場合（Ｓ５０５のＮＯ）、カメラアダプタ１２０は、通常伝送を行う（Ｓ５０６）。Ｓ５０６では、カメラアダプタ１２０は、Ｓ５０１で取得した自カメラ画像データ及びＳ５０３で取得した他カメラ画像データのすべてを下流の他のカメラアダプタに送信する。 If it is determined in S505 that the amount of output data does not exceed the amount of data that can be output, which is a threshold (NO in S505), the camera adapter 120 performs normal transmission (S506). In S506, the camera adapter 120 transmits all of the self-camera image data acquired in S501 and the other camera image data acquired in S503 to the other downstream camera adapter.

Ｓ５０５において、出力データ量が閾値である出力可能データ量を超えると判断した場合（Ｓ５０５のＹＥＳ）、カメラアダプタ１２０は、データ量を削減して画像データの伝送を行う（Ｓ５０７）。 If it is determined in S505 that the amount of output data exceeds the amount of data that can be output that is a threshold (YES in S505), the camera adapter 120 reduces the amount of data and transmits image data (S507).

データ削減伝送処理（Ｓ５０７）について、図７のフローチャートを参照して詳細に説明する。カメラアダプタ１２０は、Ｓ５０１において生成した自カメラ画像データのメタ情報を取得する（Ｓ７０１）。カメラアダプタ１２０は、Ｓ５０３において取得した他カメラ画像データのメタ情報を抽出する（Ｓ７０２）。 The data reduction transmission process (S507) will be described in detail with reference to the flowchart of FIG. The camera adapter 120 acquires meta information of the self-camera image data generated in S501 (S701). The camera adapter 120 extracts meta information of the other camera image data acquired in S503 (S702).

カメラアダプタ１２０の不適切画像判定部１２０３４は、Ｓ７０１において取得した自カメラ画像及びＳ７０２において取得した他カメラ画像のうち、仮想視点画像の生成に用いるのに不適切な画像を判定する。カメラアダプタ１２０は、取得した自カメラ画像及び他カメラ画像の撮影状況を判定し、仮想視点画像の画質が劣化し得る状況で撮影された画像を判定してもよい。また、カメラアダプタ１２０の不適切画像判定部１２０３４は、自カメラ画像と他カメラ画像それぞれに関連づいたセンサ情報に基づいて、仮想視点画像の生成に用いるのに不適切な画像を判定してもよい。また、カメラアダプタ１２０の不適切画像判定部１２０３４は、Ｓ７０１において取得した自カメラ画像データのメタ情報とＳ７０２において取得した他カメラ画像データのメタ情報とに基づいて、仮想視点画像の生成に用いるのに不適切な画像を判定してもよい。そして、カメラアダプタ１２０は、仮想視点画像を生成するために最も不適切な画像を示す画像データを出力対象から外し、当該画像データを削除することで、当該撮影画像のネットワークへの伝送を制限する（Ｓ７０３）。 The inappropriate image determination unit 12034 of the camera adapter 120 determines an image unsuitable for use in generating a virtual viewpoint image, out of the self-camera image acquired in S701 and the other camera image acquired in S702. The camera adapter 120 may determine the shooting conditions of the acquired self-camera image and the other camera image, and may determine an image shot in a situation where the image quality of the virtual viewpoint image may deteriorate. In addition, the inappropriate image determination unit 12034 of the camera adapter 120 determines an image unsuitable for use in generating a virtual viewpoint image based on sensor information associated with each of the own camera image and the other camera image. Good. In addition, the inappropriate image determination unit 12034 of the camera adapter 120 is used to generate a virtual viewpoint image based on the meta information of the own camera image data acquired in S701 and the meta information of the other camera image data acquired in S702. It is possible to determine an image inappropriate for Then, the camera adapter 120 removes the image data indicating the most inappropriate image for generating the virtual viewpoint image from the output target, and deletes the image data to limit the transmission of the captured image to the network. (S703).

カメラアダプタ１２０は、Ｓ７０３で削減後の出力データ量を再度算出し（Ｓ７０４）、Ｓ７０４で算出した出力データ量が出力可能なデータ量を超えるか否かを判定する（Ｓ７０５）。 The camera adapter 120 calculates the output data amount after reduction at S703 again (S704), and determines whether the output data amount calculated at S704 exceeds the data amount that can be output (S705).

Ｓ７０５において、出力データ量が閾値である出力可能データ量を超えないと判断した場合（Ｓ７０５のＮＯ）、カメラアダプタ１２０は、出力の対象となった画像データを下流の他のカメラアダプタに送信する（Ｓ７０６）。一方、Ｓ７０５において、出力データ量が閾値である出力可能データ量を超えると判断した場合（Ｓ５０５のＹＥＳ）、カメラアダプタ１２０は、再度Ｓ７０３からの処理を行う。 If it is determined in S 705 that the output data amount does not exceed the threshold that can be output, which is the threshold (NO in S 705), the camera adapter 120 transmits the image data to be output to the other downstream camera adapter (S706). On the other hand, when it is determined in S705 that the amount of output data exceeds the amount of data that can be output, which is a threshold (YES in S505), the camera adapter 120 performs the processing from S703 again.

通常伝送（Ｓ５０６）時の、カメラアダプタ１２０間のデータの流れについて図６を用いて説明する。 A flow of data between camera adapters 120 at the time of normal transmission (S506) will be described using FIG.

図６において、カメラアダプタ１２０ａは、自カメラ画像データである、カメラ１１２ａから取得した撮影画像を示す画像データを下流のカメラアダプタ１２０ｂに送信する。６００は、カメラアダプタ１２０ａから送信されるデータである。データ６００は、カメラシステム１１０ａで生成した自カメラ画像データが含まれる。データ６００には、カメラ１１２ａのカメラ番号６０１、撮影時刻を示すタイムコード６０２、センサ１１４ａのセンサ情報を基に画像補正部１２０１３で補正を行ったぶれ補正量６０３、画像処理部１２０２１で処理した画像６０４を格納している。 In FIG. 6, the camera adapter 120a transmits image data indicating a photographed image acquired from the camera 112a, which is self-camera image data, to the downstream camera adapter 120b. 600 is data transmitted from the camera adapter 120a. Data 600 includes self-camera image data generated by the camera system 110 a. Data 600 includes a camera number 601 of the camera 112a, a time code 602 indicating a photographing time, a blur correction amount 603 corrected by the image correction unit 12013 based on sensor information of the sensor 114a, and an image processed by the image processing unit 12021 604 is stored.

カメラアダプタ１２０ｂは、自カメラ画像データである、カメラ１１２ｂから取得した撮影画像を示す画像データを取得する。また、カメラアダプタ１２０ｂは、他カメラ画像データである、カメラ１１２ａが撮影した撮影画像を示す画像データをカメラアダプタ１２０ａから取得する。カメラアダプタ１２０ｂは、出力データ（カメラ１１２ｂから取得した撮影画像を示す自カメラ画像データとカメラアダプタ１２０ａから取得した他カメラ画像データ）が出力可能データ量を超えないと判断し、出力データを通常伝送する。 The camera adapter 120b acquires image data indicating a photographed image acquired from the camera 112b, which is self-camera image data. Also, the camera adapter 120b acquires, from the camera adapter 120a, image data indicating another image captured by the camera 112a, which is other camera image data. The camera adapter 120b determines that the output data (the own camera image data indicating the photographed image acquired from the camera 112b and the other camera image data acquired from the camera adapter 120a) does not exceed the outputable data amount, and normally transmits the output data Do.

６０５、６１０は、カメラアダプタ１２０ｂから送信されるデータである。６０５は上流のカメラアダプタ１２０ａから受信したデータを転送したデータであり、６１０はカメラシステム１１０ｂで生成した自カメラ画像データである。 Data 605 and 610 are data transmitted from the camera adapter 120 b. Reference numeral 605 denotes data transferred from the data received from the upstream camera adapter 120a, and reference numeral 610 denotes self-camera image data generated by the camera system 110b.

カメラアダプタ１２０ｃは、カメラ１１２ｃから取得した撮影画像を示す自カメラ画像データを取得する。また、カメラアダプタ１２０ｃは、カメラ１１２ａ及びカメラ１１２ｂが撮影した撮影画像を示す他カメラ画像データをカメラアダプタ１２０ｂから取得する。カメラアダプタ１２０ｃは、出力データ（カメラ１１２ｃから取得した自カメラ画像データとカメラアダプタ１２０ｂから取得した他カメラ画像データ）が出力可能データ量を超えないと判断し、出力データを通常伝送する。 The camera adapter 120c acquires self-camera image data indicating a photographed image acquired from the camera 112c. Also, the camera adapter 120c acquires, from the camera adapter 120b, other camera image data indicating a captured image captured by the camera 112a and the camera 112b. The camera adapter 120c determines that the output data (the own camera image data acquired from the camera 112c and the other camera image data acquired from the camera adapter 120b) does not exceed the amount of data that can be output, and normally transmits the output data.

６０６、６１５、６２０は、カメラアダプタ１２０ｃから送信されるデータである。６０６、６１５はそれぞれ上流カメラアダプタ１２０ｂから受信したデータの転送データ、６２０はカメラシステム１１０ｃで生成した自カメラ画像データである。 606, 615, and 620 are data transmitted from the camera adapter 120c. Reference numerals 606 and 615 denote transfer data of data received from the upstream camera adapter 120b, and reference numeral 620 denotes own camera image data generated by the camera system 110c.

以上示すように、通常伝送時、各カメラアダプタ１２０は、自カメラ画像データとともに、上流のカメラアダプタ１２０から受信したデータを漏れなく下流のカメラアダプタに転送する。 As described above, at the time of normal transmission, each camera adapter 120 transfers the data received from the upstream camera adapter 120 together with the own camera image data to the downstream camera adapter without leakage.

続いて、カメラアダプタ１２０ｃにおいて、Ｓ５０７のデータ削減伝送処理が行われた場合のカメラアダプタ１２０間のデータの流れについて図８を用いて説明する。図８において、カメラアダプタ１２０ａ及びカメラアダプタ１２０ｂにより伝送されるデータは、図６と同様である。 Subsequently, the flow of data between the camera adapters 120 when the data reduction transmission process of S507 is performed in the camera adapter 120c will be described with reference to FIG. In FIG. 8, data transmitted by the camera adapter 120 a and the camera adapter 120 b are the same as in FIG. 6.

図８において、カメラアダプタ１２０ｃは、カメラ１１２ｃから取得した撮影画像を示す自カメラ画像データを取得する。また、カメラアダプタ１２０ｃは、カメラ１１２ａ及びカメラ１１２ｂが撮影した撮影画像を示す他カメラ画像データをカメラアダプタ１２０ｂから取得する。カメラアダプタ１２０ｃは、出力データ（カメラ１１２ｃから取得した自カメラ画像データとカメラアダプタ１２０ｂから取得した他カメラ画像データ）が出力可能データ量を超えると判断された場合、Ｓ５０７のデータ削減伝送処理を行う。 In FIG. 8, the camera adapter 120 c acquires self-camera image data indicating a photographed image acquired from the camera 112 c. Also, the camera adapter 120c acquires, from the camera adapter 120b, other camera image data indicating a captured image captured by the camera 112a and the camera 112b. When it is determined that the output data (the own-camera image data acquired from the camera 112c and the other-camera image data acquired from the camera adapter 120b) exceeds the outputable data amount, the camera adapter 120c performs the data reduction transmission process of S507. .

カメラアダプタ１２０ｃは、取得した自カメラ画像及び他カメラ画像のうち、仮想視点画像の生成に不適切な画像を判定する。ここでは、カメラアダプタ１２０ｃは、取得した自カメラ画像及び他カメラ画像の撮影状況を判定し、画質が劣化し得る状況で撮影された画像を判定する。例えば、カメラアダプタ１２０ｃは、各画像のメタ情報を取得し、取得したメタ情報からぶれが生じる状況で撮影された画像を判定する。カメラアダプタ１２０ｃは、各画像のメタ情報から、最もぶれ補正量が大きい画像であるカメラ１１２ｂが撮影した画像データ６１５を仮想視点画像の生成に適さない画像と判定する。カメラアダプタ１２０ｃは、仮想視点画像の生成に不適切な画像と判定した画像データ６１５を送信対象から外し、画像データ６１５を削除する。カメラアダプタ１２０ｃは、画像データ６１５を削除後の出力データのデータ量が出力可能データ量を超えないと判定し、出力データを下流のカメラアダプタ１２０ｄに伝送する。 The camera adapter 120c determines an image unsuitable for generating a virtual viewpoint image among the acquired self-camera image and the other camera image. Here, the camera adapter 120c determines the shooting conditions of the acquired self-camera image and the other camera image, and determines an image shot in a situation in which the image quality may deteriorate. For example, the camera adapter 120c acquires meta information of each image, and determines an image captured in a situation where blurring occurs from the acquired meta information. From the meta information of each image, the camera adapter 120c determines that the image data 615 captured by the camera 112b, which is an image with the largest blur correction amount, is an image not suitable for generating a virtual viewpoint image. The camera adapter 120 c removes the image data 615 determined to be an image unsuitable for generating the virtual viewpoint image from the transmission target, and deletes the image data 615. The camera adapter 120c determines that the data amount of the output data after deleting the image data 615 does not exceed the possible output data amount, and transmits the output data to the downstream camera adapter 120d.

６０６、６２０は、図８におけるカメラアダプタ１２０ｃから送信されるデータである。図６と比して、画像データ６１５が削除されているため、仮想視点画像の生成のために必要な撮影画像を伝送するネットワークの伝送帯域の逼迫を低減することができる。また、削除する画像は、仮想視点画像の生成に不適切な画像であるため、伝送する画像を削減しても、仮想視点画像の画質低下を抑制することができる。 606 and 620 are data transmitted from the camera adapter 120c in FIG. Compared with FIG. 6, since the image data 615 is deleted, it is possible to reduce the tightness of the transmission band of the network that transmits the photographed image necessary for generating the virtual viewpoint image. Further, since the image to be deleted is an image unsuitable for generating a virtual viewpoint image, even if the image to be transmitted is reduced, it is possible to suppress the image quality deterioration of the virtual viewpoint image.

以上示したように、本実施形態では、自カメラ画像、他カメラ画像に関わらず、仮想視点画像の作成に不適切と判定した画像データを適切に伝送対象から削除し伝送帯域のオーバーを防止することができる。また、伝送帯域が逼迫した場合にのみ、仮想視点画像の作成に不適切と判定した画像データを削除することで、不必要なデータ削減を抑止することができる。 As described above, in this embodiment, regardless of the self-camera image and the other-camera image, the image data determined to be unsuitable for creating the virtual viewpoint image is appropriately deleted from the transmission target to prevent the transmission band from being over. be able to. In addition, unnecessary data reduction can be suppressed by deleting the image data determined to be unsuitable for creating the virtual viewpoint image only when the transmission band is tight.

なお、センサ情報としてカメラの振動を示す情報に限定されるものではなく、カメラの向きや位置情報をセンサ１１４でセンシングしてもよい。 The sensor information is not limited to the information indicating the vibration of the camera, and the sensor 114 may sense the direction or position information of the camera.

また、上述の実施形態は、画像処理システム１００が競技場やコンサートホールなどの施設に設置される場合の例を中心に説明した。施設の他の例としては、例えば、遊園地、公園、競馬場、競輪場、カジノ、プール、スケートリンク、スキー場、ライブハウスなどがある。また、各種施設で行われるイベントは、屋内で行われるものであっても屋外で行われるものであってもよい。また、本実施形態における施設は、一時的に（期間限定で）建設される施設も含む。 Moreover, the above-mentioned embodiment was explained focusing on the example in case image processing system 100 is installed in facilities, such as a stadium and a concert hall. Other examples of the facility include, for example, an amusement park, a park, a racetrack, a bicycle race track, a casino, a pool, a skating rink, a ski resort, a live house and the like. In addition, events performed in various facilities may be performed indoors or outdoor. Further, the facilities in the present embodiment also include facilities temporarily (limitedly) built.

また、図７のＳ７０３において、不適切画像判定部１２０３４は、上述の通り、ぶれ量だけでなく、撮影領域の遮蔽物体の有無、撮影領域の変更の有無などの撮影状況の判定結果に基づいて、仮想視点画像の生成において不適切な画像と特定してもよい。また、Ｓ７０３において、不適切画像判定部１２０３４は、上述の通り、撮影画像の画像処理結果に基づいて、当該画像が仮想視点画像の生成において不適切か否かを判定してもよい。また、不適切画像判定部１２０３４は、画像を撮影したカメラ１１２に設定されている撮影パラメータ（絞り、シャッタースピード、ホワイトバランス、感度など）に基づいて、仮想視点画像の生成において不適切な画像を判定してもよい。不適切画像判定部１２０３４は、規定の値と異なる撮影パラメータを用いて撮影された画像を仮想視点画像の生成において不適切な画像と特定してもよい。仮想視点画像を生成する際に、他の画像と撮影パラメータが異なると、色味などが異なり、画質の劣化が生じ得るためである。この場合、画像データのメタ情報として撮影パラメータが付されている構成としてもよい。 Further, in S703 of FIG. 7, as described above, the inappropriate image determination unit 12034 determines not only the blur amount but also the determination result of the imaging condition such as the presence or absence of the shielding object in the imaging region and the change in the imaging region. The image may be identified as an inappropriate image in generation of a virtual viewpoint image. Further, in S703, as described above, the inappropriate image determination unit 12034 may determine whether the image is inappropriate for generation of a virtual viewpoint image based on the image processing result of the captured image as described above. Also, the inappropriate image determination unit 12034 generates an image that is inappropriate for generating a virtual viewpoint image based on the shooting parameters (aperture, shutter speed, white balance, sensitivity, etc.) set in the camera 112 that has captured the image. You may judge. The inappropriate image determination unit 12034 may specify an image captured using a shooting parameter different from the prescribed value as an image unsuitable for generation of a virtual viewpoint image. When the virtual viewpoint image is generated, if the imaging parameters are different from those of the other images, the color tone and the like are different, and the image quality may be degraded. In this case, imaging parameters may be attached as meta information of the image data.

また、Ｓ７０３において、不適切画像判定部１２０３４は、画像コンピューティングサーバ２００において生成される仮想視点画像に係る仮想視点の位置に基づいて、仮想視点画像の生成において不適切な画像であって、データを削減する画像を判定してもよい。また、不適切画像判定部１２０３４は、コントローラ３００またはエンドユーザ端末１９０において指定される仮想視点の位置や視線方向に基づいて、仮想視点画像の生成において不適切な画像であって、データを削減する画像を判定してもよい。この場合、不適切画像判定部１２０３４は、生成される仮想視点画像に係る仮想視点の位置や視線方向または指定された仮想視点の位置や視線方向から見て写らない撮影領域の画像を、データを削減する対象として特定してもよい。生成される仮想視点画像に係る仮想視点の位置や視線方向または指定された仮想視点の位置や視線方向から見える領域を写さない撮影領域の画像は、仮想視点画像の生成への寄与が低い。したがって、この画像が伝送されなくとも、生成される仮想視点画像の画質低下への影響は低いためである。 Also, in S703, the inappropriate image determination unit 12034 is an image that is inappropriate for generating a virtual viewpoint image based on the position of the virtual viewpoint related to the virtual viewpoint image generated by the image computing server 200, The image to reduce may be determined. In addition, the inappropriate image determination unit 12034 reduces data, which is an image unsuitable for generating a virtual viewpoint image, based on the position of the virtual viewpoint specified by the controller 300 or the end user terminal 190 and the gaze direction. An image may be determined. In this case, the inappropriate image determination unit 12034 sets the data of the position of the virtual viewpoint related to the generated virtual viewpoint image, the gaze direction, the position of the designated virtual viewpoint, the image of the shooting area not captured as viewed from the gaze direction. It may be specified as an object to be reduced. The position of the virtual viewpoint related to the virtual viewpoint image to be generated, the direction of the gaze direction, the position of the designated virtual viewpoint, and the image of the imaging region which does not show the area viewed from the gaze direction contributes less to generation of the virtual viewpoint image. Therefore, even if this image is not transmitted, the influence on the image quality deterioration of the generated virtual viewpoint image is low.

また、図７のＳ７０３において、不適切画像判定部１２０３４は、故障しているカメラ１１２において撮影された画像を、仮想視点画像の生成において不適切な画像であって、データを削減する画像として判定してもよい。不適切画像判定部１２０３４は、カメラ１１２の破損やレンズの破損が生じている場合、当該カメラ１１２において撮影された画像を、仮想視点画像の生成において不適切な画像であって、データを削減する画像として判定してもよい。この場合、画像データのメタ情報としてカメラ１１２の動作状態や破損状態を撮影パラメータが付されている構成としてもよい。 Further, in S703 of FIG. 7, the inappropriate image determination unit 12034 determines an image captured by the malfunctioning camera 112 as an image that is inappropriate for generation of a virtual viewpoint image and that reduces data. You may In the case where breakage of the camera 112 or lens breakage occurs, the inappropriate image determination unit 12034 reduces the data of the image captured by the camera 112 as an image unsuitable for generating a virtual viewpoint image. It may be determined as an image. In this case, as the meta information of the image data, an operation state or a damage state of the camera 112 may be added with a photographing parameter.

また、カメラアダプタ１２０は、Ｓ５０５の判定においてネットワークの伝送帯域の残量や使用状況に応じて、Ｓ５０７のデータ削減伝送処理を行ったが、ネットワークの伝送帯域の残量に拘らずＳ５０７のデータ削減伝送処理を行ってもよい。即ち、カメラアダプタ１２０は、Ｓ５０３〜Ｓ５０４及びＳ５０６を省略し、Ｓ５０７の処理を行ってもよい。 In addition, the camera adapter 120 performs the data reduction transmission process of S507 according to the remaining amount of the transmission band of the network and the usage status in the determination of S505, but the data reduction of S507 regardless of the remaining amount of the transmission band of the network. Transmission processing may be performed. That is, the camera adapter 120 may omit steps S503 to S504 and step S506 and perform the process of step S507.

また、カメラアダプタ１２０は、Ｓ７０３において、不適切と判定された画像を削除して出力データ量を低減する例を示した。しかしながら、例えば、不適切と判定された画像の画素数や解像度を減らしたり、圧縮したりして、データ量を削減して不適切と判定された画像を送信することで、当該撮影画像のネットワークへの伝送を制限する構成としてもよい。 Also, the example in which the camera adapter 120 reduces the output data amount by deleting the image determined to be inappropriate in S703 has been described. However, for example, the number of pixels and resolution of the image determined to be inappropriate are reduced or compressed to reduce the amount of data and transmit the image determined to be inappropriate, thereby the network of the captured image It may be configured to limit the transmission to.

以上、本発明の実施形態について詳述したが、本発明は上述の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形及び変更が可能である。 As mentioned above, although the embodiment of the present invention was explained in full detail, the present invention is not limited to the above-mentioned embodiment, and various modification and change within the range of the gist of the present invention described in the claim are included. Is possible.

（その他の実施形態）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or storage medium, and one or more processors in a computer of the system or apparatus read and execute the program. Can also be realized. It can also be implemented by a circuit (eg, an ASIC) that implements one or more functions.

１００画像処理システム
１１０カメラシステム
１１２カメラ
１１４センサ
１２０カメラアダプタ
１８０スイッチングハブ
１９０エンドユーザ端末
２００画像コンピューティングサーバ
２９０タイムサーバ
３００コントローラ 100 image processing system 110 camera system 112 camera 114 sensor 120 camera adapter 180 switching hub 190 end user terminal 200 image computing server 290 time server 300 controller

Claims

In an image processing system for generating a virtual viewpoint image using a plurality of captured images captured by a plurality of imaging devices, an image transmission device transmitting at least a part of the plurality of captured images to generate the virtual viewpoint image And
An acquisition unit configured to acquire a captured image captured by at least one of the plurality of imaging devices;
Transmission means for transmitting the photographed image acquired by the acquisition means;
A determination unit that determines whether the captured image acquired by the acquisition unit is suitable for generating the virtual viewpoint image;
A control unit that restricts transmission of the captured image by the transmission unit when it is determined by the determination unit that the captured image is not suitable for generating the virtual viewpoint image;
An image transmission apparatus comprising:

2. The apparatus according to claim 1, wherein the determination unit determines whether the captured image is suitable for generating the virtual viewpoint image based on a capturing condition of the captured image acquired by the acquisition unit. Image transmission apparatus as described.

The determination unit is characterized by determining that the captured image is not suitable for generation of the virtual viewpoint image when it is determined that blurring occurs in the captured image acquired by the acquisition unit. Item 3. The image transmission device according to item 1 or 2.

The determination unit determines that the captured image is not suitable for generation of the virtual viewpoint image when it is determined that an object blocking the imaging target of the captured image acquired by the acquisition unit is included in the captured image. The image transmission apparatus according to any one of claims 1 to 3, wherein it is determined that

The determination unit determines that the captured image is not suitable for generation of the virtual viewpoint image when it is determined that the imaging device for capturing the captured image acquired by the acquisition unit is broken or damaged. The image transmission device according to any one of claims 1 to 4, characterized in that:

The determination unit determines that the captured image is not suitable for generation of the virtual viewpoint image when it is determined that the imaging area of the imaging device for capturing the captured image acquired by the acquisition unit has changed. The image transmission apparatus according to any one of claims 1 to 5, characterized in that

The determination means is characterized by determining whether or not the photographed image is suitable for generation of the virtual viewpoint image based on a photographing parameter when the photographed image acquired by the acquisition means is photographed. The image transmission apparatus according to any one of claims 1 to 6.

The determination means determines whether or not the captured image is suitable for generation of the virtual viewpoint image based on sensor information obtained by sensing a capturing condition at the time of capturing the captured image acquired by the acquisition device. The image transmission device according to any one of claims 1 to 7, characterized in that:

The determination means is based on sensor information which is a result of sensing by an infrared sensor, a gyro sensor, an acceleration sensor, a millimeter wave sensor, GPS or an electronic compass when the photographed image acquired by the acquisition means is photographed. The image transmission apparatus according to any one of claims 1 to 8, wherein it is determined whether or not the photographed image is suitable for generation of the virtual viewpoint image.

The determination means determines whether or not the photographed image is suitable for generation of the virtual viewpoint image based on the position of the viewpoint related to the virtual viewpoint image and the direction of the line of sight. 9. The image transmission device according to any one of up to 9.

The said determination means determines whether it is suitable for the production | generation of the said virtual viewpoint image based on the image processing result of the said picked-up image, The any one of Claim 1 to 10 characterized by the above-mentioned. Image transmission device.

When the determination unit determines that the captured image is not suitable for generating the virtual viewpoint image, the control unit reduces the data amount of the captured image to transmit the captured image by the transmission unit. The image transmission device according to any one of claims 1 to 11, characterized in that it is restricted.

13. The apparatus according to claim 1, wherein the control means does not transmit the photographed image to the transmission means when the judging means judges that the photographed image is not suitable for generation of the virtual viewpoint image. The image transmission device according to any one of the above.

The information processing apparatus further includes identification means for identifying the data amount of data to be transmitted by the transmission means.
The said control means limits transmission by the said transmission means of the said picked-up image, when the data amount specified by the said identification means exceeds a threshold value, It is any one of Claim 1 to 13 characterized by the above-mentioned. Image transmission device.

The acquisition unit acquires a first captured image captured by a first imaging device of the plurality of imaging devices and a second captured image captured by a second imaging device of the plurality of imaging devices.
2. The image processing apparatus according to claim 1, wherein the determination unit determines a captured image not suitable for generating the virtual viewpoint image from the first captured image and the second captured image acquired by the acquisition unit. 14. The image transmission device according to any one of items 1 to 14.

The transmission unit transmits the photographed image acquired by the acquisition unit when the judgment unit judges that the photographed image is suitable for generating the virtual viewpoint image. 15. The image transmission apparatus according to any one of items 1 to 15.

The transmission unit transmits the photographed image acquired by the acquisition unit to another image transmission apparatus that transmits at least a part of the plurality of photographed images or a server that generates the virtual viewpoint image. The image transmission apparatus according to any one of claims 1 to 16.

The determination means is suitable for generating the virtual viewpoint image when the sensor information obtained as a result of sensing by the gyro sensor or the acceleration sensor indicates that the imaging device for capturing the captured image is vibrating. The image transmission apparatus according to claim 9, wherein it is determined that there is no.

In the case where the determination unit indicates that there is an object that shields the imaging region of the imaging device that captures the captured image acquired by the acquisition unit, sensor information that is a result of sensing by an infrared sensor or a millimeter wave sensor. The image transmission apparatus according to claim 9, wherein the image transmission apparatus determines that the captured image is not suitable for generating the virtual viewpoint image.

When the determination unit indicates that sensor information which is a result of sensing by the electronic compass indicates that the imaging direction of the imaging device for capturing the captured image acquired by the acquisition unit has changed, the captured image corresponds to the virtual viewpoint image The image transmission apparatus according to claim 9, wherein the image transmission apparatus is determined not to be suitable for the generation of.

When the determination unit indicates that the position of the imaging device for capturing the captured image acquired by the acquisition unit has changed as the sensor information which is the result of the sensing of the GPS, the captured image is a generation of the virtual viewpoint image The image transmission apparatus according to claim 9, wherein the image transmission apparatus is determined not to be suitable for

In an image processing system including a plurality of imaging devices, a plurality of image transmission devices transmitting at least a part of a plurality of photographed images photographed by the plurality of imaging devices, and a server generating a virtual viewpoint image An image transmission method,
And transmitting the photographed image not suitable for generating the virtual viewpoint image among the plurality of photographed images to the server.

A program for operating a computer as an image transmission device according to any one of claims 1 to 16.