JP2004007284A

JP2004007284A - Video recording system, program and recording medium

Info

Publication number: JP2004007284A
Application number: JP2002160647A
Authority: JP
Inventors: Norihiko Murata; 村田　憲彦; Shin Aoki; 青木　伸
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2002-05-31
Filing date: 2002-05-31
Publication date: 2004-01-08
Anticipated expiration: 2022-05-31
Also published as: JP3954439B2

Abstract

【課題】簡素な構成・処理で、広範囲の映像を取得・記録すると同時に、所望のシーンの映像を高い解像度で取得・記録することを可能とならしめる映像記録システムを提供する。
【解決手段】広角の映像を取得する第一の撮像手段と、
複数のカメラにより構成され、互いに異なる所定の領域が撮影された複数の映像を同期的に取得する第二の撮像手段と、
前記広角の映像と前記第二の撮像手段により取得された各々の映像との対応関係を特定する特定手段と、
前記広角の映像及び前記第二の撮像手段により取得された映像の少なくとも一つ及び前記対応関係を記録する記録手段とを有することを特徴とする映像記録システムにより構成する。
【選択図】　　図２[PROBLEMS] To provide a video recording system capable of acquiring and recording a wide range of video with a simple configuration and processing, and simultaneously acquiring and recording a video of a desired scene at a high resolution.
A first imaging unit for acquiring a wide-angle image,
A second imaging unit configured by a plurality of cameras and synchronously acquiring a plurality of videos in which predetermined regions different from each other are captured,
Identification means for identifying the correspondence between the wide-angle image and each image acquired by the second imaging means,
A recording unit for recording at least one of the wide-angle image and the image acquired by the second imaging unit and the correspondence relationship is provided.
[Selection] Figure 2

Description

【０００１】
【発明の属する技術分野】
本発明は、広角の視野を有する撮像手段を用いて取得された広範囲のシーンの映像を記録するシステムに関するものである。具体的には監視システム、遠隔会議システム、遠隔教育システム等の用途に使用される。
【０００２】
【従来の技術】
電気通信技術の発展により、会議の様子を撮影し、取得された画像を遠隔地に伝送するテレビ会議システムが多くの企業・団体で活用されるようになった。かかるシステムの利便性をより向上させるべく、従来より会議の様子を映像として取り込むための装置及び話者のみを切り出した部分映像を伝送するためのシステムが数多く提案されている。例えば特開平５−１２２６８９号公報において、マイクから入力される音声を検出して話者を判定し、該判定結果に基づいてカメラ制御部においてカメラを自動制御し、話者を捉えるというテレビ会議システムが提案されている。しかし、話者を捉えるためにカメラを制御するのに時間がかかるという問題点がある。
【０００３】
この問題点を解決すべく、特開平１１−３３１８２７号公報において、魚眼又は超広角レンズ及び可変指向性マイクロフォンを用いたテレビカメラ装置に関し、音源位置の方向を判定し、該音源位置方向を追尾し、音源位置方向の画像を切り出して映像信号を生成するという発明が公開されている。しかし、魚眼又は超広角レンズを用いた該テレビカメラ装置を机の上などに設置する場合、一般に天井などあまり重要でないものが視野の大半を占めるのに比べ、視野の周辺部に人間の顔などの重要な被写体が存在するという問題点がある。また、これらのレンズは設計及び製造にかかるコストが高価である。
【０００４】
一方、ＭＰＥＧ　（Ｍｏｔｉｏｎ　Ｐｉｃｔｕｒｅ　Ｅｘｐｅｒｔｓ　Ｇｒｏｕｐ）に代表される近年の画像符号化技術の進化、ハードディスクドライブ　（以下ＨＤＤと略す）の大容量化及びそのデータ転送速度の高速化に伴い、長時間の映像及び音声信号をパーソナルコンピュータ　（以下ＰＣと略す）に電子情報として記録することが技術的に可能となってきた。それと共に、会議などのシーンを映像に記録し、後で記録されたシーンを振り返ることへのニーズが高まっている。しかし、上記従来技術はいずれもテレビ会議システムを指向しており、装置により取得された映像から所望の部分のみを切り出した映像信号を実時間で伝送することにより、通常の会議と同様の臨場感の高い映像を実時間で生成及び伝送することに主眼が置かれたものであった。すなわち、取得された映像をＰＣに記録したり、後にＰＣ上でその映像を閲覧するために便宜を図ったものではなかった。
【０００５】
【発明が解決しようとする課題】
本発明は、上述の問題点に鑑みてなされたものであり、その第１の目的は、簡素な構成・処理で、広範囲の映像を取得・記録すると同時に、所望のシーンの映像を高い解像度で取得・記録することを可能とならしめる映像記録システム並びに該システムの各部の処理を実行させるためのプログラム及び記録媒体を提供しようとするものである。
【０００６】
また、本発明の第２の目的は、閲覧者に対し、高い解像度を持つ所望のシーンの映像を一層容易に選択可能とならしめる映像記録システムを提供しようとするものである。
【０００７】
また、本発明の第３の目的は、ユーザに面倒な操作を強いることなく、所望のシーンの映像を高い解像度で取得・記録することを可能とならしめる映像記録システムを提供しようとするものである。
【０００８】
また、本発明の第４の目的は、取得された広範囲の映像を、更に閲覧者に観察しやすい形で表示することを可能とならしめる映像記録システムを提供しようとするものである。
【０００９】
また、本発明の第５の目的は、更に所望のシーンの映像を高い解像度で漏れなく取得・記録することを可能とならしめる映像記録システムを提供しようとするものである。
【００１０】
【課題を解決するための手段】
広角の映像を取得する第１撮像部３１と、互いに異なる所定の領域が撮影された複数の映像を同期的に取得する第２撮像部３２と、前記広角の映像及び前記第２撮像部３２により取得された映像の少なくとも一つを記録する記録部３５とを備えることにより、簡素な構成・処理で、広範囲の映像を取得・記録すると同時に、所望のシーンの映像を高い解像度で取得・記録することが可能となる。すなわち、第１の目的が達成される。
【００１１】
請求項１に記載の発明は、映像記録システムであって、
広角の映像を取得する第一の撮像手段と、
複数のカメラにより構成され、互いに異なる所定の領域が撮影された複数の映像を同期的に取得する第二の撮像手段と、
前記広角の映像及び前記第二の撮像手段により取得された映像の少なくとも一つを記録する記録手段とを有する。
【００１２】
請求項１の発明により、広角の映像を取得する第１撮像部３１と、互いに異なる所定の領域が撮影された複数の映像を同期的に取得する第２撮像部３２と、前記広角の映像及び前記第２撮像部３２により取得された映像の少なくとも一つを記録する記録部３５とを備えることにより、簡素な構成・処理で、広範囲の映像を取得・記録すると同時に、所望のシーンの映像を高い解像度で取得・記録することが可能となる。すなわち、第１の目的が達成される。
【００１３】
請求項２に記載の発明は、映像記録システムであって、広角の映像を取得する第一の撮像手段と、
複数のカメラにより構成され、互いに異なる所定の領域が撮影された複数の映像を同期的に取得する第二の撮像手段と、
前記広角の映像と前記第二の撮像手段により取得された各々の映像との対応関係を特定する特定手段と、
前記広角の映像及び前記第二の撮像手段により取得された映像の少なくとも一つ及び前記対応関係を記録する記録手段とを有する。
【００１４】
請求項３に記載の発明は、映像記録システムであって、前記第二の撮像手段を構成する複数のカメラは各々識別子が付されており、
前記第一の撮像手段は、前記複数のカメラに付された識別子を撮影範囲に含み、
前記特定手段は、前記広角の映像において含まれる識別子の撮影位置に基づき、前記対応関係を特定する。
【００１５】
請求項４に記載の発明は、映像記録システムであって、前記特定手段は、前記広角の映像と前記第二の撮像手段により取得された各々の映像との類似度に基づき、前記対応関係を特定する。
【００１６】
請求項２、３及び、４に記載の発明により、第１撮像部３１により取得された広角の映像と、第２撮像部３２により取得された各々の部分映像との対応関係を特定する特定部３６を備えることにより、高い解像度を持つ所望のシーンの映像を一層容易に選択することが可能となる。すなわち、第２の目的が達成される
請求項５に記載の発明は、映像記録システムであって、広角の映像を取得する第一の撮像手段と、
複数のカメラにより構成され、互いに異なる所定の領域が撮影された複数の映像を同期的に取得する第二の撮像手段と、
前記第二の撮像手段により取得された複数の映像より所定の映像を選択する映像選択手段と、
前記広角の映像及び前記映像選択手段により選択された所定の映像を記録する記録手段とを有する。
【００１７】
請求項６に記載の発明は、映像記録システムであって、更に、音声を入力する複数のマイクロフォンと
前記複数のマイクロフォンにより入力された音声に基づいて音源の位置又は方向を検出する音源検出手段とを有し、
前記映像選択手段は、前記音源検出手段により出力された音源の位置若しくは方向に基づいて、前記所定の映像を選択する。
【００１８】
請求項７に記載の発明は、映像記録システムであって、更に、前記広角の映像又は前記第二の撮像手段により取得された複数の映像における被写体の動きを検出する動き検出手段を有し、
前記映像選択手段は、前記動き検出手段により出力された被写体の動きに基づいて、前記所定の映像を選択する。
【００１９】
請求項５、６及び、７に記載の発明により第２撮像部３２により取得される複数の映像より、所定の映像を選択する映像選択部３９を備えることにより、ユーザに面倒な操作を強いることなく、所望のシーンの映像を高い解像度で取得・記録することが可能となる。すなわち、第３の目的が達成される。
【００２０】
請求項８に記載の発明は、映像記録システムであって、更に、前記広角の映像を変形する変形手段を有し、
前記記録手段は、前記変形手段により変形された映像及び前記第二の撮像手段により取得された映像の少なくとも一つを同時に記録する。
【００２１】
請求項８に記載の発明により、広角の映像を変形する変形部３３を備えることにより、取得された広範囲の映像を、更に閲覧者に観察しやすい形で表示することが可能となる。すなわち、第４の目的が達成される。
【００２２】
また、第２撮像部３２により取得される各々の映像が、少なくとも一の他の映像と一部の共通する領域を含むことにより、更に所望のシーンの映像を高い解像度で漏れなく取得・記録することが可能となる。すなわち、第５の目的が達成される。
【００２３】
請求項９に記載の発明は、コンピュータに、請求項２乃至８の何れか一に記載の映像記録システムの各手段に係る処理を実行させることを特徴とするプログラムである。
【００２４】
請求項１０に記載の発明は、　映像記録システムの各手段に係る処理を実行するための、コンピュータ読み取り可能なプログラムソフトウェアを記録することを特徴とする記録媒体である。
【００２５】
請求項９又は１０に記載の発明により、広角の映像を取得する第１撮像部３１と、互いに異なる所定の領域が撮影された複数の映像を同期的に取得する第２撮像部３２と、前記広角の映像及び前記第２撮像部３２により取得された映像の少なくとも一つを記録する記録部３５とを備えることにより、簡素な構成・処理で、広範囲の映像を取得・記録すると同時に、所望のシーンの映像を高い解像度で取得・記録することが可能となる。すなわち、第１の目的が達成される。
【００２６】
【発明の実施の形態】
まず、映像記録システムがどのように使用されるかの使用例について簡単に概説し、次に、映像記録システムの実施の形態を具体的に説明する。各実施の形態においては、それを構成する要素及びその動作を説明し、最後に処理の流れについて説明する。
【００２７】
先ず最初に映像配信システムの使用例について説明する。
【００２８】
図１は、本発明を会議場面に設置した使用例を概説する説明図である。映像記録システムは、広角の映像を取得する広角カメラ２００と、通常の画角を持つ複数のカメラ４０１−１から４０１−４より構成されたカメラアレイ４００と、会議中の音声を取得するマイクロフォン５０１と、広角カメラ２００及びカメラアレイ４００で取得された映像データ並びにマイクロフォン５０１により取得された音声データを取り込み、記録するためのサーバ３００とを有する。
【００２９】
図１に示したように、広角カメラ２００は、テーブル１に設置され、会議の参加者（話者）２−１から２−４のいる方向、例えば水平面を見渡す全周囲の画像を一括して撮像する。また、カメラアレイ４００を構成する各々のカメラ４０１−１から４０１−４は、例えばそれぞれ会議の参加者の前面に置かれ、各参加者の姿を撮影する。これらのカメラ４０１−１から４０１−４により取得される映像を、以後「部分映像」と呼ぶ。また、サーバ３００はキャビネット３に格納され、広角カメラ２００及びカメラアレイ４００からの映像データ並びにマイクロフォン５０１により取得された音声データを取得し、取得された映像データ及び音声データをＨＤＤに記録する。
【００３０】
以下の各実施の形態では、本発明の映像記録システムを、会議の撮影及びその映像の記録に適用した場合について説明する。
【００３１】
１．実施の形態１
先ず最初に、本発明の実施の形態１について説明する。
【００３２】
１．１　構成
図２は、本発明の実施の形態１に係る映像記録システムの構成を示す図である。サーバ３００には、ＵＳＢハブ３２０及びバス３１０を介して広角カメラ２００と、カメラアレイ４００とが接続され、広角の映像データ及びカメラアレイ４００により取得された少なくとも１つの部分映像データが取得・記録される。サーバ３００により記録された映像データは、サーバ３００において表示される。また、該映像データは、必要に応じてインターネットを介して配信され、該インターネットに接続されたクライアントＰＣにおいて表示される。
【００３３】
次に、上記各部の構成について説明する。
【００３４】
１．１．１　広角カメラ
図３は、実施の形態１に係る、第一の撮像手段としての、広角カメラ２００の構成を示す図である。この第一の撮像手段としての、広角カメラ２００は、所定形状の曲面を有するミラー２１１と、レンズ２１２と、絞り２１３と、ＣＣＤ（Ｃｈａｒｇｅ　ＣｏｕｐｌｅｄＤｅｖｉｃｅ）等の撮像素子２１４と、上記撮像素子２１４のタイミング制御、並びに上記撮像素子２１４により得られた映像信号に対してアナログ−デジタル変換等のデジタル化処理を行う駆動部２１５と、前記駆動部２１５により得られたデジタル信号に対してエッジ強調やγ補正等の前処理を行う前処理回路２１６と、アイリスを制御するために絞り２１３を駆動するモータ駆動部２１７とを備えている。
【００３５】
ミラー２１１は、光学系に入射する光を反射させることにより広角の撮影を可能とするためのものであり、ここでは所定形状の曲面有するミラーとして、双曲面ミラーを使用する。図４は、本実施の形態の双曲面ミラー２１１を用いた場合の光路を説明する図である。また図５は、本実施の形態の双曲面ミラー２１１により撮像素子２１４の表面に結像される広角画像の様子を示した図である。図５に示すように、双曲面ミラー２１１から反射されて撮像素子２１４に取り込まれる画像はドーナツ形状となっている（このドーナッツ形状の映像を以後「ドーナッツ映像」と呼ぶ）。該ドーナッツ映像は、前記撮像素子２１４において結像され、さらに前記駆動部２１５においてデジタル化され、前処理回路２１６を介して後述するサーバ３００に送出される。なお、図４の中の中心部は、撮像素子２１４の方向を映し出し、これは重要でない画像情報である。したがって、双曲面ミラー２１１の頭頂部２１８を黒く塗りつぶして、黒色情報としてもよい。なお、使用の態様によっては、頭頂部２１８に基準線を描画し、広角カメラ２００の立ち上げの際、モータ駆動部２１７を駆動することにより、ピント調整などの初期設定に利用してもよい。
【００３６】
上記のように、通常のカメラとミラーの組み合わせにより、安価かつ簡素な構成で広角の映像を撮影することができる。
【００３７】
１．１．２　カメラアレイ
カメラアレイ４００は少なくとも１つのカメラより構成され、各々のカメラは、前記広角カメラ２００の撮影範囲の一部のシーンを、より高い解像度で撮影する。カメラアレイ４００を構成するカメラ４０１は、図１のようにバラバラに配置されても、図６のように各々のカメラ４０１−１から４０１−３を筐体４０２に固定して配置したものであっても構わない。カメラ４０１に使用される撮像素子は、ＣＣＤ、ＣＭＯＳ　（Ｃｏｍｐｌｅｍｅｎｔａｒｙ　Ｍｅｔａｌ−Ｏｘｉｄｅ　Ｓｅｍｉｃｏｎｄｕｃｔｏｒ）型など様々な種類のものを使用することができる。該撮像素子において結像された映像信号は、カメラ内部でデジタル化された後、後述するサーバ３００に送出される。
【００３８】
上記の構成を有するカメラ４０１を少なくとも１つ用意することにより、安価かつ簡素な構成で解像度の高い部分映像を取得することができる。
【００３９】
１．１．３　サーバ
図７は、本実施の形態におけるサーバ３００の構成例を示した図である。すなわち、映像記録システム１００における各種の制御及び処理を行うＣＰＵ　（Ｃｅｎｔｒａｌ　Ｐｒｏｃｅｓｓｉｎｇ　Ｕｎｉｔ）　３０１と、ＳＤＲＡＭ　（Ｓｙｎｃｈｒｏｎｏｕｓ　Ｄｙｎａｍｉｃ　Ｒａｎｄｏｍ　Ａｃｃｅｓｓ　Ｍｅｍｏｒｙ）　３０２と、ＨＤＤ　（Ｈａｒｄ　Ｄｉｓｋ　Ｄｒｉｖｅ）　３０３と、マウス３１１等のポインティングデバイス、キーボード３１２等の各種入力インターフェース（以下Ｉ／Ｆと略す）３０４と、電源３０５と、ＣＲＴ　（Ｃａｔｈｏｄｅ　Ｒａｙ　Ｔｕｂｅ）等のディスプレイとを接続するための表示Ｉ／Ｆ３０６と、前記広角カメラ２００や前記カメラアレイ４００などの外部機器を接続するための外部Ｉ／Ｆ　３０７と、ＤＶＤ　（Ｄｉｇｉｔａｌ　Ｖｅｒｓａｔｉｌｅ　Ｄｉｓｃ）＋ＲＷドライブ等の大容量記録装置３０８とを、バス３１３を介して接続することにより構成される。
【００４０】
次に、サーバ３００の各構成部について説明する。ＣＰＵ　３０１は、ＨＤＤ　３０３に格納された所定のプログラムにしたがって、広角カメラ２００及びカメラアレイ４００からの映像取得・記録などの各種処理及び制御を行う。ＳＤＲＡＭ　３０２は、ＣＰＵ　３０１の作業領域として利用されるとともに、ＨＤＤ　３０３に格納された各処理プログラムや、Ｗｉｎｄｏｗｓ（登録商標）　ＮＴ　Ｓｅｒｖｅｒ　（米国Ｍｉｃｒｏｓｏｆｔ社の登録商標）などのＯＳ　（Ｏｐｅｒａｔｉｎｇ　Ｓｙｓｔｅｍ）の記憶領域として利用される。また、該ＨＤＤ　３０３は、取得された映像の記録するための領域としても使用される。
【００４１】
外部Ｉ／Ｆ３０７の一例として各種Ｉ／Ｆボード、ＵＳＢ　（Ｕｎｉｖｅｒｓａｌ　Ｓｅｒｉａｌ　Ｂｕｓ）、ＩＥＥＥ　１３９４、或いはＩｒＤＡ、Ｂｌｕｅｔｏｏｔｈ等の無線Ｉ／Ｆが挙げられる。前記広角の映像データ及び前記カメラアレイ４００により取得される複数の部分映像データは、前記広角カメラ２００及びカメラアレイ４００をＵＳＢ２．０のような高速シリアルインターフェース経由でサーバ３００に接続することにより、同期的に取得することが可能である。外部Ｉ／Ｆ　３０７を経由して取得されたデータは、ＨＤＤ　３０３又は大容量記録装置３０８に記録される。
【００４２】
１．２　動作
図８は、図２に示された本実施の形態に係る映像記録システムを、機能別のブロック図に書き直した図である。以下において、図８に示された各部の動作を具体的に説明する。
【００４３】
１．２．１　第１撮像部
第１撮像部３１は、上記の１．１．１に記載した広角カメラ２００により構成され、取得され且つデジタル化された広角の映像データを出力する動作を行う。
【００４４】
１．２．２　第２撮像部
第２撮像部３２のは、上記の１．１．２に記載されたカメラアレイ４００により構成され、取得され且つデジタル化された部分映像データを出力する動作を行う。
【００４５】
１．２．３　変形部
図９は、実施の形態１における変形部３３の動作を説明する図である。変形部３３は、第１撮像部３１により取得された広角の映像データを、図１１に示すように、通常のカメラで捉えた透視変換像に近い映像　（以下パノラマ映像と呼ぶ）に変形するものである。一般に、広角の範囲を撮影可能なカメラによって得られる映像は、上述したように、人間の眼で確認できる像の形状と異なり、大きな歪みが含まれるために、後に閲覧するときの便宜を図るために変形処理を施すと好適である。以下、文献　（Ａ．Ｍ．Ｂｒｕｃｋｓｔｅｉｎ　ａｎｄ　Ｔ．Ｊ．Ｒｉｃｈａｒｄｓｏｎ：“Ｏｍｎｉｖｉｅｗ　Ｃａｍｅｒａｓ　ｗｉｔｈ　Ｃｕｒｖｅｄ　Ｓｕｒｆａｃｅ　Ｍｉｒｒｏｒｓ”，　Ｐｒｏｃ．　ｏｆ　ｔｈｅ　ＩＥＥＥ　Ｗｏｒｋｓｈｏｐ　ｏｎ　Ｏｍｎｉｄｉｒｅｃｔｉｏｎａｌ　Ｖｉｓｉｏｎ　２０００，　ｐｐ．７９−８４）に記載された方法を参考に、広角の映像データ（図６に示したドーナッツ映像を）をパノラマ映像に変形する方法を説明する。
【００４６】
図１０は、双曲面ミラーを使用したカメラにおける映像の変形原理を説明するための図である。また図１２（ａ）は、変形部３３の動作例を示すし、ドーナッツ映像を、横軸を方位角とし且つ縦軸を仰角とする円柱面に映されたパノラマ映像に座標変換する。また図１０（ｂ）は、広角カメラ２００の幾何的構造を示す図であり、図１０（ｂ）の中のカメラの光学系は中心射影モデルである。ここで、図中の各変数の意味は、下記の通りである。
（ｕ，　ｖ）：ドーナッツ映像における座標
（ｕ_０，　ｖ_０）：ドーナッツ映像における双曲面ミラーの中心の座標
（θ，φ）：パノラマ映像における座標
ｒ：　（ｕ_０，　ｖ_０）から（ｕ，　ｖ）への画素単位の距離
ｒ_ｍａｘ：ドーナッツ映像における双曲面ミラーの画素単位の半径
θ：方位角
φ：仰角
ψ：カメラの光軸からの頂角
Ｆ：双曲面ミラーの焦点
Ｆ’：双曲面ミラーと対をなす双曲面の焦点、カメラの光学中心に一致する。
【００４７】
このとき、頂角ψと仰角φとの間に、以下の関係が成立する。
【００４８】
【数１】

ここで、
【００４９】
【数２】

である。また、φｍｉｎはドーナッツ映像上の半径ｒｍａｘの位置に対応する仰角φの値であり、これはカメラの仰角方向の撮影限界値を表す。ｒｍａｘとφｍｉｎの値は一般に容易に知ることができる。
以下、変形の手順を説明する。　（ｉ）点（ｕ，　ｖ）に対応する極座標　（ｒ，θ）を、次式を解くことにより求める。
【００５０】
【数３】

（ｉｉ）　（３）式により算出されたｒに対応する頂角ψを次式により求める。
【００５１】
【数４】

ここで、
【００５２】
【数５】

であり、ψ_ｍａｘはドーナッツ映像上の半径ｒ_ｍａｘの位置及び仰角φ_ｍｉｎに対応する頂角ψの値である。ψ_ｍａｘの値は、（１）式にφ_ｍｉｎを代入することにより求めることができる。
【００５３】
（ｉｉｉ）　（４）式により算出されたψに対応する仰角φを、（１）式により求める。
【００５４】
以上の手順により、双曲面ミラーにより撮影されたドーナッツ映像における任意の点（ｕ，ｖ）を、パノラマ映像における点（θ，φ）に座標変換することができる。すなわち、ドーナッツ映像がパノラマ映像に変形される。
【００５５】
図１１は、変形部３３で使用される座標変換テーブルを説明する図である。撮影からパノラマ映像の記録を一時に行う場合、上記の変形処理に要する計算時間が問題となるため、図１１のように、上記の手順に基づいた座標変換テーブルを予め作成しておくと好適である。図１１の座標変換テーブルにおいては、各点（θ，φ）に対応するドーナッツ映像の座標（ｕ，　ｖ）を格納しておく。
【００５６】
以上の変形処理は、前記サーバ３００内の前記ＣＰＵ　３０１により実行される。このとき、前記ＨＤＤ　３０３には該変形処理を施すための所定のプログラムを予め格納しておく。
【００５７】
１．２．４　エンコード部
図８のエンコード部３４は、図８の前記第２撮像部３２により取得された部分映像データの少なくとも一つと、前記変形部３３により出力されたパノラマ映像データを、映像記録に適した形式にエンコードする。ここで、映像記録に適した形式は様々挙げられるが、ＭＰＥＧに代表される動画符号化フォーマットなどの形式を使用する。エンコード部３４は、映像データの取得が継続している限り常に映像をエンコードし続け、エンコードされたデータを連続して記録部３５に送信する。
【００５８】
以上のエンコード処理は、前記サーバ３００内の前記ＣＰＵ　３０１により実行される。このとき、前記ＨＤＤ　３０３にＭＰＥＧエンコードプログラムを予めインストールしておく。
【００５９】
１．２．５　記録部
記録部３５は、前記エンコード部３４によりエンコードされた少なくとも１つの部分映像データ及び前記第１撮像部３１により取得された広角の映像データを記録する。
【００６０】
記録部３５は、ＨＤＤ　３０３によりその機能を実現することができる。なお、使用の態様によっては、大容量記録装置３０８によりその機能を実現してもよい。例えば、長時間の会議や、定例会議については、保存の必要性からＤＶＤ＋ＲＷ等により構成される大容量記録装置３０８に記録し、短時間の会議など、長期の保存の必要性が低いものに関してはＨＤＤ　３０３に記録するなどの使い分けを行ってもよい。
【００６１】
２．　実施の形態２
また、一方向に曲率をもった曲面ミラーを、広角カメラ２００に使用することもできる。
【００６２】
２．１　構成
実施の形態２の構成は、上述の実施の形態１と同様に、図４に示されている。
【００６３】
以下、本実施の形態における広角カメラ２００の構成について説明する。なお、サーバ３００及びカメラアレイ４００の構成も、実施の形態１で説明した通りであるので、説明を省略する。
【００６４】
２．１．１　広角カメラ
図１２は、本実施の形態における広角カメラ２００の構成を示した図である。図１２に示すように、広角カメラ２００は、通常の画角を有するカメラ２１９と一方向に曲率をもったミラー２１１とにより構成されており、全方位を撮影することはできないが広い範囲のシーンを撮影することができる。図１３は、該ミラー２１１を使用したときのカメラ２１９に映される広角画像の様子を示した図であり、カメラ２１９の背後のシーンを撮影することができる。図１３に示したように、取り込まれる画像は、入射光の水平方向の角度と、撮影される画像の位置の横方向の座標が比例した状態で、横方向に圧縮された形状となっている。また、カメラ２１９自身の画像への写り込みを低減するように改良することも可能である。
【００６５】
上記のように、通常のカメラとミラーの組み合わせにより、安価かつ簡素な構成で広角の映像を撮影することができる。
【００６６】
２．２　動作
本実施の形態における機能別のブロック図は、上述の実施の形態１と同様に、図９に示されている。以下において、図９に示された各部の動作を具体的に説明する。なお、第１撮像部３１、第２撮像部３２、及び記録部３５の動作は、上述の実施の形態１と同様であるので、説明を省略する。
【００６７】
２．２．１　変形部
本発明の実施の形態２においては、広角カメラ２００により取得された映像データを、横方向に一様に引き伸ばすだけでパノラマ映像を得ることが可能である。双曲面ミラーを使用した場合と同様に、図１１のような座標変換テーブルを作成し、パノラマ映像の各点に対応する変形前の映像の座標（ｕ，　ｖ）を格納するようにすればよい。
【００６８】
また、この広角カメラ２００を使用した場合、映像記録システム１００に変形部３３を設けなくとも、映像を表示するクライアント側でパノラマ映像を生成・表示することが可能である。今、横　（水平）方向の撮影範囲が１８０度、縦　（鉛直）方向の撮影範囲が６０度であり、サイズが３５２×２４０画素の映像が、該広角カメラ２００により取得されているとする。この場合、横方向の長さを３倍、すなわち１０５６画素に引き伸ばすことによりパノラマ映像を得ることができる。また、サーバ３００のマシン名を”ｖｉｄｓｅｒｖ”、映像記録システムから配信される広角の映像データ名を”ｍｏｖｉｅ．ｒｍ”　（後述するＲｅａｌＶｉｄｅｏというデータ形式）、及び映像表示端末とサーバ３００との通信に使用されるプロトコルをＲＴＳＰ　（Ｒｅａｌ　Ｔｉｍｅ　Ｓｔｒｅａｍｉｎｇ　Ｐｒｏｔｏｃｏｌ）とする。このとき、該引き伸ばし処理を実行する処理は、図１４に示すように、Ｗ３Ｃ　（Ｗｏｒｌｄ　Ｗｉｄｅ　Ｗｅｂ　Ｃｏｎｓｏｒｔｉｕｍ）によって勧告されたＳＭＩＬ　（Ｓｙｎｃｈｒｏｎｉｚｅｄ　Ｍｕｌｔｉｍｅｄｉａ　Ｉｎｔｅｇｒａｔｅｄ　Ｌａｎｇｕａｇｅ）を用いて記述することができる。図１４に示すように、＜ｒｅｇｉｏｎ＞タグにおいて指定された表示領域の大きさと、関連付けられる映像データ”ｍｏｖｉｅ．ｒｍ”の画像サイズが異なる場合、ｆｉｔ属性を”ｆｉｌｌ”と指定することにより、表示領域のサイズに合わせて、映像データが拡大縮小表示される。すなわち、ｆｉｔ属性値を上記のように指定した状態で、映像データに対して所望の拡大率を有する表示領域を指定することにより、パノラマ映像を表示することができる。以上の変形処理は、クライアントが持つ映像表示端末において、映像の表示と同時に実行される。これにより、映像記録時にサーバ３００が変形処理を実行する必要がないので、小さい処理コストで広角の映像データを記録することが可能となる。
【００６９】
２．２．２　エンコード部
エンコード部３４の動作は、上述の実施の形態１と同様であり、前記第２撮像部３２により取得された部分映像データの少なくとも一つ及び前記変形部３３により出力されたパノラマ映像データを、それぞれ映像記録に適した形式にエンコードする。
【００７０】
なお、変形部３３が存在しない場合、エンコード部３４は、前記変形部３３により変形された広角の映像データの代わりに、前記第１撮像部３１により取得された広角の映像データをエンコードする。
【００７１】
３．実施の形態３
本発明の実施の形態３は、前記広角カメラ２００により取得された広角の映像データと、前記カメラアレイ４００により取得された各々の部分映像データとの対応関係を特定する映像記録システムに関するものである。ここでいう「対応関係」の例として、以下のものが挙げられる。
・広角カメラ２００とカメラアレイ４００を構成する各々のカメラ４０１との位置関係
・広角の映像データと各々の部分映像データとの位置関係
上記の対応関係が不明である場合、映像を再生する時に、部分映像の切り替えを要求しても、所望の部分映像データが表示される保証はない。この問題を解消するために、映像選択ボタン６０３の左向き矢印ボタンを順に押すと、再生される部分映像データが半時計回りに切り替わるように、カメラアレイ４００を設置するなどの対策が考えられる。しかし、部分映像の切り替え順序と、カメラの配置順序を対応させなくてはならないため、映像記録システム１００の設置作業が面倒になるという問題点がある。
【００７２】
図１５は、クライアントが持つ映像表示端末において、上記の対応関係を利用した表示画面の一例を示す図である。図において、広角の映像６０２の下側にバー６０４が設置され、現在表示されている部分映像６０１に対応する撮影範囲が、黒色のバー６０５で示されている。また、現在表示されている部分映像６０１以外の部分映像の撮影範囲が、それぞれ灰色のバー６０６で示されている。ここで、クライアントは、マウス　（図示せず）を操作することによりカーソル６０７を動かし、所定の部分映像を示す灰色のバー６０６の上をクリックすると、サーバ３００に対して配信要求する部分映像の選択情報が送信され、サーバ３００を介して送られるカメラアレイ４００からの部分映像６０１が、該当する部分映像に切り替えられる。このように、上記の対応関係が特定されることにより、映像記録システム１００の設置作業が楽になる。また、クライアントは一層容易に所望の映像を選択することができると共に、配信された映像から、撮影対象となるシーンをより深く理解することができる。
【００７３】
本発明の実施の形態３は、このような動作を実現するための映像記録システムに関するものである。
【００７４】
３．１　構成
前述の実施の形態１と同様に、本発明の実施の形態３の構成は、図２乃至図７に示されている。
【００７５】
３．２　動作
図１６は、本発明の実施の形態３に係る映像記録システムを、機能別のブロックで示した図であり、図８に示された前述の実施の形態１のブロック図に更に加えて、特定部３６を追加したものである。以下において、図１８に示された各部の動作を具体的に説明する。なお、第１撮像部３１、第２撮像部３２、変形部３３、及びエンコード部３４の動作は、前述の実施の形態１と同様であるので、説明を省略する。
【００７６】
３．２．１　特定部
特定部３６は、前記広角カメラ２００により取得された広角の映像データと、前記カメラアレイ４００により取得された各々の部分映像データとの対応関係を特定する動作を行う。この動作を、以下に説明する。
（１）カメラアレイ４００を構成する各カメラ４０１に付された識別子を利用する方法
図１７は、特定部３６の別の動作例を示す図である。図１７（ａ）に示すように、カメラアレイ４００を構成する各カメラ４０１に識別子４０３を付し、該カメラ４０１を前記広角カメラ２００で捕捉できる位置にそれぞれ配置する。この状態で、前記広角カメラ２００により取得される映像データは、図１７（ｂ）のようになる。この映像データにおいて、前記識別子４０３が映されている画像座標を検出することにより、広角カメラ２００と各々のカメラ４０１との位置関係を特定することができる。ここで、前記識別子４０３には、
・算用数字を付したシール、
・バーコード、
・カラーコード、
・２次元バーコード、
などを使用することが可能であり、これらの識別子を映像データから読み取る動作は、パターン認識の分野で既に周知技術となっている。
（２）広角カメラ２００及びカメラアレイ４００により取得された映像データを利用する方法
図１８は、特定部３６の別の動作例を示す図である。本動作例においては、広角の映像データと各々の部分映像データとの類似度が高い部分を検出する。
【００７７】
ここで、前記類似度の高い部分を検出する手段として、テンプレートマッチングを利用した場合の動作を説明する。まず、図１８（ａ）のように、カメラアレイ４００により取得される各々の部分映像より、（　２ＤＸ　＋　１　）×（　２ＤＹ　＋　１　）の大きさのテンプレート６０８を生成する。次に、図１８（ｂ）のように、該テンプレート６０８を広角の映像６０２上で移動させ、テンプレート６０８と広角の映像６０２における点（ｍ，
ｎ）との正規化相互相関値Ｓを、次式に基づき計算する。
【００７８】
【数６】

ここで、（６）式における各記号の意味は以下の通りである。
・Ｉ_１（ｘ，　ｙ）：テンプレート上の点（ｘ，　ｙ）における濃度
・Ｉ_２（ｘ，　ｙ）：広角の映像上の点（ｘ，　ｙ）における濃度
以上の計算に基づき、正規化相互相関値Ｓが最大となる広角の映像６０２における点（ｍ，　ｎ）を求め、該点の位置に対応するカメラ４０１を特定すればよい。以上の動作を、全ての部分映像に対して実行することにより、広角カメラ２００と各々のカメラ４０１との位置関係を特定することができる。
【００７９】
なお、濃度の相互相関に基づいて、映像の類似度を求めると述べたが、これはあくまでも一例である。映像の色空間や輪郭など、別の特徴に基づいて映像の類似度を求めても構わない。
（３）手動で特定する方法
図１９は、本実施の形態３におけるサーバ３００の表示画面を示す図である。この表示画面は、映像記録システム１００を起動し、映像の記録を開始する直前に出現する。その後ユーザは、まず映像選択ボタン６０３を操作することにより、表示される部分映像６０１を切り替える。すると、現在表示されている部分映像６０１と広角の映像６０２との位置関係を手動入力するよう促すメッセージ６０９が、該表示画面において提示される。この時ユーザは、マウス（図示せず）を操作してカーソル６０７を動かし、広角の映像６０２の上の所定の点をクリックすることにより、該位置関係を手動入力する。手動入力が完了すると、広角の映像６０２において、その部分映像６０１に対応する位置に十字形状のポインタ６１０が付される。以上の動作を、全ての部分映像に対して実行することにより、広角の映像６０２と各々の部分映像３０１との位置関係を特定することができる。
【００８０】
この方法は、映像記録の開始から終了に至るまで、広角カメラ２００及びカメラアレイ４００の配置位置が不変である場合に、特に大きな効果を奏する。これに対して、上記（１）乃至（２）の方法は、途中でカメラ４０１の配置位置を変更しても有効である。
【００８１】
以上の処理は、前記サーバ３００内の前記ＣＰＵ　３０１により実行される。このとき、前記ＨＤＤ　３０３には該特定処理を施すための所定のプログラムを予め格納しておく。
【００８２】
３．２．２　記録部
記録部３５は、前記エンコード部３４によりエンコードされた広角の映像データ及び少なくとも一つの部分映像データを記録する。このとき、映像データのみならず、前記特定部３６が特定した対応関係を併せて記録すると、映像表示端末において、図１７に示されている表示画面を提示することが可能となるので好適である。映像記録の動作については、前述の実施の形態１と同様である。
【００８３】
４．実施の形態４
本発明の実施の形態４は、前記カメラアレイ４００により取得された各々の部分映像データを自動的に選択する映像記録システムに関するものである。
【００８４】
実施の形態１乃至実施の形態３は、後に映像を再生する際に、クライアントが配信を要求する部分映像を選択するためのものであった。しかし、部分映像を毎回手動で選択するのは面倒である。
【００８５】
図２０は、クライアントが持つ映像表示端末において、表示される部分映像６０１が自動的に選択される表示画面を示す図である。図のように、「ＡＵＴＯ」と書かれたチェックボックス６１１をチェックすると、部分映像６０１を自動的に選択して配信するモードに切り替えられる。これに対し、サーバ３００は、発言者などの重要なシーンが映された部分映像を自動的に選択して、広角の映像と共にクライアントに配信する。これにより、クライアントは面倒な操作無しに、配信された映像から、撮影対象となるシーンをより深く理解することができる。
【００８６】
本実施の形態４は、このような動作を実現するための映像記録システムに関するものである。
【００８７】
４．１　構成
図２１に本発明の実施の形態４に係る映像記録システムの構成を示す図である。サーバ３００には広角カメラ２００と、カメラアレイ４００と、マイクアレイ５００とが接続され、広角の映像データ及び複数の部分映像データ及び複数の音声データが取得・記録される。サーバ３００により記録された映像データ及び音声データは、サーバ３００において表示・再生される。また、該映像データ及び該音声データは、必要に応じてインターネットを介して配信され、該インターネットに接続されたクライアントＰＣにおいて表示・再生される。
【００８８】
次に、上記各部の構成について説明する。なお、広角カメラ２００、カメラアレイ４００、及びサーバ３００の構成は、実施の形態１と同様であるので、説明を省略する。
【００８９】
４．１．１　マイクアレイ
マイクアレイ５００は、少なくとも２つのマイクロフォン５０１−１、５０１−２より構成される。使用されるマイクロフォン５０１−１、５０１−２は、圧電型、容量型　（いわゆるコンデンサマイクロフォン）など様々な種類のものを使用することができる。各々のマイクロフォン５０１−１、５０１−２は、カメラ４０１と同様に、別々に離れて配置されたものであっても、各々のマイクロフォン５０１−１、５０１−２を共通の筐体に固定して配置したものであっても構わない。図２２は、実施の形態４における広角カメラ２００及びマイクアレイ５００の構成を説明する図であり、このように、広角カメラ２００とマイクアレイ５００とを１つの筐体に一体化してもよい。図２２示したように、広角カメラ２００を構成するカメラ部２０１の撮像素子２１４と、マイクアレイ５００を構成するマイクロフォン５０１−１、５０１−２とは、台座２０２に配置されている。
【００９０】
該マイクロフォン５０１−１、５０１−２において取得された音声信号は、マイクロフォン内部でデジタル化された後、サーバ３００に送出される。カメラアレイ４００と同様に、マイクアレイ５００をサーバ３００の外部Ｉ／Ｆ　３０７、具体的にはＵＳＢ２．０のような高速シリアルインターフェースを経由して接続することにより、部分映像と音声とを同期的に取得することが可能である。
【００９１】
４．２　動作
図２３は、本実施の形態４に係る映像記録システムの、機能別のブロックを示す図である。図１６に示された実施の形態３のブロック図に加えて、音声取得部３７、音源検出部３８、及び映像選択部３９を追加したものである。以下において、図２３に示された各部の動作を具体的に説明する。なお、第１撮像部３１、第２撮像部３２、変形部３３、及び特定部３６の動作は、前述の実施の形態と同様であるので、説明を省略する。
【００９２】
４．２．１　音声取得部
音声取得部３７の構成・動作は、前述の４．１．１に記載されたカメラアレイ４００により取得され、デジタル化された音声データを出力するものである。
【００９３】
４．２．２　音源検出部
音源検出部３８は、前記音声取得部３７により取得された音声データに基づき、発言者のいる位置又は方向を検出するものである。その動作例を、以下において説明する。
（１）マイクアレイ５００に入力される音声の到達時間差による方法
本方法は、複数のマイクロフォン５０１が、ある筐体の既知の位置に固定された場合に有効である。図２４は、本発明の実施の形態４の音源検出部３８の動作原理を説明するための図である。図２４に示すように、２つのマイクロフォン５０１−１、５０１−２（それぞれマイク１、マイク２と便宜的に称することとする）が間隔ｌだけ離れて並んでおり、音声がθ方向から入射する場合、マイク１が出力する音声データｓ_１（ｔ）と、マイク１が出力する音声データｓ_１（ｔ）との関係は、
【００９４】
【数７】

ｖ：音速
となり、マイク１の音声データがマイク２の音声データに対して
【００９５】
【外１】

だけ時間が進んでいることとなる。この原理を利用して、話者の音声の方向を特定する手順を説明する。
【００９６】
まず、マイク１とマイク２の音声データの到達時間差を検出する。この到達時間差は、例えばマイク１の音声データｓ_１（ｔ）とマイク２の音声データｓ_２（ｔ＋ｄｔ）との相互相関値により計算される。ここで、相互相関値Ｃ　（ｔ，　ｄｔ）は、次式により算出される。
【００９７】
【数８】

ここで、Ｎは相関窓の大きさを示す正の整数であり、（８）式は時刻ｔ以前のＮ個のサンプルを用いて積和演算が行われることを示す。このとき、Ｃ　（ｔ，　ｄｔ）を最大化するｄｔが到達時間差となる。
【００９８】
次に、マイクの間隔ｌ、到達時間差ｄｔ及び音速ｖを用いて、音声とマイクロフォンの基線とがなす角θを計算する。
【００９９】
【数９】

ここで、θの値域は０°以上１８０°以下とする。
【０１００】
なお、以上の手順のみでは、マイクロフォン５０１−１、５０１−２の前側の１８０°の範囲しか方向が検出されず、音源方向が特定されない。すなわち、音源検出部３８が出力する角度θは、実際には音声の到達方向と２つのマイク間の基線とがなす角度であり、実際の音声の方向は図２５に示すように、２つのマイクの中点を頂点とする頂角θの円錐の側面上のいずれかに存在している。
【０１０１】
この問題を解消するために、マイク１とマイク２より構成される組と平行でない別のマイクロフォンの組を用いて補正を行う。図２６は、４つのマイクロフォン５０１−１、５０１−２、５０１−３、５０１−４を２組に分けて音源方向を検出する場合の組分けの様子を示した説明図である。図２６示したように、組分けは、あるマイクロフォン５０１−１と５０１−３（例えばマイク１（マイク３））と、そのマイクロフォンと最も距離の離れたマイクロフォン５０１−２と５０１−４（マイク２（マイク４））とを組み合わせる。
【０１０２】
最も距離の離れた２つのマイクの組を用いることで、音声の到達時間差が最大となり、方向検知の精度が向上する。なお、ここでは、マイクアレイ５００には４つのマイクロフォン５０１−１、５０１−２、５０１−３、５０１−４が備わっているが、３つのマイクロフォンによっても、音源方向を精度良く検出できる。図２７は、３つのマイクロフォン５０１−１、５０１−２、５０１−３によってマイクアレイ５００が構成される場合のマイクロフォンの組の採り方を説明する説明図である。図２７に示したように、マイクロフォンを正三角形に配置することにより、どのマイクの組を採用しても、精度良く音源方向を検出することができるようになる。なお、図２７に示した例では、第１の組と第２の組を採用して全方向の音源を検出できるが、補完的に第３の組を使用してもよい。
（２）指向性マイクアレイによる方法
また、限られた範囲の音声のみを入力可能な指向性マイクロフォンを利用することにより、発言者の方向を検出することも可能である。図２８は、本実施の形態４におけるマイクアレイ５００と音源方向との関係を説明する説明図である。このマイクアレイ５００は、指向性を有するマイクロフォン５０１を４つ有し、その音声の強度に基づいて音源方向を決定する。便宜的に４つのマイクロフォン５０１−１、５０１−２、５０１−３、５０１−４をマイク１〜４とする。
【０１０３】
今、音声強度がマイク１で２０、マイク２で３０、マイク３で２０，マイク４で５という数値であったとする。この場合はマイク２の方向に音源があると判断する。マイク１とマイク３の強度を比較するといずれも同じ値２０であるので、最終的に音源方向はマイク２の方向（図でθ＝４５°と示した方向）と決定する
図２９において、実施の形態４における音源検出部３８の動作の別の例を説明する図である。音声強度がマイク１で１５、マイク２で３０、マイク３で２５、マイク４で５であったとする。この場合はマイク２の方向に音源があると初期判断する。マイク１とマイク３の強度を比較すると、マイク３の強度がマイク１より大きいので、音源方向をマイク２方向からマイク３方向に若干量移動させた方向（図でθ＝３０°と示した方向）と決定する。この方向の移動量は指向性マイクの特性にしたがって予め決定しておけばよい。
【０１０４】
以上で説明した音源検出部３８の機能は、サーバ３００におけるＣＰＵ　３０１により実行される。このとき、前記ＨＤＤ　３０３には該機能を実現するための所定のプログラムが予め格納されている。
【０１０５】
４．２．３　映像選択部
映像選択部３９は、前記特定部３６により特定された対応関係と、前記音源検出部３８により検出された発言者の位置又は方向とを用いて、記録すべき部分映像を自動的に選択するものである。
【０１０６】
図３０は、実施の形態４における映像選択部３９の動作の一例を示す図であり、Ａ〜Ｆの６人の参加者２がテーブル１を囲んで会議を開いている様子を上から眺めたものである。テーブル１の上には、広角カメラ２００及びマイクアレイ５００が設置されており、また参加者毎にカメラ４０１（図示せず）が１台設置されている。今、音源検出部３８が検出した音源の方向が、図における矢印３８１のようであったとする。このとき、映像選択部３９は、該音源の方向と、前記特定部３６により特定された広角カメラ２００と各カメラ４０１との対応関係に基づき、該音源の方向に対し最も近くに配置されたカメラ４０１を選択する。すなわち、図においては、参加者Ｅを撮影しているカメラ４０１を選択する。
【０１０７】
以上で説明した映像選択部３９の機能は、サーバ３００におけるＣＰＵ　３０１により実現させることができる。このとき、前記ＨＤＤ　３０３には該機能を実現するための所定のプログラムを予め格納しておく。
【０１０８】
４．２．４　エンコード部
エンコード部３４の動作は、前記映像選択部３９により選択された部分映像データ及び前記変形部３３により出力されたパノラマ映像データを、それぞれ映像記録に適した形式にエンコードする。映像記録に適した形式は様々挙げられるが、例えば映像データに関しては、ＭＰＥＧに代表される動画符号化フォーマットなどの形式でエンコードする。また、映像データのみならず音声データをもエンコードしても良く、音声データに関してはＭＰＥＧオーディオフォーマットなどの形式でエンコードする。
【０１０９】
この他、ＭＰＥＧプログラムストリームのように、映像データと音声データを１つのファイルに収めて記録してもよい。このファイル形式を用いることで、後段の記録部３５において記録されるファイルの数が少なくなるので、記録されたファイルの管理を一層容易にすることができる。
エンコード部３４は、映像データの取得が継続している限り常に映像をエンコードし続け、エンコードされたデータを連続して記録部３５に送信する。
【０１１０】
以上のエンコード処理は、前記サーバ３００内の前記ＣＰＵ　３０１により実行される。このとき、前記ＨＤＤ　３０３にＭＰＥＧエンコードプログラムを予めインストールしておく。
【０１１１】
４．２．５　記録部
記録部３５は、前記エンコード部３４によりエンコードされた少なくとも１つの部分映像データ及び前記第１撮像部３１により取得された広角の映像データを記録する。このとき、映像データのみならず、前記特定部３６が特定した対応関係及び前記映像選択部３９が選択した部分映像データに関する情報を併せて記録すると好適である。
【０１１２】
ここで、選択された部分映像データに関する情報を記録する動作例を説明する。図３１は、部分映像データに関する情報の記録例を示した図である。図３１には、選択された部分映像が変わった時刻　（Ｔｉｍｅ）、新たなデータ名　（Ｆｉｌｅ）が記録されている。これらのデータは、テキストファイルなどの形式で、動画データや音声データと共にＨＤＤ　３０３に記録される。このように、選択された部分映像データが変わった時刻と、その時のデータ名とを随時記録しておくことによって、後に映像を再生する際に、適切な部分映像データの再生を行うことが可能となる。
【０１１３】
なお、ＭＰＥＧ−７のようなマルチメディア情報の内容記述標準を用いて、上記の部分映像データに関する情報を記しても良い。
【０１１４】
記録部３５は、ＨＤＤ　３０３によりその機能を実現することができる。なお、使用の態様によっては、大容量記録装置３０８によりその機能を実現してもよい。例えば、長時間の会議や、定例会議については、保存の必要性からＤＶＤ＋ＲＷ等により構成される大容量記録装置３０８に記録し、短時間の会議など、長期の保存の必要性が低いものに関してはＨＤＤ　３０３に記録するなどの使い分けを行ってもよい。
【０１１５】
５．　実施の形態５
本発明の実施の形態５は、前述の実施の形態４と同様に、前記カメラアレイ４００により取得された各々の部分映像データを自動的に選択する映像記録システムに関するものであり、カメラアレイ４００を構成する各々のカメラ４０１と、マイクアレイ５００を構成する各々のマイクロフォン５０１とを、１対１の対応関係となるよう構成したものである。ここでは「１対１の対応関係」を、「個々のカメラ４０１に対し、略一致する位置又は方角に配置されたマイクロフォン５０１が１つあること」と定義する。
【０１１６】
５．１　構成
本実施の形態における映像記録システム１００の構成は、前述の実施の形態４と同様に、図２１に示されている。
【０１１７】
次に、上図の各部の構成について説明する。なお、広角カメラ２００及びサーバ３００の構成は、前述の実施の形態１と同様であるので、説明を省略する。
【０１１８】
５．１．１　カメラアレイ及びマイクアレイ
図３２は、本発明の実施の形態５におけるカメラ４０１及びマイクロフォン５０１の外観を示す図である。図示したように、カメラ４０１とマイクロフォン５０１とは、共通の筐体５０２に一体化した構造となっている。また、マイクロフォン５０１は指向性を有し、限られた範囲の音声のみを入力可能である。この一体化されたカメラ４０１及びマイクロフォン５０１を、参加者につき１台設置する。
【０１１９】
５．２　動作
図３３は、本実施の形態に係る映像記録システムを、機能別のブロック図に書き直した図であり、図２３に示された実施の形態４のブロック図から音源検出部３８を削除し、また特定部３６から映像選択部３９への接続を削除したものである。第１撮像部３１、第２撮像部３２、変形部３３、エンコード部３４、記録部３５、特定部３６、及び音声取得部３７の動作は、前述の実施の形態４と同様である。
【０１２０】
５．２．１　映像選択部
前述の一体化されたカメラ４０１及びマイクロフォン５０１を使用することにより、各々のカメラ４０１とマイクロフォン５０１との対応関係が既知である。したがって、映像選択部３９は、最も大きな信号振幅が得られたマイクロフォン５０１に対応するカメラ４０１により取得された部分映像を選択すると良い。
【０１２１】
６．実施の形態６
本実施の形態は、前記前述の実施の形態４乃至５と同様に、前記カメラアレイ４００により取得された各々の部分映像データを自動的に選択する映像記録システムに関するものである。
【０１２２】
６．１　構成
前述の実施の形態１乃至３と同様に、図２乃至図７に示される。
【０１２３】
６．２　動作
図３４は、本実施の形態に係る映像記録システムを、機能別のブロック図に書き直した図であり、図１６に示された実施の形態３のブロック図に加えて、映像選択部３９及び動き検出部４０を追加したものである。以下において、図３４に示された各部の動作を具体的に説明する。なお、第１撮像部３１、第２撮像部３２、変形部３３、エンコード部３４、記録部３５、及び特定部３６の動作は、前述の実施の形態と同様であるので、説明を省略する。
【０１２４】
６．２．１　動き検出部
動き検出部４０は、広角の映像データにおける被写体の動きを検出し、映像中の各部位における動きの特徴量を出力するものである。ここで、「動きの特徴量」とは、被写体の動きの大小を指すものとする。
【０１２５】
動画における動きの検出は、前の時刻と現在の時刻のフレーム間の差分をとる方法、オプティカルフローによる方法などの周知技術により実現可能である。これらの技術により、広角の映像データにおいて、被写体が動いた位置及びその動きの大小を検出することができる。この動作によれば、本発明を遠隔監視システムとして使用する場合、動いている被写体を捉えたカメラからの部分映像が記録されるため、好適である。
【０１２６】
また、本実施の形態６に係る映像記録システムが、遠隔会議システムとして使用される場合、参加者の唇の動きを検出することにより、発言者の位置又は方向を自動的に検出すると好適である。唇の動きの検出は、例えば文献　（Ｍ．Ｋａｓｓ，　Ａ．Ｗｉｔｋｉｎ　ａｎｄ　Ｄ．Ｔｅｒｚｏｐｏｕｌｏｓ：“ＳＮＡＫＥＳ：　Ａｃｔｉｖｅ　Ｃｏｎｔｏｕｒ　Ｍｏｄｅｌｓ”，　ＩＣＣＶ，　ｐｐ．２５９−２６８　（１９８７）　）等の周知技術により実現できる。また、実施の形態４乃至５のように、マイクロフォン５０１が使用できる場合には、音声データに基づく発話区間の抽出結果と併せて唇の動きを検出することにより、発言者の検出精度を向上させることもできる。例えば、当出願人により出力された特開平６−４３８９７公報には、音声データから抽出された音声特徴と、映像データより抽出された顔面の動的視覚特徴とを用いて、会話を認識するシステムが開示されている。この動作により、音声データ中に発話以外の雑音が多く占められる場合でも、一層安定的に発言者の位置又は方向を検出することが可能となる。
【０１２７】
以上で説明した動き検出部４０の機能は、広角カメラ２００の内部に実装してもよいし、またサーバ３００におけるＣＰＵ　３０１により実現させても構わない。後者の場合、前記ＨＤＤ　３０３には該機能を実現するための所定のプログラムを予め格納しておく。
【０１２８】
６．２．２　映像選択部
本実施の形態６における映像選択部３９は、前記特定部３６により特定された対応関係と、前記動き検出部４０により検出された被写体の動きの特徴量とを用いて、記録すべき部分映像を自動的に選択するものである。映像選択部３９は、まず該被写体の動きの特徴量に基づき、広角の映像データにおいて最も大きな動きが検出された画像位置を特定する。次に、特定された画像位置と、前記特定部３６により特定された広角カメラ２００と各カメラ４０１との対応関係とに基づき、前述の実施の形態４において説明したのと同様の手順により、該位置に対し最も近くに配置されたカメラ４０１を選択する。これにより、最も大きな動きが検出された被写体を撮影した部分映像を自動的に選択することができる。
【０１２９】
以上で説明した映像選択部３９の機能は、サーバ３００におけるＣＰＵ　３０１により実行される。このとき、前記ＨＤＤ　３０３には該機能を実現するための所定のプログラムを予め格納しておく。
【０１３０】
７．実施の形態７
また、前記上述の実施の形態６においては、広角の映像において被写体の動きを検出したが、前記カメラアレイ４００により取得された各々の部分映像データにおいて、被写体の動きを検出してもよい。
７．１構成
本発明の実施の形態７の構成は、前述の実施の形態１乃至３と同様に、図２乃至図７に示される。
【０１３１】
７．２　動作
図３５は、本実施の形態に係る映像記録システムを、機能別のブロック図に書き直した図であり、図３４に示された実施の形態６のブロック図において、変形部３３の代わりに第２撮像部３２が出力する映像データが動き検出部４０に入力され、また特定部３６から映像選択部３９への接続を削除したものである。以下において、図３５に示された各部の動作を具体的に説明する。なお、第１撮像部３１、第２撮像部３２、変形部３３、エンコード部３４、及び記録部３５の動作は、前述の実施の形態と同様であるので、説明を省略する。
【０１３２】
７．２．１　動き検出部
本実施の形態における動き検出部４０は、各々の部分映像データにおける被写体の動きを検出し、各部分映像データにおける動きの特徴量を出力するものである。ここで、「動きの特徴量」は、上述の実施の形態６と同様に、被写体の動きの大小を指すものとする。また、各々の部分映像データにおける被写体の動きも、上述の実施の形態６で説明した周知技術により検出する。
【０１３３】
また、本実施の形態に係る映像記録システムが、遠隔会議システムとして使用される場合、部分映像における参加者の唇の動きを検出することにより、発言者の位置又は方向を自動的に検出すると好適である。この動作も、上述の実施の形態６で説明した周知技術により実現可能である。また、本実施の形態では、カメラ４０１により各参加者の顔が大きく撮影されるので、上述の実施の形態６に比較して、より安定的に参加者の唇の動きを検出することができる。
【０１３４】
以上で説明した動き検出部４０の機能は、カメラ４０１の内部に実装してもよいし、またサーバ３００におけるＣＰＵ　３０１により実現させても構わない。後者の場合、前記ＨＤＤ　３０３には該機能を実現するための所定のプログラムを予め格納しておく。
【０１３５】
７．２．２　映像選択部
本実施の形態における前記映像選択部３９は、前記動き検出部４０により検出された、部分映像における被写体の動きに基づき、記録すべき部分映像を自動的に選択するものである。具体的には、各々の部分映像における被写体の動きの特徴量から、最も大きな動きが検出された部分映像を特定し、これを記録すべき部分映像として自動的に選択する。ここで、本実施の形態は、特定部３６を必ずしも必要としないので、上述の実施の形態６に比較して、より簡単な構成・処理で適切な部分映像を選択することができる。
【０１３６】
以上で説明した映像選択部３９の機能は、サーバ３００におけるＣＰＵ　３０１により実行される。このとき、前記ＨＤＤ　３０３には該機能を実現するための所定のプログラムを予め格納しておく。
【０１３７】
７．３　その他
なお、上述の実施の形態６又は本実施の形態においては、カメラアレイ４００を構成する各々のカメラ４０１が、他のカメラと一部共通する撮影領域を含むと好適である。図３６（ａ）は、各々のカメラが互いに共通する撮影領域を含まない場合における映像表示端末の画面を示す図である。図３６に示すように、参加者Ａが席を立って移動している時、前記映像選択部３９は、該参加者Ａに最も近い撮影領域を含む部分映像（図中、黒色のバー６０５で示されたもの）を自動的に選択する。しかし、該参加者Ａがいずれのカメラ４０１においても撮影されない場所に移動した場合には、重要な被写体が何も写されていない部分映像が選択されてしまう。このように、移動中の被写体を連続的に追跡して映した部分映像を記録できないという問題が生ずる。
【０１３８】
そこで、図３６（ｂ）に示すように、各々のカメラが互いに共通する撮影領域を含むよう配置すれば、この問題を解決することができる。図中、斜線で示されたバーは、２つ以上のカメラ４０１で重複して撮影されている範囲を示す。図６のように、カメラアレイ４００を、各々のカメラ４０１を筐体４０２に固定して構成する場合には、互いの撮影範囲が一部重複するように各々のカメラ４０１を固定するとよい。
【０１３９】
８．　実施の形態８
なお、本発明に係る映像記録システム１００は、ＰＣによりその機能を実現させることができる。この場合は上記各部を実現するソフトウェアをハードディスクに格納し、適宜処理プログラムを実行させることによりその機能を実現させることができる。
【０１４０】
９．実施の形態９
また、上記プログラムを、ＣＤ−ＲＯＭのような記録媒体に格納することができる。図３７に示されるように、該プログラムを格納したＣＤ−ＲＯＭ　３０８をＰＣに装着し、適宜該プログラムを実行させることによりその機能を実現させることができる。なお、該プログラムを格納する記録媒体としては、上記ＣＤ−ＲＯＭ　３０８に限られず、例えばＤＶＤ−ＲＯＭ等の別の媒体であってもよいことはいうまでもない。
【０１４１】
以上の各実施の形態は、本発明のほんの一例を説明したにすぎず、本発明の権利範囲を上記実施の形態の通りに限定・縮小すべきではない。例えば、各実施の形態において、広角カメラ２００、カメラアレイ４００、及びマイクアレイ５００が、ＵＳＢハブに接続されるという構成例を用いて説明したが、これらの接続形態は上記説明に限定されるものではない。例えば、ＰＣＩバス、ＩＥＥＥ　１３９４、Ｂｌｕｅｔｏｏｔｈなどの別のインターフェースを使用しても構わない。
【０１４２】
また、広角カメラ２００に使用されるミラー２１１として、双曲面ミラー及び一方向に曲率をもった曲面ミラーを実施の形態に挙げたが、放物面ミラーや円錐ミラーなど、上記以外の形態であってもかまわない。
【０１４３】
また、第１撮像部３１の説明において、広角カメラ２００においてデジタル化された映像データを出力すると説明したが、広角カメラ２００がアナログの映像信号を出力するものであっても構わない。この場合、該広角カメラ２００と、アナログ映像信号に対してデジタル化処理を施すビデオキャプチャボードとを組み合わせることにより、デジタル形式の映像データを出力することができる。すなわち、上記実施の形態で説明した第１撮像部３１と同様の動作を実現することができる。
【０１４４】
また、第２撮像部３２の説明において、カメラアレイ４００を構成する各々のカメラ４０１においてデジタル化された部分映像データを出力すると説明したが、これらのカメラ４０１がそれぞれアナログの映像信号を出力するものであっても構わない。この場合、これらのカメラ４０１と、多チャンネルのアナログ映像信号に対してデジタル化処理を施すビデオキャプチャボードとを組み合わせることにより、デジタル形式の部分映像データを出力することができる。すなわち、上記実施の形態で説明した第２撮像部３２と同様の動作を実現することができる。
【０１４５】
また、音声取得部３７の説明において、マイクアレイ５００を構成する各々のマイクロフォン５０１においてデジタル化された音声データを出力すると説明したが、これらのマイクロフォン５０１がそれぞれアナログの音声信号を出力するものであっても構わない。この場合、これらのマイクロフォン５０１と、多チャンネルのアナログ音声信号に対してデジタル化処理を施すオーディオキャプチャボードとを組み合わせることにより、デジタル形式の音声データを出力することができる。すなわち、上記実施の形態で説明した音声取得部３７と同様の動作を実現することができる。
【０１４６】
また、エンコード部３４及び記録部３５が同一のサーバ３００に実装されると説明したが、サーバ３００とは別個にエンコード用ＰＣを設置しても構わない。この場合、エンコードされたデータは、電気通信回線を経由して、該エンコード用ＰＣからサーバ３００に転送される。
【０１４７】
また、エンコード部３４の動作説明においては、サーバ３００内のＣＰＵ　３０１によりＭＰＥＧエンコード処理を行うと説明したが、エンコード部３４の構成はこれに限定されない。例えば、ＭＰＥＧエンコードＩＣを内蔵したビデオキャプチャボードを用いて、映像データをＭＰＥＧ形式に変換しても構わない。音声データに関しても同様である。
【０１４８】
また、動き検出部４０の動作の説明において、「動きの特徴量」は被写体の動きの大小を指すと述べたが、例えば被写体の移動軌跡の形状など、別のものであっても構わない。
【０１４９】
【発明の効果】
本発明によれば、広角の映像を取得する第１撮像部３１と、互いに異なる所定の領域が撮影された複数の映像を同期的に取得する第２撮像部３２と、前記広角の映像及び前記第２撮像部３２により取得された映像の少なくとも一つを記録する記録部３５とを備えることにより、簡素な構成・処理で、広範囲の映像を取得・記録すると同時に、所望のシーンの映像を高い解像度で取得・記録することが可能となる。すなわち、第１の目的が達成される。
【０１５０】
更に、本発明によれば、第１撮像部３１により取得された広角の映像と、第２撮像部３２により取得された各々の部分映像との対応関係を特定する特定部３６を備えることにより、高い解像度を持つ所望のシーンの映像を一層容易に選択することが可能となる。すなわち、第２の目的が達成される。
【０１５１】
更に、本発明によれば、第２撮像部３２により取得される複数の映像より、所定の映像を選択する映像選択部３９を備えることにより、ユーザに面倒な操作を強いることなく、所望のシーンの映像を高い解像度で取得・記録することが可能となる。すなわち、第３の目的が達成される。
【０１５２】
更に、本発明によれば、広角の映像を変形する変形部３３を備えることにより、取得された広範囲の映像を、更に閲覧者に観察しやすい形で表示することが可能となる。すなわち、第４の目的が達成される。
【０１５３】
更に、本発明によれば、第２撮像部３２により取得される各々の映像が、少なくとも一の他の映像と一部の共通する領域を含むことにより、更に所望のシーンの映像を高い解像度で漏れなく取得・記録することが可能となる。すなわち、第５の目的が達成される。
【図面の簡単な説明】
【図１】本発明に係る映像記録システムの使用例を示す図である。
【図２】実施の形態１に係る映像記録システムの構成を示す図である。
【図３】実施の形態１に係る広角カメラ２００の構成を示す図である。
【図４】実施の形態１に係る広角カメラ２００の構造を示す図である。
【図５】図４に示された広角カメラ２００により撮影される映像を示す図である。
【図６】実施の形態１に係るカメラアレイ４００の一例を示す図である。
【図７】実施の形態１に係るサーバ３００の構成を示す図である。
【図８】実施の形態１に係る動作を示すブロック図である。
【図９】実施の形態１における変形部３３の動作を説明する図である。
【図１０】変形部３３における原理を説明する図である。
【図１１】変形部３３で使用される座標変換テーブルを説明する図である。
【図１２】実施の形態２に係る広角カメラ２００の構成を示す図である。
【図１３】図１２に示された広角カメラ２００により撮影される映像を示す図である。
【図１４】実施の形態２における変形部３３の動作を映像表示と同時に実現する例を示す図である。
【図１５】実施の形態３に係る映像配信システムを使用した場合の映像表示端末の表示画面の一例を示す図である。
【図１６】実施の形態３に係る動作を示すブロック図である。
【図１７】実施の形態３に係る特定部３６の動作の一例を示す図である。
【図１８】実施の形態３に係る特定部３６の動作の一例を示す図である。
【図１９】実施の形態３に係るサーバ３００の表示画面の一例を示す図である。
【図２０】実施の形態４に係る映像記録システムを使用した場合の映像表示端末の表示画面の一例を示す図である。
【図２１】実施の形態４に係る映像記録システムの構成を示す図である。
【図２２】実施の形態４における広角カメラ２００及びマイクアレイ５００の構成を説明する図である。
【図２３】実施の形態４に係る動作を示すブロック図である。
【図２４】実施の形態４における音源検出部３８の動作原理を説明する図である。
【図２５】実施の形態４における音源検出部３８の問題を説明する図である。
【図２６】実施の形態４におけるマイクロフォン５０１の配置例を説明する図である。
【図２７】実施の形態４におけるマイクロフォン５０１の別の配置例を説明する図である。
【図２８】実施の形態４における音源検出部３８の動作を説明する図である。
【図２９】実施の形態４における音源検出部３８の動作を説明する図である。
【図３０】実施の形態４における映像選択部３９の動作を説明する図である。
【図３１】実施の形態４における映像選択部３９が出力するデータを説明する図である。
【図３２】実施の形態５におけるカメラ４０１の構成を示す図である。
【図３３】実施の形態５に係る動作を示すブロック図である。
【図３４】実施の形態６に係る動作を示すブロック図である。
【図３５】実施の形態７に係る動作を示すブロック図である。
【図３６】実施の形態６及び７に係る映像配信システムを使用した場合の問題を示す図である。
【図３７】実施の形態９に係る構成例を示す図である。
【符号の説明】
１　テーブル
２　参加者
３　キャビネット
３１　第１撮像部
３２　第２撮像部
３３　変形部
３４　エンコード部
３５　記録部
３６　特定部
３７　音声取得部
３８　音源検出部
３９　映像選択部
４０　動き検出部
２００　広角カメラ
２１１　ミラー
２１２　レンズ
２１３　絞り
２１４　撮像素子
２１５　駆動部
２１６　前処理回路
２１７　モータ駆動部
３００　サーバ
３１０　バス
３２０　ＵＳＢハブ
３３０　インターネット
３５０　クライアントＰＣ
４００　カメラアレイ
４０１−１から４０１−４　カメラ
５００　マイクアレイ
５０１　マイクロフォン
６００　表示用ウィンドウ[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a system for recording images of a wide range of scenes obtained by using an imaging unit having a wide-angle field of view. Specifically, it is used for applications such as a monitoring system, a remote conference system, and a remote education system.
[0002]
[Prior art]
With the development of telecommunication technology, videoconferencing systems that take pictures of conferences and transmit the acquired images to remote locations have been used by many companies and organizations. In order to further improve the convenience of such a system, there have conventionally been proposed many devices for capturing a state of a conference as a video and a system for transmitting a partial video obtained by cutting out only a speaker. For example, in Japanese Patent Application Laid-Open No. 5-122689, a video conference system in which a speaker input is detected to determine a speaker, and a camera control unit automatically controls a camera based on the determination result to capture the speaker. Has been proposed. However, there is a problem that it takes time to control the camera to capture the speaker.
[0003]
To solve this problem, Japanese Unexamined Patent Application Publication No. 11-331827 discloses a television camera apparatus using a fisheye or super-wide-angle lens and a variable directivity microphone, which determines the direction of a sound source position and tracks the sound source position direction. An invention has been disclosed in which an image in the direction of a sound source position is cut out to generate a video signal. However, when the television camera device using a fisheye or an ultra-wide-angle lens is installed on a desk or the like, in general, less important objects such as a ceiling occupy most of the field of view, whereas a human face is placed around the periphery of the field of view. However, there is a problem that an important subject such as the above exists. Also, these lenses are expensive to design and manufacture.
[0004]
On the other hand, along with the recent evolution of image coding technology represented by MPEG (Motion Picture Experts Group), the increase in the capacity of hard disk drives (hereinafter abbreviated as HDDs), and the increase in the data transfer speed, long-time video and It has become technically possible to record audio signals on a personal computer (hereinafter abbreviated as PC) as electronic information. At the same time, there is an increasing need to record a scene such as a conference on a video and look back on the scene recorded later. However, the above prior arts are all directed to a video conference system, and transmit a video signal obtained by cutting out only a desired portion from a video obtained by the device in real time, thereby realizing a sense of realism similar to that of a normal conference. The main focus was on generating and transmitting high-quality images in real time. That is, it is not intended to record the acquired video on a PC or to browse the video on the PC later.
[0005]
[Problems to be solved by the invention]
The present invention has been made in view of the above-described problems, and a first object of the present invention is to acquire and record a wide range of images with a simple configuration and processing, and at the same time, to image a desired scene with high resolution. It is an object of the present invention to provide a video recording system that enables acquisition and recording, and a program and a recording medium for executing processing of each unit of the system.
[0006]
A second object of the present invention is to provide a video recording system that enables a viewer to more easily select a video of a desired scene having a high resolution.
[0007]
A third object of the present invention is to provide a video recording system capable of acquiring and recording a video of a desired scene at a high resolution without forcing a user to perform a troublesome operation. is there.
[0008]
Further, a fourth object of the present invention is to provide a video recording system capable of displaying an acquired wide-range video in a form that can be more easily observed by a viewer.
[0009]
Further, a fifth object of the present invention is to provide a video recording system capable of acquiring and recording a video of a desired scene at a high resolution without omission.
[0010]
[Means for Solving the Problems]
A first imaging unit 31 for acquiring a wide-angle image, a second imaging unit 32 for synchronously acquiring a plurality of images in which predetermined regions different from each other are photographed, and the wide-angle image and the second imaging unit 32 With the recording unit 35 for recording at least one of the acquired images, a wide range of images can be acquired and recorded with a simple configuration and processing, and at the same time, an image of a desired scene can be acquired and recorded at a high resolution. It becomes possible. That is, the first object is achieved.
[0011]
The invention according to claim 1 is a video recording system,
First imaging means for acquiring a wide-angle image,
A second imaging unit configured by a plurality of cameras and synchronously acquiring a plurality of videos in which predetermined regions different from each other are captured,
Recording means for recording at least one of the wide-angle image and the image acquired by the second imaging means.
[0012]
According to the first aspect of the present invention, the first imaging unit 31 for acquiring a wide-angle image, the second imaging unit 32 for synchronously acquiring a plurality of images in which predetermined regions different from each other have been photographed, By providing a recording unit 35 for recording at least one of the images acquired by the second imaging unit 32, a wide range of images can be acquired and recorded with a simple configuration and processing, and at the same time, an image of a desired scene can be acquired. Acquisition and recording at high resolution becomes possible. That is, the first object is achieved.
[0013]
The invention according to claim 2 is a video recording system, wherein a first imaging unit that acquires a wide-angle video,
A second imaging unit configured by a plurality of cameras and synchronously acquiring a plurality of videos in which predetermined regions different from each other are captured,
Identification means for identifying the correspondence between the wide-angle image and each image acquired by the second imaging means,
Recording means for recording at least one of the wide-angle image and the image acquired by the second imaging means and the correspondence.
[0014]
The invention according to claim 3 is a video recording system, wherein a plurality of cameras constituting the second imaging means are each provided with an identifier,
The first imaging unit includes an identifier attached to the plurality of cameras in an imaging range,
The specifying unit specifies the correspondence based on a shooting position of an identifier included in the wide-angle image.
[0015]
The invention according to claim 4 is a video recording system, wherein the specifying unit determines the correspondence based on a similarity between the wide-angle video and each video acquired by the second imaging unit. Identify.
[0016]
According to the inventions described in

claims

2, 3 and 4, the specifying unit that specifies the correspondence between the wide-angle image acquired by the first imaging unit 31 and each of the partial images acquired by the second imaging unit 32. By providing the camera 36, it is possible to more easily select an image of a desired scene having a high resolution. That is, the second object is achieved.
The invention according to claim 5 is a video recording system, wherein a first imaging unit that obtains a wide-angle video,
A second imaging unit configured by a plurality of cameras and synchronously acquiring a plurality of videos in which predetermined regions different from each other are captured,
Video selection means for selecting a predetermined video from the plurality of videos obtained by the second imaging means,
Recording means for recording the wide-angle image and a predetermined image selected by the image selecting means.
[0017]
The invention according to claim 6 is a video recording system, further comprising a plurality of microphones for inputting audio.
Sound source detecting means for detecting a position or a direction of a sound source based on sounds input by the plurality of microphones,
The image selection means selects the predetermined image based on the position or direction of the sound source output by the sound source detection means.
[0018]
The invention according to claim 7 is a video recording system, further comprising a motion detection unit that detects a motion of a subject in the wide-angle video or a plurality of videos acquired by the second imaging unit,
The video selection unit selects the predetermined video based on the motion of the subject output by the motion detection unit.
[0019]
By providing the video selecting unit 39 for selecting a predetermined video from a plurality of videos acquired by the second imaging unit 32 according to the inventions set forth in

claims

5, 6, and 7, the user is forced to perform a troublesome operation. Thus, it is possible to acquire and record a video of a desired scene at a high resolution. That is, the third object is achieved.
[0020]
The invention according to claim 8 is an image recording system, further comprising a deformation unit configured to deform the wide-angle image,
The recording unit simultaneously records at least one of the image transformed by the transformation unit and the image acquired by the second imaging unit.
[0021]
According to the eighth aspect of the present invention, it is possible to display the acquired wide-range image in a form that can be more easily observed by a viewer by providing the deformation unit 33 that deforms the wide-angle image. That is, the fourth object is achieved.
[0022]
In addition, since each image acquired by the second imaging unit 32 includes a part of a common area with at least one other image, an image of a desired scene can be acquired and recorded at a higher resolution without omission. It becomes possible. That is, the fifth object is achieved.
[0023]
According to a ninth aspect of the present invention, there is provided a program for causing a computer to execute processing relating to each means of the video recording system according to any one of the second to eighth aspects.
[0024]
According to a tenth aspect of the present invention, there is provided a recording medium for recording computer-readable program software for executing processing relating to each means of a video recording system.
[0025]
According to the invention as set forth in claim 9 or 10, the first imaging unit 31 for acquiring a wide-angle image, the second imaging unit 32 for synchronously acquiring a plurality of images of predetermined regions different from each other, By providing a recording unit 35 for recording at least one of a wide-angle image and an image acquired by the second imaging unit 32, a wide-range image can be acquired and recorded with a simple configuration and processing, and at the same time, a desired image can be obtained. It is possible to acquire and record a video of a scene at a high resolution. That is, the first object is achieved.
[0026]
BEST MODE FOR CARRYING OUT THE INVENTION
First, a brief description will be given of a usage example of how the video recording system is used, and then, an embodiment of the video recording system will be specifically described. In each embodiment, the constituent elements and their operations will be described, and finally, the flow of processing will be described.
[0027]
First, a usage example of the video distribution system will be described.
[0028]
FIG. 1 is an explanatory diagram outlining a use example in which the present invention is installed in a conference scene. The video recording system includes a wide-angle camera 200 for obtaining a wide-angle video, a camera array 400 including a plurality of cameras 401-1 to 401-4 having a normal angle of view, and a microphone 501 for obtaining audio during a conference. And a server 300 for capturing and recording video data acquired by the wide-angle camera 200 and the camera array 400 and audio data acquired by the microphone 501.
[0029]
As shown in FIG. 1, the wide-angle camera 200 is installed on the table 1 and collectively captures images in the direction where the participants (speakers) 2-1 to 2-4 of the conference are located, for example, all around the horizontal plane. Take an image. In addition, each of the cameras 401-1 to 401-4 included in the camera array 400 is placed, for example, in front of a participant of the conference, and photographs the figure of each participant. The images acquired by these cameras 401-1 to 401-4 are hereinafter referred to as "partial images". The server 300 is stored in the cabinet 3 and acquires video data from the wide-angle camera 200 and the camera array 400 and audio data acquired by the microphone 501, and records the acquired video data and audio data in the HDD.
[0030]
In the following embodiments, a case will be described in which the video recording system of the present invention is applied to shooting of a conference and recording of the video.
[0031]
1. Embodiment 1
First, a first embodiment of the present invention will be described.
[0032]
1.1 Configuration
FIG. 2 is a diagram showing a configuration of the video recording system according to Embodiment 1 of the present invention. The wide-angle camera 200 and the camera array 400 are connected to the server 300 via the USB hub 320 and the bus 310, and wide-angle video data and at least one partial video data obtained by the camera array 400 are obtained and recorded. You. The video data recorded by the server 300 is displayed on the server 300. The video data is distributed via the Internet as needed, and is displayed on a client PC connected to the Internet.
[0033]
Next, the configuration of each of the above units will be described.
[0034]
1.1.1 Wide-angle camera
FIG. 3 is a diagram illustrating a configuration of a wide-angle camera 200 as a first imaging unit according to the first embodiment. The wide-angle camera 200 as the first imaging unit includes a mirror 211 having a curved surface of a predetermined shape, a lens 212, an aperture 213, an imaging element 214 such as a CCD (Charge Coupled Device), and a timing of the imaging element 214. A drive unit 215 for performing control and digitizing processing such as analog-digital conversion on a video signal obtained by the image pickup device 214; And the like, and a motor drive unit 217 that drives the diaphragm 213 to control the iris.
[0035]
The mirror 211 is for enabling wide-angle imaging by reflecting light incident on the optical system. Here, a hyperboloid mirror is used as a mirror having a curved surface of a predetermined shape. FIG. 4 is a diagram illustrating an optical path when the hyperboloid mirror 211 of the present embodiment is used. FIG. 5 is a diagram illustrating a state of a wide-angle image formed on the surface of the image sensor 214 by the hyperboloid mirror 211 of the present embodiment. As shown in FIG. 5, the image reflected by the hyperboloid mirror 211 and taken into the image sensor 214 has a donut shape (this donut-shaped image is hereinafter referred to as a “donut image”). The donut image is formed on the image sensor 214, digitized by the driving unit 215, and sent to a server 300 described later via a preprocessing circuit 216. The center in FIG. 4 reflects the direction of the image sensor 214, which is unimportant image information. Therefore, the top 218 of the hyperboloid mirror 211 may be painted black to be used as black information. Note that, depending on the mode of use, a reference line may be drawn on the crown 218 and the motor drive unit 217 may be driven when the wide-angle camera 200 starts up, so that it may be used for initial settings such as focus adjustment.
[0036]
As described above, by combining a normal camera and a mirror, a wide-angle image can be captured with a low-cost and simple configuration.
[0037]
1.1.2 Camera array
The camera array 400 includes at least one camera, and each camera captures a part of a scene in the capturing range of the wide-angle camera 200 at a higher resolution. Although the cameras 401 constituting the camera array 400 are arranged separately as shown in FIG. 1, each of the cameras 401-1 to 401-3 is fixedly arranged on the housing 402 as shown in FIG. It does not matter. As an image sensor used for the camera 401, various types such as a CCD, a CMOS (Complementary Metal-Oxide Semiconductor) type can be used. The video signal formed by the image sensor is digitized inside the camera, and then sent to a server 300 described later.
[0038]
By preparing at least one camera 401 having the above configuration, it is possible to acquire a high-resolution partial video with a simple and inexpensive configuration.
[0039]
1.1.3 Server
FIG. 7 is a diagram illustrating a configuration example of the server 300 according to the present embodiment. That is, a CPU (Central Processing Unit) 301 that performs various controls and processes in the video recording system 100, an SDRAM (Synchronous Dynamic Random Access Memory) 302, an HDD (Hard Disk Drive) 303, and a pointing device such as a mouse 311 and mouse 311. Various input interfaces (hereinafter abbreviated as I / F) 304 such as a keyboard 312, a power supply 305, a display I / F 306 for connecting a display such as a CRT (Cathode Ray Tube), the wide-angle camera 200 and the camera An external I / F 307 for connecting an external device such as the array 400 and a large-sized device such as a DVD (Digital Versatile Disc) + RW drive. And the amount recording unit 308, and by connecting via the bus 313.
[0040]
Next, each component of the server 300 will be described. The CPU 301 performs various processes and controls, such as acquisition and recording of images from the wide-angle camera 200 and the camera array 400, according to a predetermined program stored in the HDD 303. The SDRAM 302 is used as a work area of the CPU 301 and stores various processing programs stored in the HDD 303 and an OS (Operating System) such as Windows (registered trademark) NT Server (registered trademark of Microsoft Corporation, USA). Used as an area. The HDD 303 is also used as an area for recording the acquired video.
[0041]
Examples of the external I / F 307 include various I / F boards, wireless I / Fs such as USB (Universal Serial Bus), IEEE 1394, IrDA, and Bluetooth. The wide-angle video data and the plurality of partial video data acquired by the camera array 400 are synchronized by connecting the wide-angle camera 200 and the camera array 400 to the server 300 via a high-speed serial interface such as USB 2.0. It is possible to obtain it. Data obtained via the external I / F 307 is recorded on the HDD 303 or the large-capacity recording device 308.
[0042]
1.2 Operation
FIG. 8 is a diagram in which the video recording system according to the present embodiment shown in FIG. 2 is rewritten into a block diagram for each function. Hereinafter, the operation of each unit shown in FIG. 8 will be specifically described.
[0043]
1.2.1 First imaging unit
The first imaging unit 31 is configured by the wide-angle camera 200 described in 1.1.1 above, and performs an operation of outputting acquired and digitized wide-angle video data.
[0044]
1.2.2 Second imaging unit
The second imaging unit 32 is configured by the camera array 400 described in 1.1.2 above, and performs an operation of outputting acquired and digitized partial video data.
[0045]
1.2.3 Deformed part
FIG. 9 is a diagram illustrating the operation of the deformation unit 33 according to the first embodiment. The transformation unit 33 transforms the wide-angle video data acquired by the first imaging unit 31 into an image close to a perspective transformed image captured by a normal camera (hereinafter, referred to as a panoramic image), as shown in FIG. It is. Generally, as described above, the image obtained by a camera capable of capturing a wide-angle range differs from the shape of an image that can be confirmed by the human eye and contains large distortion, so that it is convenient for later viewing. Is preferably subjected to a deformation process. The following is described in the literature (AM Brookstein and TJ Richardson: "Omniview Cameras with Curved Surface Mirrors", Proc. A method of transforming wide-angle image data (the donut image shown in FIG. 6) into a panoramic image will be described.
[0046]
FIG. 10 is a diagram for explaining the principle of image deformation in a camera using a hyperboloid mirror. FIG. 12A shows an example of the operation of the deforming unit 33, in which a donut image is coordinate-converted into a panoramic image displayed on a cylindrical surface having an azimuth on the horizontal axis and an elevation angle on the vertical axis. FIG. 10B is a diagram showing the geometric structure of the wide-angle camera 200, and the optical system of the camera in FIG. 10B is a central projection model. Here, the meaning of each variable in the figure is as follows.
(U, v): coordinates in the donut image
(U ₀ , V ₀ ): The coordinates of the center of the hyperboloid mirror in the donut image
(Θ, φ): coordinates in panoramic video
r: (u ₀ , V ₀ ) To (u, v) in pixels
r _max : Radius of pixel of hyperboloid mirror in donut image
θ: azimuth angle
φ: Elevation angle
ψ: Vertical angle from the optical axis of the camera
F: Focus of hyperboloid mirror
F ': Focus of the hyperboloid paired with the hyperboloid mirror, coincides with the optical center of the camera.
[0047]
At this time, the following relationship is established between the vertical angle ψ and the elevation angle φ.
[0048]
(Equation 1)

here,
[0049]
(Equation 2)

It is. In addition, φmin is a value of the elevation angle φ corresponding to the position of the radius rmax on the donut image, and represents a photographing limit value of the camera in the elevation angle direction. Generally, the values of rmax and φmin can be easily known.
Hereinafter, the procedure of the deformation will be described. (I) The polar coordinates (r, θ) corresponding to the point (u, v) are obtained by solving the following equation.
[0050]
[Equation 3]

(Ii) An apex angle 対応 corresponding to r calculated by the equation (3) is obtained by the following equation.
[0051]
(Equation 4)

here,
[0052]
(Equation 5)

And ψ _max Is the radius r on the donut image _max Position and elevation angle φ _min Is the value of the vertex angle に. ψ _max Is given by φ in equation (1). _min Can be obtained by substituting
[0053]
(Iii) The elevation angle φ corresponding to ψ calculated by the equation (4) is obtained by the equation (1).
[0054]
According to the above procedure, an arbitrary point (u, v) in the donut image captured by the hyperboloid mirror can be coordinate-converted to a point (θ, φ) in the panoramic image. That is, the donut image is transformed into a panoramic image.
[0055]
FIG. 11 is a diagram illustrating a coordinate conversion table used in the deforming unit 33. When recording a panoramic video from photographing at a time, the calculation time required for the above-described transformation processing becomes a problem. Therefore, it is preferable to create a coordinate conversion table based on the above procedure in advance as shown in FIG. is there. In the coordinate conversion table of FIG. 11, the coordinates (u, v) of the donut image corresponding to each point (θ, φ) are stored.
[0056]
The above modification processing is executed by the CPU 301 in the server 300. At this time, a predetermined program for performing the deformation processing is stored in the HDD 303 in advance.
[0057]
1.2.4 Encoding unit
The encoding unit 34 of FIG. 8 encodes at least one of the partial video data acquired by the second imaging unit 32 of FIG. 8 and the panoramic video data output by the transformation unit 33 into a format suitable for video recording. I do. Here, there are various formats suitable for video recording, and a format such as a moving image encoding format represented by MPEG is used. The encoding unit 34 always encodes the video as long as the acquisition of the video data is continued, and continuously transmits the encoded data to the recording unit 35.
[0058]
The above encoding process is executed by the CPU 301 in the server 300. At this time, an MPEG encoding program is installed in the HDD 303 in advance.
[0059]
1.2.5 Recording section
The recording unit 35 records at least one piece of partial video data encoded by the encoding unit 34 and wide-angle video data acquired by the first imaging unit 31.
[0060]
The function of the recording unit 35 can be realized by the HDD 303. Note that the function may be realized by the large-capacity recording device 308 depending on the mode of use. For example, a long-time meeting or a regular meeting is recorded on a large-capacity recording device 308 composed of a DVD + RW or the like because of the necessity of storage. The data may be selectively used such as recording on the HDD 303.
[0061]
2. Embodiment 2
Further, a curved mirror having a curvature in one direction can be used for the wide-angle camera 200.
[0062]
2.1 Configuration
The configuration of the second embodiment is shown in FIG. 4, as in the first embodiment.
[0063]
Hereinafter, the configuration of wide-angle camera 200 according to the present embodiment will be described. The configurations of the server 300 and the camera array 400 are also the same as those described in the first embodiment, and a description thereof will not be repeated.
[0064]
2.1.1 Wide-angle camera
FIG. 12 is a diagram illustrating a configuration of the wide-angle camera 200 according to the present embodiment. As shown in FIG. 12, the wide-angle camera 200 includes a camera 219 having a normal angle of view and a mirror 211 having a curvature in one direction, and cannot capture images in all directions, but has a wide range of scenes. Can be taken. FIG. 13 is a diagram showing a state of a wide-angle image projected on the camera 219 when the mirror 211 is used, and a scene behind the camera 219 can be photographed. As shown in FIG. 13, the captured image has a horizontally compressed shape with the horizontal angle of the incident light and the horizontal coordinate of the position of the captured image being proportional. . It is also possible to improve the camera 219 to reduce the reflection of the image on the image.
[0065]
As described above, by combining a normal camera and a mirror, a wide-angle image can be captured with a low-cost and simple configuration.
[0066]
2.2 Operation
A block diagram of each function according to the present embodiment is shown in FIG. 9 similarly to the first embodiment. Hereinafter, the operation of each unit shown in FIG. 9 will be specifically described. Note that the operations of the first imaging unit 31, the second imaging unit 32, and the recording unit 35 are the same as those in the first embodiment, and a description thereof will be omitted.
[0067]
2.2.1 Deformation part
In the second embodiment of the present invention, it is possible to obtain a panoramic video simply by extending the video data acquired by the wide-angle camera 200 uniformly in the horizontal direction. Similar to the case of using the hyperboloid mirror, a coordinate conversion table as shown in FIG. 11 may be created, and the coordinates (u, v) of the image before deformation corresponding to each point of the panoramic image may be stored. .
[0068]
Further, when the wide-angle camera 200 is used, a panoramic video can be generated and displayed on the client side that displays the video without providing the video recording system 100 with the deformation unit 33. Now, it is assumed that a horizontal (horizontal) direction imaging range is 180 degrees, a vertical (vertical) direction imaging range is 60 degrees, and an image having a size of 352 × 240 pixels is acquired by the wide-angle camera 200. In this case, a panoramic image can be obtained by extending the length in the horizontal direction to three times, that is, to 1056 pixels. Also, the machine name of the server 300 is “vidserv”, the wide-angle video data name distributed from the video recording system is “movie.rm” (a data format called RealVideo described later), and the communication between the video display terminal and the server 300 is performed. The protocol used is RTSP (Real Time Streaming Protocol). At this time, as shown in FIG. 14, the process of executing the stretching process can be described using SMIL (Synchronized Multimedia Integrated Language) recommended by the World Wide Web Consortium (W3C). As shown in FIG. 14, when the size of the display area specified in the <region> tag is different from the image size of the associated video data “movie.rm”, the display is performed by specifying the fit attribute as “fill”. The video data is scaled up and down according to the size of the area. That is, a panoramic video can be displayed by specifying a display area having a desired enlargement ratio for video data while the fit attribute value is specified as described above. The above-described transformation processing is executed simultaneously with the display of an image on the image display terminal of the client. This eliminates the need for the server 300 to execute the transformation process when recording video, and thus allows recording wide-angle video data at a small processing cost.
[0069]
2.2.2 Encoding unit
The operation of the encoding unit 34 is the same as that of the first embodiment described above. At least one of the partial video data acquired by the second imaging unit 32 and the panorama video data output by the transformation unit 33 are Encode to a format suitable for video recording.
[0070]
When the deforming unit 33 does not exist, the encoding unit 34 encodes the wide-angle video data obtained by the first imaging unit 31 instead of the wide-angle video data deformed by the deforming unit 33.
[0071]
3. Embodiment 3
The third embodiment of the present invention relates to a video recording system that specifies a correspondence between wide-angle video data acquired by the wide-angle camera 200 and respective partial video data acquired by the camera array 400. . Examples of the “correspondence” here include the following.
The positional relationship between the wide-angle camera 200 and each camera 401 constituting the camera array 400
・ Position relationship between wide-angle video data and each partial video data
If the above-mentioned correspondence is unknown, there is no guarantee that desired partial video data will be displayed even if a request to switch the partial video is made at the time of reproducing the video. In order to solve this problem, countermeasures such as installing the camera array 400 such that the partial video data to be reproduced is switched counterclockwise when the left arrow buttons of the video selection button 603 are sequentially pressed are considered. However, there is a problem in that the order of switching the partial images and the order of arranging the cameras must correspond to each other, so that the installation work of the video recording system 100 is complicated.
[0072]
FIG. 15 is a diagram illustrating an example of a display screen using the above-described correspondence in the video display terminal of the client. In the figure, a bar 604 is provided below the wide-angle image 602, and a shooting range corresponding to the currently displayed partial image 601 is indicated by a black bar 605. The shooting ranges of partial images other than the currently displayed partial image 601 are indicated by gray bars 606, respectively. Here, the client moves the cursor 607 by operating a mouse (not shown) and clicks on a gray bar 606 indicating a predetermined partial image, and selects a partial image to be requested to be delivered to the server 300. The information is transmitted, and the partial video 601 from the camera array 400 transmitted via the server 300 is switched to the corresponding partial video. By specifying the above-described correspondence, the installation work of the video recording system 100 is facilitated. In addition, the client can more easily select a desired video, and can further deeply understand the scene to be captured from the distributed video.
[0073]
Embodiment 3 of the present invention relates to a video recording system for realizing such an operation.
[0074]
3.1 Configuration
As in the first embodiment, the configuration of the third embodiment of the present invention is shown in FIGS.
[0075]
3.2 Operation
FIG. 16 is a diagram showing the video recording system according to the third embodiment of the present invention by functional blocks. In addition to the block diagram of the first embodiment shown in FIG. A part 36 is added. Hereinafter, the operation of each unit illustrated in FIG. 18 will be specifically described. Note that the operations of the first imaging unit 31, the second imaging unit 32, the deforming unit 33, and the encoding unit 34 are the same as those in the first embodiment, and a description thereof will be omitted.
[0076]
3.2.1 Specific part
The specifying unit 36 performs an operation of specifying the correspondence between the wide-angle video data acquired by the wide-angle camera 200 and each of the partial video data acquired by the camera array 400. This operation will be described below.
(1) A method of using an identifier assigned to each camera 401 constituting the camera array 400
FIG. 17 is a diagram illustrating another operation example of the specifying unit 36. As shown in FIG. 17A, an identifier 403 is assigned to each camera 401 constituting the camera array 400, and the cameras 401 are arranged at positions where the wide-angle camera 200 can capture them. In this state, the video data acquired by the wide-angle camera 200 is as shown in FIG. In this video data, the positional relationship between the wide-angle camera 200 and each camera 401 can be specified by detecting the image coordinates where the identifier 403 is projected. Here, the identifier 403 includes:
・ Seal with arithmetic numerals,
·barcode,
・ Color code,
・ 2D barcode,
For example, the operation of reading these identifiers from video data is a well-known technique in the field of pattern recognition.
(2) Method of using video data acquired by wide-angle camera 200 and camera array 400
FIG. 18 is a diagram illustrating another operation example of the specifying unit 36. In this operation example, a portion having a high similarity between wide-angle video data and each of the partial video data is detected.
[0077]
Here, an operation when template matching is used as a means for detecting a portion having a high similarity will be described. First, as shown in FIG. 18A, a template 608 having a size of (2DX + 1) × (2DY + 1) is generated from each partial image acquired by the camera array 400. Next, as shown in FIG. 18B, the template 608 is moved on the wide-angle image 602, and the point (m,
The normalized cross-correlation value S with n) is calculated based on the following equation.
[0078]
(Equation 6)

Here, the meaning of each symbol in the equation (6) is as follows.
・ I ₁ (X, y): density at point (x, y) on the template
・ I ₂ (X, y): density at point (x, y) on wide-angle video
Based on the above calculation, the point (m, n) in the wide-angle image 602 at which the normalized cross-correlation value S is maximum may be obtained, and the camera 401 corresponding to the position of the point may be specified. By performing the above operation on all partial images, the positional relationship between the wide-angle camera 200 and each camera 401 can be specified.
[0079]
Although it has been described that the similarity of an image is obtained based on the cross-correlation of density, this is merely an example. The similarity of an image may be obtained based on another characteristic such as a color space or an outline of the image.
(3) Manual identification method
FIG. 19 is a diagram showing a display screen of server 300 in the third embodiment. This display screen appears just before the video recording system 100 is started and video recording is started. Thereafter, the user first operates the video selection button 603 to switch the displayed partial video 601. Then, a message 609 prompting manual input of the positional relationship between the currently displayed partial image 601 and the wide-angle image 602 is presented on the display screen. At this time, the user operates the mouse (not shown) to move the cursor 607, and clicks a predetermined point on the wide-angle image 602 to manually input the positional relationship. When the manual input is completed, a cross-shaped pointer 610 is attached to a position corresponding to the partial image 601 in the wide-angle image 602. By performing the above operation for all the partial images, the positional relationship between the wide-angle image 602 and each of the partial images 301 can be specified.
[0080]
This method is particularly effective when the arrangement positions of the wide-angle camera 200 and the camera array 400 are unchanged from the start to the end of video recording. On the other hand, the methods (1) and (2) are effective even if the arrangement position of the camera 401 is changed on the way.
[0081]
The above processing is executed by the CPU 301 in the server 300. At this time, a predetermined program for performing the specific processing is stored in the HDD 303 in advance.
[0082]
3.2.2 Recording unit
The recording unit 35 records the wide-angle video data and at least one partial video data encoded by the encoding unit 34. At this time, it is preferable to record not only the video data but also the correspondence specified by the specifying unit 36, because the display screen shown in FIG. 17 can be presented on the video display terminal. . The video recording operation is the same as in the first embodiment.
[0083]
4. Embodiment 4
The fourth embodiment of the present invention relates to a video recording system for automatically selecting each partial video data acquired by the camera array 400.
[0084]
In the first to third embodiments, the client selects a partial video to be requested for distribution when the video is reproduced later. However, it is troublesome to manually select a partial image every time.
[0085]
FIG. 20 is a diagram showing a display screen on which a partial video 601 to be displayed is automatically selected on the video display terminal of the client. As shown in the figure, when the check box 611 written “AUTO” is checked, the mode is switched to a mode in which the partial video 601 is automatically selected and distributed. On the other hand, the server 300 automatically selects a partial video in which an important scene such as a speaker is shown, and distributes the partial video to the client together with the wide-angle video. Thereby, the client can understand the scene to be captured more deeply from the distributed video without troublesome operation.
[0086]
Embodiment 4 relates to a video recording system for realizing such an operation.
[0087]
4.1 Configuration
FIG. 21 shows a configuration of a video recording system according to Embodiment 4 of the present invention. A wide-angle camera 200, a camera array 400, and a microphone array 500 are connected to the server 300, and wide-angle video data, a plurality of partial video data, and a plurality of audio data are obtained and recorded. The video data and the audio data recorded by the server 300 are displayed and reproduced in the server 300. The video data and the audio data are distributed via the Internet as necessary, and are displayed and reproduced on a client PC connected to the Internet.
[0088]
Next, the configuration of each of the above units will be described. Note that the configurations of the wide-angle camera 200, the camera array 400, and the server 300 are the same as those in the first embodiment, and a description thereof will not be repeated.
[0089]
4.1.1 Microphone array
The microphone array 500 includes at least two microphones 501-1 and 501-2. Various types of microphones 501-1 and 501-2 such as a piezoelectric type and a capacitive type (so-called condenser microphone) can be used. Like the camera 401, each microphone 501-1 and 501-2 may be separately arranged separately, but each microphone 501-1 and 501-2 may be fixed to a common housing. They may be arranged. FIG. 22 is a diagram illustrating the configuration of the wide-angle camera 200 and the microphone array 500 according to the fourth embodiment. As described above, the wide-angle camera 200 and the microphone array 500 may be integrated into one housing. As shown in FIG. 22, the imaging element 214 of the camera unit 201 configuring the wide-angle camera 200 and the microphones 501-1 and 501-2 configuring the microphone array 500 are arranged on the pedestal 202.
[0090]
The audio signals acquired by the microphones 501-1 and 501-2 are digitized inside the microphones and then transmitted to the server 300. Similarly to the camera array 400, by connecting the microphone array 500 via an external I / F 307 of the server 300, specifically via a high-speed serial interface such as USB 2.0, the partial video and audio can be synchronized. It is possible to obtain.
[0091]
4.2 Operation
FIG. 23 is a diagram illustrating functional blocks of the video recording system according to the fourth embodiment. In addition to the block diagram of the third embodiment shown in FIG. 16, a sound acquisition unit 37, a sound source detection unit 38, and a video selection unit 39 are added. The operation of each unit shown in FIG. 23 will be specifically described below. Note that the operations of the first imaging unit 31, the second imaging unit 32, the deforming unit 33, and the specifying unit 36 are the same as those in the above-described embodiment, and thus description thereof will be omitted.
[0092]
4.2.1 Voice Acquisition Unit
The configuration and operation of the voice acquisition unit 37 are to output the voice data acquired and digitized by the camera array 400 described in 4.1.1.
[0093]
4.2.2 Sound source detector
The sound source detection unit 38 detects the position or direction of the speaker based on the audio data acquired by the audio acquisition unit 37. An example of the operation will be described below.
(1) Method based on arrival time difference of voice input to microphone array 500
This method is effective when a plurality of microphones 501 are fixed at known positions on a certain housing. FIG. 24 is a diagram for explaining the operation principle of the sound source detection unit 38 according to the fourth embodiment of the present invention. As shown in FIG. 24, two microphones 501-1 and 501-2 (hereinafter referred to as microphone 1 and microphone 2 for convenience) are arranged at an interval l, and sound enters from the θ direction. In the case, the audio data s output from the microphone 1 ₁ (T) and audio data s output by the microphone 1 ₁ The relationship with (t) is
[0094]
(Equation 7)

v: speed of sound
And the audio data of the microphone 1 is different from the audio data of the microphone 2
[0095]
[Outside 1]

Only time has advanced. A procedure for specifying the direction of a speaker's voice using this principle will be described.
[0096]
First, the arrival time difference between the audio data of the microphone 1 and the microphone 2 is detected. This arrival time difference is, for example, the audio data s of the microphone 1 ₁ (T) and the voice data s of the microphone 2 ₂ It is calculated by a cross-correlation value with (t + dt). Here, the cross-correlation value C (t, dt) is calculated by the following equation.
[0097]
(Equation 8)

Here, N is a positive integer indicating the size of the correlation window, and equation (8) indicates that the product-sum operation is performed using N samples before time t. At this time, dt that maximizes C (t, dt) is the arrival time difference.
[0098]
Next, the angle θ between the voice and the baseline of the microphone is calculated using the microphone interval l, the arrival time difference dt, and the sound velocity v.
[0099]
(Equation 9)

Here, the value range of θ is from 0 ° to 180 °.
[0100]
Note that only the above procedure detects the direction only in the range of 180 ° on the front side of the microphones 501-1 and 501-2, and does not specify the sound source direction. That is, the angle θ output by the sound source detection unit 38 is actually the angle between the arrival direction of the sound and the base line between the two microphones, and the actual sound direction is, as shown in FIG. Exists on any side surface of the cone with the vertex angle θ having the middle point as the vertex.
[0101]
In order to solve this problem, the correction is performed using another set of microphones that is not parallel to the set including the microphone 1 and the microphone 2. FIG. 26 is an explanatory diagram showing how the microphones 501-1, 501-2, 501-3, and 501-4 are grouped into two groups to detect the sound source direction. As shown in FIG. 26, the microphones are grouped into certain microphones 501-1 and 501-3 (for example, microphone 1 (microphone 3)), and microphones 501-2 and 501-4 (microphone 2) furthest from the microphone. (Microphone 4)).
[0102]
By using the pair of the two microphones furthest from each other, the difference in arrival time of the voice is maximized, and the accuracy of the direction detection is improved. Note that, here, the microphone array 500 includes four microphones 501-1, 501-2, 501-3, and 501-4, but the three microphones can also accurately detect the sound source direction. FIG. 27 is an explanatory diagram illustrating how to adopt a set of microphones when the microphone array 500 is configured by three microphones 501-1, 501-2, and 501-3. As shown in FIG. 27, by arranging the microphones in an equilateral triangle, the direction of the sound source can be detected with high accuracy regardless of which microphone set is employed. In the example shown in FIG. 27, sound sources in all directions can be detected using the first set and the second set, but the third set may be used complementarily.
(2) Directional microphone array method
It is also possible to detect the direction of the speaker by using a directional microphone capable of inputting only a limited range of sound. FIG. 28 is an explanatory diagram illustrating the relationship between the microphone array 500 and the sound source direction according to the fourth embodiment. The microphone array 500 has four microphones 501 having directivity, and determines a sound source direction based on the intensity of the sound. For convenience, the four microphones 501-1, 501-2, 501-3, and 501-4 are microphones 1-4.
[0103]
Now, it is assumed that the sound intensity is 20 for the

microphone

1, 30 for the

microphone

2, 20 for the

microphone

3, and 5 for the microphone 4. In this case, it is determined that there is a sound source in the direction of the microphone 2. When the intensities of the

microphones

1 and 3 are compared with each other, both have the same value 20, so that the sound source direction is finally determined to be the direction of the microphone 2 (direction indicated by θ = 45 ° in the figure).
FIG. 29 is a diagram illustrating another example of the operation of the sound source detection unit 38 according to Embodiment 4 in FIG. It is assumed that the sound intensity is 15 for the

microphone

1, 30 for the

microphone

2, 25 for the

microphone

3, and 5 for the microphone 4. In this case, it is initially determined that there is a sound source in the direction of the microphone 2. Comparing the intensities of the microphone 1 and the microphone 3, since the intensity of the microphone 3 is larger than that of the microphone 1, a direction in which the sound source direction is slightly moved from the direction of the microphone 2 to the direction of the microphone 3 (the direction indicated by θ = 30 ° in the figure) ). The amount of movement in this direction may be determined in advance according to the characteristics of the directional microphone.
[0104]
The function of the sound source detection unit 38 described above is executed by the CPU 301 in the server 300. At this time, a predetermined program for realizing the function is stored in the HDD 303 in advance.
[0105]
4.2.3 Image selection section
The video selecting unit 39 automatically selects a partial video to be recorded by using the correspondence specified by the specifying unit 36 and the position or direction of the speaker detected by the sound source detecting unit 38. It is.
[0106]
FIG. 30 is a diagram illustrating an example of the operation of the video selection unit 39 according to the fourth embodiment. A state in which six participants A to F are holding a conference around the table 1 is viewed from above. Things. On the table 1, a wide-angle camera 200 and a microphone array 500 are installed, and one camera 401 (not shown) is installed for each participant. Now, it is assumed that the direction of the sound source detected by the sound source detection unit 38 is as indicated by an arrow 381 in the figure. At this time, based on the direction of the sound source and the correspondence between the wide-angle camera 200 and each camera 401 specified by the specifying unit 36, the video selection unit 39 selects the camera closest to the direction of the sound source. Select 401. That is, in the figure, the camera 401 that is shooting the participant E is selected.
[0107]
The functions of the video selection unit 39 described above can be realized by the CPU 301 in the server 300. At this time, a predetermined program for realizing the function is stored in the HDD 303 in advance.
[0108]
4.2.4 Encoding unit
The operation of the encoding unit 34 encodes the partial video data selected by the video selection unit 39 and the panoramic video data output by the transformation unit 33 into a format suitable for video recording. There are various formats suitable for video recording. For example, video data is encoded in a format such as a moving image encoding format represented by MPEG. Also, not only video data but also audio data may be encoded, and audio data is encoded in a format such as the MPEG audio format.
[0109]
In addition, video data and audio data may be stored in one file and recorded like an MPEG program stream. By using this file format, the number of files recorded in the recording unit 35 at the subsequent stage is reduced, so that the management of the recorded files can be further facilitated.
The encoding unit 34 always encodes the video as long as the acquisition of the video data is continued, and continuously transmits the encoded data to the recording unit 35.
[0110]
The above encoding process is executed by the CPU 301 in the server 300. At this time, an MPEG encoding program is installed in the HDD 303 in advance.
[0111]
4.2.5 Recording unit
The recording unit 35 records at least one piece of partial video data encoded by the encoding unit 34 and wide-angle video data acquired by the first imaging unit 31. At this time, it is preferable to record not only the video data but also the information on the correspondence specified by the specifying unit 36 and the information on the partial video data selected by the video selecting unit 39.
[0112]
Here, an operation example of recording information on the selected partial video data will be described. FIG. 31 is a diagram illustrating a recording example of information regarding partial video data. In FIG. 31, the time (Time) at which the selected partial video has changed and a new data name (File) are recorded. These data are recorded in the HDD 303 together with moving image data and audio data in a format such as a text file. In this way, by recording the time at which the selected partial video data has changed and the data name at that time, the appropriate partial video data can be reproduced when the video is later reproduced. It becomes.
[0113]
The information on the partial video data may be described using a multimedia information description standard such as MPEG-7.
[0114]
The function of the recording unit 35 can be realized by the HDD 303. Note that the function may be realized by the large-capacity recording device 308 depending on the mode of use. For example, a long-time meeting or a regular meeting is recorded on a large-capacity recording device 308 composed of a DVD + RW or the like because of the necessity of storage. The data may be selectively used such as recording on the HDD 303.
[0115]
5. Embodiment 5
The fifth embodiment of the present invention relates to a video recording system for automatically selecting each of the partial video data acquired by the camera array 400 as in the fourth embodiment described above. Each of the cameras 401 and each of the microphones 501 of the microphone array 500 are configured to have a one-to-one correspondence. Here, the “one-to-one correspondence” is defined as “there is one microphone 501 arranged at a position or a direction substantially corresponding to each camera 401”.
[0116]
5.1 Configuration
The configuration of the video recording system 100 according to the present embodiment is shown in FIG. 21 as in the above-described fourth embodiment.
[0117]
Next, the configuration of each unit in the above figure will be described. Note that the configurations of the wide-angle camera 200 and the server 300 are the same as those of the first embodiment described above, and a description thereof is omitted.
[0118]
5.1.1 Camera array and microphone array
FIG. 32 is a diagram illustrating the appearance of the camera 401 and the microphone 501 according to Embodiment 5 of the present invention. As illustrated, the camera 401 and the microphone 501 have a structure integrated with a common housing 502. The microphone 501 has directivity, and can input only a limited range of sound. One camera 401 and one microphone 501 are installed for each participant.
[0119]
5.2 Operation
FIG. 33 is a diagram in which the video recording system according to the present embodiment has been rewritten into a block diagram for each function. The sound source detection unit 38 is deleted from the block diagram of the fourth embodiment shown in FIG. The connection from the specifying unit 36 to the video selecting unit 39 is deleted. The operations of the first imaging unit 31, the second imaging unit 32, the transforming unit 33, the encoding unit 34, the recording unit 35, the specifying unit 36, and the audio acquisition unit 37 are the same as those in the fourth embodiment.
[0120]
5.2.1 Image selection unit
By using the above-described integrated camera 401 and microphone 501, the correspondence between each camera 401 and microphone 501 is known. Therefore, it is preferable that the video selection unit 39 selects the partial video acquired by the camera 401 corresponding to the microphone 501 from which the largest signal amplitude was obtained.
[0121]
6. Embodiment 6
The present embodiment relates to a video recording system for automatically selecting each partial video data acquired by the camera array 400, as in the above-described fourth and fifth embodiments.
[0122]
6.1 Configuration
2 to 7 as in the first to third embodiments.
[0123]
6.2 Operation
FIG. 34 is a diagram in which the video recording system according to the present embodiment has been rewritten into a block diagram for each function. In addition to the block diagram of the third embodiment shown in FIG. The detection unit 40 is added. The operation of each unit shown in FIG. 34 will be specifically described below. The operations of the first imaging unit 31, the second imaging unit 32, the transforming unit 33, the encoding unit 34, the recording unit 35, and the specifying unit 36 are the same as those in the above-described embodiment, and thus the description is omitted.
[0124]
6.2.1 Motion detection unit
The motion detection unit 40 detects the motion of the subject in the wide-angle video data, and outputs a feature amount of the motion at each part in the video. Here, the “motion feature amount” indicates the magnitude of the motion of the subject.
[0125]
Detection of motion in a moving image can be realized by a known technique such as a method of obtaining a difference between a frame at a previous time and a frame at the current time, and a method using an optical flow. With these techniques, it is possible to detect the position where the subject has moved and the magnitude of the movement in wide-angle video data. According to this operation, when the present invention is used as a remote monitoring system, a partial image from a camera capturing a moving subject is recorded, which is preferable.
[0126]
When the video recording system according to the sixth embodiment is used as a remote conference system, it is preferable to automatically detect the position or direction of the speaker by detecting the movement of the lips of the participant. . The detection of the movement of the lips can be realized by a well-known technique such as a document (M. Kass, A. Witkin and D. Terzopoulos: “SNAKES: Active Control Models”, ICCV, pp. 259-268 (1987)). When the microphone 501 can be used as in the fourth and fifth embodiments, the lip movement is detected together with the extraction result of the utterance section based on the voice data, thereby improving the detection accuracy of the speaker. You can also. For example, Japanese Patent Application Laid-Open No. 6-43897 output by the present applicant discloses a system for recognizing a conversation using audio features extracted from audio data and dynamic visual features of a face extracted from video data. Is disclosed. This operation makes it possible to more stably detect the position or direction of the speaker even when a large amount of noise other than speech is occupied in the voice data.
[0127]
The function of the motion detection unit 40 described above may be implemented inside the wide-angle camera 200, or may be realized by the CPU 301 in the server 300. In the latter case, a predetermined program for realizing the function is stored in the HDD 303 in advance.
[0128]
6.2.2 Image selection unit
The video selecting unit 39 according to the sixth embodiment uses the correspondence specified by the specifying unit 36 and the feature amount of the motion of the subject detected by the motion detecting unit 40 to generate a partial video to be recorded. It is automatically selected. The video selecting unit 39 first specifies the image position where the largest motion is detected in the wide-angle video data based on the feature amount of the motion of the subject. Next, based on the specified image position and the correspondence between the wide-angle camera 200 and each camera 401 specified by the specifying unit 36, the same procedure as that described in the fourth embodiment is performed. The camera 401 located closest to the position is selected. As a result, it is possible to automatically select a partial image of the subject in which the largest motion is detected.
[0129]
The functions of the video selection unit 39 described above are executed by the CPU 301 in the server 300. At this time, a predetermined program for realizing the function is stored in the HDD 303 in advance.
[0130]
7. Embodiment 7
Further, in the above-described sixth embodiment, the motion of the subject is detected in the wide-angle video. However, the motion of the subject may be detected in each of the partial video data acquired by the camera array 400.
7.1 Configuration
The configuration of the seventh embodiment of the present invention is shown in FIGS. 2 to 7 as in the first to third embodiments.
[0131]
7.2 Operation
FIG. 35 is a diagram in which the video recording system according to the present embodiment is rewritten into a block diagram for each function. In the block diagram of the sixth embodiment shown in FIG. The video data output from the imaging unit 32 is input to the motion detection unit 40, and the connection from the identification unit 36 to the video selection unit 39 is deleted. The operation of each unit shown in FIG. 35 will be specifically described below. Note that the operations of the first imaging unit 31, the second imaging unit 32, the deforming unit 33, the encoding unit 34, and the recording unit 35 are the same as those in the above-described embodiment, and thus the description is omitted.
[0132]
7.2.1 Motion detection unit
The motion detecting section 40 in the present embodiment detects the motion of the subject in each partial video data and outputs the feature amount of the motion in each partial video data. Here, the “motion feature amount” indicates the magnitude of the motion of the subject, as in the sixth embodiment. Further, the movement of the subject in each of the partial video data is also detected by the well-known technique described in the sixth embodiment.
[0133]
Further, when the video recording system according to the present embodiment is used as a remote conference system, it is preferable to automatically detect the position or direction of the speaker by detecting the movement of the lips of the participant in the partial video. It is. This operation can also be realized by the well-known technique described in the sixth embodiment. Further, in the present embodiment, since the face of each participant is photographed large by the camera 401, the movement of the lips of the participant can be detected more stably as compared with the above-described sixth embodiment. .
[0134]
The function of the motion detection unit 40 described above may be implemented inside the camera 401 or may be realized by the CPU 301 in the server 300. In the latter case, a predetermined program for realizing the function is stored in the HDD 303 in advance.
[0135]
7.2.2 Image selection unit
The video selecting unit 39 in the present embodiment automatically selects a partial video to be recorded based on the motion of the subject in the partial video detected by the motion detecting unit 40. Specifically, a partial video in which the largest motion is detected is specified from the feature amount of the motion of the subject in each partial video, and this is automatically selected as a partial video to be recorded. Here, the present embodiment does not necessarily require the specifying unit 36, so that an appropriate partial image can be selected with a simpler configuration and processing than in the above-described sixth embodiment.
[0136]
The functions of the video selection unit 39 described above are executed by the CPU 301 in the server 300. At this time, a predetermined program for realizing the function is stored in the HDD 303 in advance.
[0137]
7.3 Other
In the above-described sixth embodiment or the present embodiment, it is preferable that each camera 401 included in the camera array 400 includes a shooting area that is partially common to other cameras. FIG. 36A is a diagram illustrating a screen of the video display terminal in a case where each camera does not include a common shooting area. As shown in FIG. 36, when the participant A is standing and moving, the image selection unit 39 selects a partial image including a shooting area closest to the participant A (in the figure, a black bar 605). (Shown) automatically. However, when the participant A moves to a place where none of the cameras 401 shoots, a partial video in which no important subject is captured is selected. As described above, a problem arises in that a partial image in which a moving subject is continuously tracked and projected cannot be recorded.
[0138]
Therefore, as shown in FIG. 36B, this problem can be solved by arranging the cameras so as to include a common shooting area. In the drawing, the hatched bar indicates a range where two or more cameras 401 are overlappingly photographed. As shown in FIG. 6, when the camera array 400 is configured by fixing each of the cameras 401 to the housing 402, each of the cameras 401 may be fixed so that the respective photographing ranges partially overlap.
[0139]
8. Embodiment 8
The function of the video recording system 100 according to the present invention can be realized by a PC. In this case, the functions can be realized by storing software for realizing the above-described units on a hard disk and executing a processing program as appropriate.
[0140]
9. Embodiment 9
Further, the program can be stored in a recording medium such as a CD-ROM. As shown in FIG. 37, the function can be realized by mounting the CD-ROM 308 storing the program on a PC and executing the program as appropriate. Note that the recording medium for storing the program is not limited to the CD-ROM 308, but may be another medium such as a DVD-ROM.
[0141]
Each of the above embodiments is merely an example of the present invention, and the scope of the present invention should not be limited or reduced as in the above embodiments. For example, in each embodiment, the wide-angle camera 200, the camera array 400, and the microphone array 500 have been described using a configuration example in which they are connected to a USB hub, but these connection forms are limited to the above description. is not. For example, another interface such as a PCI bus, IEEE 1394, or Bluetooth may be used.
[0142]
Further, as the mirror 211 used in the wide-angle camera 200, a hyperboloid mirror and a curved mirror having a curvature in one direction have been described in the embodiment, but other forms such as a parabolic mirror and a conical mirror may be used. It doesn't matter.
[0143]
In the description of the first imaging unit 31, it has been described that the wide-angle camera 200 outputs digitized video data. However, the wide-angle camera 200 may output an analog video signal. In this case, digital image data can be output by combining the wide-angle camera 200 and a video capture board that digitizes analog video signals. That is, the same operation as the first imaging unit 31 described in the above embodiment can be realized.
[0144]
In the description of the second imaging unit 32, it has been described that each of the cameras 401 included in the camera array 400 outputs the digitized partial video data. However, each of the cameras 401 outputs an analog video signal. It does not matter. In this case, by combining these cameras 401 and a video capture board that performs digitization processing on multi-channel analog video signals, digital partial video data can be output. That is, the same operation as the second imaging unit 32 described in the above embodiment can be realized.
[0145]
Further, in the description of the audio acquisition unit 37, it has been described that the digitalized audio data is output from each of the microphones 501 constituting the microphone array 500, but these microphones 501 each output an analog audio signal. It does not matter. In this case, by combining these microphones 501 and an audio capture board that performs digitization processing on multi-channel analog audio signals, digital audio data can be output. That is, the same operation as that of the voice acquisition unit 37 described in the above embodiment can be realized.
[0146]
In addition, although the encoding unit 34 and the recording unit 35 are described as being mounted on the same server 300, an encoding PC may be installed separately from the server 300. In this case, the encoded data is transferred from the encoding PC to the server 300 via the electric communication line.
[0147]
In the description of the operation of the encoding unit 34, it has been described that the MPEG encoding process is performed by the CPU 301 in the server 300, but the configuration of the encoding unit 34 is not limited to this. For example, video data may be converted to the MPEG format using a video capture board having a built-in MPEG encoding IC. The same applies to audio data.
[0148]
In addition, in the description of the operation of the motion detection unit 40, the “motion feature amount” indicates the magnitude of the motion of the subject, but may be another shape such as the shape of the movement locus of the subject.
[0149]
【The invention's effect】
According to the present invention, the first imaging unit 31 that acquires a wide-angle image, the second imaging unit 32 that synchronously acquires a plurality of images in which predetermined regions different from each other are captured, the wide-angle image and the By providing the recording unit 35 that records at least one of the images acquired by the second imaging unit 32, a wide range of images can be acquired and recorded with a simple configuration and processing, and at the same time, the image of a desired scene can be enhanced. It is possible to obtain and record at a resolution. That is, the first object is achieved.
[0150]
Furthermore, according to the present invention, by including the specifying unit 36 that specifies the correspondence between the wide-angle image acquired by the first imaging unit 31 and each of the partial images acquired by the second imaging unit 32, It becomes possible to more easily select a video of a desired scene having a high resolution. That is, the second object is achieved.
[0151]
Further, according to the present invention, by providing the video selecting unit 39 for selecting a predetermined video from a plurality of videos acquired by the second imaging unit 32, the user can perform a desired scene without forcing the user to perform a troublesome operation. Can be obtained and recorded at a high resolution. That is, the third object is achieved.
[0152]
Furthermore, according to the present invention, by providing the deforming unit 33 that deforms a wide-angle image, it is possible to display the acquired wide-range image in a form that can be more easily observed by a viewer. That is, the fourth object is achieved.
[0153]
Further, according to the present invention, each video acquired by the second imaging unit 32 includes a part of a common area with at least one other video, so that a video of a desired scene can be further enhanced with a higher resolution. It is possible to acquire and record without omission. That is, the fifth object is achieved.
[Brief description of the drawings]
FIG. 1 is a diagram showing an example of use of a video recording system according to the present invention.
FIG. 2 is a diagram showing a configuration of a video recording system according to the first embodiment.
FIG. 3 is a diagram showing a configuration of a wide-angle camera 200 according to Embodiment 1.
FIG. 4 is a diagram showing a structure of a wide-angle camera 200 according to the first embodiment.
FIG. 5 is a diagram showing an image captured by the wide-angle camera 200 shown in FIG.
FIG. 6 is a diagram showing an example of a camera array 400 according to the first embodiment.
FIG. 7 is a diagram showing a configuration of a server 300 according to the first embodiment.
FIG. 8 is a block diagram showing an operation according to the first embodiment.
FIG. 9 is a diagram for explaining the operation of the deformation unit 33 in the first embodiment.
FIG. 10 is a diagram for explaining the principle of a deforming unit 33;
FIG. 11 is a diagram illustrating a coordinate conversion table used in a deformation unit 33;
FIG. 12 is a diagram showing a configuration of a wide-angle camera 200 according to Embodiment 2.
FIG. 13 is a diagram showing an image captured by the wide-angle camera 200 shown in FIG.
FIG. 14 is a diagram showing an example in which the operation of the deformation unit 33 according to the second embodiment is realized simultaneously with the display of an image.
FIG. 15 is a diagram showing an example of a display screen of a video display terminal when the video distribution system according to Embodiment 3 is used.
FIG. 16 is a block diagram showing an operation according to the third embodiment.
FIG. 17 is a diagram illustrating an example of an operation of the specifying unit 36 according to the third embodiment.
FIG. 18 is a diagram illustrating an example of an operation of the specifying unit 36 according to the third embodiment.
FIG. 19 is a diagram showing an example of a display screen of a server 300 according to Embodiment 3.
FIG. 20 is a diagram showing an example of a display screen of a video display terminal when the video recording system according to Embodiment 4 is used.
FIG. 21 is a diagram showing a configuration of a video recording system according to a fourth embodiment.
FIG. 22 is a diagram illustrating a configuration of a wide-angle camera 200 and a microphone array 500 according to the fourth embodiment.
FIG. 23 is a block diagram showing an operation according to the fourth embodiment.
FIG. 24 is a diagram illustrating an operation principle of a sound source detection unit 38 according to the fourth embodiment.
FIG. 25 is a diagram illustrating a problem of the sound source detection unit 38 according to the fourth embodiment.
FIG. 26 is a diagram illustrating an arrangement example of a microphone 501 in Embodiment 4.
FIG. 27 is a diagram illustrating another arrangement example of the microphone 501 according to the fourth embodiment.
FIG. 28 is a diagram illustrating an operation of a sound source detection unit 38 according to the fourth embodiment.
FIG. 29 is a diagram illustrating an operation of the sound source detection unit 38 according to the fourth embodiment.
FIG. 30 is a diagram illustrating an operation of a video selection unit 39 according to the fourth embodiment.
FIG. 31 is a diagram illustrating data output by a video selection unit 39 according to the fourth embodiment.
FIG. 32 is a diagram illustrating a configuration of a camera 401 according to the fifth embodiment.
FIG. 33 is a block diagram showing an operation according to the fifth embodiment.
FIG. 34 is a block diagram showing an operation according to the sixth embodiment.
FIG. 35 is a block diagram showing an operation according to the seventh embodiment.
FIG. 36 is a diagram showing a problem when the video distribution systems according to Embodiments 6 and 7 are used.
FIG. 37 is a diagram showing a configuration example according to the ninth embodiment.
[Explanation of symbols]
1 table
2 Participants
3 cabinets
31 First imaging unit
32 Second imaging unit
33 Deformation part
34 Encoding unit
35 Recorder
36 Specific part
37 Voice Acquisition Unit
38 Sound source detector
39 Image selector
40 Motion detector
200 wide-angle camera
211 mirror
212 lens
213 aperture
214 Image sensor
215 Drive unit
216 Preprocessing circuit
217 Motor drive unit
300 servers
310 bus
320 USB hub
330 Internet
350 Client PC
400 camera array
401-1 to 401-4 camera
500 microphone array
501 microphone
600 Display window

Claims

First imaging means for acquiring a wide-angle image,
A second imaging unit configured by a plurality of cameras and synchronously acquiring a plurality of videos in which predetermined regions different from each other are captured,
Recording means for recording at least one of the wide-angle image and the image acquired by the second imaging means.

First imaging means for acquiring a wide-angle image,
A second imaging unit configured by a plurality of cameras and synchronously acquiring a plurality of videos in which predetermined regions different from each other are captured,
Identification means for identifying the correspondence between the wide-angle image and each image acquired by the second imaging means,
A video recording system comprising: a recording unit that records at least one of the wide-angle video and the video acquired by the second imaging unit and the correspondence.

An identifier is assigned to each of a plurality of cameras constituting the second imaging means,
The first imaging unit includes an identifier attached to the plurality of cameras in an imaging range,
3. The video recording system according to claim 2, wherein the specifying unit specifies the correspondence based on a shooting position of an identifier included in the wide-angle video.

The method according to claim 2, wherein the specifying unit specifies the correspondence based on a similarity between the wide-angle image and each image acquired by the second imaging unit. Video recording system.

First imaging means for acquiring a wide-angle image,
A second imaging unit configured by a plurality of cameras and synchronously acquiring a plurality of videos in which predetermined regions different from each other are captured,
Video selection means for selecting a predetermined video from the plurality of videos obtained by the second imaging means,
Recording means for recording the wide-angle image and a predetermined image selected by the image selection means.

Further, a plurality of microphones for inputting sound and sound source detection means for detecting the position or direction of the sound source based on the sound input by the plurality of microphones,
6. The video recording system according to claim 5, wherein the video selecting unit selects the predetermined video based on a position or a direction of a sound source output by the sound source detecting unit.

Furthermore, it has a motion detection means for detecting the motion of the subject in the wide-angle image or a plurality of images obtained by the second imaging means,
7. The video recording system according to claim 5, wherein the video selection unit selects the predetermined video based on the motion of the subject output by the motion detection unit.

Further, a deforming means for deforming the wide-angle image,
The recording device according to claim 2, wherein the recording unit simultaneously records at least one of the image transformed by the transformation unit and the image acquired by the second imaging unit. The video recording system as described.

A non-transitory computer-readable storage medium storing a program for causing a computer to execute a process related to each unit of the video recording system according to claim 2.

9. A recording medium for recording computer-readable program software for executing a process related to each means of the video recording system according to any one of claims 2 to 8.