JP2005151584A

JP2005151584A - Transmission processing apparatus and method

Info

Publication number: JP2005151584A
Application number: JP2004337892A
Authority: JP
Inventors: Atsushi Kumagai; 篤熊谷; Hiroaki Sato; 宏明佐藤; Tomoaki Kawai; 智明河合
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1996-01-08
Filing date: 2004-11-22
Publication date: 2005-06-09

Abstract

【課題】各通信端末に映像／音声符号化手段を設けなくて済むようにする。
【解決手段】スイッチ１２は、複数の映像・音声入力から１つを選択する。送信処理装置１４のビデオ・キャプチャ１６は、選択された映像信号をディジタル化する。ビデオ・エンコーダ１８は、ビデオ・キャプチャ１６の出力をモーションＪＰＥＧ、ＭＰＥＧ及びＨ．２６１に従って圧縮符号化する。セレクタ２０は、各符号化方式の圧縮映像データの内、送信すべき圧縮映像データを選択し、通信バッファ２２に供給する。音声エンコーダ２４は、選択された音声信号をディジタル化及び符号化して、通信バッファ２２に供給する。通信バッファ２２の符号化映像データ及び符号化音声データはネットワーク３０に読み出される。制御回路２６は、ネットワーク３０を介して受信した制御コマンドに従いデバイス１２〜２４を制御する。
【選択図】図１An object of the present invention is to eliminate the need to provide video / audio encoding means in each communication terminal.
A switch 12 selects one from a plurality of video / audio inputs. The video capture 16 of the transmission processing device 14 digitizes the selected video signal. The video encoder 18 converts the output of the video capture 16 into motion JPEG, MPEG and H.264. Compress and encode according to H.261. The selector 20 selects the compressed video data to be transmitted from the compressed video data of each encoding method, and supplies it to the communication buffer 22. The voice encoder 24 digitizes and encodes the selected voice signal, and supplies it to the communication buffer 22. The encoded video data and encoded audio data in the communication buffer 22 are read out to the network 30. The control circuit 26 controls the devices 12 to 24 according to the control command received via the network 30.
[Selection] Figure 1

Description

本発明は、複数の映像／音声ソースからの映像／音声信号を選択的にネットワークに出力する送信処理装置及び方法に関し、より具体的には、コンピュータ・ネットワークに対応する送信処理装置及び方法に関する。 The present invention relates to a transmission processing apparatus and method for selectively outputting video / audio signals from a plurality of video / audio sources to a network, and more specifically to a transmission processing apparatus and method corresponding to a computer network.

コンピュータ（パーソナル・コンピュータ又はワークステーション）にカメラ、映像モニタ、マイク及びスピーカなどの映像音声入出力機器を接続して、端末間で映像及び音声を送受信できるようにした通信端末装置の構成は周知であり、例えば、テレビ会議又はビデオ会議のための通信端末装置として既に一般に使用されている。 The configuration of a communication terminal device in which video and audio input / output devices such as a camera, video monitor, microphone and speaker are connected to a computer (personal computer or workstation) so that video and audio can be transmitted and received between terminals is well known. For example, it is already generally used as a communication terminal device for a video conference or a video conference.

カメラから入力された映像信号及びマイクにより入力された音声信号はそれぞれ、ディジタル化され、所定方式で符号化されてローカル・エリア・ネットワーク又はワイド・エリア・ネットワーク等のネットワークに出力される。ネットワークからの符号化された映像信号及び音声信号は復号化され、それぞれ、映像モニタ及びスピーカから出力される。マイクとスピーカは、音声の回り込みによるエコーを回避するためのエコーキャンセラと組みあわされる。スピーカ・フォンとして構成されているものもある。 The video signal input from the camera and the audio signal input from the microphone are each digitized, encoded by a predetermined method, and output to a network such as a local area network or a wide area network. The encoded video signal and audio signal from the network are decoded and output from the video monitor and the speaker, respectively. The microphone and the speaker are combined with an echo canceller for avoiding an echo caused by a sound wraparound. Some are configured as speakerphones.

従来、このようなテレビ会議システムは専用通信端末により構成されていたが、コンピュータの処理能力の向上により、オフィスの各人のデスクトップにあるコンピュータを通信端末としてテレビ会議又はビデオ会議を行なえるようになってきている。 Conventionally, such a video conference system has been configured by a dedicated communication terminal. However, by improving the processing capability of the computer, a video conference or a video conference can be performed using a computer on the desktop of each person in the office as a communication terminal. It has become to.

映像及び／音声を遠隔地に伝送する用途では、テレビ会議だけでなく、色々な場所の様子を観察する遠隔監視システムなどがある。 In applications where video and / or audio are transmitted to remote locations, there are not only video conferences but also remote monitoring systems for observing various situations.

映像伝送では、カメラの出力映像信号をディジタル化して取り込むためのビデオ・キャプチャ装置と圧縮符号化のビデオ圧縮装置に大きなコストがかかる。各人のコンピュータに装備した場合、その稼働率からみて、コスト効果比が低い。 In video transmission, a video capture device for digitizing and capturing a video signal output from a camera and a video compression device for compression coding are expensive. When equipped on each person's computer, the cost-effectiveness ratio is low in view of its availability.

また、遠隔監視システムのような用途では、ビデオ圧縮装置の性能（例えば、秒間３０フレームの符号化が可能）に対して、必ずしも１地点の画像をそんな高いフレーム・レートで更新する必要がなく、例えば秒間１フレームで良いといった利用形態がある。そのような利用形態では、秒間３０フレームの能力を持つビデオ圧縮装置は過剰仕様となる。 In applications such as a remote monitoring system, it is not always necessary to update a single point image at such a high frame rate with respect to the performance of the video compression apparatus (for example, encoding of 30 frames per second). For example, there is a usage form in which one frame per second is sufficient. In such a mode of use, a video compression device capable of 30 frames per second is over-specified.

本発明は、このような問題点に鑑み、簡便な構成でコスト効果比の良い送信処理装置及び方法を提示することを目的とする。 The present invention has been made in view of such problems, and an object of the present invention is to provide a transmission processing apparatus and method having a simple configuration and a good cost-effectiveness ratio.

本発明に係る送信処理装置は、映像及び音声の少なくとも一方についての複数の入力から１つを選択して、ネットワークに送信する送信処理装置であって、複数のアナログ入力信号を与えられた指示に応じて切り換える切り換え手段と、当該切り換え手段から出力されるアナログ信号をディジタル化するＡ／Ｄ変換手段と、当該Ａ／Ｄ変換手段のディジタル出力を圧縮符号化する符号化手段と、当該符号化手段により符号化されたデータをネットワークに出力する出力手段とからなることを特徴とする。 A transmission processing apparatus according to the present invention is a transmission processing apparatus that selects one of a plurality of inputs for at least one of video and audio, and transmits the selected one to a network. Switching means for switching in response, an A / D conversion means for digitizing an analog signal output from the switching means, an encoding means for compressing and encoding the digital output of the A / D conversion means, and the encoding means Output means for outputting the data encoded by the above to a network.

本発明に係る送信処理装置は、映像及び音声の少なくとも一方についての複数の入力から１つを選択してネットワークに送信する送信処理装置であって、ネットワークを介した複数の入力を切り換える切り換え手段と、当該切り換え手段からの信号を圧縮符号化する符号化手段と、当該手段による符号化出力をネットワークに出力する出力手段とを有することを特徴とする。 The transmission processing device according to the present invention is a transmission processing device that selects one of a plurality of inputs for video and audio and transmits the selected one to a network, and a switching means for switching the plurality of inputs via the network. And an encoding means for compressing and encoding the signal from the switching means, and an output means for outputting the output encoded by the means to the network.

本発明に係る送信処理装置は、複数のアナログ入力信号から任意の複数の信号を選択するスイッチと、当該スイッチにより選択される複数のアナログ信号を時間軸上で圧縮して合成する合成手段と、当該合成手段から出力されるアナログ信号をディジタル化するＡ／Ｄ変換手段と、当該Ａ／Ｄ変換手段から出力されるディジタル信号を圧縮符号化する符号化手段と、通信及び当該スイッチを制御する制御手段とからなることを特徴とする。 A transmission processing apparatus according to the present invention includes a switch that selects an arbitrary plurality of signals from a plurality of analog input signals, and a combining unit that compresses and combines a plurality of analog signals selected by the switch on a time axis, A / D conversion means for digitizing the analog signal output from the synthesizing means, encoding means for compressing and encoding the digital signal output from the A / D conversion means, and control for controlling the communication and the switch Means.

本発明に係る送信処理装置は、複数のアナログ入力信号から任意の複数の信号を選択するスイッチと、当該スイッチにより選択された複数のアナログ信号の夫々をディジタル化する複数のＡ／Ｄ変換手段と、当該複数のＡ／Ｄ変換手段から出力されるディジタル信号のそれぞれを圧縮符号化する複数の符号化手段と、通信及び当該スイッチを制御する制御手段とからなることを特徴とする。 A transmission processing apparatus according to the present invention includes a switch for selecting an arbitrary plurality of signals from a plurality of analog input signals, and a plurality of A / D conversion means for digitizing each of the plurality of analog signals selected by the switch. The digital signal output from the plurality of A / D conversion means includes a plurality of encoding means for compressing and encoding, and a control means for controlling the communication and the switch.

本発明に係る送信処理方法は、映像及び音声の少なくとも一方についての複数の入力から１つを選択して、ネットワークに送信する送信処理方法であって、複数のアナログ入力信号を与えられた指示に応じて切り換え、Ａ／Ｄ変換手段によりディジタル化し、そのディジタル出力を圧縮符号化し、符号化されたデータをネットワークに出力することを特徴とする。 A transmission processing method according to the present invention is a transmission processing method of selecting one of a plurality of inputs for at least one of video and audio, and transmitting the selected one to a network. Switching is made accordingly, digitizing by A / D conversion means, the digital output is compression encoded, and the encoded data is output to the network.

本発明に係る送信処理方法は、映像及び音声の少なくとも一方についての複数の入力から１つを選択してネットワークに送信する送信処理方法であって、ネットワークを介した複数の入力を切り換えて、圧縮符号化し、符号化出力をネットワークに出力することを特徴とする。 A transmission processing method according to the present invention is a transmission processing method for selecting one of a plurality of inputs for video and audio and transmitting the selected one to a network. The transmission processing method switches between a plurality of inputs via the network and performs compression. It encodes and outputs an encoding output to a network.

本発明に係る送信処理方法は、複数のアナログ入力信号から任意の複数の信号を選択し、選択された複数のアナログ信号を時間軸上で圧縮して合成し、合成されたアナログ信号をディジタル化し、ディジタル化された合成信号を圧縮符号化することを特徴とする。 The transmission processing method according to the present invention selects an arbitrary plurality of signals from a plurality of analog input signals, compresses and combines the selected analog signals on the time axis, and digitizes the combined analog signals. The digitally synthesized signal is compression-encoded.

本発明に係る送信処理方法は、複数のアナログ入力信号から任意の複数の信号を選択し、選択された複数のアナログ信号の夫々をディジタル化し、各ディジタル信号を圧縮符号化することを特徴とする。 The transmission processing method according to the present invention is characterized by selecting a plurality of arbitrary signals from a plurality of analog input signals, digitizing each of the selected plurality of analog signals, and compressing and encoding each digital signal. .

本発明によれば、映像音声情報送信処理の資源を複数の端末で共用でき、これら資源の稼働率を上げて、コスト効果比を改善できる。 According to the present invention, resources for video / audio information transmission processing can be shared by a plurality of terminals, the operating rate of these resources can be increased, and the cost-effectiveness ratio can be improved.

複数の圧縮符号化手段を設け、通信先との情報交換及び／又は通信経路の状況に基づき、圧縮符号化手段を選択することにより、送信先の性能や状況に応じた適切な通信品質で映像／音声情報を送信できる。 By providing a plurality of compression encoding means and selecting the compression encoding means based on the information exchange with the communication destination and / or the state of the communication path, the video can be transmitted with an appropriate communication quality according to the performance and situation of the transmission destination. / Voice information can be transmitted.

テレビ会議システムに組み込むことで、より低コストでテレビ会議システムを構築できる。映像／音声の符号化手段を集中化することで、テレビ会議システム用端末を安価なものとすることができる。 By incorporating it into a video conference system, a video conference system can be constructed at a lower cost. By concentrating the video / audio encoding means, the video conference system terminal can be made inexpensive.

また、本発明によれば、複数の入力信号から任意の複数の信号を選択するスイッチを使用することで、キャプチャ装置及びエンコーダなどを共用化でき、これらのモジュールの稼働率を上げてコスト効果比の向上を図ることができる。また、複数の入力信号を同時にディジタル化及び圧縮符号化することにより、スイッチの切り替えによる時間的損失を低減できる。 In addition, according to the present invention, by using a switch that selects a plurality of arbitrary signals from a plurality of input signals, the capture device and the encoder can be shared, and the operating rate of these modules is increased and the cost effectiveness ratio is increased. Can be improved. In addition, by simultaneously digitizing and compressing and encoding a plurality of input signals, time loss due to switch switching can be reduced.

また、本発明によれば、通信先との情報交換及び通信経路の状況に基づき圧縮符号化手段を選択するので、送信先の性能や状況に応じた適切な通信品質で情報を送信できる。 Further, according to the present invention, since the compression encoding means is selected based on the information exchange with the communication destination and the status of the communication path, information can be transmitted with appropriate communication quality according to the performance and status of the transmission destination.

以下、図面を参照して、本発明の実施の形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明の第１実施例の概略構成ブロック図を示す。本実施例の映像／音声送信処理装置１０は、複数の（本実施例では４つの）映像・音声入力から送信処理すべき映像音声信号を選択するスイッチ１２と、スイッチ１２により選択された映像音声信号を送信処理する送信処理装置１４とからなる。 FIG. 1 shows a schematic block diagram of a first embodiment of the present invention. The video / audio transmission processing apparatus 10 of this embodiment includes a switch 12 that selects video / audio signals to be transmitted from a plurality of (four in this embodiment) video / audio inputs, and the video / audio selected by the switch 12. And a transmission processing device 14 for transmitting a signal.

送信処理装置１４内のビデオ・キャプチャ１６は、スイッチ１２により選択された映像信号をディジタル化する。ビデオ・エンコーダ１８は、モーションＪＰＥＧ、ＭＰＥＧ及びＨ．２６１の各符号化方式に対応する符号化モジュールを具備し、ビデオ・キャプチャ１６の出力信号をモーションＪＰＥＧ、ＭＰＥＧ及びＨ．２６１に従って圧縮符号化する。セレクタ２０は、各符号化方式の圧縮映像データの内、送信すべき圧縮映像データを選択し、選択したデータを通信バッファ２２に供給する。他方、音声エンコーダ２４は、スイッチ１２により選択された音声信号をディジタル化及び符号化して、通信バッファ２２に供給する。通信バッファ２２に書き込まれた符号化映像データ及び符号化音声データは、適宜のフォーマット及び速度でネットワーク３０に読み出される。 The video capture 16 in the transmission processing device 14 digitizes the video signal selected by the switch 12. Video encoder 18 includes motion JPEG, MPEG and H.264. And an encoding module corresponding to each encoding method of H.261, and an output signal of the video capture 16 is converted into motion JPEG, MPEG and H.264. Compress and encode according to H.261. The selector 20 selects compressed video data to be transmitted from the compressed video data of each encoding method, and supplies the selected data to the communication buffer 22. On the other hand, the audio encoder 24 digitizes and encodes the audio signal selected by the switch 12 and supplies it to the communication buffer 22. The encoded video data and encoded audio data written in the communication buffer 22 are read out to the network 30 at an appropriate format and speed.

制御回路２６は、ネットワーク３０を介して受信した制御コマンドに従い、スイッチ１２、ビデオ・キャプチャ１６、ビデオ・エンコーダ１８、セレクタ２０及び音声エンコーダ２４を制御する。 The control circuit 26 controls the switch 12, the video capture 16, the video encoder 18, the selector 20, and the audio encoder 24 according to the control command received via the network 30.

図２は、図１に示す実施例の動作を示すフローチャートである。先ず、送信要求、送信終了要求及び送信起動などのイベントの待ち状態になる（Ｓ１）。映像／音声送信要求を受け付けると（Ｓ２）、新たな送信処理を開始する（Ｓ３）。映像／音声送信終了要求を受け付けると（Ｓ４）、指定された映像／音声の送信処理を終了する（Ｓ５）。送信スケジューラにより映像音声送信動作を起動するイベントが発生すると（Ｓ６）、カメラからの画像データの獲得と送信処理を開始する（Ｓ７）。 FIG. 2 is a flowchart showing the operation of the embodiment shown in FIG. First, it waits for events such as a transmission request, a transmission end request, and transmission activation (S1). When a video / audio transmission request is received (S2), a new transmission process is started (S3). When a video / audio transmission end request is received (S4), the designated video / audio transmission processing is ended (S5). When an event for starting the video / audio transmission operation is generated by the transmission scheduler (S6), acquisition of image data from the camera and transmission processing are started (S7).

本実施例には、２種類の送信要求がある。第１は、テレビ会議のように高いフレーム・レートの映像データ通信を要求する場合である。この場合、スイッチ１２又は送信処理装置１４の切り替え速度を上回るフレーム・レートが要求されたときには、その要求に応じるために全ての送信動作を占有する必要が発生する。このような要求を占有的処理要求と呼ぶことにする。第２は、カメラを切り替えながらの監視のように、高いフレーム・レートを要求されない場合である。このような要求を非占有的処理要求と呼ぶことにする。 In this embodiment, there are two types of transmission requests. The first is a case where high frame rate video data communication is requested as in a video conference. In this case, when a frame rate exceeding the switching speed of the switch 12 or the transmission processing device 14 is requested, it is necessary to occupy all transmission operations in order to meet the request. Such a request is called an exclusive processing request. The second is a case where a high frame rate is not required as in monitoring while switching cameras. Such a request will be referred to as a non-occupied processing request.

非占有的処理要求は複数、受け付け可能であり、送信スケジューラが、指定の映像／音声ソースからの信号を順に選択し、送信する。送信スケジューラは、実際にはタイマ・イベントにより実現される。 A plurality of non-occupying processing requests can be accepted, and the transmission scheduler sequentially selects and transmits signals from a designated video / audio source. The transmission scheduler is actually realized by a timer event.

制御回路２６は、まず、イベント待ち状態（Ｓ１）になる。外部からの通信又は送信スケジューラのタイマ・イベントが発生すると、通信内容及びイベントの種類に応じて、Ｓ２、Ｓ４又はＳ６の何れかに進む。 First, the control circuit 26 enters an event waiting state (S1). When an external communication or timer event of the transmission scheduler occurs, the process proceeds to S2, S4, or S6 depending on the communication content and the type of event.

通信されてきたメッセージが新たな送信要求である場合、映像／音声送信要求受け付け処理（Ｓ２）に進み、メッセージの内容を読み出す。次いで、メッセージ内容に従って送信開始処理（Ｓ３）を実行する。 When the transmitted message is a new transmission request, the process proceeds to a video / audio transmission request acceptance process (S2), and the content of the message is read. Next, a transmission start process (S3) is executed according to the message content.

通信されてきたメッセージが送信終了の要求である場合、音声／映像送信終了要求の受け付け処理（Ｓ４）に進み、メッセージの内容を読み出す。次いで、メッセージの内容に従って送信終了処理（Ｓ５）を実行する。 If the transmitted message is a transmission end request, the process proceeds to an audio / video transmission end request acceptance process (S4), and the content of the message is read. Next, a transmission end process (S5) is executed according to the content of the message.

タイマ・イベントが発生した場合には、送信処理プロセスを再起動し、イベント待ち状態（Ｓ１）に戻る。起動された送信処理プロセスが、カメラからの映像データの獲得と送信の処理を実行し、タイマを設定する（Ｓ７）。 If a timer event occurs, the transmission process is restarted and the process returns to the event waiting state (S1). The started transmission processing process acquires and transmits video data from the camera, and sets a timer (S7).

図３は、図２の送信開始処理（Ｓ３）の詳細なフローチャートを示す。制御回路２６は、まず、占有的処理要求が既にあるか否かを判定する（Ｓ１１）。占有的処理要求がある場合（Ｓ１１）、他の要求を受けつけられないので、当該開始要求を拒否する（Ｓ１５）。占有的処理要求が無い場合（Ｓ１１）、開始しようとしている処理自体が占有的処理要求であるか否かを判定する（Ｓ１２）。非占有的処理要求の場合（Ｓ１２）、送信スケジューラに送信要求処理プロセスを追加し、起動のためのタイマを設定して終了する（Ｓ１３）。占有的処理要求の場合（Ｓ１２）、他の要求と共存できないので、既存の要求の処理プロセスを一時停止させた（Ｓ１４）、その後、送信スケジューラに送信要求処理プロセスを追加し、起動のためのタイマを設定して終了する（Ｓ１３）。 FIG. 3 shows a detailed flowchart of the transmission start process (S3) of FIG. First, the control circuit 26 determines whether or not an exclusive processing request has already been made (S11). If there is an exclusive processing request (S11), since no other request can be accepted, the start request is rejected (S15). When there is no exclusive process request (S11), it is determined whether the process itself to be started is an exclusive process request (S12). In the case of a non-occupied process request (S12), a transmission request process is added to the transmission scheduler, a timer for activation is set, and the process ends (S13). In the case of the exclusive processing request (S12), since it cannot coexist with other requests, the processing process of the existing request is temporarily stopped (S14), and then the transmission request processing process is added to the transmission scheduler, A timer is set and the process ends (S13).

図４は、図２の送信終了処理（Ｓ５）の詳細なフローチャートを示す。制御回路２６は、まず、指定された終了すべき処理が実際に存在するかどうかを、送信スケジューラを参照して判定する（Ｓ２１）。終了指定された処理が無い場合（Ｓ２１）、エラー・メッセージを返す（Ｓ２４）。終了指定された処理が存在する場合（Ｓ２１）、対応する送信要求処理プロセスに終了信号を送り、送信スケジューラからそのプロセスを削除する（Ｓ２２）。その後、終了したプロセスが占有的処理要求によるものかどうかを判定し（Ｓ２３）、占有的処理要求によるものであれば（Ｓ２３）、そのプロセスにより一時停止状態にされたプロセスを送信スケジューラを参照して再起動可能な状態に戻し、終了する（Ｓ２５）。 FIG. 4 shows a detailed flowchart of the transmission end process (S5) of FIG. First, the control circuit 26 determines whether or not the designated process to be ended actually exists with reference to the transmission scheduler (S21). If there is no process designated to end (S21), an error message is returned (S24). If there is an end-designated process (S21), an end signal is sent to the corresponding transmission request processing process, and the process is deleted from the transmission scheduler (S22). Thereafter, it is determined whether the terminated process is due to an exclusive processing request (S23). If it is due to an exclusive processing request (S23), the process suspended by the process is referred to the transmission scheduler. Then, the system is returned to the restartable state, and the process ends (S25).

カメラからの映像データの獲得と送信処理（Ｓ７）の内容は、処理要求が占有的処理要求であるか否かによって異なる。図５は、図２のＳ７の詳細であって、非占有的処理要求に対する処理の流れを示す。先ず、制御回路２６は、要求されたカメラの映像を入力するようにスイッチ１２を設定し（Ｓ３１）、要求された映像サイズになるようにビデオ・キャプチャ１６を設定する（Ｓ３２）、ビデオ・エンコーダ１８の３つの出力の内、要求された圧縮方式に応じた符号化データを選択するようにセレクタ２０を設定し（Ｓ３３）、ビデオ・エンコーダ１８の符号化パラメータを設定する（Ｓ３４）。 Acquisition of video data from the camera and the content of the transmission process (S7) differ depending on whether or not the processing request is an exclusive processing request. FIG. 5 shows the details of S7 of FIG. 2 and shows the flow of processing for the non-occupied processing request. First, the control circuit 26 sets the switch 12 so as to input the requested camera image (S31), and sets the video capture 16 so as to obtain the requested image size (S32). The selector 20 is set so as to select the encoded data corresponding to the requested compression method among the three outputs of 18 (S33), and the encoding parameter of the video encoder 18 is set (S34).

その後、カメラ映像が取り込まれ、データ送信される。即ち、ビデオ・キャプチャ１６がスイッチ１２からの映像信号をディジタル化し（Ｓ３５）、ビデオ・エンコーダ１８がディジタル化された映像信号を圧縮符号化する（Ｓ３６）。指定の符号化方式の符号化映像データがセレクタ２０により選択され、通信バッファ２２に格納される（Ｓ３７）。同時に、音声エンコーダ２４が、選択された音声信号をディジタル化及び符号化し、通信バッファ２２に格納する。通信バッファ２２に格納された符号化映像データ及び符号化音声データは、適当なレート及びフォーマットでネットワーク３０に読み出され、要求元（又は指定の相手）に送信される（Ｓ３８）。送信後、タイマを次の送信時期に設定して、終了する。 Thereafter, the camera video is captured and data is transmitted. That is, the video capture 16 digitizes the video signal from the switch 12 (S35), and the video encoder 18 compresses and encodes the digitized video signal (S36). The encoded video data of the specified encoding method is selected by the selector 20 and stored in the communication buffer 22 (S37). At the same time, the audio encoder 24 digitizes and encodes the selected audio signal and stores it in the communication buffer 22. The encoded video data and encoded audio data stored in the communication buffer 22 are read to the network 30 at an appropriate rate and format and transmitted to the request source (or designated partner) (S38). After transmission, the timer is set to the next transmission time and the process ends.

なお、この実施例では、ビデオ・エンコーダ１８内の３つのエンコード・モジュール全てにおいて符号化処理が行なわれているが、選択された符号化方式に対応するものだけが動作するようにしてもよい。 In this embodiment, encoding processing is performed in all three encoding modules in the video encoder 18, but only the one corresponding to the selected encoding method may be operated.

図６及び図７は、図２のＳ７の詳細であって、占有的処理要求に対する処理の流れを示す。図６は初期化処理であり、図７は、カメラからの映像データの獲得と送信処理の流れを示す。占有的処理要求の場合、非占有的処理要求のＳ３１〜Ｓ３４に相当する初期化設定操作が１回目（及び送信要求パラメータの変更時）にのみ行なわれ、その後はデータ獲得と送信動作のみが繰り返される。 6 and 7 show details of S7 in FIG. 2 and show the flow of processing for the exclusive processing request. FIG. 6 shows the initialization process, and FIG. 7 shows the flow of the video data acquisition and transmission process from the camera. In the case of the exclusive processing request, the initialization setting operation corresponding to S31 to S34 of the non-occlusive processing request is performed only for the first time (and when the transmission request parameter is changed), and thereafter, only the data acquisition and transmission operations are repeated. It is.

初期化処理として、図６に示すように、先ず、制御回路２６は、要求されたカメラの映像を入力するようにスイッチ１２を設定し（Ｓ４１）、要求された映像サイズになるようにビデオ・キャプチャ１６を設定する（Ｓ４２）、ビデオ・エンコーダ１８の３つの出力の内、要求された圧縮方式に応じた符号化データを選択するようにセレクタ２０を設定し（Ｓ４３）、ビデオ・エンコーダ１８の符号化パラメータを設定する（Ｓ４４）。これらの処理は、図５のＳ３１〜Ｓ３４と同じである。 As an initialization process, as shown in FIG. 6, first, the control circuit 26 sets the switch 12 to input the requested camera image (S41), and the video / video size is set to the requested image size. When the capture 16 is set (S42), the selector 20 is set so as to select the encoded data corresponding to the requested compression method among the three outputs of the video encoder 18 (S43). An encoding parameter is set (S44). These processes are the same as S31 to S34 in FIG.

これらの初期化処理の後、実際に、カメラ映像の獲得と送信処理を繰り返す。即ち、ビデオ・キャプチャ１６が、スイッチ１２からの映像信号をディジタル化し（Ｓ５１）、ビデオ・エンコーダ１８が、ディジタル化された映像信号を圧縮符号化する（Ｓ５２）。指定の符号化方式の符号化映像データがセレクタ２０により選択され、通信バッファ２２に格納される（Ｓ５３）。同時に、音声エンコーダ２４が、選択された音声信号をディジタル化及び符号化し、通信バッファ２２に格納する。通信バッファ２２に格納された符号化映像データ及び符号化音声データは、適当なレート及びフォーマットでネットワーク３０に読み出され、要求元（又は指定の相手）に送信される（Ｓ５４）。送信後、タイマを次の送信時期に設定して、終了する。 After these initialization processes, the camera image acquisition and transmission processes are actually repeated. That is, the video capture 16 digitizes the video signal from the switch 12 (S51), and the video encoder 18 compresses and encodes the digitized video signal (S52). The encoded video data of the designated encoding method is selected by the selector 20 and stored in the communication buffer 22 (S53). At the same time, the audio encoder 24 digitizes and encodes the selected audio signal and stores it in the communication buffer 22. The encoded video data and encoded audio data stored in the communication buffer 22 are read out to the network 30 at an appropriate rate and format and transmitted to the request source (or designated partner) (S54). After transmission, the timer is set to the next transmission time and the process ends.

以上の説明では、１つのカメラ又はマイクからの入力信号を１つの送信先に送る動作を示したが、１つのカメラ又はマイクからの入力信号に対する送信要求が複数の端末からあった場合、これらを一括して複数の送信先に送ることができることはネットワーク技術として明らかである。 In the above description, the operation of sending an input signal from one camera or microphone to one transmission destination has been shown. However, when there are transmission requests for input signals from one camera or microphone from a plurality of terminals, It is obvious as a network technology that data can be sent to a plurality of destinations at once.

送信データのサイズやレート等のパラメータ変更要求があった際には、図６の初期化処理が再実行された後に、図７の動作が続く。この際、パラメータの変更により占有的処理要求が非占有的処理要求となる場合には、停止された処理プロセスを再起動可能な状態に戻す。 When there is a request to change parameters such as the size and rate of transmission data, the initialization process of FIG. 6 is re-executed, and then the operation of FIG. 7 continues. At this time, if the exclusive processing request becomes a non-occupying processing request due to a parameter change, the stopped processing process is returned to a restartable state.

また、映像／音声データの符号化及び通信バッファ２２への格納までのステップ（図５のＳ３１〜Ｓ３７、及び図７のＳ５１〜Ｓ５３）と通信バッファ２２に格納されるデータの送信（図５のＳ３８及び図７のＳ５４）を別々のプロセス（符号化処理プロセスと送信プロセス）とすると共に２つのバッファを用意し、Ｓ３８又はＳ５４を実行する送信プロセスが一方のバッファのデータをネットワークに出力している間に、符号化処理プロセスが次に送るべきデータをもう一方のバッファに格納するようにすることで、データ送信を効率を向上させることができる。このような技術自体は、並列処理技術として公知である。 Also, the steps (S31 to S37 in FIG. 5 and S51 to S53 in FIG. 7) until the encoding of the video / audio data and the storage in the communication buffer 22 and the transmission of the data stored in the communication buffer 22 (in FIG. 5). S38 and S54 in FIG. 7 are made separate processes (encoding process and transmission process) and two buffers are prepared, and the transmission process executing S38 or S54 outputs the data of one buffer to the network. In the meantime, the data transmission efficiency can be improved by allowing the encoding process to store the next data to be transmitted in the other buffer. Such a technique itself is known as a parallel processing technique.

図８は、図１に示す実施例を利用するテレビ会議システムのネットワーク構成の模式図を示す。図１に示す映像送信処理装置１０のスイッチ１２には、２台の端末４０ａ，４０ｂのカメラ出力及びマイク出力が入力されている。端末４０ａはワークステーション４２ａをベースとし、端末４０ｂはパーソナル・コンピュータ４２ｂをベースとしており、それぞれ、カメラ４４ａ，４４ｂ、マイク４６ａ，４６ｂ、映像モニタ４８ａ，４８ｂ及びスピーカ５０ａ，５０ｂが接続されている。マイク４６ａ，４６ｂとスピーカ５０ａ，５０ｂは、例えばスピーカ・フォン構成になっている。 FIG. 8 shows a schematic diagram of the network configuration of the video conference system using the embodiment shown in FIG. The camera output and microphone output of the two terminals 40a and 40b are input to the switch 12 of the video transmission processing apparatus 10 shown in FIG. The terminal 40a is based on a workstation 42a, and the terminal 40b is based on a personal computer 42b, to which cameras 44a and 44b, microphones 46a and 46b, video monitors 48a and 48b, and speakers 50a and 50b are connected, respectively. The microphones 46a and 46b and the speakers 50a and 50b have a speaker phone configuration, for example.

一例として、端末４０ａは、ＦＤＤＩなどの高速ネットワークを介してネットワーク３０に接続し、ワークステーション４２ａは例えば、映像／音声の符号化信号を復号化するハードウェアを具備する。また、端末４０ｂは、イーサネット５２及びルータ５４を介してネットワーク３０に接続し、コンピュータ４２ｂは、映像／音声の符号化信号を復号化するソフトウエアを具備する。ここでは、映像送信処理装置１０の出力が接続するネットワーク３０は、基幹ネットワーク又はそれに近い高速ネットワークであるとする。 As an example, the terminal 40a is connected to the network 30 via a high-speed network such as FDDI, and the workstation 42a includes hardware for decoding a video / audio encoded signal, for example. The terminal 40b is connected to the network 30 via the Ethernet 52 and the router 54, and the computer 42b includes software for decoding a video / audio encoded signal. Here, it is assumed that the network 30 to which the output of the video transmission processing device 10 is connected is a backbone network or a high-speed network close thereto.

この環境下でテレビ会議の行なわれる様子を説明する。なお、端末４０ａと図示しない別の端末との間でテレビ会議が行なわれるとする。 A state where a video conference is held in this environment will be described. It is assumed that a video conference is performed between the terminal 40a and another terminal (not shown).

テレビ会議では、自分の画像と通信相手（複数人も可）の画像をモニタ上に表示し、相手の声を自分の端末のスピーカから出力する。端末４０ａを自端末とすると、これは、自端末映像の受信処理を起動すると共に、相手端末の映像及び音声の受信処理を起動することで、実現される。 In a video conference, the user's image and the image of a communication partner (or a plurality of persons) are displayed on a monitor, and the other's voice is output from the speaker of his / her terminal. When the terminal 40a is the own terminal, this is realized by starting the reception processing of the own terminal video and starting the reception processing of the video and audio of the partner terminal.

自端末映像の受信処理と相手端末の映像及び音声の受信処理は、ほぼ同様の動作となるので、以下、相手端末の映像及び音声の受信処理を例に説明する。 Since the reception processing of the own terminal video and the reception processing of the video and audio of the counterpart terminal are almost the same operation, the video and audio reception processing of the counterpart terminal will be described below as an example.

図９は、自端末上で動作する映像音声受信処理と、相手側のカメラ出力及びマイク出力が入力する映像音声送信処理装置（以下、相手側映像音声送信処理装置という。）のデータ送信処理との関係を示す。Ｓ６１〜Ｓ６７は自端末上で動作する映像音声の受信処理を示し、Ｓ６８〜Ｓ７０は、相手側映像音声送信処理装置のデータ送信処理を示す。 FIG. 9 shows a video / audio reception process that operates on its own terminal, and a data transmission process of a video / audio transmission processing apparatus (hereinafter referred to as a counterpart video / audio transmission processing apparatus) to which the camera output and microphone output of the other party are input. The relationship is shown. S61 to S67 indicate video / audio reception processing operating on the own terminal, and S68 to S70 indicate data transmission processing of the counterpart video / audio transmission processing device.

まず、自端末上で受信処理が起動する。映像表示及び音声出力に必要なウィンドウ・システム上のモジュールなどが初期設定され（Ｓ６１）、データ受信のためのバッファが用意される（Ｓ６２）。受信のためのポートが開かれ、受け付け可能状態になる（Ｓ６３）。相手側映像音声送信処理装置に、相手端末のカメラ出力及びマイク出力の送信を要求する（Ｓ６４）。相手側映像音声送信処理装置はこの要求を受けると、もし受け入れられるならば、送信の初期設定を実行し、相手端末（端末４０ａ）側の受信ポートへの通信コネクションを要求する（Ｓ６８）。これにより、自端末（端末４０ａ）は、相手側映像音声送信処理装置との間に通信コネクションを確立する（Ｓ６５）。 First, reception processing is started on the own terminal. Modules on the window system necessary for video display and audio output are initialized (S61), and a buffer for receiving data is prepared (S62). A port for reception is opened, and reception is enabled (S63). The other party's video / audio transmission processing device is requested to transmit the camera output and microphone output of the other terminal (S64). Upon receiving this request, the counterpart video / audio transmission processing device, if accepted, executes initial settings for transmission and requests a communication connection to the reception port on the counterpart terminal (terminal 40a) side (S68). As a result, the terminal itself (terminal 40a) establishes a communication connection with the counterpart video / audio transmission processing apparatus (S65).

相手側映像音声送信処理装置は、映像及び音声データを獲得及び符号化して通信バッファに格納し（Ｓ６９）、通信バッファに格納されるデータを通信相手（端末４０ａ）に送信する（Ｓ７０）。 The partner video / audio transmission processing apparatus acquires and encodes the video and audio data, stores them in the communication buffer (S69), and transmits the data stored in the communication buffer to the communication partner (terminal 40a) (S70).

自端末（端末４０ａ）は、符号化されたデータを受信し（Ｓ６６）、これを復号化して、映像を映像モニタ４８ａのウィンドウに表示し、音声をスピーカ５０ａから出力する（Ｓ６７）。 The own terminal (terminal 40a) receives the encoded data (S66), decodes it, displays the video on the window of the video monitor 48a, and outputs the sound from the speaker 50a (S67).

その後、相手側映像音声送信処理装置は、Ｓ６９とＳ７０を繰り返し、端末４０ａも、Ｓ６６とＳ６７を繰り返す。これにより、映像及び音声が連続的に転送され、再生される。 Thereafter, the partner video / audio transmission processing apparatus repeats S69 and S70, and the terminal 40a also repeats S66 and S67. As a result, video and audio are continuously transferred and reproduced.

カメラの映像出力及びマイク出力はアナログ信号のレベルで分岐されコンピュータ４２ａ，４２ｂと映像音声送信処理装置１０の両方に入力されるように構成しても良い。その場合、自分の映像を受信する必要はなくなるが、カメラ入力をディジタル化するビデオ・キャプチャ機能がコンピュータ４２ａ，４２ｂに必要になる。 The video output and the microphone output of the camera may be branched at an analog signal level and input to both the computers 42a and 42b and the video / audio transmission processing device 10. In this case, it is not necessary to receive own video, but the video capture function for digitizing the camera input is required for the computers 42a and 42b.

尚、カメラ４４ａ，４４ｂ及びマイク４６ａ，４６ｂの出力を無線により映像音声送信処理装置１０に送信することで、配線の負担を無くすことができる。 Note that the burden of wiring can be eliminated by wirelessly transmitting the outputs of the cameras 44a and 44b and the microphones 46a and 46b to the video / audio transmission processing apparatus 10.

このようにして、多数のカメラ・マイクの映像音声情報を伝送する環境を低コストで実現できる。また、受信端末の性能や機能に応じて、異なる圧縮符号化方式や適切な送信パラメータを用いて映像音声データを送受信できる。 In this way, an environment for transmitting video / audio information of a large number of cameras and microphones can be realized at low cost. Also, video / audio data can be transmitted / received using different compression encoding methods and appropriate transmission parameters according to the performance and function of the receiving terminal.

音声信号の処理を省いて、映像信号のみを送信処理するようにしてもよい。図１０は、音声信号の処理を省略した映像送信処理装置の概略構成ブロック図を示す。図１に示す実施例から、スイッチ１２の音声入力を無くし、送信処理装置１４の音声デコーダ２４を除去した構成になっている。映像を用いた監視などの用途には、音声情報が不要な場合もあり、図１０に示すように構成することで、よりコスト効果比の良いシステムとすることができる。図１１は、図１０に示す映像送信処理装置を用いた遠隔監視システムの構成例であり、図８に示す構成から、音声の入出力装置を取り除いたものになっている。 The audio signal processing may be omitted, and only the video signal may be transmitted. FIG. 10 shows a schematic block diagram of a video transmission processing apparatus in which audio signal processing is omitted. From the embodiment shown in FIG. 1, the voice input to the switch 12 is eliminated, and the voice decoder 24 of the transmission processing device 14 is removed. For applications such as monitoring using video, audio information may not be necessary, and by configuring as shown in FIG. 10, a system with a better cost-effectiveness ratio can be obtained. FIG. 11 is a configuration example of a remote monitoring system using the video transmission processing device shown in FIG. 10, and is obtained by removing the voice input / output device from the configuration shown in FIG.

このように構成することにより、より安価な構成で、監視のような目的に最適のシステムを安価に構成できる。 By configuring in this way, it is possible to configure a system that is optimal for a purpose such as monitoring at a low cost with a cheaper configuration.

図８に示す構成をワイド・エリア・ネットワーク（ＷＡＮ）に拡張できることは明らかである。例えば、図１２に示すように、一般にビデオ・ゲートウエイなどと呼ばれる映像音声ＷＡＮ交換機６０を組みあわせればよい。 It is clear that the configuration shown in FIG. 8 can be extended to a wide area network (WAN). For example, as shown in FIG. 12, a video / audio WAN switch 60 generally called a video gateway or the like may be combined.

このように構成することにより、ＩＳＤＮなどの公衆通信回線網を介して外部のネットワークとの間でデータ交換できる。 With this configuration, data can be exchanged with an external network via a public communication line network such as ISDN.

上述の実施例では、映像信号を取り込むビデオ・キャプチャ及び映像と音声のエンコーダを共用することにより、そのためのコストを低減できるが、一般二、スイッチ１２としてあまり高速のものを利用できない（高価になる、大型になる。）ので、高いフレーム・レートを要求される用途では、複数の信号を実質的にも同時に扱うことができない。 In the above-described embodiment, by sharing the video capture for capturing the video signal and the video and audio encoder, the cost for that can be reduced, but generally, the switch 12 cannot use a very high speed (expensive). Therefore, in applications that require a high frame rate, a plurality of signals cannot be handled substantially simultaneously.

以下に、この問題点を解決した実施例を説明する。図１３は、本発明の第３実施例の概略構成ブロック図を示す。１１０は映像音声送信サーバ、１１２は８つの映像音声入力から任意の４つの映像音声信号を選択し、選択された４つの映像音声信号の４つの出力ポートから任意に出力できるマトリクス・スイッチ、１１４は、マトリクス・スイッチ１１２からの４つの映像音声信号の１つを選択するか、又は、４つの映像音声信号を時間軸圧縮し、映像信号については４画面のマルチ画構成に合成し、音声信号については時間軸で１／４に圧縮して合成する画面分割ユニットである。 An embodiment that solves this problem will be described below. FIG. 13 shows a schematic block diagram of a third embodiment of the present invention. 110 is a video / audio transmission server, 112 is a matrix switch that can arbitrarily select four video / audio signals from eight video / audio inputs, and can be arbitrarily output from four output ports of the selected four video / audio signals. , One of the four video / audio signals from the matrix switch 112 is selected, or the four video / audio signals are time-axis-compressed, and the video signal is synthesized into a multi-screen configuration of four screens. Is a screen dividing unit that compresses and synthesizes to 1/4 on the time axis.

映像音声送信サーバ１１０は、画面分割ユニット１１４から出力される映像信号を取り込み、ディジタル化する映像キャプチャ装置１１６と、映像キャプチャ装置１１６の出力映像データを圧縮符号化する映像エンコーダ１１８と、画面分割ユニット１１４から出力される音声信号を取り込み、ディジタル化する音声キャプチャ装置１２０と、音声キャプチャ装置１２０の出力音声データを圧縮符号化する音声エンコーダ１２２と、エンコーダ１１８，１２２の符号化データ及び送信すべき情報を一時格納する通信バッファ１２４と、映像音声送信サーバ１１０の全体、マトリクス・スイッチ１１２及び画面分割ユニット１１４を制御する制御回路１２６からなる。 The video / audio transmission server 110 includes a video capture device 116 that captures and digitizes a video signal output from the screen division unit 114, a video encoder 118 that compresses and encodes output video data of the video capture device 116, and a screen division unit. 114, an audio capture device 120 that captures and digitizes an audio signal output from 114, an audio encoder 122 that compresses and encodes audio data output from the audio capture device 120, encoded data of encoders 118 and 122, and information to be transmitted Is composed of a communication buffer 124 for temporarily storing the video and audio transmission server 110, a matrix switch 112, and a control circuit 126 for controlling the screen division unit 114.

画面分割ユニット１１４は、図１４に示すように分割モードと選択モードの２つの動作モードを具備する。分割モードは４つの入力映像を水平及び垂直で１／２に縮小して１画面に合成するモードである。このとき、画面分割ユニット１１４は４つの入力音声を時間軸で圧縮及び合成して出力する。選択モードは、４つの入力映像音声信号の内の任意の１つを選択して出力するモードである。このような機能を有する装置として、例えば、ソニー株式会社製ＹＳ−Ｑ４３０などがある。 As shown in FIG. 14, the screen division unit 114 has two operation modes: a division mode and a selection mode. The division mode is a mode in which four input images are reduced in half horizontally and vertically and combined into one screen. At this time, the screen division unit 114 compresses and synthesizes the four input sounds on the time axis and outputs them. The selection mode is a mode for selecting and outputting any one of the four input video / audio signals. As an apparatus having such a function, for example, there is YS-Q430 manufactured by Sony Corporation.

エンコーダ１１８は、例えば、図１５に示すように、それぞれ異なる圧縮符号化方式（例えば、ＭｏｔｉｏｎＪＰＥＧ、ＭＰＥＧ及びＩＴＵ−Ｔ勧告Ｈ．２６１）に対応する複数のエンコーダ１３０，１３２，１３４を設け、制御回路１２６の制御下で、スイッチ１３６が、入力映像データを何れか指定のエンコーダ１３０，１３２，１３４に供給するようにした構成であってもよい。音声エンコーダ１２２についても、同様である。 For example, as shown in FIG. 15, the encoder 118 includes a plurality of encoders 130, 132, and 134 that correspond to different compression encoding methods (for example, Motion JPEG, MPEG, and ITU-T recommendation H.261), and a control circuit The switch 136 may be configured to supply the input video data to any one of the designated encoders 130, 132, and 134 under the control of 126. The same applies to the voice encoder 122.

制御回路１２６は外部からの送信要求の内容に応じて、マトリクス・スイッチ１１２及び画面分割ユニット１１４を設定すると共に、映像キャプチャ装置１１６、映像エンコーダ１１８、音声キャプチャ装置１２０及び音声エンコーダ１２２を制御して、適切な送信データを生成し、通信バッファ１２４を介して外部ネットワークに送信させる。 The control circuit 126 sets the matrix switch 112 and the screen division unit 114 according to the content of the transmission request from the outside, and controls the video capture device 116, the video encoder 118, the audio capture device 120, and the audio encoder 122. Appropriate transmission data is generated and transmitted to the external network via the communication buffer 124.

図１６は、図１３の実施例の動作を示す流れ図である。先ず、送信要求、送信終了要求及び送信起動などのイベントの待ち状態になる（Ｓ１０１）。映像／音声送信要求を受け付けると（Ｓ１０２）、新たな送信処理を開始する（Ｓ１０３）。映像／音声送信終了要求を受け付けると（Ｓ１０４）、指定された映像／音声の送信処理を終了する（Ｓ１０５）。送信スケジューラにより映像音声送信動作を起動するイベントが発生すると（Ｓ１０６）、カメラ／マイクからの映像／音声の獲得と送信処理を開始する（Ｓ１０７）。 FIG. 16 is a flowchart showing the operation of the embodiment of FIG. First, it waits for events such as a transmission request, a transmission end request, and transmission activation (S101). When a video / audio transmission request is received (S102), a new transmission process is started (S103). When the video / audio transmission end request is received (S104), the designated video / audio transmission processing is ended (S105). When an event for starting the video / audio transmission operation is generated by the transmission scheduler (S106), acquisition of video / audio from the camera / microphone and transmission processing are started (S107).

この実施例でも、送信要求には占有的処理要求と非占有的処理要求の２種類がある。占有的処理要求は、全ての送信動作を特定のデータで占有させることにより高いフレーム・レートでのデータ通信を可能にする送信要求である。映像音声送信サーバ１１０に要求するフレーム・レートがマトリクス・スイッチ１１２の切り替え速度を上回る場合、占有的処理要求にする必要がある。非占有的処理要求は、高いフレーム・レートを割り当てずに、複数の映像音声信号を送信する送信要求である。非占有的処理要求は、複数受け付けることができ、送信スケジューラにより順に起動される。送信スケジューラは、実際にはタイマ・イベントにより実現される。 Also in this embodiment, there are two types of transmission requests: exclusive processing requests and non-occupying processing requests. The exclusive processing request is a transmission request that enables data communication at a high frame rate by occupying all transmission operations with specific data. When the frame rate requested to the video / audio transmission server 110 exceeds the switching speed of the matrix switch 112, it is necessary to make an exclusive processing request. The non-occupying processing request is a transmission request for transmitting a plurality of video / audio signals without assigning a high frame rate. A plurality of non-occupied processing requests can be accepted and are sequentially activated by the transmission scheduler. The transmission scheduler is actually realized by a timer event.

各送信要求は、さらに２種類に分けられる。第１は、テレビ会議などを行うために、解像度よりも映像のフレーム・レートを優先したデータ送信要求である。このときには、画面分割ユニット１１４の動作モードを分割モードに設定し、複数画面をマトリクス・スイッチ１１２の切り替えなしに取り込み、符号化することで実現される。第２は、物体の細部の観察などを行なうためにフレーム・レートよりも解像度を優先したデータ送信要求である。このときは、画面分割ユニット１１４の動作モードを選択モードに設定し、マトリクス・スイッチ１１２により、送信したい映像音声信号を選択する。 Each transmission request is further divided into two types. The first is a data transmission request in which a video frame rate is given priority over resolution in order to perform a video conference or the like. At this time, the operation mode of the screen division unit 114 is set to the division mode, and a plurality of screens are fetched without switching the matrix switch 112 and encoded. The second is a data transmission request that gives priority to resolution over frame rate in order to observe details of an object. At this time, the operation mode of the screen division unit 114 is set to the selection mode, and the video / audio signal to be transmitted is selected by the matrix switch 112.

制御回路１２６は、まず、イベント待ち状態（Ｓ１０１）になる。外部からの通信又は送信スケジューラのタイマ・イベントが発生すると、通信内容及びイベントの種類に応じて、Ｓ１０２、Ｓ１０４又はＳ１０６の何れかに進む。 First, the control circuit 126 enters an event waiting state (S101). When an external communication or transmission scheduler timer event occurs, the process proceeds to S102, S104, or S106 depending on the communication content and the type of event.

受信したメッセージが新たな送信要求である場合、映像／音声送信要求受け付け処理（Ｓ１０２）に進み、メッセージの内容を読み出す。次いで、メッセージ内容に従って送信開始処理（Ｓ１０３）を実行する。 If the received message is a new transmission request, the process proceeds to a video / audio transmission request acceptance process (S102), and the content of the message is read. Next, a transmission start process (S103) is executed according to the message content.

受信したメッセージが送信終了の要求である場合、音声／映像送信終了要求の受け付け処理（Ｓ１０４）に進み、メッセージの内容を読み出す。次いで、メッセージの内容に従って送信終了処理（Ｓ１０５）を実行する。 If the received message is a transmission end request, the process proceeds to an audio / video transmission end request acceptance process (S104), and the content of the message is read. Next, transmission end processing (S105) is executed according to the content of the message.

タイマ・イベントが発生した場合には、送信処理プロセスを再起動し、イベント待ち状態（Ｓ１０１）に戻る。起動された送信処理プロセスが、カメラ／マイクからの映像／音声の獲得と送信の処理を実行し、再びタイマを設定する（Ｓ１０７）。 If a timer event has occurred, the transmission process is restarted and the process returns to the event waiting state (S101). The activated transmission process executes the process of acquiring and transmitting video / audio from the camera / microphone, and sets the timer again (S107).

図１７は、図１６の送信開始処理Ｓ１０３の詳細なフローチャートを示す。制御回路１２６は、まず、占有的処理要求が既にあるか否かを判定する（Ｓ１１１）。占有的処理要求がある場合（Ｓ１１１）、他の要求を受けつけられないので、当該開始要求を拒否する（Ｓ１１５）。占有的処理要求が無い場合（Ｓ１１１）、開始しようとしている処理自体が占有的処理要求であるか否かを判定する（Ｓ１１２）。非占有的処理要求の場合（Ｓ１１２）、送信スケジューラに、解像度優先かフレームレート優先かに応じた送信要求プロセスを追加し、起動のためのタイマを設定して終了する（Ｓ１１３）。占有的処理要求の場合（Ｓ１１２）、他の要求と共存できないので、既存の要求の処理プロセスを一時停止させた（Ｓ１１４）、その後、送信スケジューラに送信要求処理プロセスを追加し、起動のためのタイマを設定して終了する（Ｓ１１３）。 FIG. 17 shows a detailed flowchart of the transmission start process S103 of FIG. First, the control circuit 126 determines whether or not an exclusive processing request has already been made (S111). When there is an exclusive processing request (S111), since no other request can be accepted, the start request is rejected (S115). When there is no exclusive process request (S111), it is determined whether the process to be started is an exclusive process request (S112). In the case of a non-occupied processing request (S112), a transmission request process corresponding to resolution priority or frame rate priority is added to the transmission scheduler, a timer for activation is set, and the process ends (S113). In the case of the exclusive processing request (S112), since it cannot coexist with other requests, the processing process of the existing request is suspended (S114), and then the transmission request processing process is added to the transmission scheduler to The timer is set and the process ends (S113).

非占有的処理要求の処理プロセスを送信スケジューラに追加する手順を、図１８に示す。まず、処理要求を受け取り（Ｓ１２１）、受け取った処理要求が要求するカメラと同じカメラに帯する送信処理要求が既に送信スケジューラに存在するかどうかを調べる（Ｓ１２２）。同じカメラに対する送信処理要求が存在すれば（Ｓ１２２）、その送信処理プロセスに、今受け取った送信要求の送信先を追加して終了する（Ｓ１２３）。同じカメラに対する送信処理要求が存在しない場合（Ｓ１２２）、Ｓ１２１で受け取った処理要求が解像度優先かフレーム・レート優先かを判断する（Ｓ１２４）。解像度優先の場合（Ｓ１２４）、新たに起動のためのタイマを設定して終了する（Ｓ１２５）。フレーム・レート優先であれば（Ｓ１２４）、フレームレート優先処理要求の待ち行列に空きが存在するかどうかを、送信スケジューラを参照して調べ（Ｓ１２６）、空きが存在しなければ（Ｓ１２６）、新たに起動のためのタイマを設定して終了し８Ｓ１２５）、空きがあれば（Ｓ１２６）、空いている場所に処理を追加して終了する８Ｓ１２７）。 FIG. 18 shows a procedure for adding a non-occupied processing request processing process to the transmission scheduler. First, a processing request is received (S121), and it is checked whether a transmission processing request associated with the same camera as that requested by the received processing request already exists in the transmission scheduler (S122). If there is a transmission processing request for the same camera (S122), the transmission destination of the transmission request just received is added to the transmission processing process, and the process ends (S123). If there is no transmission processing request for the same camera (S122), it is determined whether the processing request received in S121 is resolution priority or frame rate priority (S124). When the resolution is prioritized (S124), a new timer for activation is set and the process ends (S125). If the frame rate is prioritized (S124), it is checked with reference to the transmission scheduler whether or not there is a space in the frame rate priority processing request queue (S126). When the timer for starting is set and finished (S8), if there is a vacancy (S126), processing is added to the vacant place and the process is finished (S8127).

このような手順により、図１９に例示するような送信要求列に対し、送信スケジューラの送信スケジュールは、図２０に示すようになる。 With this procedure, the transmission schedule of the transmission scheduler is as shown in FIG. 20 for the transmission request sequence illustrated in FIG.

図２１は、図１６の送信終了処理Ｓ１０５の詳細なフローチャートである。制御回路１２６は、まず、指定された終了すべき処理が実際に存在するかどうかを、送信スケジューラを参照して判定する（Ｓ１３１）。終了指定された処理が無い場合（Ｓ１３１）、エラー・メッセージを返す（Ｓ１３４）。終了指定された処理が存在する場合（Ｓ１３１）、対応する送信要求処理プロセスに終了信号を送り、送信スケジューラからそのプロセスを削除する（Ｓ１３２）。その後、終了したプロセスが占有的処理要求によるものかどうかを判定し（Ｓ１３３）、占有的処理要求によるものであれば（Ｓ１３３）、そのプロセスにより一時停止状態にされたプロセスを送信スケジューラを参照して再起動可能な状態に戻し、終了する（Ｓ１３５）。 FIG. 21 is a detailed flowchart of the transmission end process S105 of FIG. First, the control circuit 126 determines whether or not the designated process to be ended actually exists with reference to the transmission scheduler (S131). If there is no process designated to end (S131), an error message is returned (S134). If there is an end-designated process (S131), an end signal is sent to the corresponding transmission request processing process, and the process is deleted from the transmission scheduler (S132). Thereafter, it is determined whether the terminated process is due to an exclusive processing request (S133). If it is due to an exclusive processing request (S133), the process suspended by the process is referred to the transmission scheduler. Then, the system is returned to the restartable state, and the process ends (S135).

図１６のＳ１０７、即ち、カメラ／マイクからの映像／音声データの獲得と送信処理の内容は、処理要求が占有的処理要求であるか否かによって異なる。 The contents of the processing of S107 in FIG. 16, that is, acquisition of video / audio data from the camera / microphone and transmission processing differ depending on whether or not the processing request is an exclusive processing request.

図２２は、図１６のＳ１０７の詳細であって、非占有的処理要求の場合のフローチャートを示す。先ず、制御回路１２６は、画面分割ユニット１１４の動作モードを、要求が解像度優先であれば選択モードに、要求がフレーム・レート優先であれば分割モードにそれぞれ設定する（Ｓ１４１）。要求されたカメラ／マイクの映像／音声を取り込むように、マトリクス・スイッチ１１２を設定し（Ｓ１４２）、要求された映像サイズ及び音質になるように映像キャプチャ装置１１６及び音声キャプチャ装置１２０のパラメータを設定する（Ｓ１４３）。映像キャプチャ装置１１６及び音声キャプチャ装置１２０は映像／音声データを取り込んでディジタル化する（Ｓ１４４）。 FIG. 22 is a detailed flowchart of S107 in FIG. 16 and shows a flowchart in the case of a non-occupying process request. First, the control circuit 126 sets the operation mode of the screen division unit 114 to the selection mode if the request is resolution priority, and to the division mode if the request is frame rate priority (S141). The matrix switch 112 is set so as to capture the requested video / audio of the camera / microphone (S142), and the parameters of the video capture device 116 and the audio capture device 120 are set so as to obtain the requested video size and sound quality. (S143). The video capture device 116 and the audio capture device 120 capture and digitize the video / audio data (S144).

映像エンコーダ１１８及び音声エンコーダ１２２に圧縮符号化パラメータ（符号化方式と圧縮率等のパラメータ、更には符号化する範囲）を設定し（Ｓ１４５）、キャプチャ装置１１６，１２０の出力をその条件で圧縮符号化させる（Ｓ１４６）。圧縮符号化条件の符号化する範囲は、例えば、送信要求が解像度優先である場合には画面全体、フレーム・レート優先の場合には、画面分割ユニット１１４によって分割された画面の内、未だ符号化されていない領域の一つである。符号化された映像／音声データは通信バッファ１２４に一時格納され（Ｓ１４７）、通信バッファ１２４からネットワークを介して要求元（又は指定の相手）に送出される。この送信後、送信スケジューラに次の送信要求処理プロセスを追加する。キャプチャされた全ての領域が送信され終えるまで、Ｓ１４５〜Ｓ１４８を繰り返す（Ｓ１４９）。 Compression encoding parameters (parameters such as encoding method and compression rate, and encoding range) are set in the video encoder 118 and the audio encoder 122 (S145), and the output of the capture devices 116 and 120 is compressed under the conditions. (S146). The encoding range of the compression encoding condition is, for example, the entire screen when the transmission request is resolution priority, or the screen divided by the screen division unit 114 when the frame rate priority is still encoded. It is one of the areas that are not. The encoded video / audio data is temporarily stored in the communication buffer 124 (S147), and is transmitted from the communication buffer 124 to the request source (or designated partner) via the network. After this transmission, the next transmission request processing process is added to the transmission scheduler. S145 to S148 are repeated until all captured areas have been transmitted (S149).

各入力ごと及び分割された各領域ごとに符号化パラメータを設定するので（Ｓ１４５）、各領域を異なった符号化条件で符号化できる。また、エンコーダ１１８，１２２が複数の圧縮符号化方式に対応している場合には、各入力ごとに異なった圧縮符号化方式で符号化することも可能になる。 Since an encoding parameter is set for each input and for each divided area (S145), each area can be encoded under different encoding conditions. Further, when the encoders 118 and 122 are compatible with a plurality of compression encoding systems, it is possible to encode with different compression encoding systems for each input.

図２３及び図２４は、図１６のＳ１０７の詳細であって、占有的処理要求の場合の流れを示す。図２３は初期化処理であり、図２４は、カメラ／マイクからの映像／音声データの獲得と送信処理の流れを示す。占有的処理要求の場合、図２２のＳ１４１乃至Ｓ１４３、及びＳ１４５に相当する初期化設定操作が１回目（及び送信要求パラメータの変更時）にのみ行なわれ、その後はデータ獲得と送信動作のみが繰り返される。 FIG. 23 and FIG. 24 show the details of S107 in FIG. 16 and show the flow in the case of an exclusive processing request. FIG. 23 shows the initialization process, and FIG. 24 shows the flow of the acquisition / transmission process of video / audio data from the camera / microphone. In the case of an exclusive processing request, the initialization setting operation corresponding to S141 to S143 and S145 of FIG. 22 is performed only for the first time (and when the transmission request parameter is changed), and thereafter, only data acquisition and transmission operations are repeated. It is.

図２３を説明する。まず、画面分割ユニット１１４を選択モードに設定する（Ｓ１５１）。このように設定することにより、ある特定のカメラからの信号を高速に送信できる。次いで、要求されたカメラ／マイクの映像／音声が入力されるようにマトリクス・スイッチ１１２を設定する（Ｓ１５２）。映像キャプチャ装置１１６及び音声キャプチャ装置１２０に送信要求の内容に応じたパラメータを設定し（Ｓ１５３）、映像エンコーダ１１８及び音声エンコーダ１２２にも送信要求の内容に応じたパラメータを設定する（Ｓ１２２）。なお、Ｓ１５１で画面分割ユニット１１４を分割モードに設定すれば、複数のカメラ（図１４の例では、最大４箇所）からの映像を高速に送信できる。 FIG. 23 will be described. First, the screen division unit 114 is set to the selection mode (S151). By setting in this way, a signal from a specific camera can be transmitted at high speed. Next, the matrix switch 112 is set so that the requested video / audio of the camera / microphone is input (S152). Parameters according to the content of the transmission request are set in the video capture device 116 and the audio capture device 120 (S153), and parameters according to the content of the transmission request are also set in the video encoder 118 and the audio encoder 122 (S122). Note that if the screen division unit 114 is set to the division mode in S151, videos from a plurality of cameras (up to four places in the example of FIG. 14) can be transmitted at high speed.

図２４を説明する。映像キャプチャ装置１１６及び音声キャプチャ装置１２０は、画面分割ユニット１１４から出力される映像信号及び音声信号をそれぞれ取り込んでディジタル化する（Ｓ１６１）。映像エンコーダ１１８及び音声エンコーダ１２２はそれぞれ映像キャプチャ装置１１６及び音声キャプチャ装置１２０の出力データを圧縮符号化し（Ｓ１６２）、符号化データを通信バッファ１２４に格納する（Ｓ１６３）。通信バッファ１２４に格納されたデータは、所定のレート及びフォーマットでネットワークを介して送信要求元（又は指定の相手）に送信される（Ｓ１６４）。送信後、タイマを次の送信時期に設定して、終了する。 FIG. 24 will be described. The video capture device 116 and the audio capture device 120 each capture and digitize the video signal and the audio signal output from the screen division unit 114 (S161). The video encoder 118 and the audio encoder 122 compress and encode the output data of the video capture device 116 and the audio capture device 120, respectively (S162), and store the encoded data in the communication buffer 124 (S163). The data stored in the communication buffer 124 is transmitted to the transmission request source (or designated partner) via the network at a predetermined rate and format (S164). After transmission, the timer is set to the next transmission time and the process ends.

図２２及び図２４に示す送信処理プロセスで映像／音声データの符号化及び通信バッファ１２４への格納のステップ（Ｓ１４５〜Ｓ１４７及びＳ１６１〜Ｓ１６３）と、通信バッファ１２４に格納されているデータの送信（Ｓ１４８及びＳ１６４）を別のプロセスとすると共に２つのバッファを用意し、Ｓ１４８及びＳ１６４を実行する送信プロセスが一方のバッファのデータを転送している間に、符号化プロセスが次に送信すべきデータを他方のバッファに格納するようにして、通信バッファの書き込みと読み出しを同時に実行し、データ送信の効率を向上させることができる。 The steps of encoding video / audio data and storing them in the communication buffer 124 (S145 to S147 and S161 to S163) in the transmission processing process shown in FIGS. 22 and 24, and transmission of data stored in the communication buffer 124 ( S148 and S164) are separate processes and two buffers are prepared. While the transmission process executing S148 and S164 is transferring the data of one buffer, the encoding process should transmit the next data Can be stored in the other buffer, and writing and reading of the communication buffer can be executed simultaneously to improve the efficiency of data transmission.

図２５は、図１３に示す実施例を利用するテレビ会議システムのネットワーク構成の模式図を示す。１４０は、図１３に示す映像音声送信サーバ１１０、マトリクス・スイッチ１１２及び画面分割ユニット１１４からなる映像音声送信処理装置であり、ローカル・エリア・ネットワーク（ＬＡＮ）、ワイド・エリア・ネットワーク、一般公衆電話回線、その他のネットワークなどからなる通信ネットワーク１４２に接続している。 FIG. 25 shows a schematic diagram of the network configuration of the video conference system using the embodiment shown in FIG. Reference numeral 140 denotes a video / audio transmission processing apparatus including the video / audio transmission server 110, the matrix switch 112, and the screen division unit 114 shown in FIG. 13, and includes a local area network (LAN), a wide area network, and a general public telephone. It is connected to a communication network 142 including a line and other networks.

映像音声送信処理装置１４０のマトリクス・スイッチ１１２の入力には、個人の机上又はサイドに配置される２台の端末１４４ａ，１４４ｂのカメラ出力及びマイク出力が接続されている。端末１４４ａはワークステーション１４６ａをベースとし、端末１４４ｂはパーソナル・コンピュータ１４６ｂをベースとしており、それぞれ、カメラ１４８ａ，１４８ｂ、マイク１５０ａ，１５０ｂ、映像モニタ１５２ａ，１５２ｂ及びスピーカ１５４ａ，１５４ｂが接続されている。マイク１５０ａ，１５０ｂとスピーカ１５４ａ，１５４ｂは、例えばスピーカ・フォン構成になっている。 The input of the matrix switch 112 of the video / audio transmission processing device 140 is connected to the camera output and microphone output of two terminals 144a and 144b arranged on a personal desk or side. The terminal 144a is based on the workstation 146a, and the terminal 144b is based on the personal computer 146b, to which cameras 148a and 148b, microphones 150a and 150b, video monitors 152a and 152b, and speakers 154a and 154b are connected, respectively. The microphones 150a and 150b and the speakers 154a and 154b have a speaker phone configuration, for example.

一例として、端末１４４ａは、ＦＤＤＩなどの高速ネットワークを介してネットワーク１４２に接続し、ワークステーション１４６ａは例えば、映像／音声の符号化信号を復号化するハードウェアを具備する。また、端末１４２ｂは、イーサネット（登録商標）１５６及びルータ１５８を介してネットワーク１４２に接続し、コンピュータ１４６ｂは、映像／音声の符号化信号を復号化するソフトウエアを具備する。ここでは、映像音声送信処理装置１４０の出力が接続するネットワーク１４２は、基幹ネットワーク又はそれに近い高速ネットワークであるとする。 As an example, the terminal 144a is connected to the network 142 via a high-speed network such as FDDI, and the workstation 146a includes hardware for decoding a video / audio encoded signal, for example. The terminal 142b is connected to the network 142 via the Ethernet (registered trademark) 156 and the router 158, and the computer 146b includes software for decoding a video / audio encoded signal. Here, it is assumed that the network 142 to which the output of the video / audio transmission processing device 140 is connected is a backbone network or a high-speed network close thereto.

この環境下でテレビ会議の行なわれる様子を説明する。なお、端末１４４ａと図示しない別の端末との間でテレビ会議が行なわれるとする。 A state where a video conference is held in this environment will be described. It is assumed that a video conference is performed between the terminal 144a and another terminal (not shown).

テレビ会議では、自分の画像と通信相手（複数人も可）の画像をモニタ上に表示し、相手の声を自分の端末のスピーカから出力する。端末１４４ａを自端末とすると、これは、自端末映像の受信処理を起動すると共に、相手端末の映像及び音声の受信処理を起動することで、実現される。 In a video conference, the user's image and the image of a communication partner (or a plurality of persons) are displayed on a monitor, and the other's voice is output from the speaker of his / her terminal. When the terminal 144a is the own terminal, this is realized by starting the reception processing of the own terminal video and starting the reception processing of the video and audio of the partner terminal.

図２６は、自端末上で動作する映像音声受信処理と、相手側のカメラ出力及びマイク出力が入力する映像音声送信処理装置（以下、相手側映像音声送信処理装置という。）のデータ送信処理との関係を示す。Ｓ１７１〜Ｓ１７７は自端末上で動作する映像音声の受信処理を示し、Ｓ１７８〜Ｓ１８０は、相手側映像音声送信処理装置のデータ送信処理を示す。 FIG. 26 shows a video / audio reception process operating on the own terminal, and a data transmission process of a video / audio transmission processing apparatus (hereinafter referred to as a counterpart video / audio transmission processing apparatus) to which the camera output and microphone output of the other party are input. The relationship is shown. S171 to S177 indicate video / audio reception processing operating on the terminal, and S178 to S180 indicate data transmission processing of the partner video / audio transmission processing device.

まず、自端末上で受信処理が起動する。映像表示及び音声出力に必要なウィンドウ・システム上のモジュールなどが初期設定され（Ｓ１７１）、データ受信のためのバッファが用意される（Ｓ１７２）。受信のためのポートが開かれ、受け付け可能状態になる（Ｓ１７３）。相手側映像音声送信処理装置に、相手端末のカメラ出力及びマイク出力の送信を要求する（Ｓ１７４）。相手側映像音声送信処理装置はこの要求を受けると、もし受け入れられるならば、送信の初期設定を実行し、相手端末（端末１４４ａ）側の受信ポートへの通信コネクションを要求する（Ｓ１７８）。これにより、自端末（端末１４４ａ）は、相手側映像音声送信処理装置との間に通信コネクションを確立する（Ｓ１７５）。 First, reception processing is started on the own terminal. Modules on the window system necessary for video display and audio output are initialized (S171), and a buffer for receiving data is prepared (S172). A port for reception is opened, and reception is enabled (S173). The other party video / audio transmission processing apparatus is requested to transmit the camera output and microphone output of the other party terminal (S174). Upon receiving this request, the counterpart video / audio transmission processing apparatus, if accepted, executes an initial setting for transmission and requests a communication connection to the reception port on the counterpart terminal (terminal 144a) side (S178). As a result, the terminal (terminal 144a) establishes a communication connection with the partner video / audio transmission processor (S175).

相手側映像音声送信処理装置は、映像及び音声データを獲得及び符号化して通信バッファに格納し（Ｓ１７９）、通信バッファに格納されるデータを通信相手（端末１４４ａ）に送信する（Ｓ１８０）。 The counterpart video / audio transmission processing apparatus acquires and encodes video and audio data, stores them in the communication buffer (S179), and transmits the data stored in the communication buffer to the communication partner (terminal 144a) (S180).

自端末（端末１４４ａ）は、符号化されたデータを受信し（Ｓ１７６）、これを復号化して、映像を映像モニタ１５２ａのウィンドウに表示し、音声をスピーカ１５４ａから出力する（Ｓ１７７）。 The own terminal (terminal 144a) receives the encoded data (S176), decodes it, displays the video on the window of the video monitor 152a, and outputs the sound from the speaker 154a (S177).

その後、相手側映像音声送信処理装置は、Ｓ１７９とＳ１８０を繰り返し、端末１４４ａも、Ｓ１７６とＳ１７７を繰り返す。これにより、映像及び音声が連続的に転送され、再生される。 Thereafter, the partner video / audio transmission processing apparatus repeats S179 and S180, and the terminal 144a also repeats S176 and S177. As a result, video and audio are continuously transferred and reproduced.

カメラの映像出力及びマイク出力はアナログ信号のレベルで分岐されコンピュータ１４６ａ，１４６ｂと映像音声送信処理装置１４０の両方に入力されるように構成しても良い。その場合、自分の映像を受信する必要はなくなるが、カメラ入力をディジタル化するビデオ・キャプチャ機能がコンピュータ１４６ａ，１４６ｂに必要になる。 The video output and microphone output of the camera may be branched at an analog signal level and input to both the computers 146a and 146b and the video / audio transmission processing device 140. In this case, it is not necessary to receive own video, but the video capture function for digitizing the camera input is required for the computers 146a and 146b.

尚、カメラ１４８ａ，１４８ｂ及びマイク１５０ａ，１５０ｂの出力を無線により映像音声送信処理装置１４０に送信することで、配線の負担を無くすことができる。 Note that the burden of wiring can be eliminated by wirelessly transmitting the outputs of the cameras 148a and 148b and the microphones 150a and 150b to the video / audio transmission processing device 140.

以上説明したように構成することで、複数のカメラ及びマイクから４つ以内の複数のソースからの映像音声情報を自由に選択して送信できるようになる。 With the configuration described above, video / audio information from a plurality of sources within four from a plurality of cameras and microphones can be freely selected and transmitted.

図２７は、図１３に示す実施例の変更例の概略構成ブロック図を示す。１６０はマトリクス・スイッチ１１２と同様の８入力・４出力のマトリクス・スイッチ、１６２ａ，１６２ｂ，１６２ｃ，１６２ｄは、マトリクス・スイッチ１６０の４出力の各映像／音声出力信号を取り込み、符号化してネットワークに送出する映像音声送信サーバ、１６４は、外部からの制御信号に従い、映像音声送信サーバ１６２ａ〜ｄ及びマトリクス・スイッチ１６０を制御する制御装置である。 FIG. 27 shows a schematic block diagram of a modified example of the embodiment shown in FIG. 160 is an 8-input, 4-output matrix switch similar to the matrix switch 112, and 162a, 162b, 162c, 162d captures, encodes, and encodes each of the 4-output video / audio output signals of the matrix switch 160 into the network. The video / audio transmission server 164 to be transmitted is a control device that controls the video / audio transmission servers 162a to 162d and the matrix switch 160 in accordance with an external control signal.

映像音声送信サーバ１６２ａ〜１６２ｄは、マトリクス・スイッチ１６０の映像音声出力信号を取り込むキャプチャ装置１６６ａ〜１６６ｄ、キャプチャ装置１６６ａ〜１６６ｄの出力を圧縮符号化するエンコーダ１６８ａ〜１６８ｄ及びエンコーダ１６８ａ〜１６８ｄの出力をネットワークに送出するために一時格納する通信バッファ１７０ａ〜１７０ｄからなる。 The audio / video transmission servers 162a to 162d output the outputs of the capture devices 166a to 166d that capture the audio / video output signals of the matrix switch 160, the encoders 168a to 168d and the encoders 168a to 168d that compress and encode the outputs of the capture devices 166a to 166d. It comprises communication buffers 170a to 170d that are temporarily stored for transmission to the network.

各映像音声送信サーバ１６２ａ〜１６２ｄは、他の映像音声送信サーバ１６２ａ〜１６２ｄとは独立に、マトリクス・スイッチ１６０の出力を取り込み、符号化してネットワークに出力する。従って、各映像音声情報を高品質に送信できる。多地点間で高フレームレートかつ高解像度の信号が必要な用途では、本実施例のように複数のエンコーダを用いることにより、各端末にキャプチャ装置及びエンコーダを設置するよりもコスト効果比を高めつつ負荷を分散して、、同時に複数の信号を送信処理できる。 Each of the video / audio transmission servers 162a to 162d takes the output of the matrix switch 160, encodes it, and outputs it to the network, independently of the other video / audio transmission servers 162a to 162d. Therefore, each video / audio information can be transmitted with high quality. In applications that require high frame rate and high resolution signals between multiple points, using multiple encoders as in this embodiment increases cost effectiveness compared to installing capture devices and encoders at each terminal. A plurality of signals can be transmitted simultaneously by distributing the load.

図２７に示す実施例では、ある送信処理要求を、複数の映像音声送信サーバ１６２ａ〜１６２ｄのどれに割り当てるかは、以下のように決定される。即ち、送信要求を出す側（受信側）は、送信要求を受け取る側（送信側）の映像音声送信処理装置の制御装置１６４を通じて映像音声送信サーバ１６２ａ〜１６２ｄの送信スケジューラを参照する。参照した結果に基づいて、どの映像音声送信サーバ１６２ａ〜１６２ｄに送信処理を要求するかを受信側が決定し、制御装置１６４を通じて映像音声送信サーバ１６２ａ〜１６２ｄを指定して送信処理を指令する。 In the embodiment shown in FIG. 27, to which of a plurality of video / audio transmission servers 162a to 162d a certain transmission processing request is assigned is determined as follows. That is, the transmission requesting side (reception side) refers to the transmission schedulers of the video / audio transmission servers 162a to 162d through the control device 164 of the video / audio transmission processing device on the transmission request receiving side (transmission side). Based on the reference result, the receiving side determines which video / audio transmission server 162a to 162d is requested to perform the transmission processing, and designates the video / audio transmission server 162a to 162d through the control device 164 to instruct the transmission processing.

また、受信側が指定なしに送信側の制御装置１６４に送信処理を要求すると、送信側の制御装置１６４が、映像音声送信サーバ１６２ａ〜１６２ｄの送信スケジューラを参照し、負荷や能力などに応じて自動的に適切な映像音声送信サーバ１６２ａ〜１６２ｄに処理を割り振るようにしてもよい。この場合には、受信側は、送信スケジューラの参照や映像音声送信サーバの指定などの負担を負わずに済むという効果がある。 Further, when the receiving side requests transmission processing from the transmitting side control device 164 without designation, the transmitting side control device 164 refers to the transmission scheduler of the video / audio transmission servers 162a to 162d and automatically performs the processing according to the load, capability, and the like. Therefore, the processing may be assigned to appropriate video / audio transmission servers 162a to 162d. In this case, there is an effect that the receiving side does not have to bear the burden of referring to the transmission scheduler or specifying the video / audio transmission server.

いずれにしても、どの送信処理要求をどの映像音声送信サーバ１６２ａ〜１６２ｄに割り当てるかが決定されると、各映像音声送信サーバ１６２ａ〜１６２ｄの送信スケジューラに、割り振られた送信処理が追加される。制御装置１６４は、各映像音声送信サーバ１６２ａ〜１６２ｄの送信スケジューラを参照して、マトリクス・スイッチ１６０を適切に設定する。各映像音声送信サーバ１６２ａ〜１６２ｄは、設定されたマトリクス・スイッチ１６０の各出力を取り込み、圧縮符号化してネットワークに送出する。 In any case, when it is determined which transmission processing request is allocated to which video / audio transmission server 162a to 162d, the allocated transmission processing is added to the transmission scheduler of each video / audio transmission server 162a to 162d. The control device 164 appropriately sets the matrix switch 160 with reference to the transmission scheduler of each of the video / audio transmission servers 162a to 162d. Each of the audio / video transmission servers 162a to 162d takes in each output of the set matrix switch 160, compresses and encodes it, and sends it to the network.

図２８は、図２７に示す映像音声送信処理装置を使用したテレビ会議システムの概略構成ブロック図を示す。１８０は図２７に図示した映像音声送信処理装置である。図２５と同じ構成要素には同じ符号を付してある。このように構成することにより、コスト効果比を低下させずに、負荷を分散し、複数の信号を同時にディジタル化及び圧縮符号化することができる。 FIG. 28 shows a schematic block diagram of a video conference system using the video / audio transmission processing apparatus shown in FIG. Reference numeral 180 denotes the video / audio transmission processing apparatus shown in FIG. The same components as those in FIG. 25 are denoted by the same reference numerals. With this configuration, it is possible to distribute the load and simultaneously digitize and compress and encode a plurality of signals without reducing the cost-effectiveness ratio.

本発明は、複数の機器（例えば、ホストコンピュータ、インタフェース機器、リーダ又はプリンタ等）から構成されるシステムに適用しても、単一の機器（例えば、複写機又はファクシミリ装置など）からなる装置に適用してもよい。 The present invention can be applied to a system composed of a plurality of devices (for example, a host computer, an interface device, a reader, a printer, etc.) or an apparatus composed of a single device (for example, a copier or a facsimile machine). You may apply.

また前述した実施形態の機能を実現する各種デバイスを動作させる様に当該各種デバイスと接続された装置又はシステム内のコンピュータに、前記実施形態の機能を実現するためのソフトウエアのプログラムコードを供給し、その装置又はシステムのコンピュータ（ＣＰＵ又はＭＰＵ）が、格納されたプログラムに従って前記各種デバイスを動作させるようにしたものも、本願発明の技術的範囲に含まれる。この場合、前記ソフトウエアのプログラムコード自体が前述した実施形態の機能を実現することに相当し、そのプログラムコード自体、及びそのプログラムコードをコンピュータに供給するための手段、例えばかかるプログラムコードを格納した記憶媒体は、本発明を構成する。 In addition, the program code of software for realizing the functions of the embodiment is supplied to an apparatus or a computer in the system connected to the various devices so as to operate the various devices that realize the functions of the embodiment. A device (CPU or MPU) of the apparatus or system that causes the various devices to operate according to a stored program is also included in the technical scope of the present invention. In this case, the program code of the software itself corresponds to realizing the functions of the above-described embodiments, and the program code itself and means for supplying the program code to the computer, for example, the program code is stored. The storage medium constitutes the present invention.

かかるプログラムコードを格納する記憶媒体としては、例えばフレキシブルディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、磁気テープ、不揮発性のメモリカード及びＲＯＭ等を用いることが出来る。 As a storage medium for storing the program code, for example, a flexible disk, a hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a magnetic tape, a nonvolatile memory card, a ROM, and the like can be used.

またコンピュータが供給されたプログラムコードを実行することにより、前述の実施形態の機能が実現されるだけではなく、そのプログラムコードがコンピュータにおいて稼働しているＯＳ（オペレーティングシステム）又は他のアプリケーションソフトウエア等と共同して前述の実施形態の機能が実現される場合にも、かかるプログラムコードは本願発明の実施形態に含まれることは言うまでもない。 Further, by executing the program code supplied by the computer, not only the functions of the above-described embodiments are realized, but also the OS (operating system) or other application software in which the program code is running on the computer, etc. Needless to say, the program code is also included in the embodiment of the present invention even when the functions of the above-described embodiment are realized in cooperation with the embodiment.

更に、供給されたプログラムコードが、コンピュータの機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに格納された後、そのプログラムコードの指示に基づいてその機能拡張ボードや機能格納ユニットに備わるＣＰＵ等が実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も、本願発明に含まれることは言うまでもない。 Further, after the supplied program code is stored in a memory provided in a function expansion board of the computer or a function expansion unit connected to the computer, the function code is provided in the function expansion board or function storage unit based on an instruction of the program code. Needless to say, the present invention also includes a case where the CPU or the like performs part or all of the actual processing and the functions of the above-described embodiments are realized by the processing.

本発明の第１実施例の概略構成ブロック図である。It is a schematic block diagram of the first embodiment of the present invention. 図１に示す実施例の基本動作のフローチャートである。It is a flowchart of the basic operation | movement of the Example shown in FIG. 図２のＳ３の詳細なフローチャートである。It is a detailed flowchart of S3 of FIG. 図２のＳ５の詳細なフローチャートである。It is a detailed flowchart of S5 of FIG. 非占有的処理要求に対する送信処理のフローチャートである。It is a flowchart of the transmission process with respect to a non-occupying process request. 占有的処理要求に対する送信処理の初期化処理のフローチャートである。It is a flowchart of the initialization process of the transmission process with respect to an exclusive process request. 占有的処理要求に対するデータ送信処理のフローチャートである。It is a flowchart of the data transmission process with respect to an exclusive process request. 図１に示す実施例を組み込んだテレビ会議システムの概略構成ブロック図である。It is a schematic block diagram of a video conference system incorporating the embodiment shown in FIG. 相手側映像音声送信処理装置の映像音声送信処理と自端末の映像音声受信処理のフローチャートである。It is a flowchart of the audio / video transmission process of the other party audio / video transmission processing apparatus, and the audio / video reception process of an own terminal. 本発明の第２実施例の概略構成ブロック図である。It is a schematic block diagram of the second embodiment of the present invention. 図１０に示す実施例を組み込んだ遠隔監視システムの概略構成ブロック図である。It is a schematic block diagram of a remote monitoring system incorporating the embodiment shown in FIG. 図８に示すシステムの変更例の概略構成ブロック図である。It is a schematic block diagram of the example of a change of the system shown in FIG. マトリクス・スイッチを使用する本発明の第１の実施例を表す構成図である。It is a block diagram showing the 1st Example of this invention using a matrix switch. 画面分割ユニット１１４の動作モードの説明図である。It is explanatory drawing of the operation mode of the screen division | segmentation unit. 映像エンコーダ１１８の構成例の概略ブロック図である。2 is a schematic block diagram of a configuration example of a video encoder 118. FIG. 図１３の実施例の動作を示す流れ図である。It is a flowchart which shows the operation | movement of the Example of FIG. 図１６の送信開始処理Ｓ１０３の流れ図である。It is a flowchart of transmission start process S103 of FIG. 非占有的処理要求の処理プロセスを送信スケジューラに追加する手順のフローチャートである。It is a flowchart of the procedure which adds the process of a non-occupying process request to a transmission scheduler. 送信要求列の一例である。It is an example of a transmission request sequence. 図１９に示す送信要求列に対する送信スケジュールである。This is a transmission schedule for the transmission request sequence shown in FIG. 図１６の送信終了処理Ｓ１０５の詳細なフローチャートである。17 is a detailed flowchart of a transmission end process S105 in FIG. 非占有的処理要求の場合の、図１６のＳ１０７の詳細なフローチャートである。It is a detailed flowchart of S107 of FIG. 16 in the case of a non-occupying process request. 占有的処理要求の場合の、図１６のＳ１０７の初期化処理のフローチャートである。FIG. 17 is a flowchart of the initialization process in S107 of FIG. 16 in the case of an exclusive process request. 占有的処理要求の場合の、図１６のＳ１０７のデータ送信処理のフローチャートである。FIG. 17 is a flowchart of data transmission processing in S107 of FIG. 16 in the case of an exclusive processing request. 図１３に示す実施例を使用したテレビ会議システムの概略構成ブロック図である。It is a schematic block diagram of a video conference system using the embodiment shown in FIG. 図１３に示す実施例における、相手側映像音声送信処理装置の映像音声送信処理と自端末の映像音声受信処理のフローチャートである。14 is a flowchart of the video / audio transmission processing of the counterpart video / audio transmission processing device and the video / audio reception processing of the own terminal in the embodiment shown in FIG. 13; マトリクス・スイッチを使用する本発明の第２の実施例の概略構成ブロック図である。It is a schematic block diagram of the 2nd Example of this invention which uses a matrix switch. 図２７に示す実施例を使用するテレビ会議システムの概略構成ブロック図である。It is a schematic block diagram of a video conference system using the embodiment shown in FIG.

Explanation of symbols

１０：映像／音声送信処理装置
１２：スイッチ
１４：送信処理装置
１６：ビデオ・キャプチャ
１８：ビデオ・エンコーダ
２０：セレクタ
２２：通信バッファ
２４：音声エンコーダ
２６：制御回路
３０：ネットワーク
４０ａ，４０ｂ：端末
４２ａ：ワークステーション
４２ｂ：パーソナル・コンピュータ
４４ａ，４４ｂ：カメラ
４６ａ，４６ｂ：マイク
４８ａ，４８ｂ：映像モニタ
５０ａ，５０ｂ：スピーカ
５２：イーサネット
５４：ルータ
１１０：映像音声送信サーバ
１１２：マトリクス・スイッチ
１１４：画面分割ユニット
１１６：映像キャプチャ装置
１１８：映像エンコーダ
１２０：音声キャプチャ装置
１２２：音声エンコーダ
１２４：通信バッファ
１２６：制御回路
１３０，１３２，１３４：エンコーダ
１３６：スイッチ
１４０：映像音声送信処理装置
１４２：通信ネットワーク
１４４ａ，１４４ｂ：端末
１４６ａ：ワークステーション
１４６ｂ：パーソナル・コンピュータ
１４８ａ，１４８ｂ：カメラ
１５０ａ，１５０ｂ：マイク
１５２ａ，１５２ｂ：映像モニタ
１５４ａ，１５４ｂ：スピーカ
１５６：イーサネット（登録商標）
１５８：ルータ
１６０：マトリクス・スイッチ
１６２ａ，１６２ｂ，１６２ｃ，１６２ｄ：映像音声送信サーバ
１６４：制御装置
１６６ａ〜１６６ｄ：キャプチャ装置
１６８ａ〜１６８ｄ：エンコーダ
１７０ａ〜１７０ｄ：通信バッファ 10: Video / audio transmission processing device 12: Switch 14: Transmission processing device 16: Video capture 18: Video encoder 20: Selector 22: Communication buffer 24: Audio encoder 26: Control circuit 30: Network 40a, 40b: Terminal 42a : Workstation 42b: Personal computer 44a, 44b: Camera 46a, 46b: Microphone 48a, 48b: Video monitor 50a, 50b: Speaker 52: Ethernet 54: Router 110: Video / audio transmission server 112: Matrix switch 114: Screen division Unit 116: Video capture device 118: Video encoder 120: Audio capture device 122: Audio encoder 124: Communication buffer 126: Control circuits 130, 132, 134: Encoder 136: Switch 140: Video Voice transmission processing apparatus 142: communication network 144a, 144b: terminal 146a: Workstation 146b: personal computers 148a, 148b: camera 150a, 150b: microphone 152a, 152 b: video monitor 154a, 154b: speaker 156: Ethernet (registered trademark)
158: Router 160: Matrix switch 162a, 162b, 162c, 162d: Video / audio transmission server 164: Control devices 166a-166d: Capture devices 168a-168d: Encoders 170a-170d: Communication buffers

Claims

A transmission processing device that selects one of a plurality of inputs for at least one of video and audio and transmits the selected video to a network.
Switching means for switching a plurality of analog input signals according to a given instruction;
A / D conversion means for digitizing an analog signal output from the switching means;
Encoding means for compressing and encoding the digital output of the A / D conversion means;
A transmission processing apparatus comprising: output means for outputting data encoded by the encoding means to a network.

The transmission processing according to claim 1, wherein the encoding means includes a plurality of compression encoding means, and the control means selects a compression encoding means to be used based on information exchange with a communication destination and a situation of a communication path. apparatus.

The transmission processing apparatus according to claim 1, wherein the encoding unit includes a video encoding unit and an audio encoding unit.

The transmission processing apparatus according to claim 1, further comprising generating means for generating the instruction.

A transmission processing apparatus that selects one of a plurality of inputs for at least one of video and audio and transmits it to a network,
Switching means for switching a plurality of inputs via a network;
Encoding means for compressing and encoding the signal from the switching means;
Output processing means for outputting the encoded output by said means to a network.

A switch for selecting an arbitrary plurality of signals from a plurality of analog input signals;
Combining means for compressing and combining a plurality of analog signals selected by the switch on the time axis;
A / D conversion means for digitizing an analog signal output from the synthesis means;
Encoding means for compressing and encoding a digital signal output from the A / D conversion means;
A transmission processing apparatus comprising communication and control means for controlling the switch.

The transmission processing apparatus according to claim 6, wherein the encoding unit includes a plurality of compression encoding units, and the compression encoding unit to be used is selected based on information exchange with a communication destination and a situation of a communication path.

The transmission processing apparatus according to claim 7, wherein information transmitted and received is video information and audio information.

A switch for selecting an arbitrary plurality of signals from a plurality of analog input signals;
A plurality of A / D conversion means for digitizing each of a plurality of analog signals selected by the switch;
A plurality of encoding means for compressing and encoding each of the digital signals output from the plurality of A / D conversion means;
A transmission processing apparatus comprising communication and control means for controlling the switch.

A transmission processing method for selecting one of a plurality of inputs for at least one of video and audio and transmitting the selected one to a network,
Switch between multiple analog input signals according to given instructions,
Digitized by A / D conversion means,
Compress and encode the digital output,
A transmission processing method, comprising: outputting encoded data to a network.

The transmission processing method according to claim 10, wherein a compression encoding method to be used is selected from a plurality of compression encoding methods based on information exchange with a communication destination and a state of a communication path.

The transmission processing method according to claim 10 or 11, further comprising video encoding means and audio encoding means.

The transmission processing method according to claim 10, further comprising generating means for generating the instruction.

A transmission processing method for selecting one of a plurality of inputs for at least one of video and audio and transmitting it to a network,
Switching between multiple inputs over the network, compression encoding,
A transmission processing method characterized by outputting an encoded output to a network.

Select any multiple signals from multiple analog input signals,
Compress and synthesize selected analog signals on the time axis,
Digitize the synthesized analog signal,
A transmission processing method characterized by compression-coding a digitized composite signal.

The transmission processing method according to claim 16, wherein a compression encoding method to be used is selected from a plurality of compression encoding methods based on information exchange with a communication destination and a state of a communication path.

The transmission processing method according to claim 16, wherein the information transmitted and received is video information and audio information.

Select any multiple signals from multiple analog input signals,
Digitize each of the selected analog signals,
A transmission processing method characterized by compression-coding each digital signal.