JP2004110615A

JP2004110615A - Sign Language Interpretation System

Info

Publication number: JP2004110615A
Application number: JP2002274423A
Authority: JP
Inventors: Shinji Nagaoka; 長　岡　伸　治; Tsutomu Nakamura; 中　村　　　勉; Takashi Nishimura; 西　村　　　隆
Original assignee: FTC KK
Current assignee: FTC KK
Priority date: 2002-09-20
Filing date: 2002-09-20
Publication date: 2004-04-08

Abstract

【課題】互いに異なる手話を使用するろう者同士、あるいはろう者と聴者同士でスムーズに会話を行うことができる手話通訳システムを提供する。
【解決手段】本発明に係る手話通訳システムは、ろう者または聴者が使用する少なくとも一台の端末装置１と、この端末装置１との間で通信媒体２を介して動画像及び音声データの送受信を行う手話通訳センター３とを備え、手話通訳センター３は、サーバー４と複数の手話通訳装置５とを有する。複数の手話言語を通訳できる複数の手話通訳者を手話通訳センター３に常駐しておき、互いに異なるろう者同士、あるいはろう者と聴者同士で会話を行う必要が生じると、手話の動画像を端末装置１から手話通訳装置５に送信して手話通訳者が手話の通訳を行い、その結果を音声または手話で端末装置１に送信するため、手話言語が理解できない者同士で、違和感なく、かつほぼリアルタイムで会話を行うことができる。
【選択図】　図１A sign language interpreting system is provided that enables a deaf person using different sign languages or a deaf person and a listener to have a smooth conversation.
A sign language interpreting system according to the present invention transmits and receives a moving image and audio data between at least one terminal device used by a deaf person or a listener and the terminal device via a communication medium. The sign language interpreting center 3 includes a server 4 and a plurality of sign language interpreting devices 5. A plurality of sign language interpreters capable of interpreting a plurality of sign language languages are stationed at the sign language interpreting center 3, and if it is necessary to have a conversation between different deaf people or between deaf people and listeners, a moving image of the sign language is transmitted to the terminal. Since the sign language interpreter interprets the sign language by transmitting it from the device 1 to the sign language interpreting device 5 and transmitting the result to the terminal device 1 by voice or sign language, the persons who cannot understand the sign language are comfortable and almost uncomfortable. You can talk in real time.
[Selection diagram] Fig. 1

Description

【０００１】
【発明の属する技術分野】
本発明は、互いに異なる種類の手話を扱うろう者同士の会話、あるいは、ろう者と聴者同士の会話を手助けする手話通訳システムに関する。
【０００２】
【従来の技術】
日本で広く使用されている手話には、大別すると二種類がある。一つは、日本手話と呼ばれるものであり、昔からろう者によって代々受け継がれてきた伝統的なものである。この日本手話は、日本語とは全く異なる言語である。もう一つは、日本語対応手話と呼ばれるものであり、日本語そのものを手話によって表現しようとするものである。日本語対応手話は日本手話よりも歴史が浅く、近代になって作られたものである。
【０００３】
日本手話は先天的ろう者が使用することが多いのに対し、日本語対応手話は後天的ろう者が使用することが多い。
【０００４】
このように、国内に複数種類の手話が並存するのは外国においても同じであり、例えば、米国には、アメリカ手話と英語対応手話が存在する。
【０００５】
【発明が解決しようとする課題】
日本手話は、日本語と文法が異なるため、日本手話をマスターするのは、外国語をマスターするのと同じくらい困難である。この点、日本語対応手話は、単語
さえマスターできれば、すぐにでも話しができる。
【０００６】
日本手話を話すろう者は、日本語を理解できる者であれば、日本語対応言語を理解しやすいが、後天的ろう者の多くは日本語対応手話しか理解できない。このため、日本手話を話す者と日本語対応手話を話す者との会話でも、通訳を必要とする場合がある。
【０００７】
また、現在の日本社会で手話として教えているものは日本語対応手話であることが多く、二種類の手話が存在することすら、十分に認知されていない。日本手話をマスターするのは難しいが、両親がろう者である聴者の子供は、日本手話と日本語の双方を完全にマスターできる環境にあり、日本手話と日本語対応手話の双方を理解できる手話通訳者になる可能性が高い。
【０００８】
本発明は、このような点に鑑みてなされたものであり、その目的は、互いに異なる手話を使用するろう者同士、あるいはろう者と聴者同士でスムーズに会話を行うことができる手話通訳システムを提供することにある。
【０００９】
【課題を解決するための手段】
上述した課題を解決するために、本発明は、複数の手話通訳者がそれぞれ使用する複数の手話通訳装置を有するセンターと、前記センターとの間で音声及び動画像データの通信を行う少なくとも一台の端末装置と、を備え、前記端末装置は、ろう者の顔の表情と手の動きを動画像として取り込む第１画像取込み部と、話者の音声を取り込む第１音声取込み部と、取り込んだ動画像および音声を前記センターに伝送する第１通信部と、前記手話通訳装置から伝送されてきた動画像を第１表示装置に映し出す第１表示処理部と、を有し、前記複数の手話通訳装置のそれぞれは、前記端末装置から伝送されてきた動画像を第２表示装置に映し出す第２表示処理部と、前記第２表示装置に映し出された動画像に含まれる手話情報を前記手話通訳者が通訳して得られた音声を取り込む第２音声処理部と、手話通訳者の顔の表情と手の動きを動画像として取り込む第２画像取込み部と、取り込んだ音声及び動画像を前記端末装置に伝送する第２通信部と、を有する。
【００１０】
本発明では、端末装置と手話通訳装置との間で、音声及び動画像データを送受信し、互いに異なる手話言語を扱うろう者同士、あるいはろう者と聴者同士で、ほぼリアルタイムに会話を行えるようにする。
【００１１】
【発明の実施の形態】
以下、本発明に係る手話通訳システムについて、図面を参照しながら具体的に説明する。
【００１２】
図１は本発明に係る手話通訳システムの一実施形態の全体構成を示すブロック図である。図１の手話通訳システムは、ろう者または聴者が使用する少なくとも一台の端末装置１と、この端末装置１との間で通信媒体２を介して動画像及び音声データの送受信を行う手話通訳センター３とを備えている。
【００１３】
通信媒体２の種類は特に問わないが、３５万画素以上の解像度の動画（ストリーミング画像）を２０コマ／秒以上で安定に伝送可能な通信媒体２が望ましい。この種の通信媒体２の具体例としては、ＦＴＴＨやＡＤＳＬなどが考えられる。
【００１４】
図１の手話通訳システムは、互いに異なる手話言語を扱うろう者同士、あるいはろう者と聴者同士の会話を行うことを念頭に置いている。
【００１５】
図１の手話通訳センター３は、サーバー４と複数の手話通訳装置５とを有する。手話通訳装置５は、手話通訳者が使用するものであり、複数種類の手話言語に迅速に対応できるように複数設けられ、複数の手話通訳者が同時に複数のろう者に応対できるようにしている。
【００１６】
端末装置１には、ろう者や聴者が手軽に持ち運べるようにした携帯型のものと、聴者がろう者と会話する必要のある場所（役所や病院など）に設置される据置型のものとがある。
【００１７】
例えば、ろう者同士で会話を行う場合は、各ろう者が所持している携帯型の端末装置１をそれぞれ利用して、手話通訳センター３を介して会話を行う。また、役所などで、聴者がろう者と会話する必要が生じた場合には、据置型の端末装置１のある場所にろう者を案内して、この端末装置１を利用して聴者は手話通訳センター３を介してろう者と会話する。
【００１８】
図２は携帯型の端末装置１の一例を示す斜視図、図３は据置型の端末装置１の一例を示す斜視図である。これら端末装置１は、手話を行うろう者の顔の表情と手の動きを動画像として取り込み可能な撮像部６（例えば、ＣＣＤカメラ）と、手話通訳センター３から送られてきた動画像及び音声を再生する表示部７及びスピーカ８と、手話通訳センター３との間で文字情報や制御情報の送受信を行うためのキーボード９とを有する。
【００１９】
ろう者は耳が不自由なため、着信音が鳴っても着信に気がつかない。そこで、本実施形態では、ろう者に対して図３及び図４に示すような腕時計型のバイブレータ１０を持たせている。手話通訳センター３から端末装置１に着信信号が届くと、端末装置１内の着信信号送信部は無線にてバイブレータ１０を駆動する。また、バイブレータ１０だけで着信を知らせるだけでなく、端末装置１のランプを点灯させてもよい。
【００２０】
図４は端末装置１の内部構成の一例を示す詳細ブロック図である。図示のように、端末装置１は、手話通訳センター３とのデータ通信を制御する通信処理部２１と、手話通訳センター３からの受信データを復調する復調部２２と、手話通訳センター３への送信データを変調する変調部２３と、手話通訳センター３からの着信を検出する着信検出部２４と、バイブレータ１０に着信を報知する着信信号送信部２５と、手話通訳センター３からの受信データに含まれる音声データの処理を行う音声処理部２６と、音声の再生を行うスピーカ８と、手話通訳センター３からの受信データに含まれる画像データの処理を行う画像処理部２８と、画像を表示する表示部７と、手話通訳センター３からの警告信号を検出する警告検出部２９と、動画像のコマ落ちを検出するコマ落ち検出部３０と、ろう者の顔の表情及び手の動きを撮像する撮像部６と、撮像画像の処理を行う画像処理部３１と、ろう者または聴者の音声を取り込むマイク３２と、取り込んだ音声の処理を行う音声処理部３３と、手話言語の選択を行う手話言語選択部３４と、手話通訳装置５に対してコマ落ちを警告する警告出力部３５と、を有する。
【００２１】
スピーカ８とマイク３２は、電話機に設けられている受話器をそのまま利用してもよいし、あるいは受話器とは別に設けられてもよい。
【００２２】
図５は手話通訳センター３内の手話通訳装置５の内部構成の一例を示す詳細ブロック図である。図示のように、手話通訳装置５は、端末装置１とのデータ通信を制御する通信処理部４１と、端末装置１からの受信データを復調する復調部４２と、端末装置１への送信データを変調する変調部４３と、端末装置１からの受信データに含まれる音声データの処理を行う音声処理部４４と、音声の再生を行うスピーカ４５と、端末装置１からの受信データに含まれる画像データの処理を行う画像処理部４６と、画像を表示する表示部４７と、端末装置１からの警告信号を検出する警告検出部４８と、動画像のコマ落ちを検出するコマ落ち検出部４９と、端末装置１が選択した手話言語を検出する手話言語検出部５０と、手話通訳者の顔の表情及び手の動きを撮像する撮像部５１と、撮像画像の処理を行う画像処理部５２と、手話通訳者の音声を取り込むマイク５３と、取り込んだ音声の処理を行う音声処理部５４と、手話通訳装置５に対してコマ落ちを警告する警告出力部５５と、を有する。
【００２３】
端末装置１を使用するろう者または聴者は、手話を開始する前に、手話言語選択部３４により手話言語を選択する。この手話言語選択部３４は、端末装置１に設けられたボタンやキーボード９操作等によりろう者や聴者が入力した手話言語を選択する。選択された手話言語は、手話通訳センター３に送られ、手話通訳装置５内の手話言語検出部５０にて検出される。
【００２４】
手話通訳センター３には、複数の手話言語それぞれを通訳する複数の手話翻訳者が常駐しており、端末装置１で選択された手話言語を通訳可能な手話翻訳者を選定する。なお、端末装置１が手話言語を選択したときに、その手話言語を通訳できる手話通訳者がすぐに応対できない場合には、その旨を端末装置１に返信するのが望ましい。
【００２５】
手話通訳センター３に常駐する手話翻訳者は、日本手話と日本語対応手話を通訳できる者だけでなく、外国人のろう者にも対応できるように、主要な外国の手話言語を通訳できる手話翻訳者を常駐させるのが望ましい。
【００２６】
また、手話通訳センター３に手話通訳者を常駐させる代わりに、各手話通訳者の自宅等をネットワークで相互に接続して、手話通訳センター３からネットワークを介して連絡を受けた手話通訳者が手話通訳を行ってもよい。この場合、手話通訳センター３から手話通訳者にろう者の動画像を送って手話通訳者が通訳を行ってもよいし、手話通訳センター３からの指示により手話通訳者がろう者と直接通信を行って動画像を受信してもよい。
【００２７】
これにより、種々の手話言語の通訳ができる多数の手話通訳者をネットワーク化することができ、数多くのろう者に対して同時に通訳サービスを行うことができ、利用価値が向上する。
【００２８】
端末装置１と手話通訳センター３とは、高速の通信媒体２でデータ通信を行うが、時間帯等により通信トラフィックが発生し、条件が悪いと高速データ通信ができない場合もある。この場合、手話の様子を表す動画像がコマ落ちしてしまい、手話通訳者やろう者が内容を理解できないおそれがある。
【００２９】
このような問題に対処するために、本実施形態では、端末装置１と手話通訳装置５の双方に、コマ落ち検出部３０，４９と、警告出力部３５，５５と、警告検出部２９，４８とを設けている。コマ落ち検出部３０，４９で動画像のコマ落ちを検出すると、警告出力部３５，５５から警告信号を送信する。また、警告検出部２９，４８で警告信号が検出されると、手話を行うろう者や手話通訳者に対してコマ落ちが起こった旨の警告を行い、手話の速度を落とすように促す。警告の具体的手法としては、例えば、端末装置１に設けられている専用のランプを点灯または点滅させる。あるいは、端末装置１の表示装置７にコマ落ちが起きた旨を表示してもよい。
【００３０】
手話の速度が遅くなれば、秒当たりのコマ数を遅くしてもコマ落ちは発生しなくなるため、秒当たりのコマ数を減らすことができ、これにより、動画像のデータ量を削減できる。
【００３１】
コマ落ちが起きた場合の他の対処方法として、手話に関係のない背景画像を削除して動画像のデータ量を削減してもよい。手話に必要な画像は、手話者の顔の表情と手の動きであり、それ以外の背景画像は手話の理解にはあまり重要ではない。そこで、背景画像をカットして（より具体的には、背景画像を無地単色に設定して）、動画像データのデータ量を削減する。
【００３２】
コマ落ちが起きた場合のもう一つの対処方法として、動画像の画素数を減らすか、画面サイズを縮小してデータ量を削減してもよい。ただし、画素数を減らすすと、画像が粗くなり、また、画面サイズを縮小すると、画面が見づらくなり、いずれにしても顔の表情や手の動きを把握しづらくなる。したがって、極端に画素数を減らしたり、画面サイズを縮小するのは望ましくない。
【００３３】
上記のようなコマ落ち対策を施すと、単位時間当たりの情報伝送量が当然に少なくなる。情報伝送量が減っても常時接続の環境にあれば特に金銭的な問題は起きないが、ＩＳＤＮ回線等の通信時間に応じて課金される環境では、コマ落ちが起きると通信時間がより長くなるため、ユーザの金銭的な負担が重くなる。このため、このような環境では、通信時間ではなく、実際に送信した情報伝送量に応じて課金するのが望ましい。例えば、携帯電話などで採用されているパケット量に応じて課金する課金システムが望ましい。
【００３４】
ところで、動画像や音声データを伝送する手法として、高速の伝送速度が得られる可能性があるが伝送速度が保証されないベストエフォート型の伝送手法と、最高伝送速度が制限される代わりに最低伝送速度を保証する伝送手法とがあるが、本実施形態は、ブロードバンド回線を利用し、かつコマ落ち対策も施しているため、できればベストエフォート型の伝送手法を採用するのが望ましい。
【００３５】
互いに異なる手話言語を扱うろう者同士が手話通訳センター３を介して会話を行う場合、ろう者の電話番号やＩＰアドレス等（以下、総称して識別情報と呼ぶ）を会話のたびに入力するのはろう者の負担が大きい。そこで、手話通訳センター３にろう者の識別情報を一括して登録しておき、各ろう者は手話通訳センター３から他のろう者の識別情報を取得できるようにするのが望ましい。
【００３６】
図６は手話通訳センター３内のサーバー４が複数のろう者の識別情報を管理する例を示すサーバー４のブロック図である。図６のサーバー４は、ろう者の識別情報を格納する識別情報格納部６１と、ろう者の識別情報を更新する識別情報更新部６２と、要求のあったろう者の識別情報を提供する識別情報提供部６３とを有する。
【００３７】
ろう者Ａが他のろう者Ｂと手話通訳センター３を介して会話を行う場合は、ろう者Ｂの氏名等を手話通訳センター３に送信すれば、手話通訳センター３内のサーバー４がろう者Ｂの識別情報をろう者Ａに提供するか、あるいはろう者Ａとろう者Ｂとを自動的にネットワーク接続する。
【００３８】
ろう者は、言葉を発しないため、他人の氏名を覚えるのが苦手であるという一般的な傾向がある。このため、サーバ４の識別情報格納部６１に登録されている情報をろう者に提供する場合には、図７に示すように、登録されているろう者の上半身を写した静止画像と、そのろう者の特徴的な文字情報（例えば、住んでいる地域名、ニックネーム、趣味など）とを組にして、検索を行ったろう者に提供するのが望ましい。
【００３９】
これにより、ろう者は、自分が会話を行いたい相手を静止画像と文字情報で視覚的に把握でき、通話相手を記憶に留め易くなる。
【００４０】
また、ろう者が識別情報格納部６１に登録されている情報を検索する際も、氏名だけでなく、地域名やニックネームなどで検索できるようにするのが望ましい。
【００４１】
例えば、ろう者Ａがろう者Ｂと会話を行う目的で、ろう者Ｂに電話をかけた場合、ろう者Ｂの端末装置１の画面には、着信時に図７に示すような文字情報付きのろう者Ａの静止画像を表示する。ここで、ろう者Ｂが端末装置１の特定のボタンを押すと、ろう者Ｂ自身の静止画像をろう者Ａに伝送する。これにより、ろう者Ａはろう者Ｂが応答したことを視覚的に確認できる。
【００４２】
上述した実施形態では、異なる手話言語を扱うろう者同士、あるいはろう者と聴者が手話通訳センター３を介して会話を行う例を説明したが、本システムは、同種の手話言語を扱うろう者同士の会話にも利用可能である。この場合、手話通訳センター３を利用しないことになるが、従来のテレビ電話システムに比べて、本システムは、ブロードバンド回線を利用しつつ、コマ落ち対策を施しているため、実際に会って会話しているのと変わらない使い勝手で、違和感なく会話を行うことができる。
【００４３】
このように、本実施形態では、複数の手話言語を通訳できる複数の手話通訳者を手話通訳センター３に常駐しておき、互いに異なるろう者同士、あるいはろう者と聴者同士で会話を行う必要が生じると、手話の動画像を端末装置１から手話通訳装置５に送信して手話通訳者が手話の通訳を行い、その結果を音声または手話で端末装置１に送信するため、手話言語が理解できない者同士で、違和感なく、かつほぼリアルタイムで会話を行うことができる。
【００４４】
特に、本実施形態によれば、ろう者間、あるいはろう者と聴者間のコミュニケーションをより緊密に図ることができ、ろう者に対する差別や、手話言語の違いによるコミュニケーションの欠如も解消される可能性が大きい。
【００４５】
また、端末装置１と手話通訳装置５との間でのデータ通信速度が何らかの事情で遅くなってコマ落ちが発生すると、その旨を動画像の送信側に報知するようにしたため、手話の速度を遅くして秒当たりのコマ数を減らすことにより、動画像のデータ量を削減することができる。
【００４６】
上述した実施形態において、端末装置１と手話通訳装置５との間の通信は、インターネットを介して行ってもよいし、電話会社やプロバイダの専用回線を介して行ってもよい。
【００４７】
また、端末装置１自体が必ずしも手話通訳装置５との通信機能を持っていなくてもよく、例えば端末装置１に携帯電話を接続して、この携帯電話を介して手話通訳装置５とデータ通信を行ってもよい。あるいは、端末装置１が無線ＬＡＮ機能を持っていれば、ルーターや無線アクセスポイントを介して手話通訳装置５とデータ通信を行ってもよい。
【００４８】
【発明の効果】
以上詳細に説明したように、本発明によれば、端末装置と手話通訳装置との間で、音声及び動画像データを送受信できるようにしたため、互いに異なる手話言語を扱うろう者同士、あるいはろう者と聴者同士で、ほぼリアルタイムに会話を行うことができる。特に、種々の手話言語を通訳できる複数の手話通訳者をセンターに常駐させることにより、色々な手話言語を扱うろう者との間で通訳を行うことができる。
【図面の簡単な説明】
【図１】本発明に係る手話通訳システムの一実施形態の全体構成を示すブロック図。
【図２】携帯型の端末装置の一例を示す斜視図。
【図３】据置型の端末装置の一例を示す斜視図。
【図４】端末装置の内部構成の一例を示す詳細ブロック図。
【図５】手話通訳センター内の手話通訳装置の内部構成の一例を示す詳細ブロック図。
【図６】手話通訳センター内のサーバーが複数のろう者の識別情報を管理する例を示すサーバーのブロック図。
【図７】登録されているろう者の上半身を写した静止画像と、そのろう者の特徴的な文字情報とを組にして、検索を行ったろう者に提供する例を示す図。
【符号の説明】
１　端末装置
２　通信媒体
３　手話通訳センター
４　サーバー
５　手話通訳装置
６，５１　撮像部
７，４７　表示部
８，４５　スピーカ
９　キーボード
１０　バイブレータ
２１，４１　通信処理部
２２，４２　復調部
２３，４３　変調部
２４　着信検出部
２５　着信信号送信部
２６，３３，４４，５４　音声処理部
２８，３１，４６，５２　画像処理部
２９，４８　警告検出部
３０，４９　コマ落ち検出部
３２，５３　マイク
３４　手話言語選択部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a sign language interpreting system for assisting conversation between deaf people who handle different kinds of sign languages or conversation between deaf people and listeners.
[0002]
[Prior art]
Sign language widely used in Japan can be roughly classified into two types. One is called Japanese Sign Language, which is traditionally handed down from generation to generation by Deaf people. This Japanese sign language is a completely different language from Japanese. The other is called Japanese-language sign language, which attempts to express Japanese in sign language. Japanese-language sign language has a shorter history and is more modern than Japanese sign language.
[0003]
While Japanese sign language is often used by innate deaf people, Japanese-language sign language is often used by acquired deaf people.
[0004]
As described above, it is the same in foreign countries that a plurality of types of sign language coexist in Japan. For example, in the United States, there are American sign language and English-language sign language.
[0005]
[Problems to be solved by the invention]
Mastering Japanese Sign Language is as difficult as mastering a foreign language because Japanese Sign Language has a different grammar than Japanese. In this regard, Japanese-language sign language can be spoken as soon as words are mastered.
[0006]
Deaf people who speak Japanese Sign Language can easily understand Japanese-language if they can understand Japanese, but many acquired Deaf people cannot understand Japanese-language sign language. For this reason, interpreters may be needed even in conversations between Japanese sign language speakers and Japanese sign language speakers.
[0007]
Also, what is taught as sign language in current Japanese society is often Japanese-language sign language, and even the existence of two types of sign language is not fully recognized. Although it is difficult to master Japanese sign language, children whose listeners are deaf parents are in an environment where they can master both Japanese sign language and Japanese, and can understand both Japanese sign language and Japanese sign language. Highly likely to be an interpreter.
[0008]
The present invention has been made in view of such a point, and an object of the present invention is to provide a sign language interpreting system that enables smooth conversation between deaf people using different sign languages or between deaf people and listeners. To provide.
[0009]
[Means for Solving the Problems]
In order to solve the above-mentioned problem, the present invention provides a center having a plurality of sign language interpreters used by a plurality of sign language interpreters, and at least one unit for communicating voice and moving image data between the centers. A first image capturing unit that captures a facial expression and hand movements of a deaf person as a moving image, a first voice capturing unit that captures a speaker's voice, A first communication unit for transmitting a moving image and a voice to the center; a first display processing unit for displaying a moving image transmitted from the sign language interpreting device on a first display device; Each of the devices includes a second display processing unit that displays a moving image transmitted from the terminal device on a second display device, and sign language information included in the moving image displayed on the second display device. Through A second voice processing unit that captures the voice obtained as a result, a second image capture unit that captures the facial expression of the sign language interpreter and hand movements as a moving image, and transmits the captured voice and moving image to the terminal device. And a second communication unit.
[0010]
According to the present invention, voice and video data are transmitted and received between a terminal device and a sign language interpreting device, so that deaf people who handle mutually different sign language, or deaf people and listeners, can have a conversation in almost real time. I do.
[0011]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, the sign language interpreting system according to the present invention will be specifically described with reference to the drawings.
[0012]
FIG. 1 is a block diagram showing the overall configuration of an embodiment of the sign language interpreting system according to the present invention. The sign language interpreting system shown in FIG. 1 is a sign language interpreting center for transmitting and receiving a moving image and audio data between at least one terminal device 1 used by a deaf person or a listener and the terminal device 1 via a communication medium 2. 3 is provided.
[0013]
Although the type of the communication medium 2 is not particularly limited, the communication medium 2 capable of stably transmitting a moving image (streaming image) having a resolution of 350,000 pixels or more at 20 frames / second or more is desirable. Specific examples of this type of communication medium 2 include FTTH and ADSL.
[0014]
The sign language interpreting system of FIG. 1 is intended to hold conversations between deaf people who handle different sign languages, or between deaf people and listeners.
[0015]
The sign language interpreting center 3 of FIG. 1 includes a server 4 and a plurality of sign language interpreting devices 5. The sign language interpreter 5 is used by a sign language interpreter, and is provided in a plurality so as to be able to quickly cope with a plurality of types of sign language, so that a plurality of sign language interpreters can simultaneously respond to a plurality of deaf people. .
[0016]
The terminal device 1 includes a portable device that can be easily carried by a deaf person or a listener and a stationary device that is installed in a place where a listener needs to talk with a deaf person (such as a government office or a hospital). is there.
[0017]
For example, when a conversation is performed between deaf persons, the conversation is performed via the sign language interpreting center 3 using the portable terminal devices 1 possessed by each deaf person. Also, when it becomes necessary for a listener to have a conversation with a Deaf person at a government office or the like, the Deaf person is guided to a place where the stationary terminal device 1 is located, and the listener uses the terminal device 1 to interpret the sign language. Conversation with Deaf people through Center 3.
[0018]
FIG. 2 is a perspective view illustrating an example of the portable terminal device 1, and FIG. 3 is a perspective view illustrating an example of the stationary terminal device 1. These terminal devices 1 include an imaging unit 6 (for example, a CCD camera) capable of capturing the facial expression and hand movements of a deaf person performing sign language as a moving image, and a moving image and voice transmitted from the sign language interpreting center 3. And a keyboard 8 for transmitting and receiving character information and control information to and from the sign language interpreting center 3.
[0019]
Deaf people are deaf and do not notice incoming calls when they ring. Therefore, in the present embodiment, the deaf person is provided with a wristwatch-type vibrator 10 as shown in FIGS. When an incoming signal arrives at the terminal device 1 from the sign language interpreting center 3, the incoming signal transmitting unit in the terminal device 1 drives the vibrator 10 wirelessly. Further, not only the vibrator 10 may notify the incoming call, but also the lamp of the terminal device 1 may be turned on.
[0020]
FIG. 4 is a detailed block diagram illustrating an example of the internal configuration of the terminal device 1. As illustrated, the terminal device 1 includes a communication processing unit 21 that controls data communication with the sign language interpreting center 3, a demodulation unit 22 that demodulates data received from the sign language interpreting center 3, and a transmission to the sign language interpreting center 3. Modulation unit 23 that modulates data, incoming call detection unit 24 that detects an incoming call from sign language interpreting center 3, incoming call signal transmitting unit 25 that notifies vibrator 10 of the incoming call, and data received from sign language interpreting center 3. An audio processing unit 26 for processing audio data, a speaker 8 for reproducing audio, an image processing unit 28 for processing image data included in data received from the sign language interpreting center 3, and a display unit for displaying images 7, a warning detection unit 29 that detects a warning signal from the sign language interpreting center 3, a frame drop detection unit 30 that detects a frame drop of a moving image, a facial expression and hand of a deaf person An image capturing unit 6 for capturing motion, an image processing unit 31 for processing a captured image, a microphone 32 for capturing the voice of a deaf or a listener, a voice processing unit 33 for processing the captured voice, and selection of a sign language And a warning output unit 35 that warns the sign language interpreting apparatus 5 of a missing frame.
[0021]
The speaker 8 and the microphone 32 may use the receiver provided on the telephone as it is, or may be provided separately from the receiver.
[0022]
FIG. 5 is a detailed block diagram showing an example of the internal configuration of the sign language interpreting apparatus 5 in the sign language interpreting center 3. As illustrated, the sign language interpreting apparatus 5 includes a communication processing unit 41 that controls data communication with the terminal device 1, a demodulation unit 42 that demodulates data received from the terminal device 1, and a transmission data to the terminal device 1. A modulating unit 43 for modulating a signal, a sound processing unit 44 for processing sound data included in data received from the terminal device 1, a speaker 45 for reproducing sound, and image data included in data received from the terminal device 1. An image processing unit 46 for performing the above processing, a display unit 47 for displaying an image, a warning detection unit 48 for detecting a warning signal from the terminal device 1, a dropped frame detection unit 49 for detecting a dropped frame of a moving image, A sign language detection unit 50 for detecting the sign language selected by the terminal device 1, an imaging unit 51 for capturing the facial expression and hand movements of the sign language interpreter, an image processing unit 52 for processing the captured image, and a sign language Interpreter's voice A microphone 53, an audio processor 54 for processing the audio captured, and the warning output unit 55 to alert the frame dropping with respect to the sign language interpretation system 5, the Komu Ri.
[0023]
A deaf person or a listener using the terminal device 1 selects a sign language by the sign language selection unit 34 before starting sign language. The sign language selection unit 34 selects a sign language input by a deaf person or a listener by operating buttons or the keyboard 9 provided on the terminal device 1. The selected sign language is sent to the sign language interpreting center 3 and detected by the sign language detecting unit 50 in the sign language interpreting apparatus 5.
[0024]
The sign language interpreting center 3 has a plurality of sign language translators who interpret each of the plurality of sign language languages, and selects a sign language translator capable of interpreting the sign language selected by the terminal device 1. When the sign language is selected by the terminal device 1 and a sign language interpreter who can interpret the sign language is not available immediately, it is desirable to reply to the terminal device 1 to that effect.
[0025]
Sign language translators who are resident at the Sign Language Interpretation Center 3 can interpret Japanese sign language and Japanese-language sign language, as well as foreigners who are deaf. It is desirable to have a resident.
[0026]
Also, instead of having the sign language interpreter resident at the sign language interpreter center 3, the sign language interpreters who are connected to the sign language interpreter 3 via the network are connected to each other via a network. An interpreter may be provided. In this case, the sign language interpreting center 3 may send the moving image of the deaf person to the sign language interpreter, and the sign language interpreter may perform the interpretation, or the sign language interpreter may directly communicate with the deaf person according to the instruction from the sign language interpreting center 3. The moving image may be received.
[0027]
As a result, a large number of sign language interpreters capable of interpreting various sign language languages can be networked, and an interpreting service can be simultaneously provided to a large number of deaf people, thereby improving the utility value.
[0028]
The terminal device 1 and the sign language interpreting center 3 perform data communication using the high-speed communication medium 2. However, communication traffic occurs due to time zones and the like, and high-speed data communication may not be performed if conditions are poor. In this case, the moving image representing the sign language may be dropped, and the sign language interpreter or deaf person may not understand the content.
[0029]
In order to deal with such a problem, in the present embodiment, in both the terminal device 1 and the sign language interpreting device 5, dropped frame detection units 30, 49, warning output units 35, 55, and warning detection units 29, 48. Are provided. When the dropped frames are detected by the dropped frames detecting units 30 and 49, a warning signal is transmitted from the warning output units 35 and 55. Further, when a warning signal is detected by the warning detection units 29 and 48, a warning is issued to a deaf person or a sign language interpreter that sign language has occurred, and the user is prompted to reduce the speed of the sign language. As a specific warning method, for example, a dedicated lamp provided in the terminal device 1 is turned on or blinks. Alternatively, it may be displayed on the display device 7 of the terminal device 1 that the dropped frame has occurred.
[0030]
If the speed of the sign language is reduced, no frame drop occurs even if the number of frames per second is reduced, so that the number of frames per second can be reduced, thereby reducing the data amount of the moving image.
[0031]
As another method for coping with dropped frames, a background image irrelevant to sign language may be deleted to reduce the data amount of the moving image. The images required for sign language are the facial expression of the signer and the movement of the hands, and the other background images are not so important for understanding sign language. Therefore, the background image is cut (more specifically, the background image is set to a solid color) to reduce the amount of moving image data.
[0032]
As another method for coping with dropped frames, the number of pixels of a moving image may be reduced, or the screen size may be reduced to reduce the amount of data. However, if the number of pixels is reduced, the image becomes coarse, and if the screen size is reduced, the screen becomes difficult to see, and in any case, it becomes difficult to grasp facial expressions and hand movements. Therefore, it is not desirable to extremely reduce the number of pixels or reduce the screen size.
[0033]
If the above-described countermeasures against dropped frames are taken, the amount of information transmitted per unit time naturally decreases. Even if the amount of information transmission is reduced, there is no particular financial problem in an environment of constant connection, but in an environment where charges are made according to the communication time of an ISDN line or the like, the communication time becomes longer if dropped frames occur. Therefore, the user's financial burden increases. For this reason, in such an environment, it is desirable to charge according to the actually transmitted information transmission amount instead of the communication time. For example, a charging system that charges according to the amount of packets employed in mobile phones and the like is desirable.
[0034]
By the way, as a method of transmitting moving image and audio data, there is a possibility that a high transmission rate can be obtained, but a best effort type transmission method where the transmission rate is not guaranteed, and a minimum transmission rate instead of limiting the maximum transmission rate However, in the present embodiment, since a broadband line is used and measures against dropped frames are taken, it is desirable to adopt a best-effort transmission method if possible.
[0035]
When a deaf person who speaks different sign languages speaks via the sign language interpreting center 3, the deaf person's telephone number, IP address, etc. (hereinafter collectively referred to as identification information) are input each time the conversation is made. The burden on deaf people is great. Therefore, it is desirable that the identification information of the deaf person be registered in the sign language interpreting center 3 collectively, so that each deaf person can acquire the identification information of another deaf person from the sign language interpreting center 3.
[0036]
FIG. 6 is a block diagram of the server 4 showing an example in which the server 4 in the sign language interpreting center 3 manages identification information of a plurality of deaf people. The server 4 of FIG. 6 includes an identification information storage unit 61 for storing identification information of a deaf person, an identification information updating unit 62 for updating identification information of a deaf person, and identification information for providing identification information of a deaf person who has made a request. And a providing unit 63.
[0037]
When the deaf person A has a conversation with another deaf person B via the sign language interpreting center 3, the name and the like of the deaf person B are transmitted to the sign language interpreting center 3, and the server 4 in the sign language interpreting center 3 becomes deaf person. The identification information of B is provided to the deaf person A, or the deaf person A and the deaf person B are automatically connected to a network.
[0038]
Deaf people do not speak well, so there is a general tendency that they are not good at remembering others' names. Therefore, when the information registered in the identification information storage unit 61 of the server 4 is provided to the deaf person, as shown in FIG. It is desirable to provide the character information (for example, the name of the area where the person lives, a nickname, a hobby, etc.) as a set to the deaf person who made the search.
[0039]
Thereby, the deaf person can visually grasp the partner with whom he or she wants to have a conversation with the still image and the character information, and can easily remember the partner of the call.
[0040]
Also, when a deaf person searches for information registered in the identification information storage unit 61, it is desirable to be able to search not only by name but also by area name or nickname.
[0041]
For example, when the deaf person A calls the deaf person B for the purpose of having a conversation with the deaf person B, the screen of the terminal device 1 of the deaf person B is provided with character information as shown in FIG. A still image of the deaf person A is displayed. Here, when the deaf person B presses a specific button of the terminal device 1, a still image of the deaf person B is transmitted to the deaf person A. Thereby, the deaf person A can visually confirm that the deaf person B has responded.
[0042]
In the above-described embodiment, an example has been described in which deaf people who handle different sign language languages or a deaf person and a listener have a conversation via the sign language interpreting center 3. It can also be used for conversations. In this case, the sign language interpreting center 3 is not used. However, compared to the conventional videophone system, this system uses a broadband line and takes measures against dropped frames. It is as easy to use as you are, and you can have a conversation without discomfort.
[0043]
As described above, in the present embodiment, it is necessary to have a plurality of sign language interpreters who can interpret a plurality of sign language languages resident at the sign language interpreting center 3 and have a conversation between different deaf persons or between deaf persons and listeners. When this occurs, a sign language moving image is transmitted from the terminal device 1 to the sign language interpreting device 5, and the sign language interpreter interprets the sign language, and transmits the result to the terminal device 1 by voice or sign language, so that the sign language cannot be understood. It is possible for a person to have a conversation in almost real time without any discomfort.
[0044]
In particular, according to the present embodiment, communication between a deaf person or between a deaf person and a listener can be made more closely, and discrimination against a deaf person and lack of communication due to a difference in sign language may be eliminated. Is big.
[0045]
Also, when the data communication speed between the terminal device 1 and the sign language interpreting device 5 is slowed down for some reason and a frame drop occurs, the fact is notified to the transmitting side of the moving image. By reducing the number of frames per second at a low speed, the data amount of the moving image can be reduced.
[0046]
In the above-described embodiment, the communication between the terminal device 1 and the sign language interpreting device 5 may be performed via the Internet, or may be performed via a dedicated line of a telephone company or a provider.
[0047]
In addition, the terminal device 1 itself does not necessarily have to have a communication function with the sign language interpreting device 5. For example, a mobile phone is connected to the terminal device 1, and data communication with the sign language interpreting device 5 is performed via the mobile phone. May go. Alternatively, if the terminal device 1 has a wireless LAN function, data communication with the sign language interpreting device 5 may be performed via a router or a wireless access point.
[0048]
【The invention's effect】
As described in detail above, according to the present invention, since voice and video data can be transmitted and received between the terminal device and the sign language interpreting device, deaf people who handle mutually different sign language, or deaf people And the listeners can have a conversation in almost real time. In particular, by having a plurality of sign language interpreters capable of interpreting various sign language languages resident at the center, it is possible to perform interpretation with deaf people who handle various sign language languages.
[Brief description of the drawings]
FIG. 1 is a block diagram showing an entire configuration of an embodiment of a sign language interpreting system according to the present invention.
FIG. 2 is a perspective view showing an example of a portable terminal device.
FIG. 3 is a perspective view showing an example of a stationary terminal device.
FIG. 4 is a detailed block diagram showing an example of an internal configuration of the terminal device.
FIG. 5 is a detailed block diagram showing an example of an internal configuration of a sign language interpreting apparatus in a sign language interpreting center.
FIG. 6 is a block diagram of a server showing an example in which a server in a sign language interpreting center manages identification information of a plurality of deaf people.
FIG. 7 is a diagram illustrating an example in which a registered still image of the upper body of a deaf person and character information characteristic of the deaf person are combined and provided to a deaf person who has performed a search.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 Terminal device 2 Communication medium 3 Sign language interpreting center 4 Server 5 Sign language interpreting device 6,51 Imaging unit 7,47 Display unit 8,45 Speaker 9 Keyboard 10 Vibrator 21,41 Communication processing unit 22,42 Demodulation unit 23,43 Modulation unit 24 incoming call detection unit 25 incoming signal transmission unit 26, 33, 44, 54 voice processing unit 28, 31, 46, 52 image processing unit 29, 48 warning detection unit 30, 49 dropped frame detection unit 32, 53 microphone 34 sign language selection Department

Claims

A center having a plurality of sign language interpreters used by a plurality of sign language interpreters,
And at least one terminal device that performs communication of voice and moving image data with the center,
The terminal device,
A first image capturing unit that captures a facial expression and hand movements of a deaf person as a moving image;
A first voice capturing unit for capturing the voice of the speaker;
A first communication unit for transmitting the captured moving image and audio to the center;
A first display processing unit for displaying a moving image transmitted from the sign language interpreting device on a first display device,
Each of the plurality of sign language interpreters,
A second display processing unit for displaying a moving image transmitted from the terminal device on a second display device;
A second voice processing unit that captures a voice obtained by the sign language interpreter interpreting the sign language information included in the moving image projected on the second display device;
A second image capturing unit that captures the facial expression and hand movement of the sign language interpreter as a moving image;
A second communication unit for transmitting the captured voice and moving image to the terminal device.

The terminal device,
A first frame skip detection unit that detects whether a transmission speed of a moving image transmitted from the second communication unit is equal to or lower than a predetermined speed limit;
A first warning unit configured to generate a warning signal to the sign language interpreting device that has transmitted a moving image when the first frame skip detection unit detects that the speed has become equal to or less than the speed limit,
Each of the plurality of sign language interpreters,
A second frame skip detection unit that detects whether a transmission speed of a moving image transmitted from the first communication unit has fallen below a predetermined speed limit;
A second warning unit configured to generate a warning signal to the terminal device that has transmitted a moving image when the second frame skip detection unit detects that the speed has fallen below the speed limit. Item 1. The sign language interpreting system according to Item 1.

Upon receiving the warning signal from the second warning unit, the terminal device instructs a deaf person to reduce the speed of sign language,
The sign language interpreting system according to claim 2, wherein the sign language interpreting device, upon receiving the warning signal from the first warning unit, instructs the sign language interpreter to reduce the speed of the sign language.

When the terminal device receives the warning signal from the second warning unit, according to the transmission speed of the moving image, reduces the number of frames per second of the moving image transmitted to the sign language interpreter,
The sign language interpreter, upon receiving a warning signal from the first warning unit, reduces the number of frames per second of the moving image transmitted to the terminal device according to the transmission speed of the moving image. The sign language interpreting system according to claim 2 or 3, wherein

The terminal device, upon receiving the warning signal from the second warning unit, simplifies the background image other than the facial expression of the deaf person and the hand movement included in the moving image transmitted to the sign language interpreting device. Reduce the amount of video data,
Upon receiving the warning signal from the first warning unit, the sign language interpreter simplifies the background image other than the facial expression and hand movements of the sign language interpreter included in the moving image transmitted to the terminal device. The sign language interpreting system according to claim 3 or 4, wherein the amount of data of the moving image is reduced by performing the operation.

Upon receiving the warning signal from the second warning unit, the terminal device includes a background image other than the facial expression of the deaf person and the hand movement included in the moving image transmitted to the sign language interpreting device in a plain monochrome color. Set,
When receiving the warning signal from the first warning unit, the sign language interpreting apparatus converts the background image other than the facial expression of the sign language interpreter's face and the hand movement included in the moving image transmitted to the terminal apparatus into a solid color image. The sign language interpreting system according to claim 5, wherein the setting is set to:

The terminal device, when receiving the warning signal from the second warning unit, by reducing the resolution of the moving image, to reduce the number of frames per second of the moving image transmitted to the sign language interpreter,
When receiving the warning signal from the first warning unit, the sign language interpreter reduces the resolution of the moving image, thereby reducing the number of frames per second of the moving image transmitted to the terminal device. The sign language interpreting system according to claim 2, wherein

The terminal device according to any one of claims 1 to 7, wherein the terminal device has an incoming notification unit that notifies an incoming notification to an incoming notification device held by a deaf person when receiving a moving image or a voice from the sign language interpreting device. Sign language interpretation system described in Crab.

The incoming call notification device is a wristwatch-type vibrator having a wireless function,
The sign language interpreting system according to claim 8, wherein the incoming call notification unit notifies the incoming call by vibrating the vibrator by a wireless signal.

The center is
A communication information storage unit that stores information necessary for communication with each of the terminal devices;
A communication information updating unit that updates information stored in the communication information storage unit;
The sign language interpreting system according to any one of claims 1 to 9, further comprising: a communication information providing unit that provides information stored in the communication information storage unit to the terminal device as needed.

The communication information storage unit, for each of a plurality of deaf people, stores a still image of the deaf person, name and characteristic identification information,
The communication information providing unit, in response to an instruction from the terminal device, a still image of any deaf person stored in the communication information storage unit, providing the terminal device with the name and characteristic identification information,
The first display processing unit of the terminal device displays a still image, a name, and characteristic identification information of a deaf person provided from the communication information providing unit on the same screen of the first display device. The sign language interpreting system according to claim 10.