JP4034025B2

JP4034025B2 - Video equipment with learning function

Info

Publication number: JP4034025B2
Application number: JP2000078669A
Authority: JP
Inventors: 原旭劉; 相錫李
Original assignee: LG Electronics Inc
Current assignee: LG Electronics Inc
Priority date: 1999-03-22
Filing date: 2000-03-21
Publication date: 2008-01-16
Anticipated expiration: 2020-03-21
Also published as: KR20000062159A; CN1267863A; CN100353358C; JP2000298493A; KR100686085B1

Description

【０００１】
【発明の属する技術分野】
本発明は映像機器に関するもので、特に、学習機能を有する映像機器およびその制御方法に関する。
最近、各種の映像機器、そのうち最も代表的な例としてテレビジョン装置などは、本来の目的である映像および音声方法の外に、ユーザの多様な欲求を充足させるための種々の付加機能を追加して、製品の競争力を向上させるための努力が為されている。学習機能も上述の種々の付加機能中の１つであって、各種の映像機器に適用されている。
【０００２】
【従来の技術】
従来の技術による学習機能を有する映像機器の一例としてのテレビジョン装置は図９に示すように、アンテナを介して受信されるＣＶＢＳ（Composite Video Blanking Signal)、即ち、複合映像信号を選局するためのチューナ１、チューナ１を介して選局された複合映像信号のうち放送信号をＹ／Ｕ／Ｖ信号に分離するＹ／Ｃ分離部２、チューナ１を介して選局された放送信号から同期信号を分離する同期信号分離部３、同期信号分離部３から分離された同期信号に従ってＯＳＤまたはキャプションデータの表示を制御するマイクロコンピュータ４と、動作プログラムなどを記憶させておくためのＥＥＰＲＯＭ（Electrically Erasable ＆ Programmable Read-Only Memory）５と、マイクロコンピュータ４のスイッチング信号によりＯＳＤまたはキャプションデータを選択して出力させるビデオスイッチ６と、映像処理および偏向処理を行なう映像／偏向処理部７と、表示部としてのＣＰＴ（Color Picture Tube）８と、チューナ１を介して選局された放送信号のうちキャプション情報を処理するキャプション処理部９と、学習レベル別に所定の単語データを記憶している学習用データベース１１と、キャプション処理部９と学習用データベース１１間のデータ交換のためのインターフェース部１０とを含んで構成されている。
【０００３】
ここで、キャプション処理部９は、同期信号分離部３から出力された同期信号に従って、チューナ１により選局された放送信号からキャプション情報を抽出するデータスライサ（Date Slicer）１２と、キャプション情報処理を制御するキャプション制御部１４と、キャプション情報を復号化するキャプションデコーダ１３と、英語，ハングル語および日本語等の言語に対応するフォントとフォント処理プログラム等を格納するフォントＲＯＭ／プログラムＲＯＭ／データＲＡＭ１５から構成されている。
【０００４】
このように構成された従来技術に係る映像機器における学習機能を実行する動作について以下に説明する。
ユーザが学習機能を「オン」にすると、マイクロコンピュータ４はユーザからの学習機能の設定内容、即ちユーザが設定した学習レベル、スタート番号または表示位置などに対応するデータのアドレスを指定してキャプション制御部１４に制御信号を供給し、該当するデータを学習用データベース１１から読み込む。
したがって、キャプション制御部１４は、フォントＲＯＭ／プログラムＲＯＭ／データＲＡＭ１５から設定されたデータに対応するフォントをマッチングさせて学習データを読み込み、読み込んだ学習データをインターフェース部１０を介してビデオスイッチ６に入力させる。
【０００５】
次いで、マイクロコンピュータ４はビデオスイッチ６に制御信号を出力してスイッチをオンさせて、該当する学習用データが映像／偏向処理部７に出力されるようにし、該当する学習用データである単語が、映像／偏向処理部７を経由してＣＰＴ８上に出力される。
【０００６】
一方、放送信号に含まれてチューナ１により受信された音声信号は、トーン調節部１６およびアンプ１７を経由して信号処理され、該当する映像に同期させられてスピーカ１８を介して音声として出力される。
【０００７】
しかし、このような従来の技術による学習機能を有する映像機器は所定の単語だけが画面上に表示され、その単語に対応する音声が出力されないため、単語に該当する発音を聞くことはできない。したがって、ユーザの学習意欲を満足させることができないことは勿論として、ユーザが学習する能率を低下させるという問題があった。
【０００８】
【発明が解決しようとする課題】
本発明は、上述した従来の問題点を解決するために為されたものであって、学習用単語を画面上に表示すると同時に、その単語に対応する音声をその単語に同期して出力させることにより、学習能率を高めることができるようにした学習機能を有する映像機器およびその制御方法を提供することを目的としている。
【０００９】
【課題を解決するための手段】
上記目的を達成するために、本発明の第１の基本構成に係る学習機能を有する映像機器は、発音記号を含む学習データを記憶する学習用データベースと、表示用の画面に画像データとして表示するために、放送信号に含まれる字幕データおよび前記学習用データベースに記憶された前記学習データを字幕処理する字幕処理部と、予め記憶された発音記号別のデジタル音声データを用いて、前記学習データの中の単語に対応するアクセントの音声を合成すると共に、前記学習データに含まれる発音記号のアクセントに対応する出力音声の周波数を変調するために対応するアクセントに合うクロックパルスを発生させる複数の発振器を含む音声発生部と、ユーザから音声学習の提供が要求されたときに、前記学習用データベースから該当する前記学習データを読み込んで、ユーザから要求された設定内容に当たる学習データおよびこれに対応する音声が出力されるように前記字幕処理部および前記音声発生部を制御する制御部と、を備えることを特徴とする。
また、上記第１の基本構成に係る学習機能を有する映像機器において、前記音声発生部は、各発音記号別の音声データを記憶している発音記号音声記憶ＲＯＭと、前記制御部の制御信号に従って前記発振器により生成されるクロックパルスの中の１つを選択して出力するための周波数スイッチと、前記制御部により前記発音記号音声記憶ＲＯＭから選択出力される音声データを、前記周波数スイッチにより選択出力されたクロックパルスに従って合成する音声合成部と、前記音声合成部の出力をアナログに変換して、オーディオスイッチに出力するためのＤ／Ａ変換器と、を備えるようにしても良い。
【００１０】
また、本発明の第２の基本構成に係る学習機能を有する映像機器は、発音記号を含む学習データを記憶する学習用データベースと、表示用の画面に画像データとして表示するために、放送信号に含まれる字幕データおよび前記学習用データベースに記憶された前記学習データを字幕処理する字幕処理部と、予め記憶された発音記号別のデジタル音声データを用いて、前記学習データの中の単語に対応するアクセントの音声を合成すると共に、前記学習データに含まれる発音記号のアクセントに対応する出力音声の周波数を変調するために対応するアクセントに合うクロックパルスを発生させる複数の発振器を含む音声発生部と、ユーザにより音声学習の提供が要求されたときに、前記学習用データベースから該当する前記学習データを読み込み、要求された設定内容に当たる学習データおよびこれに対応する音声が出力されるように前記字幕処理部および前記音声発生部を制御する制御部と、前記字幕処理部の字幕信号または前記学習データを入力すると共に、画面上に映像機器を制御するためのＯＳＤ信号を入力して、前記制御部の切換信号に従って前記字幕信号、前記学習データ、及び前記ＯＳＤ信号を選択的に出力させるビデオスイッチと、前記ビデオスイッチからの出力と映像信号を入力すると共に、前記出力および映像信号を画面上に表示させるように映像信号処理する映像処理部と、前記音声発生部からの出力と、音声信号をそれぞれ入力すると共に、前記制御部の切換信号に従って前記ビデオスイッチからの出力および前記映像信号を選択的に出力するオーディオスイッチと、前記オーディオスイッチの出力がスピーカを介して出力可能になるように音声信号処理する音声信号処理部と、を備えることを特徴とする。
【００１６】
以下、添付の図面を参照しながら本発明の第１および第２実施形態に係る学習機能を有する映像機器およびその制御方法について詳細に説明する。
（第１実施形態）
本発明のよる学習機能を有する映像機器は、図２に示すように、学習用データベース１１に記憶された学習データ、例えば英単語に当たる音声を合成して出力する音声発生部１９と、マイクロコンピュータ４の制御信号に従って音声発生部１９から出力された音声と放送信号に含まれた音声の中から何れか１つを選択して出力するオーディオスイッチ２８と、を除いては、従来の技術の構成と同一であるので、同一構成要素に対しては従来技術と同一符号を付し、重複説明を省略する。
【００１７】
ここで、音声発生部１９は各英単語の発音記号別音声データを記憶する発音記号音声記憶ＲＯＭ２０と、学習データの発音記号に表れたアクセントに合うよう出力音声の周波数を変調するために、アクセント別クロックパルスを発生させる第１，第２，第３発振器２１、２２、２３と、キャプション制御部１４の制御信号に従って前記第１ないし第３発振器２１ないし２３のクロックパルスのうちから１つを選択して出力するための周波数スイッチ２４と、前記キャプション制御部１４により発音記号音声記憶ＲＯＭ２０から選択出力される音声データを、前記周波数スイッチ２４により選択出力されたクロックパルスに対応するアクセントを付与して合成する音声合成部２５と、前記音声合成部２５の出力をアナログに変換するためのＤ／Ａ変換器２６およびＤ／Ａ変換器２６の出力を緩衝させて、前記オーディオスイッチ２８に出力するためのバッファ２７とから構成されている。
【００１８】
そして、前記発音記号音声記憶ＲＯＭ２０には、図２に示すような発音記号別音声データが、該当する音声データの検索を容易にするためのインデックスデータと共に、図３に示すような形式により記憶されている。
【００１９】
このように構成された本発明の第１実施形態に係る映像機器における学習機能を実行する方法について、図１を参照しながら以下に説明する。
ユーザによって学習モードが設定され、音声学習モードが設定されない場合は（Ｓ２２）、ユーザの設定内容に対応する学習データ、つまり、意味、発音記号などが含まれた英単語が前記学習用データベース１１からインターフェース部１０を経由してキャプション制御部１４へ伝送され、キャプション制御部１４の制御によってキャプション処理され、ビデオスイッチ６を経由してＣＰＴ８の画面上に表示される（Ｓ２３）。
【００２０】
一方、ユーザが音声学習モードを設定すると（Ｓ２２）、マイクロコンピュータ４の制御信号に従ってキャプション制御部１４から前記学習データに含まれた各英単語の発音記号を読み込み、その発音記号の各音素別ディジタル音声データを前記発音記号音声記憶ＲＯＭ２０から読み込んで音声合成部２５に供給する（Ｓ２４）。
そして、キャプション制御部１４は、マイクロコンピュータ４の制御に基づいて、その英単語の映像データに同期させて、英単語の発音記号に対応するアクセントを表現して音声合成部２５を介して出力するように前記周波数スイッチ２４を制御する（Ｓ２５）。
【００２１】
即ち、図１において、キャプション制御部１４が前記周波数スイッチ２４に制御信号を印加して、対応するアクセントに合うクロックパルスを音声合成部２５に印加し、音声合成部２５は順次入力される各発音記号音素別音声データが第１および第２アクセントを表現できるように、第１ないし第３発振器２１、２２、２３から供給されるクロックパルスに従い音声データを合成して出力する。
このとき、出力される英単語音声データにアクセントを表現する他の方法として、出力される音声データのレベルを調節し各音素別ボリュームレベルを変化させることにより、第１および第２アクセントを表現することもできる。
【００２２】
次いで、音声合成部２５から出力されたディジタル音声データがＤ／Ａ変換器２６を介してアナログ音声データに変換され、バッファ２７を経由してオーディオスイッチ２８に入力される。そして、オーディオスイッチ２８はマイクロコンピュータ４の制御信号に従って前記バッファ２７から出力された音声データを出力する。
【００２３】
次いで、前記オーディオスイッチ２８から出力された音声データはトーン調節部１６によりその音質が調整され、アンプ１７により増幅されて、スピーカ１８を介して前記画面上に表示される英単語に同期して出力される。したがって、ユーザは英単語およびそれに当たる発音を同時に聴取することができる。また、本発明は音声学習機能をテレビジョン装置に適用した例であって、セットトップボックスを含む各種の映像機器に基本的なキャプション機能や学習機能を連携させて構成することにより容易に適用することができる。
【００２４】
（第２実施形態）
本発明の第２実施形態による学習機能を有する映像機器としてのテレビジョン受像装置は、図５に示すように、所定の単語データを記憶するための単語データベース１２０と、外部の映像信号に含まれたキャプションデータまたは前記単語データベース１２０に記憶された単語データをキャプション処理するための字幕処理部１１０と、前記字幕処理部１１０と単語データベース１２０のデータ交換のためのインターフェース部１３０と、前記単語データを発音可能な文字列に変換して分析し、韻律処理された音を合成するための音声発生部１４０と、ユーザの音声学習機能の要求時に、単語データベース１２０から該当単語データを読み込み前記字幕処理部１１０を介してキャプション処理されるようにし、前記キャプション処理された単語データに対応する音声が出力されるように前記音声発生部１４０を制御するマイクロコンピュータ７０と、前記マイクロコンピュータ７０の動作プログラムなどを記憶するためのＥＥＰＲＯＭ６０と、前記字幕処理部１１０から出力された映像信号と前記マイクロコンピュータ７０から出力された映像信号または外部の映像信号を画面上に表示可能なように信号処理する映像処理部３０と、前記映像処理部３０により処理された信号出力を表示するＣＰＴ４０と、前記音声発生部１４０の出力と外部の音声信号を前記マイクロコンピュータ７０の制御によって選択的に出力させるオーディオスイッチ１００と、前記オーディオスイッチ１００の出力をスピーカ９０により出力可能なように信号処理する音声処理部８０と、を含んで構成されている。
【００２５】
このとき、字幕処理部１１０は外部の映像信号からキャプション情報を抽出するためのデータスライサ１１１と、前記キャプション情報を復号化するためのキャプションデコーダ１１２と、単語を表現するためのフォントとフォント処理プログラムおよび現在画面上に表示されるキャプション情報の中のユーザの記憶命令に対応するキャプション情報を記憶するためのメモリ部１１４と、前記マイクロコンピュータ７０の制御によって所定のキャプション情報を前記メモリ部１１４に記憶したり、メモリ部１１４から読み込んで前記音声発生部１４０へ伝送し、データスライサ１１１およびキャプションデコーダ１１２の動作を制御するためのキャプション制御部１１３とから構成されている。
【００２６】
そして、音声発生部１４０は各発音別特徴パラメータを記憶するための音声データベース１４１と、前記単語データを発音可能な文字列に変換して分析し、韻律処理してこれに対応する特徴パラメータを前記音声データベース１４１から読み込んでディジタル音声信号を生成するための音声プロセッサ１４２と、前記音声プロセッサ１４２から生成したディジタル音声信号をアナログ音声信号に変換するためのＤ／Ａ変換部１４３およびＤ／Ａ変換部１４３からの出力を緩衝するためのバッファ１４４とから構成されている。
【００２７】
このとき、音声プロセッサ１４２は、図６に示すように、単語データのうちの数字、アルファベット、略字、特殊記号などを発音可能な文字列に定型化するための文字列定型化部１４２−１と、前記定型化した文字列を分析して句および節の境界点を検出し、多重発音単語の発音を設定するための文字列分析部１４２−２と、前記文字列分析部１４２−２から出力された文字列で音節間の連結による発音の変動時、変動した発音に合うように発音記号を変換処理するための発音記号処理部１４２−３と、前記発音記号処理部１４２−３から出力された文字列に長さ、強さおよび抑揚の韻律を付与するための韻律処理部１４２−４と、前記韻律処理部１４２−４から出力された文字列の発音に当たる特徴パラメータを前記音声データベース１４１から読み込んで、それに従うディジタル音声信号を合成するための音声生成部１４２−５とから構成されている。
【００２８】
また映像処理部３０は、アンテナを介して受信される放送信号を選局するためのチューナ３１と、チューナ３１により選局されたＣＶＢＳ（Composite Video Blanking Signal）、即ち複合映像信号のうち放送信号をＹ／Ｕ／Ｖ信号に分離するＹ／Ｃ分離部３２と、チューナ３１により選局された放送信号から同期信号を分離する同期信号分離部３３と、映像処理および偏向処理を行なう映像／偏向処理部３４と、から構成されている。そして、音声処理部８０はトーン調節部８１とアンプ８２とを含んで構成されている。
【００２９】
以下、このように構成された本発明の第２実施形態に係る映像機器の学習機能を実行する方法を図７および図８のフローチャートを参照して説明する。
まず、ユーザは学習機能を行なうために、テレビジョン装置のキーまたはリモコンを操作して、学習機能「オン」命令を入力する。すると、マイクロコンピュータ７０は、図７に示すように、学習機能が「オン」となっているか否かを判断する（Ｓ４１）。
【００３０】
次いで、前記の判断結果（Ｓ４１）、学習機能が「オン」されていれば、ユーザによって設定された学習設定値、つまり、学習レベル／スタート位置／表示位置などを把握する（Ｓ４２）。そして、前記学習設定値に当たるアドレスの単語データを単語データベース１２０から読み込む（Ｓ４３）。
【００３１】
次いで、マイクロコンピュータ７０は字幕処理部１１０を制御して、前記単語データを該当フォントに合うようにキャプション処理し、映像処理部３０を介して映像処理して、ＣＰＴ４０を介して表示されるようにする（Ｓ４４）。これと同時にマイクロコンピュータ７０は音声発生部１４０が前記キャプション制御部１１０を介して単語データを伝送してもらえるようにする。これによって、音声発生部１４０は前記単語データを発音可能な文字列に変換して分析した後、韻律処理する（Ｓ４５）。
そして、音声発生部１４０は韻律処理された文字列を音声データベース１４１に記憶された該当発音別特徴パラメータを用いて音声合成し、スピーカに出力する（Ｓ４６）。
【００３２】
このとき音声発生部１４０の詳細な動作を以下に説明する。
即ち、文字列定型化部１４２−１が前記単語データのうち数字、アルファベット、略字、特殊記号を発音可能な文字列に定型化する。
【００３３】
次いで、文字列分析部１４２−２は前記定型化した文字列を分析して句および節の境界点を検出し、二つ以上の発音を有する単語の適正発音を選定する。
【００３４】
そして、発音記号処理部１４２−３が前記文字列分析部１４２−２から出力された文字列で音節間の連結による発音の変動時、変動した発音に合うように発音記号を校正する。
【００３５】
次いで韻律処理部１４２−４は前記発音記号処理部１４２−３から出力された文字列に長さと強さおよび抑揚などの韻律を付与する。
【００３６】
そして、音生成部１４２−５が前記韻律処理された文字列とその発音に当たる特徴パラメータ、つまり、周波数、帯域幅およびエネルギー情報を合成してデジタル音声データを生成する。
【００３７】
次いで、前記デジタル音声データはＤ／Ａ変換器１４３を介してアナログ音声に変換し、バッファ１４４を介してスピーカ９０に出力されるのである。
【００３８】
このように音声出力動作が完了すると、前記画面上に表示された単語の表示周期が設定周期に到達しているかを判断して（Ｓ４７）、到達していれば単語のスタート番号を増加させ（Ｓ４８）、表示された単語番号が終了設定番号であるか否かを判断する（Ｓ４９）。
【００３９】
次いで、前記の判断結果（Ｓ４９）、単語番号が終了設定番号であれば、最初のスタート番号に復帰して（Ｓ５０）、前記段階（Ｓ４３）に復帰する。
【００４０】
一方、前記の判断結果（Ｓ４１）、学習機能が「オン」されていない場合は、学習機能「オフ」命令が入力されているかを判断して（Ｓ５１）、学習機能「オフ」命令が入力されていれば、学習設定値および現在の進行状態を記憶し（Ｓ５２）、単語および音声出力を中止させる（Ｓ５３）。また、学習機能「オフ」命令が入力されていなければ、対応する指令を処理する（Ｓ５４）。
【００４１】
また、本発明は表示される単語を必要時に記憶して、希望の時間に該当音声と共に再生させることができる。これを図８に基づいて以下に説明する。
まず、ユーザの所望時に、映像機器、例えばテレビジョン受像装置のキーまたはリモコンのキーを操作して、単語記憶命令または単語再生命令を入力することができる。
したがって、マイクロコンピュータ７０は、図８に示すように、単語記憶命令が入力されるかを判断して（Ｓ６１）、単語記憶命令が入力されると、該当単語を字幕処理部１１０のメモリ部１１４に記憶させる（Ｓ６２）。
【００４２】
一方、前記の判断結果（Ｓ６１）、単語記憶命令が入力されていない場合は、単語再生命令が入力されるかを判断して（Ｓ６３）、単語再生命令が入力されると、前記キャプション制御部１１３を制御して、前記メモリ部１１４に記憶された単語を読み込むようにする（Ｓ６４）。
【００４３】
次いで、字幕処理部１１０を制御して、前記単語データを該当フォントに合うようにキャプション処理し、映像処理部３０により映像処理して、ＣＰＴ４０の画面上に表示されるようにする（Ｓ６５）。
それと同時に、音声発生部１４０は前記単語データがキャプション制御部１１３を介して伝送されるようにし、図７を用いて説明したように、前記単語データを発音可能な文字列に変換して分析した後、韻律処理する（Ｓ６６）。また、音声発生部１４０は韻律処理した文字列を音声データベース１４１に記憶された該当発音別特徴パラメータを用いて音声合成して、スピーカを介して出力する（Ｓ６７）。
【００４４】
なお、上述した第１および第２実施形態に係る学習機能を有する映像機器においては、学習対象としての言語は英語を例にとって説明したが、本発明はこれに限定されず例えばハングル語を学習する場合や日本語を学習する場合であっても本発明を適用することができ、また、その他の言語、例えば中国語、ロシア語、アラビア語、スペイン語、フランス語、ドイツ語等のあらゆる言語に対して適用することができる。
【００４５】
以上説明した第１および第２実施形態は本発明を学習機能をテレビジョン装置に適用した例であったが、本発明はこれにも限定されず、セットトップボックスを含む各種の映像機器に基本的なキャプション機能および学習機能を連携させて構成することにより、他の映像機器に対しても容易に適用することができる。
【００４６】
【発明の効果】
また、本発明に係る学習機能を有する映像機器およびその制御方法は例えば英単語などの学習対象としての単語とこれに対応する音声を出力することにより、ユーザの学習能率を高めると共に学習意欲を鼓吹させることができ、他製品と差別化することによって製品の市場における競争力を向上させることができる等の種々の効果がある。
【図面の簡単な説明】
【図１】本発明の第１実施形態に係る学習機能を有する映像機器の構成を示すブロック図。
【図２】本発明の第１実施形態による映像機器の発音記号テーブルを示す説明図。
【図３】本発明の第１実施形態による発音記号音声データおよびインデックスデータの記憶形式を示す説明図。
【図４】本発明の第１実施形態に係る学習機能を有する映像機器の制御方法を示すフローチャート。
【図５】本発明の第２実施形態に係る学習機能を有する映像機器の構成を示すブロック図。
【図６】図５の音声プロセッサにおける細部の構成を示すブロック図。
【図７】本発明の第２実施形態に係る学習機能を有する映像機器の制御方法を示すフローチャート。
【図８】本発明の第２実施形態による学習機能を有する映像機器の制御方法における単語記憶および再生方法を示すフローチャート。
【図９】従来の技術による学習機能を有する映像機器の構成を示すブロック図。
【符号の説明】
４マイクロコンピュータ
６ビデオスイッチ
８ＣＰＴ
９キャプション処理部
１１学習用インターフェース
１４キャプション制御部
１６トーン調節部
１８スピーカ
１９音声発生部
２０発音記号音声記憶ＲＯＭ
２１第１発振器
２２第２発振器
２３第３発振器
２４周波数スイッチ
２５音声合成部
２８オーディオスイッチ
４０ＣＰＴ
５０ビデオスイッチ
７０マイクロコンピュータ
８０音声処理部
８１トーン調節部
９０スピーカ
１１０字幕処理部
１１３キャプション制御部
１２０単語データベース
１４０音声発生部
１４１音声データベース
１４２音声プロセッサ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a video device, and more particularly to a video device having a learning function and a control method thereof.
Recently, various types of video equipment, of which television devices are the most representative examples, have added various additional functions to satisfy the various needs of users in addition to the original video and audio methods. Efforts are being made to improve the competitiveness of products. The learning function is one of the various additional functions described above, and is applied to various video devices.
[0002]
[Prior art]
As shown in FIG. 9, a television apparatus as an example of video equipment having a learning function according to the prior art is used to select a CVBS (Composite Video Blanking Signal), that is, a composite video signal received via an antenna. Tuner 1, a Y / C separation unit 2 that separates a broadcast signal into Y / U / V signals from the composite video signal selected via tuner 1, and a broadcast signal selected via tuner 1. A synchronization signal separation unit 3 that separates signals, a microcomputer 4 that controls the display of OSD or caption data in accordance with the synchronization signal separated from the synchronization signal separation unit 3, and an EEPROM (Electrically Erasable) for storing operation programs and the like & Programmable Read-Only Memory) 5 and OSD or caption data selected by switching signal of microcomputer 4 A video switch 6 to be output, a video / deflection processing unit 7 for performing video processing and deflection processing, a CPT (Color Picture Tube) 8 as a display unit, and a caption among broadcast signals selected via the tuner 1 A caption processing unit 9 that processes information, a learning database 11 that stores predetermined word data for each learning level, and an interface unit 10 for data exchange between the caption processing unit 9 and the learning database 11 are included. It consists of
[0003]
Here, the caption processing unit 9 performs a caption information process with a data slicer (Date Slicer) 12 that extracts caption information from a broadcast signal selected by the tuner 1 in accordance with the synchronization signal output from the synchronization signal separation unit 3. From a caption control unit 14 for controlling, a caption decoder 13 for decoding caption information, a font ROM / program ROM / data RAM 15 for storing fonts and font processing programs corresponding to languages such as English, Korean, Japanese, etc. It is configured.
[0004]
An operation for executing the learning function in the video equipment according to the related art configured as described above will be described below.
When the user turns on the learning function, the microcomputer 4 controls the caption by specifying the learning function setting contents from the user, that is, the address of data corresponding to the learning level, start number, or display position set by the user. A control signal is supplied to the unit 14 and the corresponding data is read from the learning database 11.
Therefore, the caption control unit 14 matches the font corresponding to the data set from the font ROM / program ROM / data RAM 15 to read the learning data, and inputs the read learning data to the video switch 6 via the interface unit 10. Let
[0005]
Next, the microcomputer 4 outputs a control signal to the video switch 6 to turn on the switch so that the corresponding learning data is output to the video / deflection processing unit 7. Then, it is output on the CPT 8 via the video / deflection processing unit 7.
[0006]
On the other hand, the audio signal included in the broadcast signal and received by the tuner 1 is signal-processed via the tone adjusting unit 16 and the amplifier 17 and output as audio via the speaker 18 in synchronization with the corresponding video. The
[0007]
However, such a conventional video device having a learning function according to the conventional technique displays only a predetermined word on the screen and does not output a sound corresponding to the word, and therefore cannot hear a pronunciation corresponding to the word. Therefore, there is a problem that the efficiency of learning by the user is lowered as well as the user's desire to learn cannot be satisfied.
[0008]
[Problems to be solved by the invention]
The present invention has been made to solve the above-described conventional problems, and displays a learning word on a screen and simultaneously outputs a sound corresponding to the word in synchronization with the word. Accordingly, an object of the present invention is to provide a video apparatus having a learning function and a method for controlling the same that can increase learning efficiency.
[0009]
[Means for Solving the Problems]
In order to achieve the above object, a video apparatus having a learning function according to the first basic configuration of the present invention displays a learning database that stores learning data including phonetic symbols and image data on a display screen. Therefore, subtitle data included in a broadcast signal and the learning data stored in the learning database are subjected to subtitle processing, and digital audio data for each phonetic symbol stored in advance is used to store the learning data. A plurality of oscillators for synthesizing accent speech corresponding to the word in the middle and generating clock pulses matching the corresponding accent to modulate the frequency of the output speech corresponding to the accent of the phonetic symbol included in the learning data a sound generating unit including, when providing voice learning is requested by the user, the corresponding from the learning database A control unit that reads the learning data and controls the subtitle processing unit and the sound generation unit so that learning data corresponding to the setting content requested by the user and sound corresponding thereto are output. To do.
Further, in the video apparatus having a learning function according to the first basic structure, the sound voice onset raw portion includes a phonetic symbol voice storage ROM that stores the phonetic symbols different voice data, before Symbol controller A frequency switch for selecting and outputting one of clock pulses generated by the oscillator according to a control signal; and voice data selected and output from the phonetic symbol voice storage ROM by the control unit. A voice synthesizer that synthesizes according to the clock pulse selected and output by the signal generator, and a D / A converter for converting the output of the voice synthesizer into an analog signal and outputting it to an audio switch.
[0010]
In addition, the video equipment having the learning function according to the second basic configuration of the present invention includes a learning database for storing learning data including phonetic symbols and a broadcast signal for display as image data on a display screen. Corresponding to the words in the learning data by using the caption processing unit that performs caption processing on the included caption data and the learning data stored in the learning database, and digital voice data for each phonetic symbol stored in advance A voice generation unit including a plurality of oscillators for synthesizing a voice of an accent and generating a clock pulse matching the corresponding accent to modulate the frequency of the output voice corresponding to the accent of the phonetic symbol included in the learning data; When the user requests to provide speech learning, the corresponding learning data is read from the learning database. A control unit that controls the subtitle processing unit and the audio generation unit so that learning data corresponding to the requested setting content and sound corresponding thereto are output, and a subtitle signal of the subtitle processing unit or the learning data is input And a video switch for inputting an OSD signal for controlling video equipment on a screen and selectively outputting the caption signal, the learning data, and the OSD signal according to a switching signal of the control unit, The output from the video switch and the video signal are input, the video processing unit that processes the video signal so that the output and the video signal are displayed on the screen, the output from the audio generation unit, and the audio signal are input respectively. And an audio system that selectively outputs the output from the video switch and the video signal in accordance with a switching signal of the control unit. And pitch, the output of the audio switch comprising: a, a sound signal processing unit for audio signal processing to allow the output through the speaker.
[0016]
Hereinafter, a video apparatus having a learning function and a control method thereof according to first and second embodiments of the present invention will be described in detail with reference to the accompanying drawings.
(First embodiment)
As shown in FIG. 2, the video equipment having the learning function according to the present invention includes a sound generation unit 19 that synthesizes and outputs learning data stored in the learning database 11, for example, a sound corresponding to an English word, and a microcomputer 4. Except for the audio switch 28 that selects and outputs one of the sound output from the sound generation unit 19 and the sound included in the broadcast signal in accordance with the control signal of Since they are the same, the same constituent elements are denoted by the same reference numerals as those in the prior art, and redundant description is omitted.
[0017]
Here, the voice generator 19 stores a phonetic symbol voice storage ROM 20 that stores voice data for each English phonetic symbol, and an accent to modulate the frequency of the output voice to match the accent appearing in the phonetic symbol of the learning data. One of the first, second, and third oscillators 21, 22, and 23 for generating different clock pulses and one of the clock pulses of the first to third oscillators 21 to 23 are selected according to the control signal of the caption control unit 14. The frequency switch 24 for output and the voice data selected and output from the phonetic symbol storage ROM 20 by the caption control unit 14 are given accents corresponding to the clock pulses selected and output by the frequency switch 24. A speech synthesis unit 25 for synthesis, and a D / D for converting the output of the speech synthesis unit 25 into analog The output of the converter 26 and the D / A converter 26 by the buffer, and a buffer 27. to be output to the audio switch 28.
[0018]
The phonetic symbol voice storage ROM 20 stores phonetic symbol-specific voice data as shown in FIG. 2 in the form shown in FIG. 3 together with index data for facilitating the search of the corresponding voice data. ing.
[0019]
A method for executing the learning function in the video apparatus according to the first embodiment of the present invention configured as described above will be described below with reference to FIG.
When the learning mode is set by the user and the voice learning mode is not set (S22), learning data corresponding to the setting contents of the user, that is, English words including meaning, phonetic symbols, and the like are read from the learning database 11. The data is transmitted to the caption control unit 14 via the interface unit 10, is subjected to caption processing under the control of the caption control unit 14, and is displayed on the screen of the CPT 8 via the video switch 6 (S23).
[0020]
On the other hand, when the user sets the speech learning mode (S22), the phonetic symbol of each English word included in the learning data is read from the caption control unit 14 according to the control signal of the microcomputer 4, and the phoneme-specific digital of the phonetic symbol is read. The voice data is read from the phonetic symbol voice storage ROM 20 and supplied to the voice synthesizer 25 (S24).
Then, the caption control unit 14 expresses an accent corresponding to the pronunciation symbol of the English word and outputs it via the voice synthesis unit 25 in synchronization with the video data of the English word based on the control of the microcomputer 4. The frequency switch 24 is controlled as described above (S25).
[0021]
That is, in FIG. 1, the caption control unit 14 applies a control signal to the frequency switch 24, applies a clock pulse matching the corresponding accent to the speech synthesizer 25, and the speech synthesizer 25 sequentially inputs each sound generation. The voice data is synthesized and output in accordance with the clock pulses supplied from the first to third oscillators 21, 22, and 23 so that the voice data classified by symbol phoneme can express the first and second accents.
At this time, as another method for expressing the accent in the output English word voice data, the first and second accents are expressed by adjusting the level of the output voice data and changing the volume level for each phoneme. You can also.
[0022]
Next, the digital voice data output from the voice synthesizer 25 is converted into analog voice data via the D / A converter 26 and input to the audio switch 28 via the buffer 27. The audio switch 28 outputs the audio data output from the buffer 27 in accordance with a control signal from the microcomputer 4.
[0023]
Next, the sound data output from the audio switch 28 is adjusted in tone quality by the tone adjusting unit 16, amplified by the amplifier 17, and output in synchronization with the English words displayed on the screen via the speaker 18. Is done. Therefore, the user can listen to the English word and the pronunciation corresponding to it at the same time. In addition, the present invention is an example in which the voice learning function is applied to a television apparatus, and can be easily applied by configuring a basic caption function and a learning function in cooperation with various video devices including a set-top box. be able to.
[0024]
(Second Embodiment)
A television receiver as a video device having a learning function according to the second embodiment of the present invention is included in a word database 120 for storing predetermined word data and an external video signal as shown in FIG. Caption processing unit 110 for performing caption processing on the caption data or word data stored in the word database 120, an interface unit 130 for exchanging data between the caption processing unit 110 and the word database 120, and the word data. A speech generator 140 for synthesizing a prosody processed sound after converting it into a pronunciationable character string, and reading the corresponding word data from the word database 120 when the user's speech learning function is requested. 110, the caption processing is performed. The microcomputer 70 that controls the sound generator 140 so that sound corresponding to the data is output, the EEPROM 60 for storing the operation program of the microcomputer 70, and the video output from the caption processing unit 110 A video processing unit 30 that performs signal processing so that a signal and a video signal output from the microcomputer 70 or an external video signal can be displayed on a screen, and a CPT 40 that displays a signal output processed by the video processing unit 30. And an audio switch 100 for selectively outputting the output of the sound generator 140 and an external sound signal under the control of the microcomputer 70, and signal processing so that the output of the audio switch 100 can be output by the speaker 90. And an audio processing unit 80. There.
[0025]
At this time, the caption processing unit 110 includes a data slicer 111 for extracting caption information from an external video signal, a caption decoder 112 for decoding the caption information, a font for expressing words, and a font processing program. The memory unit 114 stores the caption information corresponding to the user's storage command in the caption information currently displayed on the screen, and the predetermined caption information is stored in the memory unit 114 under the control of the microcomputer 70. Or a caption control unit 113 for controlling the operation of the data slicer 111 and the caption decoder 112 by reading from the memory unit 114 and transmitting to the audio generation unit 140.
[0026]
Then, the voice generation unit 140 converts the voice data 141 for storing each pronunciation-specific feature parameter, converts the word data into a character string that can be pronounced, analyzes it, performs prosodic processing, and selects the feature parameter corresponding thereto. An audio processor 142 for reading from the audio database 141 and generating a digital audio signal, and a D / A converter 143 and a D / A converter for converting the digital audio signal generated from the audio processor 142 into an analog audio signal And a buffer 144 for buffering the output from 143.
[0027]
At this time, as shown in FIG. 6, the speech processor 142 includes a character string standardization unit 142-1 for standardizing numbers, alphabets, abbreviations, special symbols, and the like in the word data into a pronounceable character string. Analyzing the stylized character string to detect boundary points between phrases and clauses, and outputting from the character string analyzing unit 142-2 for setting the pronunciation of multiple pronunciation words When the pronunciation changes due to the connection between syllables in the generated character string, the phonetic symbol processing unit 142-3 for converting the phonetic symbols to match the changed pronunciation, and the phonetic symbol processing unit 142-3 output A prosody processing unit 142-4 for giving a prosody of length, strength, and intonation to the character string, and a feature parameter corresponding to the pronunciation of the character string output from the prosody processing unit 142-4 Read from, and a sound generation unit 142-5 Prefecture for synthesizing a digital audio signal conforming to it.
[0028]
The video processor 30 also selects a tuner 31 for selecting a broadcast signal received via the antenna, and a CVBS (Composite Video Blanking Signal) selected by the tuner 31, that is, a broadcast signal of the composite video signal. Y / C separation unit 32 that separates into Y / U / V signals, synchronization signal separation unit 33 that separates the synchronization signal from the broadcast signal selected by tuner 31, and video / deflection processing that performs video processing and deflection processing Part 34. The audio processing unit 80 includes a tone adjustment unit 81 and an amplifier 82.
[0029]
Hereinafter, a method of executing the learning function of the video equipment according to the second embodiment of the present invention configured as described above will be described with reference to the flowcharts of FIGS.
First, in order to perform a learning function, the user operates a key of the television device or a remote controller to input a learning function “ON” command. Then, as shown in FIG. 7, the microcomputer 70 determines whether or not the learning function is “ON” (S41).
[0030]
Next, if the determination result (S41) indicates that the learning function is "ON", the learning setting value set by the user, that is, the learning level / start position / display position, etc. is grasped (S42). Then, the word data of the address corresponding to the learning set value is read from the word database 120 (S43).
[0031]
Next, the microcomputer 70 controls the caption processing unit 110 to perform caption processing of the word data so as to match the corresponding font, to perform video processing through the video processing unit 30, and to display the data through the CPT 40. (S44). At the same time, the microcomputer 70 allows the voice generator 140 to transmit word data via the caption controller 110. Accordingly, the voice generation unit 140 converts the word data into a character string that can be pronounced and analyzes it, and then performs prosodic processing (S45).
Then, the voice generation unit 140 synthesizes the character string subjected to prosodic processing using the corresponding pronunciation-specific feature parameters stored in the voice database 141 and outputs the synthesized voice to the speaker (S46).
[0032]
At this time, a detailed operation of the sound generation unit 140 will be described below.
That is, the character string stylization unit 142-1 standardizes numbers, alphabets, abbreviations, and special symbols in the word data into a character string that can be pronounced.
[0033]
Next, the character string analysis unit 142-2 analyzes the stylized character string to detect boundary points between phrases and clauses, and selects a proper pronunciation of a word having two or more pronunciations.
[0034]
Then, the phonetic symbol processing unit 142-3 calibrates the phonetic symbols so as to match the changed pronunciation when the pronunciation of the character string output from the character string analyzing unit 142-2 varies due to the connection between syllables.
[0035]
Next, the prosody processing unit 142-4 gives a prosody such as length, strength, and intonation to the character string output from the phonetic symbol processing unit 142-3.
[0036]
Then, the sound generation unit 142-5 generates digital voice data by synthesizing the character string subjected to the prosodic processing and the characteristic parameters corresponding to the pronunciation thereof, that is, frequency, bandwidth, and energy information.
[0037]
Next, the digital audio data is converted into analog audio via the D / A converter 143 and output to the speaker 90 via the buffer 144.
[0038]
When the voice output operation is completed in this way, it is determined whether the display period of the word displayed on the screen has reached the set period (S47), and if it has reached, the start number of the word is increased ( In S48, it is determined whether or not the displayed word number is an end setting number (S49).
[0039]
Next, if the determination result (S49) indicates that the word number is the end setting number, the process returns to the first start number (S50) and returns to the step (S43).
[0040]
On the other hand, if the learning function is not “ON” as a result of the determination (S41), it is determined whether the learning function “OFF” command is input (S51), and the learning function “OFF” command is input. If so, the learning set value and the current progress state are stored (S52), and the word and voice output is stopped (S53). If the learning function “OFF” command is not input, the corresponding command is processed (S54).
[0041]
In addition, the present invention can store displayed words when necessary and reproduce them together with the corresponding sound at a desired time. This will be described below with reference to FIG.
First, when a user desires, a word storage command or a word reproduction command can be input by operating a key of a video device, for example, a television receiver or a key of a remote controller.
Therefore, as shown in FIG. 8, the microcomputer 70 determines whether a word storage command is input (S61). When the word storage command is input, the microcomputer 70 stores the corresponding word in the memory unit 114 of the caption processing unit 110. (S62).
[0042]
On the other hand, if the determination result (S61) indicates that no word storage command is input, it is determined whether a word reproduction command is input (S63). When the word reproduction command is input, the caption control unit 113 is controlled to read the word stored in the memory unit 114 (S64).
[0043]
Next, the caption processing unit 110 is controlled so that the word data is captioned so as to match the corresponding font, and the image processing unit 30 performs image processing so that the word data is displayed on the screen of the CPT 40 (S65).
At the same time, the voice generation unit 140 transmits the word data via the caption control unit 113, and converts the word data into a utterable character string and analyzes it as described with reference to FIG. Thereafter, prosody processing is performed (S66). In addition, the voice generation unit 140 synthesizes the character string subjected to prosodic processing using the corresponding pronunciation-specific feature parameters stored in the voice database 141, and outputs the synthesized voice through the speaker (S67).
[0044]
In addition, in the video equipment having the learning function according to the first and second embodiments described above, the language as a learning target has been described by taking English as an example, but the present invention is not limited to this, for example, learning Korean. Even when learning Japanese or learning Japanese, the present invention can be applied to other languages such as Chinese, Russian, Arabic, Spanish, French, German, etc. Can be applied.
[0045]
In the first and second embodiments described above, the present invention is an example in which the learning function is applied to a television apparatus. However, the present invention is not limited to this, and is basically applied to various video devices including a set-top box. By combining the basic caption function and the learning function, it can be easily applied to other video devices.
[0046]
【The invention's effect】
In addition, the video equipment having the learning function and the control method thereof according to the present invention increase the learning efficiency of the user and stimulate learning motivation by outputting a word as a learning target such as an English word and a corresponding voice. There are various effects such as being able to improve the competitiveness of the product in the market by differentiating from other products.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a video device having a learning function according to a first embodiment of the present invention.
FIG. 2 is an explanatory diagram showing a phonetic symbol table of the video equipment according to the first embodiment of the present invention.
FIG. 3 is an explanatory diagram showing a storage format of phonetic symbol voice data and index data according to the first embodiment of the present invention.
FIG. 4 is a flowchart showing a method for controlling a video device having a learning function according to the first embodiment of the present invention.
FIG. 5 is a block diagram showing a configuration of a video device having a learning function according to the second embodiment of the present invention.
6 is a block diagram showing a detailed configuration of the audio processor of FIG. 5;
FIG. 7 is a flowchart showing a method for controlling a video device having a learning function according to the second embodiment of the present invention.
FIG. 8 is a flowchart showing a word storage and reproduction method in the control method of the video equipment having a learning function according to the second embodiment of the present invention.
FIG. 9 is a block diagram showing a configuration of a video device having a learning function according to a conventional technique.
[Explanation of symbols]
4 Microcomputer 6 Video switch 8 CPT
9 Caption processing unit 11 Learning interface 14 Caption control unit 16 Tone adjustment unit 18 Speaker 19 Sound generation unit 20 Phonetic symbol voice storage ROM
21 First oscillator 22 Second oscillator 23 Third oscillator 24 Frequency switch 25 Speech synthesizer 28 Audio switch 40 CPT
50 Video Switch 70 Microcomputer 80 Audio Processing Unit 81 Tone Adjustment Unit 90 Speaker 110 Subtitle Processing Unit 113 Caption Control Unit 120 Word Database 140 Audio Generation Unit 141 Audio Database 142 Audio Processor

Claims

A learning database for storing learning data including phonetic symbols;
A caption processing unit that performs caption processing on caption data included in a broadcast signal and the learning data stored in the learning database in order to display the image data on a display screen;
Using prestored digital voice data for each phonetic symbol, the speech of the accent corresponding to the word in the learning data is synthesized and the frequency of the output voice corresponding to the accent of the phonetic symbol included in the learning data A sound generator including a plurality of oscillators for generating clock pulses that match a corresponding accent to modulate
When the user is requested to provide voice learning, the corresponding learning data is read from the learning database, and the learning data corresponding to the setting content requested by the user and the corresponding voice are output. A control unit that controls the caption processing unit and the sound generation unit;
A video apparatus having a learning function.

The sound voice onset generation unit,
A phonetic symbol voice storage ROM that stores the phonetic symbols different voice data,
A frequency switch for selecting and outputting one of the clock pulses generated by the oscillator in accordance with the control signal before Symbol controller,
A voice synthesizer that synthesizes voice data selected and output from the phonetic symbol voice storage ROM by the control unit according to the clock pulse selected and output by the frequency switch;
A D / A converter for converting the output of the speech synthesizer into analog and outputting it to an audio switch;
The video equipment having a learning function according to claim 1, comprising:

A learning database for storing learning data including phonetic symbols;
A caption processing unit that performs caption processing on caption data included in a broadcast signal and the learning data stored in the learning database in order to display the image data on a display screen;
Using prestored digital voice data for each phonetic symbol, the speech of the accent corresponding to the word in the learning data is synthesized and the frequency of the output voice corresponding to the accent of the phonetic symbol included in the learning data A sound generator including a plurality of oscillators for generating clock pulses that match a corresponding accent to modulate
When the user requests to provide voice learning, the subtitle processing unit reads the corresponding learning data from the learning database and outputs the learning data corresponding to the requested setting contents and the corresponding voice. And a control unit for controlling the sound generation unit,
The caption signal of the caption processing unit or the learning data is input, and an OSD signal for controlling a video device is input on the screen, and the caption signal, the learning data, and the control signal are switched according to the switching signal of the control unit. A video switch for selectively outputting an OSD signal;
A video processing unit that inputs an output and a video signal from the video switch, and that processes the video signal so that the output and the video signal are displayed on a screen;
An audio switch that inputs an output from the audio generation unit and an audio signal, and selectively outputs the output from the video switch and the video signal according to a switching signal of the control unit,
An audio signal processing unit for processing an audio signal so that an output of the audio switch can be output via a speaker;
A video apparatus having a learning function.