JP2006054517A

JP2006054517A - Information presenting apparatus, method, and program

Info

Publication number: JP2006054517A
Application number: JP2004232760A
Authority: JP
Inventors: Makoto Shibata; 誠柴田; Masanobu Osumi; 正信大隅
Original assignee: Bank of Tokyo Mitsubishi Ltd
Current assignee: MUFG Bank Ltd
Priority date: 2004-08-09
Filing date: 2004-08-09
Publication date: 2006-02-23

Abstract

<P>PROBLEM TO BE SOLVED: To enable a user to readily and surely view video or listen to audio desired by the user among video or audio that is broadcast. <P>SOLUTION: A signal processing section 16 converts a received TV signal of a particular channel during broadcasting into a compressed moving picture file and sequentially records the file to a moving picture file DB 22 of a HDD 20. Further, a speech recognition section 18 sequentially receives a speech signal included in the received TV signal, converts the speech signal into character data through speech recognition and sequentially records the character data to a character data DB 23. A character data searching section 34 searches the character data wherein a searching phrase is included as a character string and, when the corresponding character data are extracted, displays a date and time and a reception channel, at a point of time when the speech signal corresponding to the corresponding character data is received, on a display apparatus 30 as information of a reproduction object moving picture. Notifying a moving picture reproduction section 28 about the above information allows the display apparatus 30 and a loudspeaker 32 to reproduce moving pictures for several minutes before and after a timing when the searching phrase is uttered as a speech. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は情報提示装置、方法及びプログラムに係り、特に、放送された映像信号又は音声信号を記録する機能を備えた情報提示装置、該情報提示装置に適用可能な情報提示方法、及び、コンピュータを前記情報提示装置として機能させるための情報提示プログラムに関する。 The present invention relates to an information presentation apparatus, method, and program, and more particularly to an information presentation apparatus having a function of recording a broadcast video signal or audio signal, an information presentation method applicable to the information presentation apparatus, and a computer. The present invention relates to an information presentation program for causing the information presentation apparatus to function.

テレビジョン放送番組録画装置（ＶＴＲ）によって所望のテレビジョン番組の録画を予約する場合、所望の番組の録画予約に必要な各種の情報、すなわちチャンネルや日付（月日）、録画開始時刻及び録画終了時刻をキー入力する方法が一般的であるが、この方法は操作が煩雑な上に入力誤りが生じ易いという欠点があり、従来より種々の録画予約方法が提案されている。 When reserving recording of a desired television program by a television broadcast program recording device (VTR), various information necessary for reserving recording of the desired program, that is, channel, date (month / day), recording start time and recording end A method of keying in the time is common, but this method has the disadvantages that the operation is complicated and an input error tends to occur, and various recording reservation methods have been proposed.

例えば特許文献１には、放送番組に関する番組情報を取得すると共に、利用者の過去の番組視聴についての視聴履歴情報を取得し、番組情報と視聴履歴情報から利用者が視聴する番組の嗜好を学習し、当該学習の結果を用いて将来放送予定の番組の中から視聴可能性のある番組を検索し、視聴可能性が高く重複のない録画候補となる番組を決定し、その録画候補を自動的に録画予約することで、煩雑な録画予約操作を行うことなく、利用者の嗜好に合致した番組を自動的に録画予約する技術が開示されている。 For example, Patent Document 1 acquires program information related to a broadcast program, acquires viewing history information about the user's past program viewing, and learns the preference of the program viewed by the user from the program information and viewing history information. Then, using the learning results, search for programs that can be viewed from programs scheduled to be broadcast in the future, determine programs that are highly viewable and have no overlapping recording candidates, and automatically select those recording candidates. In other words, there is disclosed a technique for automatically making a recording reservation for a program that matches a user's preference without making a complicated recording reservation operation.

また、上記に関連して特許文献２には、キーボードやマイクロフォンを介してキー音源を事前に登録しておき、テレビジョン信号に含まれる音声信号とキー音源とを比較し、音声信号とキー音源との一致を示す検出信号に基づいて音声信号記憶部及び映像信号記憶部を制御することで、番組のハイライトシーン等の所望の場面のみを自動的に抜き出して記録するダイジェスト録画を実現する技術が開示されている。 In relation to the above, in Patent Document 2, a key sound source is registered in advance via a keyboard or a microphone, and the audio signal included in the television signal is compared with the key sound source. That realizes digest recording by automatically extracting and recording only desired scenes such as program highlight scenes by controlling the audio signal storage unit and the video signal storage unit based on a detection signal indicating a match with Is disclosed.

また特許文献３には、入力映像信号を記憶すると共に、映像信号から検索に適した形の特徴値（例えば代表フレームを４分割した個々のブロック毎の色平均値や、音声信号に対して音声解析を行うことで得られたテキストデータから選出したキーワード）を算出して映像特徴値記憶装置に記録し、算出した特徴値が既に記録されている別の映像信号の特徴値の何れかと類似していた場合には、これらの特徴値が関連していることを示す映像関連情報を映像関連情報記憶装置に記録しておき、映像信号を視聴しているユーザから関連映像の視聴が要求された場合に、関連映像信号を再生表示装置で表示する技術が開示されている。
特開平１１−３４５４４６号公報特開平０９−００９１９９号公報特開２０００−３０８０１７号公報 Patent Document 3 stores an input video signal and features values in a form suitable for search from the video signal (for example, color average value for each block obtained by dividing a representative frame into four parts, or audio for an audio signal. The keyword selected from the text data obtained by the analysis is calculated and recorded in the video feature value storage device, and the calculated feature value is similar to any of the feature values of another video signal already recorded. The video related information indicating that these feature values are related is recorded in the video related information storage device, and the user viewing the video signal is requested to view the related video. In this case, a technique for displaying a related video signal on a reproduction display device is disclosed.
JP-A-11-345446 JP 09-009199 A JP 2000-308017 A

しかしながら、ユーザが視聴を所望する番組は常に一定の基準（ユーザの嗜好等）に従って選択されるとは限らず、ユーザが、通常時のユーザの嗜好等とは合致しない番組の視聴を所望する場合もある。例えば或る日突発的に大きな事件が発生した場合、漏れ伝わる断片的な報道で当該事件の発生を知り興味を持ったユーザは、その日の帰宅後に前記事件の詳細を確認するために、前記事件を報道している番組の視聴を所望する可能性が高い。これに対して特許文献１及び特許文献２に記載の技術は、何れも事前に設定した情報（視聴履歴情報から学習した結果又は事前に登録したキー音源）に基づいて録画予約を行う技術であるので、上記のようにユーザが視聴したい番組が突発的に発生したとしても、当該番組の録画を予約することは不可能である。 However, the program that the user desires to watch is not always selected according to a certain standard (user preference or the like), and the user desires to watch a program that does not match the user's preference or the like at normal times. There is also. For example, when a large incident occurs suddenly on a certain day, a user who is interested in knowing the occurrence of the incident through a fragmented report that leaks out can confirm the details of the incident after returning home on that day. There is a high possibility of wanting to watch a program that reports on On the other hand, the techniques described in Patent Document 1 and Patent Document 2 are techniques for making a recording reservation based on information set in advance (a result learned from viewing history information or a key sound source registered in advance). Therefore, even if a program that the user wants to watch suddenly occurs as described above, it is impossible to reserve recording of the program.

また、ユーザが外出先から録画予約を設定することを可能とする技術も提案されているが、例えば上記のように大きな事件が発生した等の場合には、発生した事件を詳細に報道する特別番組が急遽放送される等のように番組編成が変更され、事前に配布された番組表とは異なる番組編成で放送がされることも多い。そして、変更後の番組編成をユーザが知ることは容易ではないので、上記技術を利用したとしても、番組編成の変更を伴って急遽放送される番組が正しく録画されるように録画予約を設定することは困難である。

また、特許文献３に記載の技術は、或る映像を視聴しているユーザが視聴中の映像に関連する映像の視聴を所望した場合には、ユーザが視聴を所望している映像を提示することができるが、ユーザが所望している映像が視聴中の映像と関連の無い映像であった場合、或いはユーザが映像を視聴していない状態では、ユーザが所望している映像を提示することは不可能である。このため、或る映像を視聴したい場合に、まず視聴したい映像又はそれに関連する映像をユーザ自身が探し出して視聴する必要があり、使い勝手が非常に悪いという問題がある。また、例えば大きな事件の発生を知ったときに、或るユーザはその事件を報道している番組の視聴を所望する一方、別のユーザの関心事はその事件に関係する企業の株価の動向である等、ユーザが視聴を所望する映像はユーザによって大きく相違する。これに対して特許文献３に記載の技術では、自動的に算出した特徴値に基づいて映像信号同士の関連を判断しているので、関連映像として検索・抽出した映像の中にユーザが視聴を所望している映像が含まれていなかったり、これを回避するために検索条件を緩くすると（例えば個々の映像の特徴値としてより多数のキーワードを記録しておき、何れか１つのキーワードが特徴値として記録されている映像を検索する等）、ユーザが視聴を所望している映像と異なる多数の映像が関連映像として検索・抽出されるという問題が生ずる。 In addition, a technology that allows a user to set a recording reservation from the outside is also proposed. For example, when a major incident occurs as described above, a special report that reports the incident in detail is provided. In many cases, the program organization is changed, such as when a program is broadcast suddenly, and the program organization is different from the program table distributed in advance. And since it is not easy for the user to know the program organization after the change, even if the above technique is used, the recording reservation is set so that the program broadcast suddenly with the program organization change is correctly recorded. It is difficult.

Further, the technology described in Patent Document 3 presents a video that the user desires to view when a user who is viewing a video desires to view a video related to the video that is being viewed. However, when the video desired by the user is not related to the video being viewed, or when the user is not viewing the video, the video desired by the user is presented. Is impossible. For this reason, when a user wants to view a certain video, the user first has to search for and view the video he / she wants to watch or a video related thereto, which is very inconvenient. For example, when a user knows that a major incident has occurred, one user wants to watch a program reporting the incident, while another user's interest is the stock price trend of the company related to the incident. For example, the video that the user desires to view varies greatly depending on the user. On the other hand, in the technique described in Patent Document 3, since the relation between the video signals is determined based on the automatically calculated feature value, the user can watch the video searched and extracted as the related video. If the desired video is not included or the search conditions are relaxed to avoid this (for example, a larger number of keywords are recorded as feature values of individual videos, and any one keyword is a feature value. For example, a video that is recorded as a video) is searched and extracted as a related video.

また、上記の問題はテレビジョン放送等で放送された映像を視聴する場合に限られるものではなく、ラジオ放送等で放送された音声を聴取する場合にも同様に生じ得る問題である。 In addition, the above problem is not limited to the case of viewing a video broadcast by a television broadcast or the like, but may also occur when listening to a sound broadcast by a radio broadcast or the like.

本発明は上記事実を考慮して成されたもので、放送された映像又は音声のうちユーザが所望した映像又は音声を簡単かつ確実に視聴又は聴取させることが可能な情報提示装置、情報提示方法及び情報提示プログラムを得ることが目的である。 The present invention has been made in consideration of the above facts, and is an information presentation apparatus and an information presentation method capable of easily and surely viewing or listening to a desired video or audio among broadcasted video or audio. And to obtain an information presentation program.

上記目的を達成するために請求項１記載の発明に係る情報提示装置は、放送中の映像信号又は音声信号を受信する受信手段と、前記受信手段によって受信された映像信号又は音声信号を記録媒体へ記録する記録手段と、前記受信された映像信号に含まれる音声信号又は前記受信された音声信号を音声認識によって文字情報へ変換する音声認識手段と、前記音声認識手段による音声認識によって得られた文字情報を、前記記録媒体に記録された映像信号又は音声信号と対応付ける情報と共に記録媒体へ記録する文字情報記録手段と、前記文字情報記録手段によって記録媒体に記録された文字情報に対し、ユーザによって指定された検索対象語句を検索する検索手段と、前記検索手段による検索によって検索対象語句が存在する文字情報が抽出された場合に、該文字情報に対応する映像信号又は音声信号を記録媒体から読み出して再生することで提示するか、又は、前記検索対象語句が存在する文字情報に対応する映像信号又は音声信号を特定可能な情報を提示する提示手段と、を含んで構成されている。 In order to achieve the above object, an information presentation apparatus according to the invention described in claim 1 includes a receiving means for receiving a video signal or an audio signal being broadcast, and a recording medium for receiving the video signal or the audio signal received by the receiving means. Obtained by the voice recognition by the voice recognition means, the voice recognition means for converting the voice signal contained in the received video signal or the received voice signal into character information by voice recognition, and the voice recognition means Character information recording means for recording character information on the recording medium together with information associated with the video signal or audio signal recorded on the recording medium, and character information recorded on the recording medium by the character information recording means by the user Search means for searching for the specified search target phrase, and character information in which the search target phrase exists is extracted by the search by the search means. The video signal or audio signal corresponding to the character information is read out from the recording medium and reproduced, or the video signal or audio signal corresponding to the character information containing the search target phrase is specified. Presenting means for presenting possible information.

請求項１記載の発明では、放送中の映像信号又は音声信号が受信手段によって受信され、受信手段によって受信された映像信号又は音声信号は記録手段によって記録媒体へ記録される。なお、放送中の映像信号としては、例えばアナログ又はデジタルのテレビジョン信号を適用することができる。この場合、記録手段は、例えば請求項２に記載したように、受信手段によって受信されたテレビジョン信号を、圧縮されたデジタルの映像信号へ変換した後に記録媒体へ記録するように構成することが好ましい。圧縮された映像信号を記録することで、記録媒体へ記録する映像信号のデータ量を削減することができ、映像信号を記録するために必要な記録媒体の容量を節減することができる。また、放送中の音声信号としては、例えばラジオ信号を適用することもできる。この場合も、受信されたラジオ信号を圧縮されたデジタルの音声信号へ変換した後に記録媒体へ記録するように記録手段を構成することが好ましい。また、記録媒体としては、例えばハードディスク等のランダムアクセス可能な記録媒体が好ましい。 According to the first aspect of the present invention, the video signal or audio signal being broadcast is received by the receiving means, and the video signal or audio signal received by the receiving means is recorded on the recording medium by the recording means. For example, an analog or digital television signal can be applied as the video signal being broadcast. In this case, the recording means may be configured to record the television signal received by the receiving means on a recording medium after converting the television signal received by the receiving means into a compressed digital video signal. preferable. By recording the compressed video signal, the data amount of the video signal to be recorded on the recording medium can be reduced, and the capacity of the recording medium necessary for recording the video signal can be reduced. Further, for example, a radio signal can be applied as an audio signal during broadcasting. In this case as well, it is preferable to configure the recording means so that the received radio signal is converted into a compressed digital audio signal and then recorded on the recording medium. The recording medium is preferably a randomly accessible recording medium such as a hard disk.

また、請求項１記載の発明は、受信手段によって受信された映像信号に含まれる音声信号又は受信手段によって受信された音声信号を音声認識によって文字情報へ変換する音声認識手段を備えており、音声認識手段による音声認識によって得られた文字情報は、文字情報記録手段により、記録手段によって記録媒体に記録された映像信号又は音声信号と対応付ける情報と共に記録媒体へ記録される。従って、受信手段によって映像信号が受信される場合には、受信された映像信号（及び該映像信号に含まれる音声信号）と該映像信号に含まれる音声信号が表す音声を文字化した文字情報が対応付けられて記録媒体へ各々記録され、受信手段によって音声信号が受信される場合には、受信された音声信号と該音声信号が表す音声を文字化した文字情報が対応付けられて記録媒体へ各々記録されることになる。 Further, the invention described in claim 1 includes voice recognition means for converting voice signals included in the video signal received by the receiving means or voice signals received by the receiving means into character information by voice recognition. The character information obtained by the voice recognition by the recognition unit is recorded on the recording medium by the character information recording unit together with information associated with the video signal or the audio signal recorded on the recording medium by the recording unit. Therefore, when the video signal is received by the receiving means, the received video signal (and the audio signal included in the video signal) and the character information obtained by characterizing the voice represented by the audio signal included in the video signal are displayed. When the audio signal is received by the receiving means in association with each other, the received audio signal and character information obtained by characterizing the audio represented by the audio signal are associated with each other and recorded on the recording medium. Each will be recorded.

そして請求項１記載の発明では、文字情報記録手段によって記録媒体に記録された文字情報に対し、ユーザによって指定された検索対象語句を検索手段が検索し、提示手段は、検索手段による検索によって検索対象語句が存在する文字情報が抽出された場合に、該文字情報に対応する映像信号又は音声信号を記録媒体から読み出して再生することで提示するか、又は、検索対象語句が存在する文字情報に対応する映像信号又は音声信号を特定可能な情報を提示する。これにより、検索対象語句として特定の語句をユーザが指定すれば、前記特定の語句を発声する音声が含まれる映像信号又は音声信号が検索され、該当する映像信号又は音声信号が存在していた場合にはこの映像信号又は音声信号が抽出されることで、該当する映像信号又は音声信号の再生又は該当する映像信号又は音声信号を特定可能な情報の提示が行われることになる。 According to the first aspect of the present invention, the search means searches for the search target phrase specified by the user for the character information recorded on the recording medium by the character information recording means, and the presentation means searches by the search by the search means. When character information in which the target word / phrase exists is extracted, a video signal or audio signal corresponding to the character information is read out from the recording medium and reproduced, or presented in the character information in which the search target word / phrase exists. Information that can identify the corresponding video signal or audio signal is presented. Thereby, if the user designates a specific phrase as a search target phrase, a video signal or an audio signal including the voice that utters the specific phrase is searched, and the corresponding video signal or audio signal exists When the video signal or audio signal is extracted, the corresponding video signal or audio signal is reproduced or information that can identify the corresponding video signal or audio signal is presented.

このように、請求項１記載の発明では、受信手段が受信した映像信号及び音声信号を、通常時のユーザの嗜好を表す情報等に基づく記録対象の絞り込み等を行うことなく記録媒体に記録しておき、音声認識によって得られた文字情報に対して指定された検索対象語句を検索することで、指定された検索対象語句を発声する音声が含まれる映像信号又は音声信号を検索するので、ユーザが突発的に新たな事柄に興味を持った等の場合にも、興味を持った事柄に関連する語句を検索対象語句として指定すれば、前記検索対象語句を発声する音声が含まれる映像信号又は音声信号が放送されていれば、該映像信号又は音声信号の再生又は該映像信号又は音声信号を特定可能な情報が提示されることで、ユーザが前記映像信号又は音声信号を視聴することができる。 Thus, according to the first aspect of the present invention, the video signal and the audio signal received by the receiving unit are recorded on the recording medium without performing the narrowing down of the recording target based on the information indicating the user's preference at the normal time. In addition, by searching for the specified search target phrase with respect to the character information obtained by voice recognition, the video signal or the audio signal including the voice that utters the specified search target phrase is searched. Even if the user suddenly becomes interested in a new matter, if a word or phrase related to the matter of interest is designated as a search target word, a video signal containing audio that utters the search target word or If the audio signal is broadcast, the user views the video signal or the audio signal by reproducing the video signal or the audio signal or presenting information that can identify the video signal or the audio signal. Door can be.

また、例えば大きな事件の発生を知ったときに、ユーザがその事件を報道している番組の視聴又は聴取を所望している場合には、検索対象語句として発生した事件を特定する語句を指定すれば、発生した事件を報道している番組の映像信号又は音声信号のみが検索され、ユーザの関心事がその事件に関係する企業の株価の動向である場合には、検索対象語句として前記企業の名称を指定すれば、前記企業について報道している番組の映像信号又は音声信号のみが確実に検索される。従って、ユーザが視聴又は聴取を所望している映像信号又は音声信号が検索されなかったり、逆に、ユーザが視聴又は聴取を所望していない映像信号又は音声信号を含む大量の映像信号又は音声信号が検索されることも回避することができる。従って、請求項１記載の発明によれば、放送された映像又は音声のうちユーザが所望した映像又は音声を簡単かつ確実に視聴又は聴取させることが可能となる。 Also, for example, when a user knows that a major incident has occurred, and the user wants to watch or listen to a program reporting the incident, the phrase that identifies the incident that occurred is specified as the search target phrase. For example, if only the video signal or audio signal of the program reporting the incident that occurred is searched, and the user's interest is the trend of the stock price of the company related to the incident, If the name is designated, only the video signal or audio signal of the program reporting on the company is surely searched. Therefore, a video signal or audio signal that the user wants to watch or listen to is not searched, or conversely, a large amount of video signal or audio signal including a video signal or audio signal that the user does not want to watch or listen to Can also be avoided. Therefore, according to the first aspect of the present invention, it is possible to easily and reliably view or listen to the video or audio desired by the user among the broadcast video or audio.

なお、請求項１記載の発明において、例えば請求項３に記載したように、文字情報を映像信号又は音声信号と対応付ける情報を、文字情報へ変換される前の音声信号が放送された日時を表す日時情報を含んで構成し、提示手段をは、検索対象語句が存在する文字情報が抽出された場合に、抽出された文字情報と共に記録されている日時情報に基づき、当該日時情報が表す日時に放送された映像信号又は音声信号を再生・提示するか、又は、対応する映像信号又は音声信号を特定可能な情報として、抽出された文字情報と共に記録されている日時情報を含む情報を提示するように構成することができる。 In the first aspect of the invention, as described in the third aspect, for example, the date and time when the audio signal before the conversion of the information that associates the character information with the video signal or the audio signal is broadcast to the character information is broadcast. When the character information in which the search target phrase exists is extracted, the presenting means includes the date information that is recorded together with the extracted character information. The broadcasted video signal or audio signal is reproduced / presented, or the information including date and time information recorded together with the extracted character information is presented as information that can identify the corresponding video signal or audio signal. Can be configured.

また、請求項１記載の発明において、ユーザが視聴する可能性のある映像又は音声のチャンネルが複数存在している等の場合には、例えば請求項４に記載したように、受信手段、記録手段、音声認識手段及び文字情報記録手段を、記録対象の映像信号又は音声信号のチャンネル数と同数設け、検索手段は、個々の文字情報記録手段によって記録媒体に記録された文字情報に対し、指定された検索対象語句を各々検索するように構成すればよい。これにより、記録対象の映像信号又は音声信号（例えばユーザが視聴する可能性のある全てのチャンネルの映像又は音声）が記録媒体に各々記録されると共に、記録された映像信号に含まれる音声信号又は記録された音声信号が音声認識によって文字情報へ各々変換され、得られた文字情報が記録された映像信号又は音声信号と対応付ける情報と共に記録媒体へ各々記録される。そして検索手段は、個々の文字情報記録手段によって記録媒体に記録された文字情報に対し、指定された検索対象語句を各々検索するので、記録された複数のチャンネルの映像信号又は音声信号を検索対象として、指定された検索対象語句を発声する音声が含まれる映像信号又は音声信号が全て抽出される。従って、請求項４記載の発明によれば、記録対象の複数のチャンネルで放送された映像又は音声のうち、検索対象語句を発声する音声が含まれる全ての映像又は音声を、ユーザに視聴させることが可能となる。 Further, in the first aspect of the invention, when there are a plurality of video or audio channels that the user may watch, for example, as described in claim 4, the receiving means, the recording means The voice recognition means and the character information recording means are provided in the same number as the number of channels of the video signal or audio signal to be recorded, and the search means is designated for the character information recorded on the recording medium by the individual character information recording means. What is necessary is just to comprise so that each search object phrase may be searched. As a result, the video signal or audio signal to be recorded (for example, the video or audio of all channels that the user may watch) is recorded on the recording medium, and the audio signal included in the recorded video signal or The recorded voice signal is converted into character information by voice recognition, and the obtained character information is recorded on a recording medium together with information associated with the recorded video signal or voice signal. And the search means searches the specified search target word / phrase for the character information recorded on the recording medium by the individual character information recording means, so the video signal or audio signal of a plurality of recorded channels is searched. As a result, all the video signals or audio signals including the voice that utters the designated search target phrase are extracted. Therefore, according to the fourth aspect of the present invention, the user can watch all the video or audio including the audio that utters the search target phrase among the video or audio broadcast on a plurality of channels to be recorded. Is possible.

なお、請求項１記載の発明において、ユーザによる検索対象語句の指定は、例えば請求項５に記載したように、ユーザが入力手段を介して検索対象語句を入力するか、又はユーザが検索対象語句を発声することによって行うことができる。ユーザが検索対象語句を発声することで検索対象語句が指定される場合、検索手段による検索対象語句の検索は、ユーザが検索対象語句を発声したときの音声に対して音声認識を行い、該音声認識によって得られる文字情報を用いることで実現できる。 In the first aspect of the invention, the search target word / phrase is specified by the user, for example, as described in claim 5, when the user inputs the search target word / phrase via the input means, or the user inputs the search target word / phrase. Can be done by speaking. When a search target phrase is specified by the user speaking the search target phrase, the search target phrase search by the search means performs voice recognition on the voice when the user utters the search target phrase, and the voice This can be realized by using character information obtained by recognition.

請求項６記載の発明に係る情報提示方法は、放送中の映像信号又は音声信号を受信し、受信した映像信号又は音声信号を記録媒体へ記録すると共に、前記受信した映像信号に含まれる音声信号又は前記受信した音声信号を音声認識によって文字情報へ変換し、音声認識によって得られた文字情報を前記記録媒体に記録された映像信号又は音声信号と対応付ける情報と共に記録媒体へ記録しておき、記録媒体に記録した文字情報に対し、ユーザによって指定された検索対象語句を検索し、前記検索によって検索対象語句が存在する文字情報が抽出された場合に、該文字情報に対応する映像信号又は音声信号を、記録媒体から読み出して再生することで提示するか、又は、前記検索対象語句が存在する文字情報に対応する映像信号又は音声信号を特定するための情報を提示するので、請求項１記載の発明と同様に、放送された映像又は音声のうちユーザが所望した映像又は音声を簡単かつ確実に視聴又は聴取させることが可能となる。 According to a sixth aspect of the present invention, there is provided an information presentation method for receiving a video signal or audio signal being broadcast, recording the received video signal or audio signal on a recording medium, and an audio signal included in the received video signal. Alternatively, the received voice signal is converted into character information by voice recognition, and the character information obtained by voice recognition is recorded on the recording medium together with information associated with the video signal or voice signal recorded on the recording medium, and recorded. A search target phrase specified by a user is searched for character information recorded on a medium, and when character information including the search target phrase is extracted by the search, a video signal or an audio signal corresponding to the character information is extracted. Is read out from the recording medium and reproduced, or a video signal or an audio signal corresponding to the character information in which the search target phrase exists exists. Since presenting the information to a constant, as in the invention of claim 1 wherein, the user of the broadcast video or audio it is possible to easily and reliably view or listen to desired by video or audio.

請求項７記載の発明に係る情報提示プログラムは、放送中の映像信号又は音声信号を受信する受信手段及び記録媒体を備えたコンピュータを、前記受信手段によって受信された映像信号又は音声信号を記録媒体へ記録する記録手段、前記受信された映像信号に含まれる音声信号又は前記受信された音声信号を音声認識によって文字情報へ変換する音声認識手段、前記音声認識手段による音声認識によって得られた文字情報を、前記記録媒体に記録された映像信号又は音声信号と対応付ける情報と共に記録媒体へ記録する文字情報記録手段、前記文字情報記録手段によって記録媒体に記録された文字情報に対し、ユーザによって指定された検索対象語句を検索する検索手段、及び、前記検索手段による検索によって検索対象語句が存在する文字情報が抽出された場合に、該文字情報に対応する映像信号又は音声信号を記録媒体から読み出して再生することで提示するか、又は、前記検索対象語句が存在する文字情報に対応する映像信号又は音声信号を特定可能な情報を提示する提示手段として機能させる。 According to a seventh aspect of the present invention, there is provided an information presentation program comprising: a computer having receiving means and a recording medium for receiving a video signal or audio signal being broadcast; and a recording medium for receiving the video signal or the audio signal received by the receiving means. Recording means for recording to, voice signal included in the received video signal or voice recognition means for converting the received voice signal into character information by voice recognition, character information obtained by voice recognition by the voice recognition means Is recorded on the recording medium together with information associated with the video signal or audio signal recorded on the recording medium, and the character information recorded on the recording medium by the character information recording means is designated by the user. Search means for searching for a search target phrase, and characters in which the search target phrase is present by the search by the search means When the information is extracted, the video signal or the audio signal corresponding to the character information is read out from the recording medium and reproduced, or the video signal corresponding to the character information in which the search target phrase is present or It functions as a presentation means for presenting information that can identify the audio signal.

請求項７記載の発明に係るプログラムは、上記の受信手段及び記録媒体を備えたコンピュータを、上記の記録手段、音声認識手段、文字情報記録手段、検索手段及び提示手段として機能させるためのプログラムであるので、上記コンピュータが請求項７記載の発明に係る情報提示プログラムを実行することにより、上記コンピュータが請求項１に記載の情報提示装置として機能することになり、請求項１記載の発明と同様に、放送された映像又は音声のうちユーザが所望した映像又は音声を簡単かつ確実に視聴又は聴取させることが可能となる。 A program according to a seventh aspect of the invention is a program for causing a computer including the receiving unit and the recording medium to function as the recording unit, the voice recognition unit, the character information recording unit, the search unit, and the presentation unit. Therefore, when the computer executes the information presentation program according to the invention described in claim 7, the computer functions as the information presentation apparatus according to claim 1, and is similar to the invention according to claim 1. In addition, it is possible to easily and reliably view or listen to the video or audio desired by the user among the broadcast video or audio.

以上説明したように本発明は、放送中の映像信号又は音声信号を受信して記録媒体へ記録すると共に、受信した映像信号に含まれる音声信号又は受信した音声信号を音声認識によって文字情報へ変換し、得られた文字情報を記録した映像信号又は音声信号と対応付ける情報と共に記録媒体へ記録し、記録した文字情報に対しユーザによって指定された検索対象語句を検索し、検索対象語句が存在する文字情報が抽出された場合に、該文字情報に対応する映像信号又は音声信号を記録媒体から読み出して再生することで提示するか、又は、検索対象語句が存在する文字情報に対応する映像信号又は音声信号を特定可能な情報を提示するようにしたので、放送された映像又は音声のうちユーザが所望した映像又は音声を簡単かつ確実に視聴又は聴取させることが可能となる、という優れた効果を有する。 As described above, the present invention receives a video signal or audio signal being broadcast and records it on a recording medium, and converts the audio signal contained in the received video signal or the received audio signal into character information by voice recognition. Then, the obtained character information is recorded on a recording medium together with information associated with the recorded video signal or audio signal, the search target phrase specified by the user is searched for the recorded character information, and the character in which the search target phrase exists When information is extracted, a video signal or audio signal corresponding to the character information is read out from a recording medium and reproduced, or presented, or a video signal or audio corresponding to character information in which a search target phrase exists Since information that can identify the signal is presented, it is possible to easily and reliably view or listen to the desired video or audio of the broadcast video or audio. It is possible to have an excellent effect that.

以下、図面を参照して本発明の実施形態の一例を詳細に説明する。図１には本実施形態に係る動画記録再生装置１０が示されている。なお、動画記録再生装置１０は本発明に係る情報提示装置（詳しくは請求項２に記載の情報提示装置）に対応しており、例えばユーザの自宅等に設置される。 Hereinafter, an example of an embodiment of the present invention will be described in detail with reference to the drawings. FIG. 1 shows a moving image recording / reproducing apparatus 10 according to the present embodiment. The moving image recording / reproducing apparatus 10 corresponds to the information presenting apparatus according to the present invention (specifically, the information presenting apparatus described in claim 2), and is installed at the user's home, for example.

動画記録再生装置１０はアンテナ１２に接続されたＴＶチューナ１４を備えている。アンテナ１２によってＶＨＦやＵＨＦ等のＴＶ周波数帯域の電磁波が受信されることで、アンテナ１２からＴＶチューナ１４へはＴＶ周波数帯域の信号が入力され、ＴＶチューナ１４は入力されたＴＶ周波数帯域の信号に基づき、予め設定されたチャンネルの放送局が放送しているアナログのテレビジョン信号（ＴＶ信号）の復調を行う。また、ＴＶチューナ１４はタイマを内蔵しており、予め設定されたチャンネルのＴＶ信号の復調（受信）を、予め設定された受信開始時刻から受信終了時刻の間継続する。アンテナ１２及びＴＶチューナ１４は本発明に係る受信手段に対応している。ＴＶチューナ１４には信号処理部１６と音声認識部１８が接続されており、ＴＶチューナ１４で復調されたＴＶ信号は信号処理部１６へ順次入力され、復調されたＴＶ信号に含まれる映像信号と音声信号のうち音声信号のみは音声認識部１８へ順次入力される。 The moving image recording / reproducing apparatus 10 includes a TV tuner 14 connected to an antenna 12. By receiving electromagnetic waves in the TV frequency band such as VHF and UHF by the antenna 12, a signal in the TV frequency band is input from the antenna 12 to the TV tuner 14, and the TV tuner 14 converts the input signal in the TV frequency band. Based on this, an analog television signal (TV signal) broadcast by a broadcasting station of a preset channel is demodulated. Further, the TV tuner 14 has a built-in timer, and continues demodulation (reception) of the TV signal of a preset channel from the preset reception start time to the reception end time. The antenna 12 and the TV tuner 14 correspond to the receiving means according to the present invention. A signal processing unit 16 and a voice recognition unit 18 are connected to the TV tuner 14, and TV signals demodulated by the TV tuner 14 are sequentially input to the signal processing unit 16, and video signals included in the demodulated TV signals Of the audio signals, only the audio signals are sequentially input to the audio recognition unit 18.

信号処理部１６は、ＴＶチューナ１４から順次入力されるアナログのＴＶ信号をデジタルの動画像データへ変換し、変換後の動画像データを一定時間毎（例えば１時間等）に分割すると共に、分割した動画像データを所定の動画像圧縮形式（例えばmpeg1/2/4等）に従って各々圧縮（エンコード）することで、所定の動画像圧縮形式に準拠した圧縮動画ファイルを順次生成する。なお、上記の圧縮動画ファイルは、動画像（映像及び音声）の再生時に映像と音声の同期をとることを目的として、圧縮動画ファイルの生成時に、圧縮画像ファイルを先頭から再生した際の経過時間を表す時刻情報（タイムスタンプ）が、信号処理部１６によって圧縮画像ファイルに含まれる圧縮映像データ及び圧縮音声データに順次挿入される。また信号処理部１６は、個々の圧縮動画ファイルのヘッダに、ＴＶチューナ１４から個々の圧縮動画ファイルに対応するＴＶ信号の入力が開始された時点での日時を表す受信開始日時情報を設定する。 The signal processing unit 16 converts analog TV signals sequentially input from the TV tuner 14 into digital moving image data, and divides the converted moving image data every predetermined time (for example, 1 hour). By compressing (encoding) the moving image data according to a predetermined moving image compression format (for example, mpeg1 / 2/4), compressed moving image files that conform to the predetermined moving image compression format are sequentially generated. Note that the above compressed video file is the elapsed time when the compressed image file is played back from the beginning when the compressed video file is generated for the purpose of synchronizing the video and audio during playback of the moving image (video and audio). The time information (time stamp) indicating is sequentially inserted into the compressed video data and the compressed audio data included in the compressed image file by the signal processing unit 16. In addition, the signal processing unit 16 sets reception start date and time information indicating the date and time when the input of the TV signal corresponding to each compressed moving image file from the TV tuner 14 is started in the header of each compressed moving image file.

なお、圧縮映像データ及び圧縮音声データへ圧縮される前のＴＶ信号の放送日時（受信日時）は、圧縮映像データ及び圧縮音声データに順次挿入された時刻情報と、圧縮動画ファイルのヘッダに設定された受信開始日時情報から判断可能であり、これらの情報は請求項３に記載の日時情報に対応している。 Note that the broadcast date / time (reception date / time) of the TV signal before being compressed into compressed video data and compressed audio data is set in the time information sequentially inserted into the compressed video data and compressed audio data and the header of the compressed video file. It can be determined from the reception start date / time information, and these information correspond to the date / time information described in claim 3.

またＴＶチューナ１４は、信号処理部１６へのＴＶ信号出力時に、出力するＴＶ信号のチャンネルを表す受信チャンネル情報も同時に出力する。信号処理部１６は、ＴＶチューナ１４から入力された受信チャンネル情報を、生成した圧縮動画ファイルのヘッダに設定する。従って、ＴＶチューナ１４から入力された信号（情報）に基づき、例として図２（Ａ）に示すようなフォーマットの圧縮動画ファイルが信号処理部１６によって生成されることになる。信号処理部１６はＨＤＤ（ハードディスクドライブ）２０に接続されており、このＨＤＤ２０には上記の圧縮動画ファイルを蓄積記憶するための動画ファイルＤＢ（データベース）２２が記憶されている。信号処理部１６は生成した圧縮画像ファイルをＨＤＤ２０の動画ファイルＤＢ２２に順次蓄積記憶させる。従って、ＨＤＤ２０の動画ファイルＤＢ２２には、ＴＶチューナ１４によって受信されたＴＶ信号が、通常時のユーザの嗜好を表す情報等に基づく記録対象の絞り込み等を行うことなく、圧縮動画ファイルとして動画ファイルＤＢ２２に記録されることになる。このように、信号処理部１６は本発明に係る記録手段に対応している。 Further, the TV tuner 14 simultaneously outputs reception channel information indicating the channel of the TV signal to be output when the TV signal is output to the signal processing unit 16. The signal processing unit 16 sets the reception channel information input from the TV tuner 14 in the header of the generated compressed moving image file. Therefore, based on the signal (information) input from the TV tuner 14, a compressed moving image file having a format as shown in FIG. 2A as an example is generated by the signal processing unit 16. The signal processing unit 16 is connected to an HDD (Hard Disk Drive) 20, and the HDD 20 stores a moving image file DB (database) 22 for accumulating and storing the compressed moving image file. The signal processing unit 16 sequentially stores and stores the generated compressed image file in the moving image file DB 22 of the HDD 20. Therefore, the moving image file DB 22 of the HDD 20 stores the moving image file DB 22 as a compressed moving image file without narrowing down the recording target based on the information indicating the user's preference during normal operation of the TV signal received by the TV tuner 14. Will be recorded. Thus, the signal processing unit 16 corresponds to the recording means according to the present invention.

一方、音声認識部１８には音声辞書２６が接続されており、この音声辞書２６には、音声認識処理時にパターンマッチングのために参照されるパターンデータが多数記憶されている。音声認識部１８は、ＴＶチューナ１４から順次入力される音声信号をデジタルの音声データへ変換した後に、変換によって得られた音声データから１文節又は１単語分の音声データを順に取り出し、取り出した音声データを音声辞書２６に記憶されているパターンデータと照合（パターンマッチング）することで音声認識を行い、認識結果を表す文字データ（テキストデータ）を生成することを繰り返す。これにより、入力された音声信号が表す音声の内容を表す文字データが得られることになる。音声認識部１８は、上記の音声認識によって得られた文字データを、一定時間分の音声データに対応する文字データ毎に分割し、分割した個々の文字データに、対応する音声信号がＴＶチューナ１４から入力された日時（≒ＴＶチューナ１４による受信日時）を表す日時情報を付加する。 On the other hand, a speech dictionary 26 is connected to the speech recognition unit 18, and a large number of pattern data referred to for pattern matching during speech recognition processing is stored in the speech dictionary 26. The voice recognition unit 18 converts the voice signal sequentially input from the TV tuner 14 into digital voice data, and then sequentially extracts voice data for one phrase or one word from the voice data obtained by the conversion, and extracts the extracted voice. Voice recognition is performed by matching the data with pattern data stored in the voice dictionary 26 (pattern matching), and generation of character data (text data) representing the recognition result is repeated. As a result, character data representing the content of the voice represented by the input voice signal is obtained. The voice recognition unit 18 divides the character data obtained by the above-described voice recognition into character data corresponding to a predetermined amount of voice data, and the corresponding voice signal is transmitted to the TV tuner 14 for each divided character data. Is added with date / time information indicating the date / time input (≈ the date / time received by the TV tuner 14).

また音声認識部１８は、音声認識処理によって得られた文字データ及び該文字データに付加した日時情報を、一定時間分（例えば１時間分）の情報を単位としてファイル化し、個々のファイル（文字データファイル）のヘッダに、ＴＶチューナ１４から個々の文字データファイルに対応する音声信号の入力が開始された時点での日時を表す受信開始日時情報を設定する。なお、文字データの生成に用いられた音声データに対応する音声信号を含むＴＶ信号の放送日時（受信日時）は、文字データに付加された日時情報から判断可能であり、この日時情報も請求項３に記載の日時情報に対応している。 In addition, the voice recognition unit 18 converts the character data obtained by the voice recognition process and date / time information added to the character data into a file by using information for a predetermined time (for example, one hour) as a unit. The reception start date / time information indicating the date / time when the input of the audio signal corresponding to each character data file is started from the TV tuner 14 is set in the header of the (file). The broadcast date / time (reception date / time) of the TV signal including the audio signal corresponding to the audio data used to generate the character data can be determined from the date / time information added to the character data. 3 corresponds to the date and time information described in 3.

またＴＶチューナ１４は、音声認識部１８へ音声信号を出力する際に、出力する音声信号に対応するＴＶ信号のチャンネルを表す受信チャンネル情報も同時に出力し、音声認識部１８へはＴＶチューナ１４から入力された受信チャンネルを個々の文字データファイルのヘッダに設定する。従って、ＴＶチューナ１４から入力された信号（情報）に基づき、例として図２（Ｂ）に示すようなフォーマットの文字データファイルが音声認識部１８によって生成されることになる。音声認識部１８はＨＤＤ２０に接続されており、このＨＤＤ２０には上記の文字データファイルを蓄積記憶するための文字データＤＢ２４が記憶されている。音声認識部１８は生成した文字データファイルをＨＤＤ２０の文字データＤＢ２４に順次蓄積記憶させる。このように、音声認識部１８は本発明に係る音声認識手段及び文字情報記録手段に各々対応している。 When the TV tuner 14 outputs an audio signal to the audio recognition unit 18, it also outputs reception channel information indicating a TV signal channel corresponding to the audio signal to be output, and the audio recognition unit 18 is output from the TV tuner 14. The input reception channel is set in the header of each character data file. Therefore, based on the signal (information) input from the TV tuner 14, a character data file having a format as shown in FIG. 2B is generated by the voice recognition unit 18 as an example. The voice recognition unit 18 is connected to the HDD 20, and the HDD 20 stores a character data DB 24 for accumulating and storing the character data file. The voice recognition unit 18 sequentially stores and stores the generated character data file in the character data DB 24 of the HDD 20. Thus, the voice recognition unit 18 corresponds to the voice recognition means and the character information recording means according to the present invention.

なお、ＨＤＤ２０に蓄積記憶された圧縮動画ファイル及び文字データファイルの消去に関しては、例えばＨＤＤ２０が満杯になった時点で、ヘッダに設定されている受信開始日時情報が表す受信開始日時が古い情報から順に消去する等、任意の方式で行うことができる。 Regarding the deletion of the compressed moving image file and the character data file stored and stored in the HDD 20, for example, when the HDD 20 is full, the reception start date and time indicated by the reception start date and time information set in the header is in order from the oldest information. It can be performed by any method such as erasing.

また、ＨＤＤ２０には動画再生部２８と文字データ検索部３４が接続されている。動画再生部２８には画像（映像）を再生表示するためのディスプレイ３０と、音声を再生出力するためのスピーカ３２が接続されている。なお、ディスプレイ３０及びスピーカ３２としては、例えばテレビ受像機に内蔵されているディスプレイとスピーカを適用することができる。動画再生部２８は、再生対象の圧縮動画ファイルを指定する情報が入力されると、指定された再生対象の圧縮動画ファイルをＨＤＤ２０の動画ファイルＤＢ２２から読み出して順にデコードし、デコードによって得られた映像データが表す映像をディスプレイ３０に表示させると共に、デコードによって得られた音声データが表す音声をスピーカ３２から出力させることで、再生対象の圧縮動画ファイルが表す動画像（映像及び音声）を再生させる。 Further, the HDD 20 is connected with a moving image playback unit 28 and a character data search unit 34. A display 30 for reproducing and displaying an image (video) and a speaker 32 for reproducing and outputting audio are connected to the moving image reproducing unit 28. As the display 30 and the speaker 32, for example, a display and a speaker built in the television receiver can be applied. When the information specifying the playback target compressed video file is input, the video playback unit 28 reads out the specified playback target compressed video file from the video file DB 22 of the HDD 20 and sequentially decodes the video obtained by the decoding. The video represented by the data is displayed on the display 30 and the audio represented by the audio data obtained by decoding is output from the speaker 32, thereby reproducing the moving image (video and audio) represented by the compressed moving image file to be reproduced.

一方、文字データ検索部３４はマイクロコンピュータ等を含んで構成されており、後述する検索・再生処理を行う。また、文字データ検索部３４にはユーザが各種の情報を入力するための指定部３６が接続されている。指定部３６はキーボード等の情報入力手段を含んで構成されており、ＴＶチューナ１４の受信開始時刻及び受信終了時刻、受信チャンネル、文字データ検索部３４による検索における検索語句等の情報がユーザによって入力される。 On the other hand, the character data search unit 34 includes a microcomputer or the like, and performs search / playback processing to be described later. The character data search unit 34 is connected to a designation unit 36 for the user to input various information. The designation unit 36 is configured to include information input means such as a keyboard, and the user inputs information such as the reception start time and reception end time of the TV tuner 14, the reception channel, and a search phrase in the search by the character data search unit 34. Is done.

次に本実施形態の作用として、文字データ検索部３４で実行される検索・再生処理について、図３のフローチャートを参照しながら説明する。なお、この検索・再生処理は、例えばユーザが特定の事柄に興味を持ち、興味を持った事柄に触れているＴＶ番組の有無を確認すると共に、該当するＴＶ番組があれば視聴することを所望している等の場合に、指定部３６を介してＴＶ番組（圧縮動画ファイル）の検索がユーザから指示されることで実行される。 Next, as an operation of this embodiment, search / reproduction processing executed by the character data search unit 34 will be described with reference to the flowchart of FIG. In this search / playback process, for example, the user is interested in a specific matter, confirms the presence or absence of a TV program touching the matter of interest, and desires to view the TV program if there is a corresponding one. In such a case, a search for a TV program (compressed moving image file) is executed by an instruction from the user via the designation unit 36.

ステップ１００では、検索語句及び検索対象の日時範囲の入力を要請するメッセージを動画再生部２８によってディスプレイ３０へ表示させることで、ユーザに対して検索語句の入力を要請する。次のステップ１０２では検索語句が入力されたか否か判定し、判定が肯定される迄ステップ１０２の判定を繰り返す。ディスプレイ３０に表示されたメッセージを確認することで検索語句及び検索対象の日時範囲の入力が要請されていることを認識したユーザは、指定部３６を介して検索語句（例えば自身が興味を持った事柄に関連する語句）を入力する。なお、例えば動画ファイルＤＢ２２に蓄積記憶されている全ての圧縮動画ファイルを検索対象とする等の場合は、検索対象の日時範囲の指定を省略することも可能である。 In step 100, a message requesting input of a search term and a date and time range to be searched is displayed on the display 30 by the video playback unit 28, thereby requesting the user to input the search term. In the next step 102, it is determined whether or not a search term has been input, and the determination in step 102 is repeated until the determination is affirmed. The user who recognizes that the input of the search phrase and the search date / time range is requested by confirming the message displayed on the display 30, the search phrase (for example, he / she is interested) Enter words related to the matter). For example, when all the compressed moving image files stored and stored in the moving image file DB 22 are to be searched, it is possible to omit the specification of the date range to be searched.

ユーザによって検索語句及び検索対象の日時範囲が入力されると、ステップ１０２の判定が肯定されてステップ１０４へ移行し、文字データＤＢ２４から単一の文字データファイルを取り出し、取り出した文字データファイルのヘッダに設定されている受信開始日時に基づき、取り出した文字データファイルに対応するＴＶ信号の放送日時範囲が指定された検索対象の日時範囲内か否かを判定し、放送日時範囲が検索対象の日時範囲から外れていた場合には、次の文字データファイルを取り出すことを繰り返すことで、指定された検索対象の日時範囲内に放送された番組に対応する文字データファイルを検索し、該当する文字データファイルの取り込みを行う。ステップ１０６では取り込んだ文字データファイルに含まれる全ての文字データを検索語句と比較し、次のステップ１０８において、取り込んだ文字データファイルの中に、検索語句が文字列として含まれている文字データが存在していたか否か判定する。 When the search phrase and the date / time range to be searched are input by the user, the determination in step 102 is affirmed and the process proceeds to step 104, a single character data file is extracted from the character data DB 24, and the header of the extracted character data file is retrieved. Based on the reception start date and time set in the above, it is determined whether the broadcast date and time range of the TV signal corresponding to the extracted character data file is within the specified search date and time range, and the broadcast date and time range is the search target date and time If the character data file is out of the range, the character data file corresponding to the program broadcast within the specified date and time range to be searched is searched by repeatedly extracting the next character data file, and the corresponding character data is retrieved. Import files. In step 106, all character data included in the captured character data file are compared with the search terms, and in the next step 108, character data including the search terms as character strings are included in the captured character data file. It is determined whether or not it existed.

ステップ１０８の判定が否定された場合は何ら処理を行うことなくステップ１１２へ移行するが、判定が肯定された場合はステップ１１０へ移行し、再生候補動画の情報として、先にステップ１０４で取り込んだ文字データファイルのヘッダに設定されている受信チャンネル情報と、検索語句と一致した文字データ（検索語句が文字列として含まれている文字データ）に付加されている日時情報（図２（Ｂ）参照）を、動画再生部２８によってディスプレイ３０に表示させた後にステップ１１２へ移行する。なお、検索語句が文字列として含まれている文字データ（例えば検索語句を中心として前後数行程度）も併せてディスプレイ３０に表示させるようにしてもよい。
ステップ１１２では、先のステップ１０４において、文字データＤＢ２４に蓄積記憶されている全ての文字データファイルのヘッダを参照したか否かを判断することで、検索語句の検索が完了したか否か判定する。判定が否定された場合はステップ１０４に戻り、ステップ１１２の判定が肯定される迄ステップ１０４〜１１２を繰り返す。これにより、検索語句が文字列として含まれている文字データが複数抽出された場合には、抽出された個々の文字データに対応する再生候補動画の情報がディスプレイ３０に一覧表示されることになる。 If the determination in step 108 is negative, the process proceeds to step 112 without performing any processing. However, if the determination is affirmative, the process proceeds to step 110 and is previously captured in step 104 as reproduction candidate video information. Receiving channel information set in the header of the character data file and date and time information added to character data matching the search word (character data including the search word as a character string) (see FIG. 2B) ) Is displayed on the display 30 by the moving image reproduction unit 28, and then the process proceeds to step 112. In addition, you may make it display on the display 30 also the character data (For example, about several lines before and behind centering on a search word / phrase) including the search word / phrase as a character string.
In step 112, it is determined whether or not the search for the search term has been completed by determining whether or not the headers of all the character data files stored and stored in the character data DB 24 have been referred to in the previous step 104. . If the determination is negative, the process returns to step 104, and steps 104 to 112 are repeated until the determination of step 112 is affirmed. As a result, when a plurality of character data containing the search word / phrase as a character string is extracted, information of reproduction candidate moving images corresponding to each extracted character data is displayed in a list on the display 30. .

ステップ１１２の判定が肯定されると、検索終了を通知するメッセージを動画再生部２８によってディスプレイ３０に表示させた後にステップ１１４へ移行し、上述したステップ１０４〜１１２の処理により、検索語句が文字列として含まれている文字データが抽出されたか否か判定する。判定が否定された場合はステップ１２２へ移行し、指定された検索語句を発声する音声が含まれるＴＶ番組（動画像）が存在していない旨をユーザへ通知するエラーメッセージを動画再生部２８によってディスプレイ３０に表示させた後に、検索・再生処理を終了する。 If the determination in step 112 is affirmed, a message notifying the end of the search is displayed on the display 30 by the moving image reproduction unit 28, and then the process proceeds to step 114, and the search phrase is converted into a character string by the processing in steps 104 to 112 described above. It is determined whether or not the character data included as is extracted. If the determination is negative, the process proceeds to step 122, and the moving image playback unit 28 sends an error message notifying the user that there is no TV program (moving image) containing the voice that utters the designated search phrase. After displaying on the display 30, the search / playback process is terminated.

一方、検索語句が文字列として含まれている文字データが抽出された場合には、ステップ１１４の判定が肯定されてステップ１１６へ移行し、ディスプレイ３０に情報が表示されている再生候補動画が選択されて再生が指示されたか否か判定する。判定が否定された場合はステップ１２０へ移行し、検索・再生処理の終了が指示されたか否か判定する。判定が否定された場合はステップ１１６に戻り、何れかの判定が肯定される迄ステップ１１６、１２０を繰り返す。検索終了を通知するメッセージがディスプレイ３０に表示された時点で再生候補動画の情報がディスプレイ３０に表示されていた場合、ユーザは指定部３６を介し、ディスプレイ３０に表示されている再生候補動画の情報のうち、視聴したい再生候補動画の情報を再生対象として選択し、選択した再生対象動画の再生を指示する。再生対象動画が選択されて再生が指示されると、ステップ１１６の判定が肯定されてステップ１１８へ移行し、再生対象として選択された再生候補動画の情報（受信チャンネル情報、文字データに付加されていた日時情報）を再生対象動画の情報として動画再生部２８へ通知することで、動画再生部２８に対して再生対象動画の再生を指示した後に、ステップ１２０へ移行する。 On the other hand, when character data including a search phrase as a character string is extracted, the determination in step 114 is affirmed and the process proceeds to step 116, where a playback candidate video whose information is displayed on the display 30 is selected. It is then determined whether or not playback is instructed. If the determination is negative, the process proceeds to step 120, and it is determined whether or not the end of the search / playback process is instructed. If the determination is negative, the process returns to step 116, and steps 116 and 120 are repeated until either determination is positive. When the information on the playback candidate video is displayed on the display 30 when the message notifying the end of the search is displayed on the display 30, the user uses the designation unit 36 to display the information on the playback candidate video displayed on the display 30. Among them, information on a playback candidate video to be viewed is selected as a playback target, and playback of the selected playback target video is instructed. When a playback target video is selected and playback is instructed, the determination in step 116 is affirmed and the process proceeds to step 118, where information on the playback candidate video selected as the playback target (reception channel information and character data is added). The date / time information) is notified to the video playback unit 28 as information on the playback target video, and the video playback unit 28 is instructed to play back the playback target video, and then the process proceeds to step 120.

これにより、動画再生部２８は、まず動画ファイルＤＢ２２から単一の圧縮動画ファイルを取り出し、取り出した圧縮動画ファイルのヘッダに設定されている受信開始日時に基づき、文字データ検索部３４から通知された日時情報が表す日時（視聴時に検索語句が音声として発声されるＴＶ信号を受信した日時）が、取り出した圧縮動画ファイルに対応するＴＶ信号が放送された日時範囲内に含まれており、かつ取り出した圧縮動画ファイルのヘッダに設定されている受信チャンネルが通知された受信チャンネルに一致しているか否かを判定し、取り出した圧縮動画ファイルが上記の条件に合致しない場合は次の圧縮動画ファイルを取り出すことを繰り返すことで、上記の条件に合致する圧縮動画ファイルを検索する。上記の条件に合致する再生対象の圧縮動画ファイルが抽出されると、次に動画再生部２８は、通知された日時情報に基づいて、再生対象の圧縮動画ファイルに対して動画像として再生する範囲を決定する。この再生範囲は、例えば通知された日時情報が表すタイミング（再生対象の圧縮動画ファイルが表す動画像の再生時に検索語句が音声として発せられるタイミング）を中心として前後数分間の動画像のみが再生されるように決定することができる。 Thereby, the moving image reproduction unit 28 first extracts a single compressed moving image file from the moving image file DB 22, and is notified from the character data search unit 34 based on the reception start date and time set in the header of the extracted compressed moving image file. The date and time represented by the date and time information (the date and time when the TV signal whose search phrase is uttered as audio when viewed) is included in the date and time range when the TV signal corresponding to the extracted compressed video file is broadcast, and is extracted It is determined whether the received channel set in the header of the compressed video file matches the notified reception channel. If the extracted compressed video file does not meet the above conditions, the next compressed video file is selected. By repeating the extraction, a compressed moving image file that meets the above conditions is searched. When a compressed moving image file to be played that satisfies the above conditions is extracted, the moving image playing unit 28 next reproduces a moving image from the compressed moving image file to be played based on the notified date and time information. To decide. For example, only the moving image for several minutes before and after is reproduced centering on the timing represented by the notified date and time information (timing that the search phrase is uttered as audio when the moving image represented by the compressed video file to be reproduced is reproduced). Can be determined.

続いて動画再生部２８は、再生対象の圧縮動画ファイルに含まれる圧縮映像データ及び圧縮音声データに挿入された時刻情報を順次参照し、参照した時刻情報が表す時刻が決定した再生範囲内に相当する時刻であれば、対応する圧縮映像データ又は圧縮音声データを抽出することを繰り返すことで、決定した再生範囲内に相当する圧縮映像データ及び圧縮音声データのみを抽出する。そして動画再生部２８は、抽出した圧縮映像データ及び圧縮音声データを時系列に並べて順にデコードし、デコードによって得られた映像データが表す映像をディスプレイ３０に表示させると共に、デコードによって得られた音声データが表す音声をスピーカ３２から出力させることで、決定した再生範囲内に相当する動画像（映像及び音声）を再生させる。これにより、ユーザは、放送されたＴＶ番組のうち、興味を持った事柄に触れている部分のみを動画像として視聴することができる。このように、文字データ検索部３４は本発明に係る検索手段に対応しており、文字データ検索部３４及び動画再生部２８は、本発明に係る提示手段に対応している。 Subsequently, the video playback unit 28 sequentially refers to the time information inserted in the compressed video data and the compressed audio data included in the compressed video file to be played back, and the time represented by the referenced time information corresponds to the determined playback range. If it is the time, the extraction of the corresponding compressed video data or compressed audio data is repeated to extract only the compressed video data and compressed audio data corresponding to the determined reproduction range. Then, the moving image reproduction unit 28 arranges the extracted compressed video data and compressed audio data in time series and decodes them in order, displays the video represented by the video data obtained by the decoding on the display 30, and the audio data obtained by the decoding. Is output from the speaker 32, and a moving image (video and audio) corresponding to the determined reproduction range is reproduced. Thereby, the user can view only a part touching a matter of interest in the broadcast TV program as a moving image. As described above, the character data search unit 34 corresponds to the search unit according to the present invention, and the character data search unit 34 and the moving image playback unit 28 correspond to the presentation unit according to the present invention.

また、再生候補動画の情報がディスプレイ３０に複数表示されていた場合には、上記の視聴を終えたユーザが他の再生候補動画の情報を選択して再生を指示すれば、ステップ１１６の判定が再度肯定され、選択された再生候補動画に対して上述の処理が繰り返されることで、放送されたＴＶ番組のうち興味を持った事柄に触れている別の部分を動画像として視聴することができる。 If a plurality of playback candidate video information are displayed on the display 30, if the user who has finished viewing the above selects other playback candidate video information and indicates playback, the determination in step 116 is made. By affirming again and repeating the above-described processing for the selected playback candidate video, it is possible to view another portion of the broadcast TV program that touches an interesting matter as a moving image. .

このように、本実施形態によれば、ユーザが突発的に新たな事柄に興味を持った等の場合にも、興味を持った事柄に関連する語句を検索語句として指定すれば、指定した検索語句を発声する音声が含まれるＴＶ番組が放送されていれば、このＴＶ番組をユーザが簡単かつ確実に視聴することができる。また、本実施形態では、放送されたＴＶ番組のうち、指定された検索語句が音声として発せられるタイミングを中心として前後数分間の動画像のみが再生されるので、検索語句が音声として発せられている箇所をユーザが探したりする手間も省け、視聴に要する時間を節約することができる。また、本実施形態では、音声信号を音声認識によって文字データに変換し、変換後の文字データに対して指定された検索語句の検索を行うことで、指定された検索語句が音声として発せられた動画像（ＴＶ番組）の検索を行うので、番組表等を利用する場合と比較して、ユーザが興味を持った事柄に触れている動画像（ＴＶ番組）を確実に抽出・提示することができる。 As described above, according to the present embodiment, even when the user suddenly becomes interested in a new matter, etc., if a phrase related to the matter of interest is designated as a search term, the designated search is performed. If a TV program including a voice that utters a phrase is broadcast, the user can easily and reliably view the TV program. Further, in the present embodiment, in the broadcast TV program, only the moving images for several minutes around the timing when the specified search word / phrase is uttered as voice are reproduced, so that the search word / phrase is uttered as voice. This saves the user from having to search for a location and saves the time required for viewing. Further, in the present embodiment, the specified search word is uttered as a voice by converting the voice signal into character data by voice recognition and performing a search for the specified search word with respect to the converted character data. Since a search for a moving image (TV program) is performed, it is possible to reliably extract and present a moving image (TV program) touching a matter that the user is interested in, as compared with the case of using a program guide or the like. it can.

なお、上記では指定された検索語句が音声として発せられるタイミングを中心として前後数分間の動画像のみを再生する例を説明したが、本発明はこれに限定されるものではなく、指定された検索語句が音声として発せられるＴＶ番組全体を再生するようにしてもよいし、動画像の再生範囲をユーザが任意に設定できるようにしてもよい。 In the above description, an example in which only a moving image for several minutes before and after the timing at which the specified search phrase is uttered as a voice is reproduced has been described. However, the present invention is not limited to this, and the specified search is performed. The entire TV program in which the phrase is uttered as sound may be played, or the playback range of the moving image may be arbitrarily set by the user.

また、上記では再生候補動画（指定された検索語句が音声として発せられる動画像）の情報として、受信チャンネル情報と日時情報をディスプレイ３０に表示させる例を説明したが、本発明はこれに限定されるものではなく、例えばインターネット上で公開されている電子番組表を参照する等により、番組名等の他の情報も取得・表示させるようにしてもよい。 In the above description, an example in which reception channel information and date / time information are displayed on the display 30 as information of a reproduction candidate video (a moving image in which a designated search phrase is uttered as sound) has been described. However, the present invention is not limited to this. For example, other information such as a program name may be acquired and displayed by referring to an electronic program guide published on the Internet.

また、本実施形態に係る動画記録再生装置１０は、例えばＨＤＤレコーダとパーソナル・コンピュータ（ＰＣ）で構成することができるが、これに限られるものではなく、例えば放送中の映像信号又は音声信号を受信する受信手段（例えばアンテナに接続されたＴＶチューナ）を内蔵したＰＣであれば、該ＰＣに所定のプログラムを実行させることで、本実施形態に係る動画記録再生装置１０として機能させることも可能である。この場合、上記所定のプログラムが請求項７に記載の情報提示プログラムに対応することになる。 In addition, the moving image recording / reproducing apparatus 10 according to the present embodiment can be configured by, for example, an HDD recorder and a personal computer (PC), but is not limited to this. For example, a video signal or an audio signal being broadcast is received. If the PC has a receiving means for receiving (for example, a TV tuner connected to an antenna), the PC can also function as the moving picture recording / reproducing apparatus 10 according to the present embodiment by causing the PC to execute a predetermined program. It is. In this case, the predetermined program corresponds to the information presentation program according to the seventh aspect.

更に、上記ではＴＶチューナ１４、信号処理部１６及び音声認識部１８が各々１個づつ設けられた構成を説明したが、これに限定されるものではなく、例として図４に示すように、上記のＴＶチューナ１４、信号処理部１６及び音声認識部１８から成る動画ファイル／文字データ生成・記録部４０を複数設け、個々の動画ファイル／文字データ生成・記録部４０で互いに異なるチャンネルのＴＶ信号について、受信・圧縮動画ファイルの生成及び記録・音声信号に対する音声認識及び文字データファイルの記録を並列に実行する構成を採用してもよい。この場合、文字データ検索部３４は指定された検索語句の検索を全チャンネルの文字データファイルに対して行うように構成すればよい。これにより、異なるチャンネルで同一時間帯に放送されたＴＶ番組が、ユーザが興味を持った事柄に各々触れていた等の場合にも、各ＴＶ番組を確実に抽出・提示することができる。 Furthermore, in the above description, the configuration in which one TV tuner 14, one signal processing unit 16, and one voice recognition unit 18 are provided has been described. However, the present invention is not limited to this, and as shown in FIG. Are provided with a plurality of moving image file / character data generating / recording units 40 each including a TV tuner 14, a signal processing unit 16, and a voice recognizing unit 18. Alternatively, a configuration may be employed in which the generation of the received / compressed moving image file and the recording / recording of the character data file are performed in parallel for the recording / audio signal. In this case, the character data search unit 34 may be configured to search for the specified search word / phrase with respect to the character data files of all channels. This makes it possible to reliably extract and present each TV program even when TV programs broadcast in the same time zone on different channels are touching matters that the user is interested in.

また、上記では本実施形態に係る動画記録再生装置１０がユーザの自宅に設置される例を説明したが、これに限定されるものではなく、動画記録再生装置１０をインターネット等のコンピュータ・ネットワークに直接接続するか、或いはサーバを介して接続し、ユーザからコンピュータ・ネットワーク経由で検索語句が指定されると、検索結果や再生が指示された動画像のデータをコンピュータ・ネットワーク経由でユーザへ送信するサービスを提供するようにしてもよい。特に図４の構成を採用した場合、装置が大規模かつ高価となる可能性もあるが、上記のようにコンピュータ・ネットワークを利用することで、単一の動画記録再生装置１０を複数のユーザが利用可能となるので好適である。 Further, the example in which the moving picture recording / reproducing apparatus 10 according to the present embodiment is installed at the user's home has been described above. However, the present invention is not limited to this, and the moving picture recording / reproducing apparatus 10 is connected to a computer network such as the Internet. When a search term is specified from a user via a computer network, either directly or via a server, the search result or moving image data instructed to be played is transmitted to the user via the computer network. A service may be provided. In particular, when the configuration of FIG. 4 is adopted, the apparatus may be large and expensive. However, by using the computer network as described above, a single moving image recording / reproducing apparatus 10 can be used by a plurality of users. This is preferable because it can be used.

また、上記では圧縮動画ファイルの圧縮映像データ及び圧縮音声データに、圧縮画像ファイルを先頭から再生した際の経過時間を表す時刻情報（タイムスタンプ）が順次挿入される態様を説明したが、本発明はこれに限定されるものではなく、圧縮映像データ及び圧縮音声データへ圧縮される前のＴＶ信号の放送日時（受信日時）を表す日時情報を順次挿入するようにしてもよい。この態様において、上記の日時情報は請求項３に記載の日時情報に対応している。 In the above description, the mode in which time information (time stamp) indicating the elapsed time when the compressed image file is reproduced from the beginning is sequentially inserted into the compressed video data and the compressed audio data of the compressed moving image file has been described. However, the present invention is not limited to this, and date / time information indicating the broadcast date / time (reception date / time) of the TV signal before being compressed into compressed video data and compressed audio data may be sequentially inserted. In this aspect, the date and time information corresponds to the date and time information described in claim 3.

更に、上記では検索結果（指定された検索語句が音声として発声される再生候補動画の情報）を表示した後に、再生候補動画が選択されて再生が指示されると選択された動画像の再生を行う例を説明したが、これに限定されるものではなく、検索結果の表示（提示）のみを行う態様も本発明に含まれる。 Further, in the above, after displaying the search result (information of the playback candidate video in which the designated search phrase is uttered as sound), when the playback candidate video is selected and playback is instructed, the selected moving image is played back. Although the example to perform was demonstrated, it is not limited to this, The aspect which displays only a search result (presentation) is also contained in this invention.

また、上記では無線放送で送信されるアナログのＴＶ信号を受信する例を説明したが、本発明はこれに限定されるものではなく、有線放送のＴＶ信号を受信する場合、デジタルのＴＶ信号を受信する場合、ＴＶ放送に代えてラジオ放送の信号を受信する場合（この場合、ＴＶチューナ１４に代えてＡＭ／ＦＭチューナ等を設ければよい）にも適用可能であることは言うまでもない。 In the above description, an example of receiving an analog TV signal transmitted by wireless broadcasting has been described. However, the present invention is not limited to this, and when receiving a TV signal of wired broadcasting, a digital TV signal is received. In the case of reception, it goes without saying that the present invention can also be applied to the case of receiving a radio broadcast signal instead of the TV broadcast (in this case, an AM / FM tuner or the like may be provided instead of the TV tuner 14).

本実施形態に係る動画記録再生装置の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of the moving image recording / reproducing apparatus which concerns on this embodiment. （Ａ）は圧縮動画像ファイルのフォーマット、（Ｂ）は文字データファイルのフォーマットの一例を各々示すイメージ図である。(A) is a format of a compressed moving image file, and (B) is an image diagram showing an example of a format of a character data file. 文字データ検索部の作用を説明する検索・再生処理の内容を示すフローチャートである。It is a flowchart which shows the content of the search / reproduction | regeneration process explaining the effect | action of a character data search part. 動画記録再生装置の概略構成の他の例を示すブロック図である。It is a block diagram which shows the other example of schematic structure of a moving image recording / reproducing apparatus.

Explanation of symbols

１０動画記録再生装置
１２アンテナ
１４ＴＶチューナ
１６信号処理部
１８音声認識部
２０ＨＤＤ
２８動画再生部
３０ディスプレイ
３２スピーカ
３４文字データ検索部
３８動画再生部
４０動画ファイル／文字データ生成・記録部 DESCRIPTION OF SYMBOLS 10 Movie recording / reproducing apparatus 12 Antenna 14 TV tuner 16 Signal processing part 18 Voice recognition part 20 HDD
28 Video playback unit 30 Display 32 Speaker 34 Character data search unit 38 Video playback unit 40 Video file / character data generation / recording unit

Claims

Receiving means for receiving a video signal or audio signal being broadcast;
Recording means for recording the video signal or audio signal received by the receiving means on a recording medium;
An audio signal included in the received video signal or an audio recognition means for converting the received audio signal into character information by audio recognition;
Character information recording means for recording character information obtained by voice recognition by the voice recognition means together with information associated with a video signal or a voice signal recorded on the recording medium, on a recording medium;
Search means for searching for a search target phrase specified by a user for character information recorded on a recording medium by the character information recording means;
When character information including a search target phrase is extracted by the search by the search means, a video signal or an audio signal corresponding to the character information is read out from a recording medium and reproduced, or presented. Presenting means for presenting information capable of specifying a video signal or an audio signal corresponding to character information in which the target phrase exists;
An information presentation device.

The video signal being broadcast is an analog or digital television signal, and the recording means converts the television signal received by the receiving means into a compressed digital video signal and then records it on a recording medium. The information presenting apparatus according to claim 1, wherein:

The information associating the character information with the video signal or the audio signal includes date and time information indicating the date and time when the audio signal before being converted into the character information was broadcast,
When the character information in which the search target word / phrase exists is extracted, the presenting means is based on the date / time information recorded together with the extracted character information, and the video signal or audio broadcast on the date / time indicated by the date / time information A signal is reproduced / presented, or information including date and time information recorded together with the extracted character information is presented as information that can identify the corresponding video signal or audio signal. Item 1. An information presentation device according to Item 1.

The reception means, the recording means, the voice recognition means, and the character information recording means are provided in the same number as the number of channels of the video signal or audio signal to be recorded, and the search means is provided by each character information recording means. 2. The information presentation apparatus according to claim 1, wherein each of the designated search target phrases is searched for the character information recorded on the recording medium.

The information presentation apparatus according to claim 1, wherein the search target word / phrase is specified by a user inputting through an input unit, or specified by a user uttering the search target word / phrase.

Receiving the video signal or audio signal being broadcast, recording the received video signal or audio signal to the recording medium,
Information that converts the audio signal included in the received video signal or the received audio signal into character information by voice recognition, and associates the character information obtained by voice recognition with the video signal or audio signal recorded on the recording medium Along with a recording medium,
Search the search target words specified by the user against the character information recorded on the recording medium,
When character information including a search target word is extracted by the search, a video signal or an audio signal corresponding to the character information is read out from a recording medium and reproduced, or the search target word is displayed. An information presentation method for presenting information for specifying a video signal or an audio signal corresponding to character information in which there is a character.

A computer comprising a receiving means and a recording medium for receiving a video signal or audio signal being broadcast,
Recording means for recording the video signal or audio signal received by the receiving means on a recording medium;
An audio signal included in the received video signal or an audio recognition means for converting the received audio signal into character information by audio recognition;
Character information recording means for recording character information obtained by voice recognition by the voice recognition means together with information associated with a video signal or a voice signal recorded on the recording medium, on a recording medium;
Search means for searching for a search target phrase specified by a user for character information recorded on a recording medium by the character information recording means;
And, when the character information in which the search target phrase exists is extracted by the search by the search means, the video signal or the audio signal corresponding to the character information is read out from the recording medium and reproduced, or presented, An information presentation program that functions as a presentation unit that presents information that can identify a video signal or an audio signal corresponding to character information in which the search target word is present.