JP2006033562A

JP2006033562A - Device for receiving onomatopoeia

Info

Publication number: JP2006033562A
Application number: JP2004211294A
Authority: JP
Inventors: Sakuma Hatakeno; 佐久磨畠野
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 2004-07-20
Filing date: 2004-07-20
Publication date: 2006-02-02

Abstract

<P>PROBLEM TO BE SOLVED: To provide an onomatopoeia receiving device which is easily constituted for an auditorily disabled person to view a TV, generates a caption text with a high hit rate, and performs multiplexing on a reception image. <P>SOLUTION: The device comprises an onomatopoeia-transmitting part 16 for generating the database of onomatopoeia corresponding to the background sound of a preceding series among the series of programs to be broadcasted a plurality of times; an onomatopoeia-receiving part 31 for receiving the database of the preceding series from the onomatopoeia-transmitting part 16, based on the series to be broadcasted; a retrieving part 34 for retrieving a proximity background sound parameter which is most approximate to the background sound of the series to be broadcasted among the background sound parameters of the database, which is received by the onomatopoeia receiving part 31 and retrieving a proximity onomatopoeia corresponding to the proximity background sound parameter; and an image-compositing part 35 for compositing the proximity onomatopoeia retrieved by the retrieving part 34 with image information of the series of program. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、毎週又は毎日放送されるシリーズ番組の音声信号に含まれる背景音を基に、その背景音に係る擬声語をシリーズ番組擬声語データベースより取得し、取得された擬声語を受信して得られる映像信号に多重して出力する擬声語受信装置に関する。 The present invention obtains an onomatopoeia related to the background sound from the series program onomatopoeia database based on the background sound included in the audio signal of the series program broadcast weekly or daily, and an image obtained by receiving the acquired onomatopoeia The present invention relates to an onomatopoeia receiving device that multiplexes and outputs a signal.

最近になり、音声認識技術を用い、出演者の話の内容をテキスト文字に変換して多重表示する放送番組も増えてきた。外国語会話の翻訳表示、聴力障害者の聴力補助として有効である。米国では聴覚障害者のためのクローズドキャプションとして会話以外の描写情報が多重されることもあるが、現在行われている放送は聴覚障害者にとって優しい放送であるとは言えない。
特許文献１には、字幕番組制作者による手動字幕制作を効率的に支援する実用性のより高い半自動型字幕番組制作システムを、テレビ素材番組の映像、音声、タイムコードに基づいてスピーチに対する字幕用テキストの書き起こし及び背景音などの付加情報データを入力する字幕テキスト書き起こし部と、書き起こされた字幕テキストに基づいて字幕画面作成およびタイミング付与を行う自動字幕番組データ制作部と、作成された字幕番組データの編集および試写を行う字幕番組編集・試写部とを備えて構成する方法について開示されている。
特開２００３−２２４７７４号公報 Recently, an increasing number of broadcast programs use voice recognition technology to convert the content of a performer's story into text characters and multiplex display them. It is effective for translation display of foreign language conversation and hearing aid for hearing impaired people. In the US, descriptive information other than conversation may be multiplexed as a closed caption for the hearing impaired, but the current broadcast is not a friendly broadcast for the hearing impaired.
Patent Document 1 discloses a more practical semi-automatic subtitle program production system that efficiently supports manual subtitle production by subtitle program producers for subtitles for speech based on video, audio, and time code of TV material programs. A subtitle text transcription unit that inputs additional information data such as text transcription and background sound, an automatic subtitle program data production unit that creates a subtitle screen and assigns timing based on the transcripted subtitle text, and A method is disclosed that includes a closed caption program editing / preview section for editing and previewing closed caption program data.
JP 2003-224774 A

しかしながら、背景音のテキスト文字変換は、スピーチ内容のテキスト変換に比し容易ではなく、特許文献１の場合であっても作業の支援をするにとどまっている。字幕制作を業務として行う分野においても自動的に字幕テキストを生成する装置の実現までには至ってない。聴覚障害者がテレビ視聴を行うために必要な擬声語受信装置が実現されていなかった。 However, text-to-text conversion of background sounds is not as easy as text conversion of speech content, and even in the case of Patent Document 1, only work support is provided. Even in the field where subtitle production is carried out as a business, an apparatus for automatically generating subtitle text has not yet been realized. The onomatopoeia receiving device necessary for hearing-impaired persons to watch TV has not been realized.

そこで、本発明は、上記のような問題点を解消するためになされたもので、聴覚障害者がテレビ視聴を行うために利用可能で構成が容易であり、ヒット率の高い字幕テキストを生成して受信映像に多重する擬声語受信装置を提供することを目的とする。 Therefore, the present invention has been made to solve the above-described problems, and can be used for a hearing-impaired person to view a television, is easy to configure, and generates caption text with a high hit rate. Another object of the present invention is to provide an onomatopoeia receiving device that multiplexes the received video.

本願発明は、周波数分布特性により関連づけられた背景音に基づいて作成される擬声語が付加されていない番組を受信する際に、前記番組とは別系統で、予め前記番組で用いられる背景音に基づいて作成された擬声語のデータベースの格納及び送信を行い、前記データベースの中から前記番組で用いられる背景音に最も近い擬声語の検索及び受信を行い、前記背景音に対応する前記擬声語を付加して前記番組の画像情報と共に映像出力する擬声語送受信システムに用いられる擬声語受信装置であって、前記データベースの背景音パラメータの中から前記番組の背景音に最も近い近似背景音パラメータ及び前記近似背景音パラメータに対応する近似擬声語を検索する検索部と、前記検索部で検索された前記近似擬声語と前記番組の画像情報とを合成する画像合成部と、を備えたことを特徴とする擬声語受信装置を提供する。 The present invention is based on the background sound used in the program in advance, in a system different from the program, when receiving the program to which the onomatopoeia created based on the background sound related by the frequency distribution characteristic is not added. Storing and transmitting the onomatopoeia database created in this manner, searching and receiving the onomatopoeia closest to the background sound used in the program from the database, adding the onomatopoeia corresponding to the background sound and adding the onomatopoeia An onomatopoeia receiving device used in an onomatopoeia transmission / reception system that outputs a video together with image information of a program, corresponding to the approximate background sound parameter closest to the background sound of the program and the approximate background sound parameter among the background sound parameters of the database A search unit for searching for an approximate onomatopoeia, and the approximate onomatopoeia searched by the search unit and image information of the program. Providing onomatopoeic word receiving apparatus characterized by comprising an image combining unit for forming a.

本発明によれば、周波数分布特性により関連づけられた背景音に基づいて作成される擬声語が付加されていない番組を受信する際に、前記番組とは別系統で、予め前記番組で用いられる背景音に基づいて作成された擬声語のデータベースの格納及び送信を行い、前記データベースの中から前記番組で用いられる背景音に最も近い擬声語の検索及び受信を行い、前記背景音に対応する前記擬声語を付加して前記番組の画像情報と共に映像出力する擬声語送受信システムに用いられる擬声語受信装置を、前記データベースの背景音パラメータの中から前記番組の背景音に最も近い近似背景音パラメータ及び前記近似背景音パラメータに対応する近似擬声語を検索する検索部と、前記検索部で検索された前記近似擬声語と前記番組の画像情報とを合成する画像合成部とを備える格別な構成があるので、聴覚障害者がテレビ視聴を行うために利用可能な構成が容易でありヒット率の高い字幕テキストを生成して受信映像に多重する擬声語受信装置を提供することができる。 According to the present invention, when receiving a program to which an onomatopoeia created based on a background sound associated with a frequency distribution characteristic is not added, a background sound that is used in the program in advance in a separate system from the program. Storing and transmitting a database of onomatopoeia created based on the above, searching and receiving the onomatopoeia closest to the background sound used in the program from the database, and adding the onomatopoeia corresponding to the background sound The onomatopoeia receiving device used in the onomatopoeia transmission / reception system that outputs video together with the image information of the program corresponds to the approximate background sound parameter closest to the background sound of the program and the approximate background sound parameter among the background sound parameters of the database A search unit that searches for the approximate onomatopoeia to be combined with the approximate onomatopoeia searched by the search unit and the image information of the program. And an image synthesizing unit that can be used for a hearing-impaired person to watch TV, and generates a subtitle text with a high hit rate and multiplexes it on the received video. Can be provided.

以下に本発明の各実施例に係る擬声語受信装置について図１〜図４を用いて説明する。
図１は、本発明の実施に係る擬声語生成付加システムの概略構成例を示す図である。
図２は、本発明の実施に係る擬声語情報送信装置の構成例を示す図である。
図３は、本発明の実施に係る擬声語生成受信システムの構成例を示す図である。
図４は、本発明の実施に係る擬声語データベースの記述例を示す図である。 The onomatopoeia receiving device according to each embodiment of the present invention will be described below with reference to FIGS.
FIG. 1 is a diagram showing a schematic configuration example of an onomatopoeia generation / addition system according to an embodiment of the present invention.
FIG. 2 is a diagram illustrating a configuration example of the onomatopoeia information transmission device according to the embodiment of the present invention.
FIG. 3 is a diagram showing a configuration example of the onomatopoeia generation and reception system according to the embodiment of the present invention.
FIG. 4 is a diagram showing a description example of the onomatopoeia database according to the embodiment of the present invention.

擬声語生成受信システムの構成について述べる。
図１に示す擬声語生成受信システムは、擬声語ＩＤ（Identification）入力部１１、閾値決定部１２、背景音入力部１３、特徴パラメータ抽出部１４、テーブル作成部１５、及び擬声語送信部１６からなる擬声語情報送信装置１と、インターネット２と、擬声語受信部３１、背景音入力部３２、特徴パラメータ抽出部３３、擬声語検索部３４、及び画像合成部３５からなる擬声語生成付加装置３と、放送受信装置４とから構成される。
図２に示す擬声語情報送信装置１の要部である擬声語送信部１６は、調整値保持回路１６１、擬声語ＩＤ記憶回路１６２、相関テーブル記憶回路１６３、及び送信制御回路１６４より構成される。
図３に示す擬声語生成付加装置３の要部である擬声語受信部３１は、通信制御回路３１１、調整値保持回路３１２、相関テーブル記憶回路３１３、及び擬声語記憶回路３１４より構成される。 The configuration of the onomatopoeia generation and reception system will be described.
The onomatopoeia generation / reception system shown in FIG. 1 includes onomatopoeia ID (Identification) input section 11, threshold determination section 12, background sound input section 13, feature parameter extraction section 14, table creation section 15, and onomatopoeia transmission section 16. An onomatopoeia generating / adding device 3 including a transmitting device 1, the Internet 2, an onomatopoeia receiving unit 31, a background sound input unit 32, a feature parameter extracting unit 33, an onomatopoeia searching unit 34, and an image synthesizing unit 35; Consists of
The onomatopoeic transmitter 16 which is a main part of the onomatopoeia information transmitting apparatus 1 shown in FIG. 2 includes an adjustment value holding circuit 161, an onomatopoeia ID storage circuit 162, a correlation table storage circuit 163, and a transmission control circuit 164.
The onomatopoeia receiving unit 31 that is a main part of the onomatopoeia generation / addition device 3 shown in FIG. 3 includes a communication control circuit 311, an adjustment value holding circuit 312, a correlation table storage circuit 313, and an onomatopoeia storage circuit 314.

擬声語生成受信システムの動作について述べる。
まず、擬声語を付加して再生する番組の視聴予約を行う。擬声語生成付加装置３は、その番組に係る公開されている擬声語情報をインターネット２上で検索し、その番組に用いられる背景音に係る擬声語ＩＤ、擬声語の特徴パラメータ、及び背景音の閾値レベル情報を取得する。視聴予約された番組の受信時に擬声語の特徴パラメータを基に受信される背景音から取得した特徴パラメータに類似する特徴パラメータに係る擬声語を検索して多重した映像信号を生成する。 The operation of the onomatopoeia generation and reception system will be described.
First, a viewing reservation is made for a program to be reproduced with an onomatopoeia added. The onomatopoeia generation / addition device 3 searches the Internet 2 for public onomatopoeia information related to the program, and obtains the onomatopoeia ID related to the background sound used in the program, the onomatopoeia feature parameters, and the threshold level information of the background sound. get. A multiplexed video signal is generated by searching for an onomatopoeia related to a feature parameter similar to the feature parameter acquired from the background sound received based on the onomatopoeia feature parameter upon reception of a program reserved for viewing.

以下、詳細に説明する。
まず、擬声語情報送信装置１の所有者は、擬声語情報送信装置１を用いて行う擬声語提供サービスに係る番組のカテゴリを決定する。擬声語提供サービスに適す番組カテゴリとして、例えば、いわゆるシリーズ番組がある。擬声語情報送信装置１の所有者は決定したシリーズ番組を録画する。録画した番組を再生して得られる背景音を背景音入力部１３に入力する。図示しない視聴装置により再生番組を視聴する。背景音のレベルを図示しないレベルメータで監視する。背景音のレベルが所定レベル以上であり、且つ擬声語を表示するのが有効であると使用者（視聴者）により判断されたときに、視聴者は擬声語ＩＤ入力部１１を操作し擬声語に係るＩＤ（擬声語ＩＤ）を入力する。入力された擬声語ＩＤは擬声語送信部１６の擬声語ＩＤ記憶回路１６２に記憶される。同時に、閾値決定部１２では、背景音入力部１３に入力されている音量レベルの値が閾値レベルとして決定される。その閾値レベルは擬声語送信部１６の調整値保持回路１６１に入力され、そこに保持（記憶）される。 Details will be described below.
First, the owner of the onomatopoeia information transmission device 1 determines a category of a program related to the onomatopoeia information providing service performed using the onomatopoeia information transmission device 1. As a program category suitable for the onomatopoeia providing service, for example, there is a so-called series program. The owner of the onomatopoeia information transmitting device 1 records the determined series program. The background sound obtained by reproducing the recorded program is input to the background sound input unit 13. A playback program is viewed by a viewing device (not shown). The background sound level is monitored with a level meter (not shown). When the user (viewer) determines that the background sound level is equal to or higher than a predetermined level and that it is effective to display the onomatopoeia, the viewer operates the onomatopoeia ID input unit 11 to determine the ID related to the onomatopoeia. Enter (Onomatopoeia ID). The input onomatopoeia ID is stored in the onomatopoeia ID storage circuit 162 of the onomatopoeia transmitter 16. At the same time, the threshold value determination unit 12 determines the volume level value input to the background sound input unit 13 as the threshold level. The threshold level is input to the adjustment value holding circuit 161 of the onomatopoeia transmission unit 16 and held (stored) there.

閾値が決定されたタイミングで、調整値保持回路１６１からは特徴パラメータ抽出部１４に対して、背景音の特徴パラメータの抽出を実行させるための命令が出力される。特徴パラメータ抽出部１４では、背景音入力部１３から入力される信号の中心周波数成分とその成分の信号の平均音圧変化に係る特徴パラメータが抽出される。特徴パラメータはテーブル作成部１５に入力される。テーブル作成部１５では、入力されたそれらの特徴パラメータは特徴パラメータテーブルとして整理、作成される。同一の擬声語で表現される背景音は同一の特徴パラメータを有する背景音として纏められる。纏められた特徴パラメータは相関テーブル記憶回路１６３に入力され、そこに一時記憶される。即ち、擬声語ＩＤに係り入力された閾値レベルと特徴パラメータからなる擬声語情報はそれぞれの記憶回路に記憶される。それらの擬声語ＩＤ、閾値レベル、及び特徴パラメータなどの擬声語情報は番組毎にグルーピング（仕分け）されて記憶される。番組毎にグルーピングされた擬声語情報はインターネットを介して擬声語生成受信システムの利用者に、例えば有料で公開される。 At the timing when the threshold is determined, the adjustment value holding circuit 161 outputs a command for causing the feature parameter extraction unit 14 to extract the feature parameter of the background sound. The feature parameter extraction unit 14 extracts a feature parameter related to the center frequency component of the signal input from the background sound input unit 13 and the average sound pressure change of the signal of the component. The characteristic parameters are input to the table creation unit 15. The table creation unit 15 organizes and creates the input feature parameters as a feature parameter table. Background sounds expressed in the same onomatopoeia are collected as background sounds having the same characteristic parameters. The collected feature parameters are input to the correlation table storage circuit 163 and temporarily stored therein. In other words, the onomatopoeia information composed of the threshold level and the characteristic parameters input according to the onomatopoeia ID is stored in each storage circuit. The onomatopoeic information such as the onomatopoeia ID, threshold level, and feature parameters is grouped (sorted) for each program and stored. Onomatopoeia information grouped for each program is disclosed to users of the onomatopoeia generation and reception system via the Internet, for example, for a fee.

放送受信装置４の図示しない操作部が操作され、毎週又は毎日受信される番組の視聴予約がなされたときには、その視聴予約情報は擬声語生成付加装置３の擬声語受信部３１の通信制御回路３１１に伝送される。通信制御回路３１１によりインターネット２に公開される擬声語情報が検索される。擬声語情報送信装置１により、視聴予約された番組に係る擬声語情報が公開されている情報が得られる。公開情報が得られたときに、通信制御回路３１１により、インターネット２上に公開されている視聴予約番組に係る擬声語情報の取得が要求される。要求を受けた擬声語情報送信装置１の擬声語送信部１６の送信制御回路１６４では、必要に応じて擬声語生成付加装置３の通信制御回路３１１を認証した後に要求された擬声語情報を送信する。送信された擬声語情報は、通信制御回路３１１が介されて伝送され、擬声語ＩＤは擬声語記憶回路３１４に、閾値レベルは調整値保持回路３１２に、そして特徴パラメータは相関テーブル記憶回路３１３のそれぞれに入力されて記憶される。 When an operation unit (not shown) of the broadcast receiving apparatus 4 is operated and a viewing reservation is made for a program received weekly or daily, the viewing reservation information is transmitted to the communication control circuit 311 of the onomatopoeia receiving section 31 of the onomatopoeia generation / addition apparatus 3. Is done. Onomatopoeia information published on the Internet 2 is searched by the communication control circuit 311. The onomatopoeia information transmitting device 1 obtains information on which onomatopoeia information related to a program reserved for viewing is disclosed. When the public information is obtained, the communication control circuit 311 requests acquisition of the onomatopoeia information related to the viewing reservation program published on the Internet 2. Upon receiving the request, the transmission control circuit 164 of the onomatopoeia transmission unit 16 of the onomatopoeia information transmission device 1 transmits the requested onomatopoeia information after authenticating the communication control circuit 311 of the onomatopoeia generation / addition device 3 as necessary. The transmitted onomatopoeia information is transmitted via the communication control circuit 311, the onomatopoeia ID is input to the onomatopoeia storage circuit 314, the threshold level is input to the adjustment value holding circuit 312, and the characteristic parameter is input to the correlation table storage circuit 313. And memorized.

やがて視聴予約時刻が到来し、放送受信装置４により予約された番組が受信される。受信して得られた映像信号は画像合成部３５に入力されると共に、音声信号は背景音入力部３２に入力される。音声信号が２チャンネルステレオ信号である場合には、左チャンネル音声信号と右チャンネル音声信号の差の信号が背景音入力部３２に入力される。左右チャンネルの和の信号には出演者の声の成分が多く含まれ、差の信号には背景音成分が多く含まれる。音声信号が５チャンネル信号であるときには、センターチャンネルを除くチャンネル信号に背景音成分が多く含まれるため、背景音入力部３２にはセンターチャンネルを除くチャンネル信号が入力される。 Eventually, the viewing reservation time arrives, and the program reserved by the broadcast receiving device 4 is received. The received video signal is input to the image synthesis unit 35 and the audio signal is input to the background sound input unit 32. When the audio signal is a 2-channel stereo signal, a difference signal between the left channel audio signal and the right channel audio signal is input to the background sound input unit 32. The sum signal of the left and right channels contains many performer voice components, and the difference signal contains many background sound components. When the audio signal is a five-channel signal, the channel signal excluding the center channel is input to the background sound input unit 32 because many channel sound components are included in the channel signal excluding the center channel.

背景音入力部３２に入力された音声信号のうちの背景音成分は特徴パラメータ抽出部３３に入力される。そこでは入力される背景音信号の成分が調整値保持回路３１２に記憶される閾値レベルより大きいか否かが検出される。閾値レベルより大きいとして検出されたときには入力される背景音の中心周波数成分、その周波数成分の平均音圧レベル、及び時間変化特性に係る特徴パラメータが抽出される。抽出された特徴パラメータは擬声語検索部３４に入力される。擬声語検索部３４では、相関テーブル記憶回路３１３に記憶されている特徴パラメータと類似する入力音声信号の特徴パラメータが存在しているかが検索される。類似する特徴パラメータが検索されたときには、その特徴パラメータに係り擬声語記憶回路３１４に記憶されている擬声語ＩＤを擬声語記憶回路３１４から取得する。取得された擬声語ＩＤは画像合成部３５に入力される。画像合成部３５に入力される映像信号に擬声語ＩＤが多重化され、多重化映像信号が出力される。 The background sound component of the audio signal input to the background sound input unit 32 is input to the feature parameter extraction unit 33. There, it is detected whether or not the component of the input background sound signal is larger than the threshold level stored in the adjustment value holding circuit 312. When it is detected that it is greater than the threshold level, the center frequency component of the input background sound, the average sound pressure level of the frequency component, and the characteristic parameters related to the time change characteristic are extracted. The extracted feature parameters are input to the onomatopoeia search unit 34. The onomatopoeia search unit 34 searches for a feature parameter of the input speech signal similar to the feature parameter stored in the correlation table storage circuit 313. When a similar feature parameter is retrieved, the onomatopoeia ID related to the feature parameter and stored in the onomatopoeia storage circuit 314 is acquired from the onomatopoeia storage circuit 314. The acquired onomatopoeia ID is input to the image composition unit 35. The onomatopoeia ID is multiplexed with the video signal input to the image synthesizer 35, and a multiplexed video signal is output.

ここで、シリーズ番組における擬声語記憶回路３１４に記憶される擬声語ＩＤ、及び相関テーブル記憶回路３１３に記憶される特徴パラメータなどの類似語情報は前回又は過去に受信された番組に係る擬声語情報と類似されるものが使用されるケースが多くヒット率の高い犠声語が得られる。
図４を用い、背景音の特徴パラメータと擬声語との関係についてさらに説明する。
同図に示す擬声語情報はシリーズ番組が野球放送である場合の擬声語の特徴パラメータについて示したものである。
特徴パラメータは、周波数成分の中心値がＰｆｃ１、Ｐｆｃ２、Ｐｆｃ３（Ｐｆｃ１＞Ｐｆｃ２＞Ｐｆｃ３）であり、且つ周波数成分の平均音圧レベルの時間変化が減衰していくもの、ゆっくりした周期で繰り返すもの、及び早い周期で繰り返すものについて例示している。番号の１、４、及び６は中心周波数が共にＰｆｃ１である場合であり、平均音圧の時間変化がそれぞれ異なっている。擬声語としてはカァーン、カンカン、カッカッカッカッがそれぞれ割り当てられている。同様にして中心周波数がＰｆｃ２のものに対してはゴォーン、ゴンゴン、ゴッゴッゴッゴッが割り当てられている。 Here, similar word information such as the onomatopoeia ID stored in the onomatopoeia storage circuit 314 and the characteristic parameter stored in the correlation table storage circuit 313 in the series program is similar to the onomatopoeia information related to the program received previously or in the past. Sacrifice words with a high hit rate are obtained in many cases.
The relationship between the background sound feature parameter and the onomatopoeia will be further described with reference to FIG.
The onomatopoeia information shown in the figure shows the onomatopoeia characteristic parameters when the series program is a baseball broadcast.
The characteristic parameters are such that the center value of the frequency component is Pfc1, Pfc2, Pfc3 (Pfc1>Pfc2> Pfc3), and the temporal change of the average sound pressure level of the frequency component is attenuated, repeated at a slow cycle, And what is repeated at an early cycle is illustrated. Numbers 1, 4 and 6 are cases where the center frequency is Pfc1, and the time variation of the average sound pressure is different. Onomatopoeia are assigned to can, can, and cuckoo, respectively. Similarly, Goon, Gongon, and Gogging are assigned to those having a center frequency of Pfc2.

この例は野球放送の場合のレベルが大きな背景音の中心周波数成分及び平均音圧の時間変化を基に示した擬声語である。その他にも周波数成分の分布や周波数成分の時間変化特性などを特徴パラメータとして用いることによりさらに多くの擬声語の相関テーブルを定義することができる。
同様にして、シリーズ番組の放送内容毎に同様の擬声語の相関テーブルを作成することによりそのシリーズ番組中で複数回出現する背景音に対して共通の擬声語を定義して用いることができる。例えば周波数分布が広いホワイトノイズのような背景音に対しては、雨の風景である場合には「ザァー」、人々の完成である場合には「ワァー」、拍手の場合には「パチパチパチ」などの異なる擬声語を表示する必要がある。シリーズ番組では過去に放送されたと同じ情景の擬声語など、映像に関連性の高い擬声語を得ることができる。
また、擬声語が映像信号に多重されて表示されるタイミングは背景音のレベルが所定レベルに達したときとしている。特徴パラメータには表示タイミングに係るレベル値も含めるのが好ましい。 This example is an onomatopoeia based on the central frequency component of the background sound having a large level in the case of baseball broadcasting and the temporal change of the average sound pressure. In addition, it is possible to define more onomatopoeia correlation tables by using frequency component distribution, frequency component temporal change characteristics, and the like as characteristic parameters.
Similarly, by creating a similar onomatopoeia correlation table for each broadcast content of a series program, a common onomatopoeia can be defined and used for background sounds that appear multiple times in the series program. For example, for background sounds such as white noise with a wide frequency distribution, it will be “Zar” if it is a rainy landscape, “War” if it is the completion of people, “Crap” if it is applause, etc. It is necessary to display different onomatopoeia. In series programs, it is possible to obtain onomatopoeia that is highly relevant to the video, such as onomatopoeia of the same scene that was broadcast in the past.
In addition, the timing when the onomatopoeia is multiplexed and displayed on the video signal is when the level of the background sound reaches a predetermined level. The characteristic parameter preferably includes a level value related to display timing.

以上のように、本発明の実施例として記述した擬声語生成付加システムによれば、周波数分布特性により関連づけられた背景音に基づいて作成される擬声語が付加されていない番組を受信する際に、前記番組とは別系統で、予め前記番組で用いられる背景音に基づいて作成された擬声語のデータベースの格納及び送信を行い、前記データベースの中から前記番組で用いられる背景音に最も近い擬声語の検索及び受信を行い、前記背景音に対応する前記擬声語を付加して前記番組の画像情報と共に映像出力する擬声語送受信システムに用いられる擬声語受信装置３を、前記データベースの背景音パラメータの中から前記番組の背景音に最も近い近似背景音パラメータ及び前記近似背景音パラメータに対応する近似擬声語を検索する検索部３４と、前記検索部で検索された前記近似擬声語と前記番組の画像情報とを合成する画像合成部３５とを備えて構成しているので、聴覚障害者がテレビ視聴を行うために利用可能な構成が容易でありヒット率の高い字幕テキストを生成して受信映像に多重する擬声語生成付加システムを実現できる。 As described above, according to the onomatopoeia generation and addition system described as the embodiment of the present invention, when receiving a program to which the onomatopoeia created based on the background sound related by the frequency distribution characteristic is not added, Stores and transmits a database of onomatopoeia previously created on the basis of background sounds used in the program in a separate system from the program, searches for onomatopoeia closest to the background sounds used in the program from the database, and The onomatopoeia receiving device 3 used in the onomatopoeia transmission / reception system for receiving and adding the onomatopoeia corresponding to the background sound and outputting the video together with the image information of the program is used as the background of the program from the background sound parameters of the database. A search unit for searching for an approximate background sound parameter closest to the sound and an approximate onomatopoeia corresponding to the approximate background sound parameter; Since the image synthesizing unit 35 is configured to synthesize the approximate onomatopoeia searched by the search unit and the image information of the program, a configuration that can be used by a hearing-impaired person to view TV is easy. Therefore, it is possible to realize an onomatopoeia generation / addition system that generates subtitle text with a high hit rate and multiplexes it on the received video.

受信される音声信号に含まれる背景音に関連した擬声語を受信映像に多重して表示する、聴覚障害者用擬声語付加受信システムに適用できる。 The present invention can be applied to an onomatopoeia-added reception system for the hearing impaired person that multiplexes and displays the onomatopoeia related to the background sound included in the received audio signal on the received video.

本願発明の実施に係る擬声語生成受信システムの概略構成例を示す図である。It is a figure which shows the schematic structural example of the onomatopoeia production | generation receiving system which concerns on implementation of this invention. 本願発明の実施に係る擬声語情報送信装置の構成例を示す図である。It is a figure which shows the structural example of the onomatopoeia information transmitter which concerns on implementation of this invention. 本願発明の実施に係る擬声語生成受信システムの構成例を示す図である。It is a figure which shows the structural example of the onomatopoeia production | generation receiving system which concerns on implementation of this invention. 本願発明の実施に係る擬声語データベースの記述例を示す図である。It is a figure which shows the example of a description of the onomatopoeia database based on implementation of this invention.

Explanation of symbols

１擬声語情報送信装置
２インターネット
３擬声語生成付加装置
４放送受信装置
１１擬声語ＩＤ入力部
１２閾値決定部
１３背景音入力部
１４特徴パラメータ抽出部
１５テーブル作成部
１６擬声語送信部
３１擬声語受信部
３２背景音入力部
３３特徴パラメータ抽出部
３４擬声語検索部
３５画像合成部
１６１調整値保持回路
１６２擬声語ＩＤ記憶回路
１６３相関テーブル記憶回路
１６４送信制御回路
３１１通信制御回路
３１２調整値保持回路
３１３相関テーブル記憶回路
３１４擬声語記憶回路

DESCRIPTION OF SYMBOLS 1 Onomatopoeia information transmission apparatus 2 Internet 3 Onomatopoeia generation addition apparatus 4 Broadcast receiving apparatus 11 Onomatopoeia ID input part 12 Threshold determination part 13 Background sound input part 14 Feature parameter extraction part 15 Table creation part 16 Onomatopoeia transmission part 31 Onomatopoeia reception part 32 Background sound Input unit 33 Feature parameter extraction unit 34 Onomatopoeia search unit 35 Image synthesis unit 161 Adjustment value holding circuit 162 Onomatopoeia ID storage circuit 163 Correlation table storage circuit 164 Transmission control circuit 311 Communication control circuit 312 Adjustment value holding circuit 313 Correlation table storage circuit 314 Onomatopoeia Memory circuit

Claims

When receiving a program to which the onomatopoeia created based on the background sound related by the frequency distribution characteristic is not added, it was created based on the background sound used in the program in a system different from the program. Stores and transmits an onomatopoeic database, searches and receives an onomatopoeia closest to the background sound used in the program from the database, adds the onomatopoeia corresponding to the background sound, and adds image information of the program An onomatopoeia receiving device used in an onomatopoeia transmission / reception system that outputs video together with
A search unit for searching for an approximate background sound parameter closest to the background sound of the program and an approximate onomatopoeia corresponding to the approximate background sound parameter from the background sound parameters of the database;
An image synthesizing unit that synthesizes the approximate pseudonym searched by the search unit and image information of the program;
An onomatopoeia receiving device comprising: