JP2015061277A

JP2015061277A - Reverberation device

Info

Publication number: JP2015061277A
Application number: JP2013195609A
Authority: JP
Inventors: 郁子澤谷; Ikuko Sawatani; 敏行西口; Toshiyuki Nishiguchi
Original assignee: Nippon Hoso Kyokai NHK; Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2013-09-20
Filing date: 2013-09-20
Publication date: 2015-03-30

Abstract

PROBLEM TO BE SOLVED: To add reverberation to an acoustic signal of a multichannel audio system on the basis of a video image.SOLUTION: A reverberation device 1, for adding reverberation to direct sound on viewer front channels, disposed inside and in the periphery of a display unit 30 for displaying images, among the acoustic signals of the multichannel audio system, includes: a scene selection unit 11 for selecting a scene image to be a criterion of reverberation addition from the video image; a reflectivity selection unit 13 for selecting the reflectivity of each region in the scene image; an impulse response selection unit 14 for selecting an impulse response to each front channel, on the basis of the configuration of the front channel and the characteristic amount based on the reflectivity of each region; and a convolution processing unit 15 for convolution of the direct sound of the front channel with the impulse response.

Description

この発明は、残響付加装置に関し、特に、マルチチャンネル音響方式の音響信号のうち、映像を表示する表示部の内部及び周囲に配置される視聴者の前方チャンネルの直接音に対し残響を付加する残響付加装置に関する。 The present invention relates to a reverberation adding device, and in particular, a reverberation that adds reverberation to direct sound of a viewer's front channel arranged in and around a display unit that displays a video among multi-channel sound signals. It relates to an additional device.

映画やテレビの音響方式には、従来の２ｃｈステレオシステムや、５．１ｃｈ、７．１ｃｈ、８．２ｃｈ、１０．２ｃｈなどのサラウンドシステム、次世代のテレビ方式として提案されている２２．２マルチチャンネル音響方式などがあり、チャンネル数は多様である。マルチチャンネル音響のコンテンツ制作は一般に、チャンネル数の増加に伴い作業が煩雑化し、制作の所要時間が長くなる傾向にある。さらに、２２．２マルチチャンネル音響に見られるような、３次元音響再生方式では、２次元音響再生と比べてその制御はより複雑化する。そこで、直感的かつ簡便に制作できる環境を構築することが望まれている。 For the sound system of movies and television, the conventional 2ch stereo system, 5.1ch, 7.1ch, 8.2ch, 10.2ch, etc. surround system, 22.2 multiplayer proposed as the next generation television system There is a channel sound system and the number of channels is various. In general, multi-channel audio content production tends to become complicated as the number of channels increases, and the time required for production tends to increase. Further, in the three-dimensional sound reproduction method as seen in 22.2 multi-channel sound, the control becomes more complicated than in the two-dimensional sound reproduction. Therefore, it is desired to construct an environment that allows intuitive and simple production.

実空間で音を聴取する場合、あらゆる方向からの到来音を両耳で知覚する。そこでは、音源からの直接音や、それが空間に響いて直接音より遅れて耳に到達する残響音を知覚する。残響音生成装置については、仮想音源分布生成装置を用いた自然な残響生成のための装置の開発が進んでいる（例えば特許文献１参照）。またコンサートホールにおける残響制御を目的とし、音の明瞭度と音の拡がり感を制御することができる音響発生装置の開発も進んでいる（例えば特許文献２参照）。 When listening to sound in real space, incoming sounds from all directions are perceived by both ears. There, a direct sound from a sound source or a reverberant sound that reaches the ear later than the direct sound due to the sound reverberating in space is perceived. As for the reverberant sound generation device, development of a device for natural reverberation generation using a virtual sound source distribution generation device is in progress (see, for example, Patent Document 1). In addition, for the purpose of reverberation control in a concert hall, development of a sound generator capable of controlling the intelligibility of sound and the feeling of sound spread is also progressing (for example, see Patent Document 2).

特開平７−３３４１７６号公報Japanese Patent Laid-Open No. 7-334176 特許第４８９４４２２号公報Japanese Patent No. 48944422

従来の２次元音響システムでは、隣り合う２つのスピーカ間の音像移動や残響付加などの制作作業が比較的容易であった。しかし、２２．２マルチチャンネル音響のような３次元音響のコンテンツ制作においては、チャンネル数の増加に伴い、空間的印象の制御が煩雑になる。そこで、特に３次元音響システムのようなマルチチャンネル音響において、効率的かつ効果的な空間的印象制御手法を考案する必要がある。特に、テレビや映画のように映像・音響から構成されるコンテンツの制作において、映像を手掛かりとしたマルチチャンネル音響の残響生成や、画像から残響空間を推定して生成する装置は存在しない。 In the conventional two-dimensional sound system, production work such as moving a sound image between two adjacent speakers and adding reverberation was relatively easy. However, in the production of a three-dimensional sound content such as 22.2 multi-channel sound, the control of the spatial impression becomes complicated as the number of channels increases. Therefore, it is necessary to devise an efficient and effective spatial impression control technique particularly in multi-channel sound such as a three-dimensional sound system. In particular, there is no device for generating reverberation of multi-channel sound using video as a clue or estimating and generating reverberation space from an image in the production of content composed of video and sound such as television and movies.

したがって、かかる点に鑑みてなされた本発明の目的は、映像に基づきマルチチャンネル音響方式の音響信号に残響を付加することが可能な残響付加装置を提供することにある。 Accordingly, an object of the present invention made in view of such a point is to provide a reverberation adding apparatus capable of adding reverberation to a multi-channel audio signal based on an image.

上述した諸課題を解決すべく、本発明に係る残響付加装置は、マルチチャンネル音響方式の音響信号のうち、映像を表示する表示部の内部及び周囲に配置される視聴者の前方チャンネルの直接音に対し残響を付加する残響付加装置であって、前記映像から残響付加の基準となるシーン画像を選択するシーン選択部と、前記シーン画像の各領域の反射率を選択する反射率選択部と、前記前方チャンネルの構成と、前記各領域の反射率に基づく特徴量とに基づき、前記前方チャンネルに対するインパルス応答を選択するインパルス応答選択部と、前記前方チャンネルの前記直接音に前記インパルス応答を畳み込む畳み込み処理部と、を備える。 In order to solve the above-described problems, a reverberation adding device according to the present invention is a multi-channel audio signal, which is a direct sound of a front channel of a viewer arranged in and around a display unit that displays an image. A reverberation adding device that adds reverberation to the scene selection unit that selects a scene image serving as a reference for reverberation addition from the video, and a reflectance selection unit that selects the reflectance of each region of the scene image; An impulse response selection unit that selects an impulse response for the front channel based on the configuration of the front channel and a feature value based on the reflectance of each region, and a convolution that convolves the impulse response with the direct sound of the front channel A processing unit.

また、前記シーン画像の前記各領域の境界及び種類を解析する領域解析部を備え、前記反射率選択部は、前記各領域の種類に基づき、前記各領域の反射率を選択する、ことが好ましい。 Further, it is preferable that the image processing apparatus includes an area analysis unit that analyzes a boundary and a type of each area of the scene image, and the reflectance selection unit selects the reflectance of each area based on the type of each area. .

また、前記反射率選択部は、前記各領域の種類に対応する素材に基づいて前記反射率を選択する、ことが好ましい。 Moreover, it is preferable that the reflectance selection unit selects the reflectance based on a material corresponding to the type of each region.

また、前記シーン選択部は、ユーザーから入力された画像を前記シーン画像として選択する、ことが好ましい。 Further, it is preferable that the scene selection unit selects an image input from a user as the scene image.

本発明に係る残響付加装置によれば、映像に基づきマルチチャンネル音響方式の音響信号に残響を付加することが可能となる。 According to the reverberation adding apparatus according to the present invention, it is possible to add reverberation to a multi-channel audio signal based on an image.

本発明の一実施形態に係る残響付加装置の構成を示す図である。It is a figure which shows the structure of the reverberation addition apparatus which concerns on one Embodiment of this invention. 残響付加装置の処理を示すフローチャートである。It is a flowchart which shows the process of a reverberation addition apparatus. シーン画像から各領域の反射率を選択する一例を示す図である。It is a figure which shows an example which selects the reflectance of each area | region from a scene image.

以降、諸図面を参照しながら、本発明の実施態様を詳細に説明する。ここで、以下の説明においては、マルチチャンネル音響方式として、スーパーハイビジョン音響である２２．２チャンネル音響を例に説明を行うが、本発明は２２．２チャンネル音響のみに限定されるものではない点に留意されたい。本発明が対応可能なマルチチャンネル音響方式は、従来の２次元音響方式のように各チャンネルが平面的に配置されるものだけではなく、各チャンネルが３次元的に配置され表現される任意の音響方式を含むものである。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. Here, in the following description, a 22.2 channel sound that is a super high-definition sound will be described as an example of a multi-channel sound system, but the present invention is not limited to only a 22.2 channel sound. Please note that. The multi-channel sound system to which the present invention can be applied is not limited to one in which each channel is arranged in a plane as in the conventional two-dimensional sound system, but any sound in which each channel is arranged and expressed in three dimensions. Including methods.

図１は、本発明の一実施形態に係る残響付加装置の構成を示す図である。残響付加装置１は、シーン選択部１１、領域解析部１２、反射率選択部１３、インパルス応答選択部１４、及び畳み込み処理部１５を含む制御部１０と、領域種データベース２１、反射率データベース２２、インパルス応答データベース２３、及び音響信号データベース２４を含む記憶部２０とを備え、表示部３０、操作部４０、及びスピーカ５０と接続されている。残響付加装置１は、マルチチャンネル音響方式の音響信号のうち、映像を表示する表示部３０の内部及び周囲に配置される視聴者の前方チャンネルの直接音に対し残響を付加するものである。例えば、２２．２チャンネル音響の場合、残響付加装置１は、前方チャンネルである上層３チャンネル（TpFL,TpFC,TpFR）、中層５チャンネル（FL,FLc,FC,FRc,FR）、及び下層３チャンネル（BtFL,BtFC,BtFR）の計１１チャンネルの直接音に対して残響付加を行う。 FIG. 1 is a diagram showing a configuration of a reverberation adding apparatus according to an embodiment of the present invention. The reverberation adding apparatus 1 includes a control unit 10 including a scene selection unit 11, a region analysis unit 12, a reflectance selection unit 13, an impulse response selection unit 14, and a convolution processing unit 15, a region type database 21, a reflectance database 22, The storage unit 20 includes an impulse response database 23 and an acoustic signal database 24, and is connected to the display unit 30, the operation unit 40, and the speaker 50. The reverberation adding apparatus 1 adds reverberation to the direct sound of the viewer's front channel arranged in and around the display unit 30 that displays video, among multi-channel sound signals. For example, in the case of 22.2 channel sound, the reverberation adding apparatus 1 includes the upper layer 3 channels (TpFL, TpFC, TpFR), the middle layer 5 channels (FL, FLc, FC, FRc, FR) and the lower layer 3 channels which are the front channels. Reverberation is added to the direct sound of a total of 11 channels (BtFL, BtFC, BtFR).

制御部１０の各処理部１１〜１５は好適なプロセッサにより構成され、各処理部１１〜１５を共通のプロセッサで実装したり、個別のプロセッサとして実装したりすることができる。記憶部２０は好適な記憶装置であって、残響付加装置１に内蔵されるものだけではなく、通信インタフェースを介した外部記憶装置を用いてもよい。表示部３０は好適なディスプレイであって、残響付加装置１のユーザーは表示部３０に表示される映像を確認しながら、好適な操作インタフェースである操作部４０により残響付加装置１への操作を行うことができる。スピーカ５０は、マルチチャンネル音響方式に対応したスピーカ群であって、残響付加装置１により残響が付加された音響信号は、チャネルごとに対応するスピーカ５０より再生される。 Each processing part 11-15 of the control part 10 is comprised by a suitable processor, and each processing part 11-15 can be mounted with a common processor, or can be mounted as an individual processor. The storage unit 20 is a suitable storage device, and may be an external storage device through a communication interface as well as a built-in reverberation adding device 1. The display unit 30 is a suitable display, and the user of the reverberation adding apparatus 1 operates the reverberation adding apparatus 1 by the operation unit 40 which is a suitable operation interface while confirming the video displayed on the display unit 30. be able to. The speaker 50 is a speaker group corresponding to the multi-channel sound system, and the acoustic signal to which reverberation is added by the reverberation adding device 1 is reproduced from the speaker 50 corresponding to each channel.

以下、フローチャートに沿って残響付加装置１の各機能部の説明を行う。図２は、残響付加装置１の処理を示すフローチャートである。ある番組の音響コンテンツを制作する場合、シーン選択部１１は、映像から残響付加の基準となるシーン画像を選択する（ステップＳ１０１）。残響付加の基準となるシーン画像とは、映像中の音響空間の特徴を表す画像であって、例えば、コンサートホール、体育館、オフィスなどの屋内空間や、市街地、公園、森の中などの屋外空間といった映像中の音響空間を表す画像である。シーン選択部１１が適切なシーン画像を選択することにより、後述の残響付加処理により聴覚的に映像中の音響空間を再現し、視聴者により高い臨場感を与えることが可能となる。選択されたシーン画像は、映像中で同様の音響空間が継続する間はそのまま用いることができ、好適には、番組映像中の音響空間が切り替わる際に新たなシーン画像を選択することができる。なお、シーン画像の選択は、音響空間が切り替わる際の先頭の画像や、同じ音響空間が継続する期間における任意の時点の画像など、設計に応じて種々の画像を選択できるものである。 Hereinafter, each functional unit of the reverberation adding apparatus 1 will be described with reference to a flowchart. FIG. 2 is a flowchart showing processing of the reverberation adding apparatus 1. When producing the audio content of a certain program, the scene selection unit 11 selects a scene image as a reference for adding reverberation from the video (step S101). A scene image that is a reference for adding reverberation is an image that represents the characteristics of the acoustic space in the video. For example, indoor spaces such as concert halls, gymnasiums, and offices, and outdoor spaces such as urban areas, parks, and forests Is an image representing an acoustic space in the video. When the scene selection unit 11 selects an appropriate scene image, an acoustic space in the video is audibly reproduced by a reverberation adding process described later, and a higher sense of presence can be given to the viewer. The selected scene image can be used as it is while the same acoustic space continues in the video, and preferably a new scene image can be selected when the acoustic space in the program video is switched. The selection of the scene image allows selection of various images according to the design, such as the top image when the acoustic space is switched or an image at an arbitrary point in time during which the same acoustic space continues.

また、ユーザーは、表示部３０の映像をコマ送りなどにより確認し、操作部４０を用いて音響空間を再現したいシーン画像を手動で選択することも可能である。さらに、シーン選択部１１は、番組映像中の画像だけでなく、ユーザーから入力された任意の写真等の画像をシーン画像として選択することができる。例えば、ユーザーは、過去に同様の音響空間に対して適切な残響が付加された静止画をシーン画像として入力することにより、過去の残響パターンを現在の番組に対して簡易に付与することが可能になる。また、番組作成者の経験等により、番組映像中の音響空間より、他の音響空間の画像をシーン画像として残響を付加した方がより高い臨場感が得られる場合もあり、本発明はこのような柔軟な残響付加にも対応可能である。 In addition, the user can check the video on the display unit 30 by frame advance or the like, and can manually select a scene image in which the acoustic space is to be reproduced using the operation unit 40. Furthermore, the scene selection unit 11 can select not only an image in a program video but also an image such as an arbitrary photograph input by a user as a scene image. For example, the user can easily add past reverberation patterns to the current program by inputting still images with appropriate reverberation added to the same acoustic space in the past as scene images. become. In addition, depending on the experience of the program creator, it may be possible to obtain a higher sense of realism by adding reverberation using an image in another acoustic space as a scene image than the acoustic space in the program video. It is possible to respond to flexible and flexible reverberation.

領域解析部１２は、シーン画像の各領域の境界及び種類を解析する（ステップＳ１０２）。まず、領域解析部１２は、シーン画像の各領域の境界を抽出する。次に、領域解析部１２は、各領域の色、密度、形状、面積、座標、ＦＦＴ後の周波数特性などに基づき、各領域の種類（テクスチャ）を判別する。領域の種類に関する情報は、各領域の色、密度、形状、面積、座標、ＦＦＴ後の周波数特性などと関連付けて領域種データベース２１に記憶されている。例えば、領域解析部１２は、領域種データベース２１を参照して、シーン画像の各領域を「道路、石垣、生垣、樹木、空」などと判別する。なお、シーン画像の各領域の境界抽出や種類判別は任意の公知技術を用いるものとして、本稿での詳述は行わない。また、領域解析部１２により各領域の境界や種類が正確に解析されていない場合、ユーザーは、表示部３０の映像を確認しながら、操作部４０により手動で所望の境界や種類に修正することができる。 The area analysis unit 12 analyzes the boundary and type of each area of the scene image (step S102). First, the region analysis unit 12 extracts the boundary of each region of the scene image. Next, the region analysis unit 12 determines the type (texture) of each region based on the color, density, shape, area, coordinates, frequency characteristics after FFT, and the like of each region. Information on the type of region is stored in the region type database 21 in association with the color, density, shape, area, coordinates, frequency characteristics after FFT, and the like of each region. For example, the region analysis unit 12 refers to the region type database 21 and determines each region of the scene image as “road, stone wall, hedge, tree, sky” or the like. It should be noted that any known technique is used for boundary extraction and type determination of each area of the scene image, and will not be described in detail in this paper. In addition, when the boundary or type of each region is not accurately analyzed by the region analysis unit 12, the user manually corrects the boundary or type to a desired one using the operation unit 40 while confirming the video on the display unit 30. Can do.

反射率選択部１３は、シーン画像の各領域の反射率を選択する（ステップＳ１０３）。反射率とは、各領域の音の響き度合いを数値化したものであり、好適には周波数毎に異なる数値が設定されるものである。当該反射率は、反射率データベース２２に記憶されており、各反射率は各領域の種類と関連付けて記憶されている。例えば、反射率選択部１３は、領域解析部１２が解析した各領域の種類に基づき、各領域の反射率を選択する。なお、反射率は、「道路、石垣、生垣、樹木、空」などの領域の種類と関連付けるだけではなく、各領域の種類に対応する素材と関連付けて記憶することができる。図３は、シーン画像から各領域の反射率を選択する一例を示す図である。図３（ａ）のシーン画像が領域解析部１２により「道路、石垣、生垣、樹木、空」という種類の領域に分けられた後、図３（ｂ）のように、反射率選択部１３は「道路、石垣、生垣、樹木、空」各領域の素材が「コンクリート、石、草、木、抜け」であると判定する。反射率データベース２２は、各素材の反射率を下記の通り周波数毎に記憶することができる。例えば、同じ「道路」でも「コンクリート」の道路と「土」の道路とは反射率が異なるため、各領域の種類に対応する素材に基づき反射率を選択することにより、より適切な残響付加が可能となる。
｛コンクリート、石、草、木、抜け｝＝
｛・・・・・,
500Hz: 0.6, 0.28, 0.75, 0.7, 0,
1kHz: 0.59, 0.28, 0.70, 0.65, 0,
・・・・・｝ The reflectance selection unit 13 selects the reflectance of each area of the scene image (step S103). The reflectance is a numerical value of the degree of sound reverberation in each region, and preferably a different numerical value is set for each frequency. The reflectance is stored in the reflectance database 22, and each reflectance is stored in association with the type of each region. For example, the reflectance selection unit 13 selects the reflectance of each region based on the type of each region analyzed by the region analysis unit 12. The reflectance can be stored not only in association with the type of area such as “road, stone wall, hedge, tree, sky” but also in association with the material corresponding to the type of each area. FIG. 3 is a diagram illustrating an example of selecting the reflectance of each region from the scene image. After the scene image in FIG. 3A is divided into regions of the type “road, stone wall, hedge, tree, sky” by the region analysis unit 12, the reflectance selection unit 13, as shown in FIG. It is determined that the material of each area of “road, stone wall, hedge, tree, sky” is “concrete, stone, grass, tree, omission”. The reflectance database 22 can store the reflectance of each material for each frequency as follows. For example, even for the same “road”, “concrete” roads and “soil” roads have different reflectivities, so by selecting the reflectivity based on the material corresponding to the type of each region, more appropriate reverberation can be added. It becomes possible.
{Concrete, stone, grass, wood, omission} =
{...,
500Hz: 0.6, 0.28, 0.75, 0.7, 0,
1kHz: 0.59, 0.28, 0.70, 0.65, 0,
...

なお、反射率選択部１３により各領域の反射率が正確に選択されていない場合や、素材の判定が正しくない場合、ユーザーは、表示部３０の映像を確認しながら、操作部４０により手動で所望の反射率や素材に修正することができる。 When the reflectance of each region is not correctly selected by the reflectance selection unit 13 or when the material is not correctly determined, the user manually operates the operation unit 40 while checking the image on the display unit 30. It can be modified to the desired reflectivity and material.

インパルス応答選択部１４は、表示部３０の内部及び周囲に配置される前方チャンネルの構成と、各領域の反射率に基づく特徴量とに基づき、前方チャンネルに対するインパルス応答を選択する（ステップＳ１０４）。インパルス応答とは、音響信号の残響特性を示すものである。当該インパルス応答は、インパルス応答データベース２３に記憶されており、各インパルス応答は前方チャンネルの構成と各領域の反射率に基づく特徴量と関連付けて記憶されている。まず、インパルス応答選択部１４は、インパルス応答データベース２３を参照し、前方チャンネルの構成に対応するインパルス応答の有無を確認する。上述の通り、２２．２チャンネル音響の場合、前方チャンネルは上層３チャンネル、中層５チャンネル、及び下層３チャンネルの計１１チャンネルである。インパルス応答データベース２３に前方チャンネルの構成に対応するインパルス応答が存在する場合、インパルス応答選択部１４は、各領域の反射率より特徴量を算出し、当該特徴量に最も近いインパルス応答を取得する。なお、各領域の反射率に基づく特徴量は当業者が適宜設計できるものであり、本稿では詳述は行わない。また、インパルス応答選択部１４は、スピーカ５０の構成（物理的な配置）が音響信号の各チャンネルの論理的な配置と異なる場合、当該配置の差に基づき選択した各チャンネルのインパルス応答を適宜修正できるものである。 The impulse response selection unit 14 selects an impulse response for the front channel based on the configuration of the front channel arranged in and around the display unit 30 and the feature amount based on the reflectance of each region (step S104). The impulse response indicates a reverberation characteristic of an acoustic signal. The impulse response is stored in the impulse response database 23, and each impulse response is stored in association with a feature amount based on the configuration of the front channel and the reflectance of each region. First, the impulse response selection unit 14 refers to the impulse response database 23 and confirms whether or not there is an impulse response corresponding to the configuration of the front channel. As described above, in the case of 22.2 channel sound, the front channels are 11 channels in total, 3 upper layers, 5 middle layers, and 3 lower channels. When an impulse response corresponding to the configuration of the front channel exists in the impulse response database 23, the impulse response selection unit 14 calculates a feature amount from the reflectance of each region, and acquires an impulse response closest to the feature amount. Note that the feature amount based on the reflectance of each region can be appropriately designed by those skilled in the art, and will not be described in detail in this paper. Further, when the configuration (physical arrangement) of the speaker 50 is different from the logical arrangement of each channel of the acoustic signal, the impulse response selection unit 14 appropriately corrects the impulse response of each channel selected based on the difference in the arrangement. It can be done.

また、インパルス応答選択部１４は、インパルス応答データベース２３に対応するインパルス応答が存在しない場合、インパルス応答データベース２３に記憶されたインパルス応答を組み合わせて新たなインパルス応答生成したり、ＩＩＲ（Infinite Impulse Response）により新規のインパルス応答を生成することが可能である。これにより、シーン画像に適した残響空間を生成することができる。 Further, when there is no impulse response corresponding to the impulse response database 23, the impulse response selection unit 14 generates a new impulse response by combining the impulse responses stored in the impulse response database 23, or generates an IIR (Infinite Impulse Response). Can generate a new impulse response. Thereby, the reverberation space suitable for the scene image can be generated.

畳み込み処理部１５は、前方チャンネルの直接音に、チャンネル毎に選択されたインパルス応答を畳み込む（ステップＳ１０５）。本実施形態において、マルチチャンネル音響の直接音は、音響信号データベース２４に記憶されているが、畳み込み処理部１５は、例えば内蔵する記憶部２０以外に、音響信号入力インタフェースにより他の装置から直接音の入力を受けてもよい。畳み込み処理部１５は、例えば歩行音、ざわめき、鳥の声などの直接音に対し、チャンネル毎にインパルス応答を畳み込み、対応するスピーカ５０への出力信号を生成する（ステップＳ１０６）。本実施形態における残響付加装置１はリアルタイム処理に対応可能であり、この場合、音響信号入力インタフェースに直接音が入力され続ける限り、畳み込み処理部１５はインパルス応答の畳み込み処理を行い、対応するスピーカ５０による再生が継続される。 The convolution processing unit 15 convolves the impulse response selected for each channel with the direct sound of the front channel (step S105). In the present embodiment, the direct sound of the multi-channel sound is stored in the sound signal database 24. However, the convolution processing unit 15 is not limited to the built-in storage unit 20, for example, but directly from other devices using the sound signal input interface. May be received. For example, the convolution processing unit 15 convolves an impulse response for each channel with a direct sound such as a walking sound, a buzzing sound, or a bird's voice, and generates an output signal to the corresponding speaker 50 (step S106). The reverberation adding apparatus 1 according to the present embodiment can support real-time processing. In this case, as long as sound continues to be input directly to the acoustic signal input interface, the convolution processing unit 15 performs convolution processing of impulse responses and the corresponding speaker 50. Playback by is continued.

このように、本実施形態によれば、シーン選択部１１は、映像から残響付加の基準となるシーン画像を選択し、反射率選択部１３は、シーン画像の各領域の反射率を選択し、インパルス応答選択部１４は、前方チャンネルの構成と、各領域の反射率に基づく特徴量とに基づき、前方チャンネルに対するインパルス応答を選択し、畳み込み処理部１５は、前方チャンネルの直接音にインパルス応答を畳み込む。これにより、映像に基づきマルチチャンネル音響方式の音響信号に残響を付加することが可能となる。特に、制作者の技量により映像に合わせて空間の響きの特徴を再現する作業を、生成したい残響空間を映像のコマのうち代表的な静止画を選択するだけで自動的に再現することが可能となる。これによって、制作者の技能の高低によらず、一定以上の臨場感のある残響空間を生成することができる。また、従来のように１チャンネルごとに手動で調整する手間を減らして残響付加に対するユーザーの作業を簡略化し、コンテンツ制作の所要時間を大幅に短縮することが可能となる。なお、自動的に生成された残響空間を、制作者の意図に合わせて調整したい場合には、手作業、手入力によって意図する残響空間に変化させていくことができるため、映像と合致する残響空間の生成に対する作業時間の短縮と、さらに臨場感や演出効果を向上させるためのクリエイティブ性の確保の両者を実現できる。 As described above, according to the present embodiment, the scene selection unit 11 selects a scene image serving as a reference for adding reverberation from the video, and the reflectance selection unit 13 selects the reflectance of each area of the scene image, The impulse response selection unit 14 selects an impulse response for the front channel based on the configuration of the front channel and the feature quantity based on the reflectance of each region, and the convolution processing unit 15 applies the impulse response to the direct sound of the front channel. Fold it up. As a result, it is possible to add reverberation to the multi-channel audio signal based on the video. In particular, it is possible to automatically reproduce the reverberation space you want to generate simply by selecting a representative still image from the video frames, in order to reproduce the characteristics of the sound of the space according to the video depending on the skill of the creator It becomes. As a result, a reverberation space with a certain level of presence can be generated regardless of the skill level of the creator. In addition, it is possible to simplify the user's work for adding reverberation by reducing the manual adjustment for each channel as in the prior art, and to significantly reduce the time required for content production. If you want to adjust the automatically generated reverberation space according to the creator's intention, you can change it to the intended reverberation space by manual operation or manual input. It is possible to achieve both shortening of work time for space generation and securing of creativeness for further enhancing the sense of reality and the production effect.

また、領域解析部１２は、シーン画像の各領域の境界及び種類を解析し、反射率選択部１３は、各領域の種類に基づき各領域の反射率を選択する。これにより、ユーザーにとって直感的に理解しやすい領域の種類（道路、空等）に基づく反射率の設定が可能となるため、例えば各領域の画素値やＦＦＴ処理結果（数字列）などから直接反射率を求める場合に比べ、反射率の設定、管理が容易になるとともに、自動処理結果をユーザーが修正することが容易になるという利点がある。 The area analysis unit 12 analyzes the boundary and type of each area of the scene image, and the reflectance selection unit 13 selects the reflectance of each area based on the type of each area. This makes it possible to set the reflectance based on the type of area (road, sky, etc.) that is easy for the user to understand intuitively. For example, direct reflection from the pixel value of each area, the FFT processing result (numerical string), etc. As compared with the case of obtaining the rate, there are advantages that the reflectance can be easily set and managed, and the user can easily correct the automatic processing result.

また、反射率選択部１３は、各領域の種類に対応する素材に基づいて反射率を選択することができる。これにより、例えば「道路」のように「コンクリート」や「土」など素材が異なり得る領域について、より適切な反射率を選択し正しい残響付加を行うことが可能となる。 Moreover, the reflectance selection part 13 can select a reflectance based on the raw material corresponding to the kind of each area | region. As a result, it is possible to select a more appropriate reflectance and perform correct reverberation addition for a region such as “road” where the material may be different, such as “concrete” and “soil”.

また、シーン選択部１１は、ユーザーから入力された画像をシーン画像として選択することができる。例えば、ユーザーは、過去に同様の音響空間に対して適切な残響が付加された静止画をシーン画像として入力することにより、過去の残響パターンを現在の番組に対して簡易に付与することが可能になる。また、番組作成者の経験等により、番組映像中の音響空間より、他の音響空間の画像をシーン画像として残響を付加した方がより高い臨場感が得られる場合もあり、本発明はこのような柔軟な残響付加にも対応可能である。このように、映像に類似した残響空間を任意の画像データを基に生成することは、各々のシーンに最適な残響空間で音源を聴取することを可能とする。また、ユーザーの意図する所望の音響空間に適宜調整する余地を与えるものであるため、音源制作に伴うユーザーの作業の効率化とクリエイティビティの確保という両面を実現させることが可能となる。 The scene selection unit 11 can select an image input from the user as a scene image. For example, the user can easily add past reverberation patterns to the current program by inputting still images with appropriate reverberation added to the same acoustic space in the past as scene images. become. In addition, depending on the experience of the program creator, it may be possible to obtain a higher sense of realism by adding reverberation using an image in another acoustic space as a scene image than the acoustic space in the program video. It is possible to respond to flexible and flexible reverberation. Thus, generating a reverberation space similar to a video based on arbitrary image data makes it possible to listen to a sound source in the reverberation space optimal for each scene. In addition, since it gives a room for appropriate adjustment to the desired acoustic space intended by the user, it is possible to realize both the efficiency of the user's work and the securing of creativity associated with sound source production.

本発明を諸図面や実施例に基づき説明してきたが、当業者であれば本開示に基づき種々の変形や修正を行うことが容易であることに注意されたい。従って、これらの変形や修正は本発明の範囲に含まれることに留意されたい。例えば、各機能部、各ステップなどに含まれる機能などは論理的に矛盾しないように再配置可能であり、複数の機能部やステップなどを１つに組み合わせたり、或いは分割したりすることが可能である。 Although the present invention has been described based on the drawings and examples, it should be noted that those skilled in the art can easily make various modifications and corrections based on the present disclosure. Therefore, it should be noted that these variations and modifications are included in the scope of the present invention. For example, the functions included in each functional unit, each step, etc. can be rearranged so that there is no logical contradiction, and a plurality of functional units, steps, etc. can be combined into one or divided. It is.

１残響付加装置
１０制御部
１１シーン選択部
１２領域解析部
１３反射率選択部
１４インパルス応答選択部
１５畳み込み処理部
２０記憶部
２１領域種データベース
２２反射率データベース
２３インパルス応答データベース
２４音響信号データベース
３０表示部
４０操作部
５０スピーカ
DESCRIPTION OF SYMBOLS 1 Reverberation addition apparatus 10 Control part 11 Scene selection part 12 Area | region analysis part 13 Reflectivity selection part 14 Impulse response selection part 15 Convolution process part 20 Storage part 21 Area | region type database 22 Reflectance database 23 Impulse response database 24 Acoustic signal database 30 Display Part 40 Operation part 50 Speaker

Claims

A reverberation adding device that adds reverberation to the direct sound of a viewer's front channel arranged in and around a display unit that displays video, among multi-channel audio signals.
A scene selection unit that selects a scene image serving as a reference for adding reverberation from the video;
A reflectance selection unit that selects the reflectance of each region of the scene image;
An impulse response selector that selects an impulse response to the front channel based on the configuration of the front channel and a feature quantity based on the reflectance of each region;
A convolution processing unit that convolves the impulse response with the direct sound of the front channel;
A reverberation adding device.

An area analysis unit for analyzing the boundary and type of each area of the scene image;
The reverberation adding apparatus according to claim 1, wherein the reflectance selection unit selects the reflectance of each region based on a type of each region.

The reverberation adding apparatus according to claim 2, wherein the reflectance selection unit selects the reflectance based on a material corresponding to a type of each region.

The reverberation adding apparatus according to any one of claims 1 to 3, wherein the scene selection unit selects an image input from a user as the scene image.