JP2009296298A

JP2009296298A - Sound signal processing device and method

Info

Publication number: JP2009296298A
Application number: JP2008147755A
Authority: JP
Inventors: Kiyotaka Nagai; 清隆永井; Mikio Oda; 幹夫小田
Original assignee: Panasonic Corp
Current assignee: Panasonic Corp
Priority date: 2008-06-05
Filing date: 2008-06-05
Publication date: 2009-12-17

Abstract

【課題】音声信号に対する単一バンドのＤＲＣやＡＧＣによる息継ぎ現象等の不自然な音の時間的な変動感を抑える。
【解決手段】入力音声信号の周波数分析を行う周波数分析部１９０と、前記周波数分析結果に基づいて、ラウドネス平滑化部１４０とゲイン平滑化部１６０との時定数をそれぞれ算出するラウドネス平滑化時定数算出部１７０とゲイン平滑化時定数算出部１８０とを備え、低域周波数成分が所定の値より大きい場合、および／または全周波数成分に占める低域周波数成分の比率が所定の値よりも大きい場合には、前記時定数を大きくすることにより、不自然な音の時間的な変動感を抑えた聞き取りやすい音声信号にすることができる。
【選択図】図１An object of the present invention is to suppress an unnatural sound temporal fluctuation feeling such as a breathing phenomenon caused by a single band DRC or AGC for an audio signal.
A frequency analysis unit that performs frequency analysis of an input audio signal, and a loudness smoothing time constant that calculates time constants of a loudness smoothing unit and a gain smoothing unit based on the frequency analysis result, respectively. When the calculation unit 170 and the gain smoothing time constant calculation unit 180 are provided and the low frequency component is larger than the predetermined value and / or the ratio of the low frequency component to the total frequency component is larger than the predetermined value Therefore, by increasing the time constant, it is possible to obtain an audio signal that is easy to hear and suppresses the temporal fluctuation of unnatural sound.
[Selection] Figure 1

Description

本発明は、音声信号のダイナミック・レンジを制御する音声信号処理装置および方法に関するものである。 The present invention relates to an audio signal processing apparatus and method for controlling the dynamic range of an audio signal.

音声信号のダイナミック・レンジを適切に抑えて、聞き取りやすい音声信号にする音声信号処理方法としては、ダイナミック・レンジ・コントロール／コンプレッション（Dynamic Range Control/Compression:ＤＲＣ）や自動利得制御（Automatic Gain Control：ＡＧＣ）が知られている。 As an audio signal processing method that appropriately suppresses the dynamic range of the audio signal to make the audio signal easy to hear, Dynamic Range Control / Compression (DRC) or Automatic Gain Control (Automatic Gain Control): AGC) is known.

図６は、非特許文献１に記載されたＤＲＣを使用する従来の音声信号処理装置の構成を示すブロック図である。図６において、６００は乗算部、６１０はラウドネス測定部、６２０はゲイン算出部、６３０はゲイン平滑化部、６４０はゲイン平滑化時定数算出部である。以下その動作について説明する。 FIG. 6 is a block diagram showing a configuration of a conventional audio signal processing apparatus using DRC described in Non-Patent Document 1. In FIG. 6, 600 is a multiplication unit, 610 is a loudness measurement unit, 620 is a gain calculation unit, 630 is a gain smoothing unit, and 640 is a gain smoothing time constant calculation unit. The operation will be described below.

ラウドネス測定部６１０は、入力音声信号のラウドネス（聴覚による音の大きさの尺度）を測定して出力する。ラウドネス測定部６１０は、周波数重み付けフィルタを通した後の入力音声信号の２乗平均をブロック単位（サンプリング周波数４８ｋＨｚで２５６サンプル）で算出して、ラウドネスとして出力する。 The loudness measuring unit 610 measures and outputs the loudness (a measure of the volume of sound by hearing) of the input audio signal. The loudness measurement unit 610 calculates the mean square of the input audio signal after passing through the frequency weighting filter in units of blocks (256 samples at a sampling frequency of 48 kHz) and outputs it as loudness.

ゲイン算出部６２０は、前記入力音声信号のダイナミック・レンジを制御するためにあらかじめ設定されたラウドネス対ゲイン関数に基づいて、前記ラウドネスからゲインを算出する。 The gain calculation unit 620 calculates a gain from the loudness based on a loudness versus gain function set in advance to control the dynamic range of the input audio signal.

ゲイン平滑化時定数算出部６４０は、前記ゲインを時間平滑化するときの時定数を算出し、ゲイン平滑化時定数として出力する。ゲイン平滑化時定数算出部６４０は、ラウドネス測定部６１０で測定した現在のラウドネスから平滑化ラウドネス（過去のラウドネスを時間平滑化したもの）を減算することにより、ラウドネスの変動値を算出する。ゲイン平滑化時定数算出部６４０は、次に、前記ラウドネスの変動値に基づいて、次の４種類の時定数の１つを選択して出力する。なお、１）から４）にいくにしたがって、時定数の値が大きくなる。
１）高速アタック時定数（ラウドネス変動値が正で、アタックの閾値より大きい場合）
２）低速アタック時定数（ラウドネス変動値が正で、アタックの閾値以下の場合）
３）高速リリース時定数（ラウドネス変動値が負で、リリース閾値より小さい場合）
４）低速リリース時定数（ラウドネス変動値が負で、リリース閾値以上の場合）
ゲイン平滑化部６３０は、ゲイン算出部６２０からのゲインをゲイン平滑化時定数算出部６４０からのゲイン平滑化時定数で時間平滑化を行い、平滑化されたゲインを算出して出力する。 The gain smoothing time constant calculation unit 640 calculates a time constant when the gain is time smoothed, and outputs the time constant as a gain smoothing time constant. The gain smoothing time constant calculation unit 640 calculates a fluctuation value of the loudness by subtracting a smoothing loudness (a value obtained by temporally smoothing the past loudness) from the current loudness measured by the loudness measurement unit 610. Next, the gain smoothing time constant calculation unit 640 selects and outputs one of the following four types of time constants based on the fluctuation value of the loudness. Note that the value of the time constant increases from 1) to 4).
1) High-speed attack time constant (when the loudness variation is positive and greater than the attack threshold)
2) Low-speed attack time constant (when the loudness fluctuation value is positive and less than the attack threshold)
3) Fast release time constant (if the loudness variation is negative and less than the release threshold)
4) Low speed release time constant (if the loudness variation is negative and greater than the release threshold)
The gain smoothing unit 630 performs time smoothing on the gain from the gain calculation unit 620 with the gain smoothing time constant from the gain smoothing time constant calculation unit 640, and calculates and outputs the smoothed gain.

乗算部６００は、入力音声信号に前記平滑化されたゲインを乗算して出力音声信号を算出して出力する。 Multiplier 600 calculates and outputs an output audio signal by multiplying the input audio signal by the smoothed gain.

図６の装置では、入力音声信号のダイナミック・レンジを適切に抑え、静寂でない環境でも聞き取りやすい音声信号に処理して出力することができる。 In the apparatus of FIG. 6, the dynamic range of the input audio signal can be appropriately suppressed, and it can be processed into an audio signal that is easy to hear even in a non-quiet environment and output.

また、特許文献１には、ＡＧＣを使用して、入力信号レベルを一定レベルの信号にして出力する従来の自動利得制御装置について記載されている。 Further, Patent Document 1 describes a conventional automatic gain control device that uses AGC to output a signal having a constant input signal level.

非特許文献１および特許文献１に記載しているのは、全周波数帯域に対して一括してＤＲＣ処理あるいはＡＧＣ処理を行う単一バンドの処理であるが、非特許文献２には、音声信号を複数の周波数帯域（バンド）に分割して、それぞれの周波数帯域に対してＤＲＣ処理あるいはＡＧＣ処理を行うマルチバンドの処理について記載している。なお、ここでは、オーディオ用のコンプレッサ等で使用されている単一バンド、マルチバンドという言葉を使用するが、非特許文献２では、その代わりとして、補聴器等で使用されている単一チャンネル、マルチチャンネルという言葉を使用している。
特許第４０１４４２９号公報チャールズ・キュー・ロビンソン（Charles Q. Robinson）他１名，「メタデータによるダイナミック・レンジ制御（Dynamic Range Control via Metadata）」，第１０７回AESコンベンション（107th AES convention），１９９９年９月，プレプリント（Preprint）５０２８ハーベイ・ディロン（Harvey Dillon）著，中川雅文監訳，「補聴器ハンドブック」，医歯薬出版株式会社，２００４年１０月，第６章，ｐ．１５５−１８１ Non-Patent Document 1 and Patent Document 1 describe single-band processing that performs DRC processing or AGC processing collectively for all frequency bands. Is divided into a plurality of frequency bands (bands), and multiband processing for performing DRC processing or AGC processing on each frequency band is described. Here, the terms single-band and multi-band used in audio compressors and the like are used, but in Non-Patent Document 2, instead of single-channel and multi-band used in hearing aids and the like. The word channel is used.
Japanese Patent No. 4014429 Charles Q. Robinson and 1 other, "Dynamic Range Control via Metadata", 107th AES convention, September 1999, preprint (Preprint) 5028 Harvey Dillon, translated by Masafumi Nakagawa, “Hearing Aid Handbook”, Ishigaku Shuppan Publishing Co., Ltd., October 2004, Chapter 6, p. 155-181

しかしながら、前記非特許文献１および特許文献１に記載された従来の単一バンドの音声信号処理装置では、息継ぎ現象（信号の変動に応じて雑音や背景音のレベルが変動し、息継ぎをするように感じられる現象）等の不自然な音の時間的な変動感があるという課題を有していた。また、息継ぎ現象を抑えるために、低速なリリース時定数を使用すると、時間的な変動感は抑えられるものの、音声信号のゲインが抑えられる時間が長くなり、打音の響き等の情報量が減少する、という課題を有していた。前記非特許文献１に記載された音声信号装置では、複数の時定数を適応的に切り替えることにより、前記息継ぎ現象に関する課題を改善している。しかしながら、更なる改善が期待されていた。 However, in the conventional single-band audio signal processing apparatus described in Non-Patent Document 1 and Patent Document 1, breathing phenomenon (the level of noise or background sound varies according to the variation of the signal, and breathing is performed. There is a problem that there is a sense of temporal fluctuation of unnatural sound such as a phenomenon that can be felt in Also, if you use a slow release time constant to suppress the breathing phenomenon, the time fluctuation will be suppressed, but the time that the gain of the audio signal will be suppressed becomes longer, and the amount of information such as the sound of the hitting sound will decrease. I had the problem of doing. In the audio signal device described in Non-Patent Document 1, the problem relating to the breath-pass phenomenon is improved by adaptively switching a plurality of time constants. However, further improvements were expected.

一方、非特許文献２に記載されたマルチバンド処理では、バンド毎に独立した処理を行うことにより、息継ぎ現象は発生しにくいが、スペクトルの形状がフラットになり、バンド間の周波数バランスが変化する、という課題を有していた。 On the other hand, in the multi-band processing described in Non-Patent Document 2, by performing independent processing for each band, the breath-joining phenomenon hardly occurs, but the spectrum shape becomes flat and the frequency balance between the bands changes. , Had a problem of.

本発明は、前記従来の課題を解決するもので、音声信号に対する単一バンドのＤＲＣやＡＧＣによる息継ぎ現象等の不自然な音の時間的な変動感を抑えた音声信号処理装置および方法を提供することを目的とする。 The present invention solves the above-described conventional problems, and provides an audio signal processing apparatus and method that suppresses a time-dependent fluctuation of an unnatural sound such as a breathing phenomenon caused by a single band DRC or AGC for an audio signal. The purpose is to do.

この課題を解決するために、本発明の音声信号処理装置は、入力音声信号にゲインを乗算し、出力音声信号を算出する乗算部と、前記入力音声信号または出力音声信号のラウドネスを測定するラウドネス測定部と、前記測定されたラウドネスを所定の時定数で時間平滑化を行うラウドネス平滑化部と、前記平滑化されたラウドネスに基づいて前記ゲインを算出するゲイン算出部と、前記入力音声信号または出力音声信号の周波数分析を行う周波数分析部と、前記周波数分析部の分析結果に基づいて、前記ラウドネス平滑化部の時定数を算出するラウドネス平滑化時定数算出部と、を備えたものであり、前記周波数分析結果に基づいて算出した時定数でラウドネスの時間平滑化を行うことにより、息継ぎ現象等の不自然な音の時間的な変動感を抑えることができる。 In order to solve this problem, an audio signal processing device according to the present invention multiplies an input audio signal by a gain and calculates an output audio signal, and a loudness for measuring the loudness of the input audio signal or the output audio signal. A measurement unit; a loudness smoothing unit that performs time smoothing on the measured loudness with a predetermined time constant; a gain calculation unit that calculates the gain based on the smoothed loudness; and the input audio signal or A frequency analysis unit that performs frequency analysis of an output audio signal; and a loudness smoothing time constant calculation unit that calculates a time constant of the loudness smoothing unit based on an analysis result of the frequency analysis unit. By performing time smoothing of the loudness with the time constant calculated based on the frequency analysis result, it is possible to reduce the unnatural sound temporal fluctuation feeling such as the breath tie phenomenon. It can be obtained.

また、前記ラウドネス平滑化時定数算出部は、低域周波数成分が所定の値よりも大きい場合、および／または全周波数成分に占める低域周波数成分の比率が所定の値よりも大きい場合には、前記ラウドネス平滑化部の時定数を大きくするように算出することを特徴とするものである。 Further, the loudness smoothing time constant calculation unit, when the low frequency component is larger than a predetermined value, and / or when the ratio of the low frequency component to the total frequency component is larger than a predetermined value, The calculation is performed to increase the time constant of the loudness smoothing unit.

また、さらに、前記ゲインを所定の時定数で時間平滑化を行い、平滑化されたゲインを算出するゲイン平滑化部を備え、前記乗算部は、前記入力音声信号に前記平滑化されたゲインを乗算することを特徴とするものである。 The gain further includes a gain smoothing unit that performs time smoothing on the gain with a predetermined time constant and calculates a smoothed gain, and the multiplication unit adds the smoothed gain to the input audio signal. It is characterized by multiplication.

また、さらに、前記周波数分析結果に基づいて、前記ゲイン平滑化部の時定数を算出するゲイン平滑化時定数算出部を備えたことを特徴とするものである。 Furthermore, a gain smoothing time constant calculation unit that calculates a time constant of the gain smoothing unit based on the frequency analysis result is further provided.

また、入力音声信号に平滑化されたゲインを乗算し、出力音声信号を算出する乗算部と、前記入力音声信号または出力音声信号のラウドネスを測定するラウドネス測定部と、前記測定されたラウドネスに基づいてゲインを算出するゲイン算出部と、前記算出されたゲインを所定の時定数で時間平滑化を行い、平滑化されたゲインを算出するゲイン平滑化部と、前記入力音声信号または出力音声信号の周波数分析を行う周波数分析部と、前記周波数分析部の分析結果に基づいて、前記ゲイン平滑化部の時定数を算出するゲイン平滑化時定数算出部と、を備えたことを特徴とするものである。 Further, the input audio signal is multiplied by a smoothed gain to calculate an output audio signal, a loudness measuring unit for measuring the loudness of the input audio signal or the output audio signal, and based on the measured loudness A gain calculating unit that calculates gain, a time smoothing of the calculated gain with a predetermined time constant, and a smoothing gain to calculate the gain, and the input audio signal or the output audio signal A frequency analysis unit that performs frequency analysis; and a gain smoothing time constant calculation unit that calculates a time constant of the gain smoothing unit based on an analysis result of the frequency analysis unit. is there.

また、前記ゲイン平滑化時定数算出部は、低域周波数成分が所定の値よりも大きい場合、および／または全周波数成分に占める低域周波数成分の比率が所定の値よりも大きい場合には、前記ゲイン平滑化部の時定数を大きくするように算出することを特徴とするものである。 Further, the gain smoothing time constant calculation unit, when the low frequency component is larger than a predetermined value, and / or when the ratio of the low frequency component to the total frequency component is larger than a predetermined value, The gain smoothing unit is calculated so as to increase the time constant.

また、前記低域周波成分が３０Ｈｚ乃至１５０Ｈｚ以下の成分であることを特徴とするものである。 Further, the low frequency component is a component of 30 Hz to 150 Hz or less.

本発明によれば、最小可聴限以上の音に対しては、低域周波数の音は、中高域周波数の音と比較して、同じ信号レベルの変化に対して音の大きさの変化が大きい、という心理音響学の知見を利用して、低域周波数成分が大きい場合、および／または全周波数成分に占める低域周波数成分の比率が大きい場合には、比較的大きな値の時定数でラウドネスおよび／またはゲインの時間平滑化を行うことにより、単一バンドのＤＲＣやＡＧＣの不自然な音の時間的な変動感を抑えた聞き取りやすい音声信号にすることができる。単一バンドのゲイン制御であるので、ゲイン制御によって周波数バランスが変化しない。 According to the present invention, for a sound above the minimum audible limit, a low frequency sound has a large change in sound volume with respect to the same signal level change compared to a medium high frequency sound. When the low frequency component is large and / or when the ratio of the low frequency component to the total frequency component is large, the loudness and the time constant of a relatively large value are utilized. By performing time smoothing of the gain, it is possible to obtain an audio signal that can be easily heard while suppressing the unnatural sound of single band DRC or AGC over time. Since it is single band gain control, the frequency balance is not changed by gain control.

以下本発明を実施するための最良の形態について、図面を参照しながら説明する。 The best mode for carrying out the present invention will be described below with reference to the drawings.

（実施の形態１）
図１は、本発明の実施の形態１における音声信号処理装置の構成を示すブロック図である。図１において、１００は入力音声信号を重畳するブロックに分割する重畳ブロック分割部、１１０はブロック分割部１００の出力とゲイン平滑化部１６０の出力とを乗算する乗算部、１２０は１１０の出力を重畳加算して出力音声信号を合成する重畳加算合成部、１３０は入力音声信号のラウドネスを測定するラウドネス測定部、１４０はラウドネス平滑化時定数算出部１７０の出力を使用してラウドネス測定部１３０の出力を時間平滑化するラウドネス平滑化部、１５０はラウドネス平滑化部１４０の出力からゲインを算出するゲイン算出部、１６０はゲイン平滑化時定数算出部１８０の出力を使用してゲイン算出部１５０の出力を時間平滑化するゲイン平滑化部、１７０はラウドネス平滑化部１４０の出力と周波数分析部１９０の出力とからラウドネス平滑化部１４０の時定数を算出するラウドネス平滑化時定数算出部、１８０はラウドネス平滑化部１４０の出力と周波数分析部１９０の出力とからゲイン平滑化部１６０の時定数を算出するゲイン平滑化時定数算出部、１９０は入力音声信号の周波数分析を行い、分析結果を出力する周波数分析部である。 (Embodiment 1)
FIG. 1 is a block diagram showing a configuration of an audio signal processing device according to Embodiment 1 of the present invention. In FIG. 1, 100 is a superposed block dividing unit that divides an input audio signal into blocks to be superimposed, 110 is a multiplying unit that multiplies the output of the block dividing unit 100 and the output of the gain smoothing unit 160, and 120 is the output of 110. A superposition addition synthesis unit that synthesizes the output audio signal by superposition addition, 130 is a loudness measurement unit that measures the loudness of the input audio signal, and 140 is an output of the loudness smoothing time constant calculation unit 170 using the output of the loudness smoothing time constant calculation unit 170. A loudness smoothing unit 150 that smoothes the output time, 150 a gain calculation unit that calculates a gain from the output of the loudness smoothing unit 140, and 160 an output of the gain smoothing time constant calculation unit 180 that uses the output of the gain smoothing time constant calculation unit 180. A gain smoothing unit 170 that temporally smoothes the output, 170 is an output of the loudness smoothing unit 140 and an output of the frequency analysis unit 190 A loudness smoothing time constant calculation unit 180 for calculating the time constant of the loudness smoothing unit 140, and a gain 180 for calculating the time constant of the gain smoothing unit 160 from the output of the loudness smoothing unit 140 and the output of the frequency analysis unit 190. A smoothing time constant calculation unit 190 is a frequency analysis unit that performs frequency analysis of an input audio signal and outputs an analysis result.

図１の音声信号処理装置は、単一バンドのブロック単位のＤＲＣ処理を行う。以下、その動作について説明する。 The audio signal processing apparatus in FIG. 1 performs DRC processing in units of single band blocks. The operation will be described below.

重畳ブロック分割部１００は、入力音声信号を重畳するブロックに分割して出力する。すなわち、（数１）に示すように、入力音声信号ｘ（ｎ）に対して５０％重畳する窓関数ｗ（ｎ）を乗算することにより、重畳するブロックに分割した分割音声信号ｙ（ｎ，ｔ）を算出する。ここで、ｎはサンプル時刻番号、ｔはブロック番号、Ｎはブロック長で、Ｎ／２はブロックシフト長を表す。Ｎの値としては、サンプリング周波数Ｆｓが４８ｋＨｚの場合には、例えば、６４から４０９６に設定される。 The superposition block division unit 100 divides the input audio signal into blocks to be superposed and outputs them. That is, as shown in (Equation 1), by dividing the input speech signal x (n) by the window function w (n) that is superimposed by 50%, the divided speech signal y (n, t) is calculated. Here, n is a sample time number, t is a block number, N is a block length, and N / 2 is a block shift length. The value of N is set to, for example, 64 to 4096 when the sampling frequency Fs is 48 kHz.

窓関数ｗ（ｎ）としては、例えば、（数２）に示すハニング窓を用いる。 As the window function w (n), for example, a Hanning window shown in (Expression 2) is used.

乗算部１１０は、重畳ブロック分割部１００からの分割音声信号にゲイン平滑化部１６０からの平滑化されたゲインＧｓ（ｔ）を乗算し、ゲイン制御された分割音声信号を出力する。 Multiplier 110 multiplies the divided audio signal from superimposed block divider 100 by the smoothed gain Gs (t) from gain smoother 160 and outputs a gain-controlled divided audio signal.

重畳加算合成部１２０は、乗算部１１０からのゲイン制御された分割音声信号を（数３）に示すように重畳加算することにより、出力音声信号ｚ（ｎ）を合成して出力する。 The superposition addition synthesis unit 120 synthesizes and outputs the output audio signal z (n) by superimposing and adding the gain-controlled divided audio signal from the multiplication unit 110 as shown in (Equation 3).

以上のようにして、クロスフェード重畳加算を行い、ブロック単位の分割音声信号を滑らかに接続することができる。 As described above, cross-fade superposition addition is performed, and divided audio signals in units of blocks can be smoothly connected.

ラウドネス測定部１３０は、前記入力音声信号のブロック単位のラウドネスを測定して出力する。ラウドネスの測定方法としては、各種の方法が提案されているが、実施の形態１では、重み付けフィルタによる方法を用いる。すなわち、最初に、入力音声信号ｘ（ｎ）に対して周波数重み付けフィルタを通した信号ｖ（ｎ）を作成する。次に、（数４）に示すように、前記重み付けフィルタを通した信号ｖ（ｎ）に対して、前記窓関数ｗ（ｎ）を掛けてブロックに分割し、前記ブロック単位で２乗平均を算出することにより、ブロック番号ｔのラウドネスＬ（ｔ）を測定する。前記周波数重み付けフィルタとしては、例えば、ＩＴＵ−Ｒのラウドネス測定の標準化に関する勧告ＢＳ．１７７０に記載の特性を有するフィルタを用いる。 The loudness measuring unit 130 measures and outputs the loudness of the input audio signal in units of blocks. Various methods have been proposed as a loudness measurement method. In the first embodiment, a method using a weighting filter is used. That is, first, a signal v (n) obtained by passing a frequency weighting filter is created for the input audio signal x (n). Next, as shown in (Equation 4), the signal v (n) that has passed through the weighting filter is multiplied by the window function w (n) and divided into blocks. By calculating, the loudness L (t) of the block number t is measured. As the frequency weighting filter, for example, the recommendation BS.1 regarding standardization of ITU-R loudness measurement. A filter having the characteristics described in 1770 is used.

ラウドネス平滑化部１４０は、最初に、ラウドネス測定部１３０からのブロック番号ｔのラウドネスＬ（ｔ）と１つ前のブロック番号（ｔ−１）の平滑化されたラウドネスＬｓ（ｔ−１）とを比較して、Ｌ（ｔ）＞Ｌｓ（ｔ−１）の場合にはアタック状態、そうでない場合にはリリース状態、と判定して結果をアタック／リリース情報として、ラウドネス平滑化時定数算出部１７０とゲイン平滑化時定数算出部１８０とに出力する。次に、ラウドネス平滑化部１４０は、前記アタック／リリース情報に基づいて、ラウドネス平滑化時定数算出部１７０が算出したラウドネス平滑化時定数Ｔｌ（ｔ）でラウドネス測定部１３０からのラウドネスＬ（ｔ）の時間平滑化を行い、平滑化されたラウドネスＬｓ（ｔ）を算出して出力する。ラウドネスの時間平滑化は（数５）にしたがって行う。（数５）で、Ａの値は、（数６）にしたがって算出される。 The loudness smoothing unit 140 first receives the loudness L (t) of the block number t from the loudness measuring unit 130 and the smoothed loudness Ls (t−1) of the previous block number (t−1). When L (t)> Ls (t−1), the attack state is determined. Otherwise, the release state is determined. The result is used as attack / release information, and the loudness smoothing time constant calculation unit is determined. 170 and gain smoothing time constant calculation section 180. Next, the loudness smoothing unit 140 uses the loudness smoothing time constant Tl (t) calculated by the loudness smoothing time constant calculating unit 170 based on the attack / release information, and the loudness L (t (t) from the loudness measuring unit 130. ) Is calculated, and the smoothed loudness Ls (t) is calculated and output. The time smoothing of the loudness is performed according to (Equation 5). In (Equation 5), the value of A is calculated according to (Equation 6).

ゲイン算出部１５０は、（数７）に示すように、あらかじめ設定されたラウドネス対ゲインのＤＲＣ関数Ｆ（）を使用して、ラウドネス平滑化部１４０からの平滑化されたラウドネスＬｓ（ｔ）に基づいて、ブロック単位のゲインＧ（ｔ）を算出する。 As shown in (Equation 7), the gain calculating unit 150 uses the preset loudness versus gain DRC function F () to obtain the smoothed loudness Ls (t) from the loudness smoothing unit 140. Based on this, a gain G (t) for each block is calculated.

図２は、前記ラウドネス対ゲインのＤＲＣ関数の一例を示す模式図である。図２で横軸は対数軸で表したラウドネスで、縦軸はｄＢ軸（２０ｌｏｇ₁₀Ｇ（ｔ））で表したゲインである。図２のＤＲＣ関数は、次の３つの領域に分類される。
１）ラウドネスが第１の閾値Ｌ１以下の場合には、Ｇ（ｔ）＞１で増幅する。
２）ラウドネスが第１の閾値より大きく、第２の閾値Ｌ２以下の場合にはＧ（ｔ）＝１（すなわち、０ｄＢ）でゲインの変更を行わない。
３）ラウドネスが第２の閾値より大きい場合には、Ｇ（ｔ）＜１で減衰する。 FIG. 2 is a schematic diagram showing an example of the DRC function of the loudness versus gain. In FIG. 2, the horizontal axis represents the loudness represented by the logarithmic axis, and the vertical axis represents the gain represented by the dB axis (20 log ₁₀ G (t)). The DRC function in FIG. 2 is classified into the following three areas.
1) When the loudness is equal to or less than the first threshold L1, amplification is performed with G (t)> 1.
2) When the loudness is larger than the first threshold and equal to or smaller than the second threshold L2, G (t) = 1 (that is, 0 dB) is not changed.
3) If loudness is greater than the second threshold, attenuate with G (t) <1.

図２のＤＲＣ関数は、小さい音を増幅し、大きい音を減衰するので、ダイナミック・レンジを抑えて、静寂でない環境でも聞き取りやすい音声信号にすることができる。 The DRC function shown in FIG. 2 amplifies a small sound and attenuates a loud sound. Therefore, the dynamic range can be suppressed and an audio signal that can be easily heard even in a quiet environment can be obtained.

ゲイン平滑化部１６０は、ゲイン平滑化時定数算出部１８０で算出されたゲイン平滑化時定数Ｔｇ（ｔ）で、ゲイン算出部１５０からのゲインＧ（ｔ）の時間平滑化を行い、平滑化されたゲインＧｓ（ｔ）を算出して出力する。前記ゲインの時間平滑化は（数８）にしたがって行う。（数８）で、Ｂの値は、（数９）にしたがって算出される。 The gain smoothing unit 160 performs time smoothing of the gain G (t) from the gain calculation unit 150 with the gain smoothing time constant Tg (t) calculated by the gain smoothing time constant calculation unit 180, and smoothes the smoothing. The calculated gain Gs (t) is calculated and output. The time smoothing of the gain is performed according to (Equation 8). In (Equation 8), the value of B is calculated according to (Equation 9).

周波数分析部１９０は、前記入力音声信号の前記ブロック単位の周波数分析を行い、周波数分析結果を出力する。周波数分析部１９０では、最初に、入力音声信号ｘ（ｎ）に対して（数１０）の短時間離散的フーリエ変換を行い、フーリエ変換係数Ｘ（ｋ，ｔ）を算出する。ここで、ｗ（ｎ）は前記窓関数であり、ｋはフーリエ変換係数番号を表す。なお、短時間離散的フーリエ変換は、高速フーリエ変換を使用して、効率的に実行することができる。 The frequency analysis unit 190 performs frequency analysis of the input audio signal in units of blocks and outputs a frequency analysis result. First, the frequency analysis unit 190 performs a short-time discrete Fourier transform of (Equation 10) on the input speech signal x (n) to calculate a Fourier transform coefficient X (k, t). Here, w (n) is the window function, and k represents a Fourier transform coefficient number. Note that the short-time discrete Fourier transform can be efficiently performed using a fast Fourier transform.

次に、（数１１）に従って、周波数帯域毎に前記フーリエ変換係数をグループ化し、各周波数帯域の信号レベルＰ（ｍ，ｔ）を算出する。ここで、ｍは周波数帯域番号、Ｍは周波数帯域の数を表す。 Next, according to (Equation 11), the Fourier transform coefficients are grouped for each frequency band, and the signal level P (m, t) of each frequency band is calculated. Here, m represents a frequency band number, and M represents the number of frequency bands.

（数１１）で、ｋ１（ｍ）とｋ２（ｍ）は、それぞれ周波数帯域ｍの開始フーリエ変換係数番号と終了フーリエ変換係数番号を表す。実施の形態１では、Ｍ＝２であり、３０Ｈｚ乃至１５０Ｈｚ以下の低域周波数成分Ｐ（０，ｔ）とそれ以外の中高域周波数成分Ｐ（１，ｔ）とにグループ化して、分析結果として出力する。なお、各周波数帯域の信号レベルＰ（ｍ，ｔ）の算出時に、（数１１）の代わりに、（数１２）に示すように、重み付け係数ｃ（ｋ）を掛けて算出しても良い。 In (Equation 11), k1 (m) and k2 (m) represent the start Fourier transform coefficient number and the end Fourier transform coefficient number of the frequency band m, respectively. In the first embodiment, M = 2, and the low frequency component P (0, t) of 30 Hz to 150 Hz or less is grouped into the other middle / high frequency components P (1, t), and the analysis result is obtained. Output. When calculating the signal level P (m, t) of each frequency band, it may be calculated by multiplying the weighting coefficient c (k) as shown in (Equation 12) instead of (Equation 11).

また、短時間離散的フーリエ変換の代わりに帯域分割フィルタバンクを使用して各周波数帯域の信号レベルを算出しても良い。 Alternatively, the signal level of each frequency band may be calculated using a band division filter bank instead of the short-time discrete Fourier transform.

ラウドネス平滑化時定数算出部１７０は、ラウドネス平滑化部１４０からのアタック／リリース情報と周波数分析部１９０からの各周波数帯域の信号レベルとに基づいて、ラウドネス平滑化部１４０で使用するラウドネス平滑化時定数を算出して出力する。ラウドネス平滑化時定数算出部１７０は、前記アタック／リリース情報に基づいて、アタック状態とリリース状態とで異なる値の時定数を算出する。アタック状態の場合には、リリース状態と比較して、通常、小さな値の（高速な）時定数となるように算出する。 The loudness smoothing time constant calculation unit 170 is based on the attack / release information from the loudness smoothing unit 140 and the signal level of each frequency band from the frequency analysis unit 190, and the loudness smoothing used in the loudness smoothing unit 140. Calculate and output the time constant. The loudness smoothing time constant calculation unit 170 calculates time constants having different values for the attack state and the release state based on the attack / release information. In the attack state, it is usually calculated so as to have a small (fast) time constant as compared to the release state.

実施の形態１では、低域周波数成分Ｐ（０，ｔ）が、所定の値以下の場合には、小さな値の（高速な）時定数を算出し、そうでない場合には、全周波数成分（Ｐ（０，ｔ）＋Ｐ（１，ｔ））に占める低域周波数成分Ｐ（０，ｔ）の比率が所定の値よりも大きい場合には大きな値の時定数を算出する。図３は、前記全周波数成分に占める低域周波数成分の比率に対する時定数の関数の一例を示す模式図で、前記比率が所定の値Ｒ１より大きな場合には、前記比率に応じて大きな値の時定数を算出する。 In the first embodiment, when the low-frequency component P (0, t) is equal to or smaller than a predetermined value, a small (fast) time constant is calculated. Otherwise, all frequency components ( When the ratio of the low frequency component P (0, t) to P (0, t) + P (1, t)) is larger than a predetermined value, a large time constant is calculated. FIG. 3 is a schematic diagram showing an example of a function of a time constant with respect to the ratio of the low frequency components occupying the total frequency components. When the ratio is larger than a predetermined value R1, a large value is set according to the ratio. Calculate the time constant.

図４は、ＩＳＯ２２６：２００３「音響−正常な音の大きさの等感曲線」に記載された自由音場試聴条件下の耳科学的に正常な人に対する純音の音の大きさの基準等感曲線である。各曲線上の点は、純音の周波数に関係なく、すべて同じ大きさに聞こえる。同じ大きさに聞こえるための各周波数の音圧レベルが縦軸で示されている。図４からわかるように、最小可聴限（Hearing threshold）以上の音に対しては、低域周波数の音は中高域周波数の音と比較して、同じ信号レベルの変化に対して音の大きさの変化が大きい。すなわち、中高域周波数の音と比較して、低域周波数の音の方が、聞こえ始めると、信号レベルの変化による音の大きさの変化が大きい。 FIG. 4 shows the standard of the loudness of a pure tone for an otologically normal person under free sound field listening conditions described in ISO 226: 2003 “Acoustics-Normal sound loudness sense curve” It is a feeling curve. The points on each curve all sound the same regardless of the frequency of the pure tone. The sound pressure level of each frequency for sounding the same magnitude is shown on the vertical axis. As can be seen from FIG. 4, for sounds above the minimum audible limit (Hearing threshold), the sound of the low frequency is louder than the sound of the middle and high frequencies with respect to the same signal level change. The change is large. That is, when the sound of the low frequency starts to be heard compared to the sound of the middle and high frequencies, the change in the sound level due to the change of the signal level is large.

したがって、ラウドネス平滑化時定数算出部１７０で、前記心理音響学の知見を利用して、低域周波数成分が所定の値より大きい場合、および全周波数成分に占める低域周波数成分の比率が所定の値よりも大きい場合には、平滑化時定数を大きくするように算出することにより、音の大きさの変化を抑えることができる。 Therefore, the loudness smoothing time constant calculation unit 170 uses the knowledge of the psychoacoustics when the low frequency component is larger than a predetermined value and the ratio of the low frequency component to the total frequency component is predetermined. When the value is larger than the value, a change in the volume of the sound can be suppressed by calculating so as to increase the smoothing time constant.

ゲイン平滑化時定数算出部１８０は、ラウドネス平滑化部１４０からのアタック／リリース情報と周波数分析部１９０からの各周波数帯域の信号レベルとに基づいて、ゲイン平滑化部１６０で使用するゲイン平滑化時定数を算出して出力する。ゲイン平滑化時定数算出部１８０の動作は、ラウドネス平滑化時定数算出部１７０の動作と同様であり、説明を省略する。 The gain smoothing time constant calculation unit 180 uses the gain smoothing unit 160 based on the attack / release information from the loudness smoothing unit 140 and the signal level of each frequency band from the frequency analysis unit 190. Calculate and output the time constant. The operation of the gain smoothing time constant calculation unit 180 is the same as the operation of the loudness smoothing time constant calculation unit 170, and a description thereof will be omitted.

なお、ラウドネス平滑化時定数算出部１７０とゲイン平滑化時定数算出部１８０とで算出する時定数を同一の値としても良い。この場合には、ラウドネス平滑化時定数算出部１７０とゲイン平滑化時定数算出部１８０は兼用することができるので、いずれか一方で良い。 Note that the time constants calculated by the loudness smoothing time constant calculation unit 170 and the gain smoothing time constant calculation unit 180 may be the same value. In this case, either the loudness smoothing time constant calculation unit 170 and the gain smoothing time constant calculation unit 180 can be used together, and either one is sufficient.

以上のように実施の形態１の音声信号処理装置では、入力音声信号の周波数分析を行う周波数分析部１９０と、周波数分析部１９０の分析結果に基づいて、ラウドネス平滑化とゲイン平滑化の時定数をそれぞれ算出するラウドネス平滑化時定数算出部１７０とゲイン平滑化時定数算出部１８０とを備えることにより、低域周波数では、信号レベルの変化に対する音の大きさの変化が大きい、という心理音響学の知見を利用して、低域周波数成分が所定の値より大きい場合、および全周波数成分に占める低域周波数成分の比率が所定の値よりも大きい場合には、前記時定数を大きくして、前記ラウドネス平滑化と前記ゲイン平滑化を行うので、音の変動感を抑えた聞き取りやすい音声信号にすることができる。 As described above, in the audio signal processing apparatus according to Embodiment 1, the frequency analysis unit 190 that performs frequency analysis of an input audio signal, and the time constants of loudness smoothing and gain smoothing based on the analysis result of the frequency analysis unit 190. Psychoacoustics that the loudness smoothing time constant calculating unit 170 and the gain smoothing time constant calculating unit 180 that respectively calculate the sound level have a large change in the sound level with respect to the change in the signal level at the low frequency. In the case where the low frequency component is larger than the predetermined value and the ratio of the low frequency component to the total frequency component is larger than the predetermined value, the time constant is increased. Since the loudness smoothing and the gain smoothing are performed, it is possible to obtain an audio signal that is easy to hear and suppresses a sense of sound fluctuation.

なお、実施の形態１の音声信号処理装置では、ラウドネス平滑化部１４０とその時定数を算出するラウドネス平滑化時定数算出部１７０、およびゲイン平滑化部１６０とその時定数を算出するゲイン平滑化時定数算出部１８０をともに備えた構成としたが、前者か後者のいずれか一方のみを備えた構成としても良い。 In the audio signal processing apparatus according to the first embodiment, the loudness smoothing unit 140 and the loudness smoothing time constant calculating unit 170 for calculating the time constant thereof, and the gain smoothing unit 160 and the gain smoothing time constant for calculating the time constant thereof. Although the calculation unit 180 is included, the configuration may include only the former or the latter.

また、実施の形態１のラウドネス平滑化時定数算出部１７０とゲイン平滑化時定数算出部１８０とでは、低域周波数成分が所定の値よりも大きい場合、および全周波数成分に占める低域周波数成分の比率が所定の値よりも大きい場合には、時定数を大きくするように算出したが、前記２つの場合のいずれか一方の場合を満足するときには、時定数を大きくするように算出してもよい。 Further, in the loudness smoothing time constant calculation unit 170 and the gain smoothing time constant calculation unit 180 according to the first embodiment, when the low frequency component is larger than a predetermined value, and the low frequency component that occupies all the frequency components When the ratio is larger than a predetermined value, the time constant is calculated to be large. However, when either one of the two cases is satisfied, the time constant is calculated to be large. Good.

また、実施の形態１のラウドネス平滑化時定数算出部１７０とゲイン平滑化時定数算出部１８０とでは、低域周波数成分に基づいて時定数を算出するようにしたが、高域周波数成分に対しても同様な考え方を適用して、時定数を算出するようにしてもよい。 In addition, the loudness smoothing time constant calculation unit 170 and the gain smoothing time constant calculation unit 180 according to the first embodiment calculate the time constant based on the low frequency component. However, the same idea may be applied to calculate the time constant.

（実施の形態２）
図５は、本発明の実施の形態２における音声信号処理装置の構成を示すブロック図である。図１の実施の形態１の音声信号処理装置は、フィード・フォワード型のブロック単位のＤＲＣの構成であるのに対して、図５の実施の形態２の音声信号処理装置は、フィード・バック型のサンプル単位のＡＧＣの構成である。 (Embodiment 2)
FIG. 5 is a block diagram showing the configuration of the audio signal processing apparatus according to Embodiment 2 of the present invention. The audio signal processing apparatus according to Embodiment 1 in FIG. 1 has a feed-forward type block unit DRC configuration, whereas the audio signal processing apparatus according to Embodiment 2 in FIG. 5 has a feedback type. This is a configuration of AGC in units of samples.

図５において、５００は入力音声信号とゲイン平滑化部５４０の出力とを乗算する乗算部、５１０は出力音声信号のラウドネスを測定するラウドネス測定部、５２０はラウドネス平滑化時定数算出部５５０の出力を使用してラウドネス測定部５１０の出力を平滑化するラウドネス平滑化部、５３０はラウドネス平滑化部５２０の出力からゲインを算出するゲイン算出部、５４０はゲイン平滑化時定数算出部５６０の出力を使用してゲイン算出部５３０の出力を平滑化するゲイン平滑化部、５５０はラウドネス平滑化部５２０の出力と周波数分析部５７０の出力とからラウドネス平滑化部５２０の時定数を算出するラウドネス平滑化時定数算出部、５６０はラウドネス平滑化部５２０の出力と周波数分析部５７０の出力とからゲイン平滑化部５４０の時定数を算出するゲイン平滑化時定数算出部、５７０は出力音声信号の周波数分析を行い、分析結果を出力する周波数分析部である。図５の音声信号処理装置は、単一バンドのサンプル単位のＡＧＣ処理を行う。以下、その動作について説明する。 In FIG. 5, 500 is a multiplication unit that multiplies the input audio signal and the output of the gain smoothing unit 540, 510 is a loudness measurement unit that measures the loudness of the output audio signal, and 520 is an output of the loudness smoothing time constant calculation unit 550. Is used to smooth the output of the loudness measuring unit 510, 530 is a gain calculating unit that calculates a gain from the output of the loudness smoothing unit 520, and 540 is the output of the gain smoothing time constant calculating unit 560. A gain smoothing unit 550 that smoothes the output of the gain calculation unit 530 using a loudness smoothing that calculates a time constant of the loudness smoothing unit 520 from the output of the loudness smoothing unit 520 and the output of the frequency analysis unit 570. The time constant calculation unit 560 calculates the gain smoothing unit 5 from the output of the loudness smoothing unit 520 and the output of the frequency analysis unit 570. Gain smoothing time constant calculation unit for calculating a time constant of 0, 570 performs a frequency analysis of the output audio signal, a frequency analysis section for outputting the analysis result. The audio signal processing apparatus in FIG. 5 performs AGC processing in units of single band samples. The operation will be described below.

乗算部５００は、（数１３）に示すように、入力音声信号ｘ（ｎ）（ただし、ｎはサンプル時刻番号）にゲイン平滑化部５４０からの平滑化されたサンプル単位のゲインＧｓ（ｎ）を乗算し、ゲイン制御された出力音声信号ｚ（ｎ）を算出する。 As shown in (Equation 13), the multiplication unit 500 performs the gain Gs (n) of the input audio signal x (n) (where n is a sample time number) smoothed from the gain smoothing unit 540. To calculate the gain-controlled output audio signal z (n).

ラウドネス測定部５１０は、乗算部５００からの出力音声信号のラウドネスを測定して出力する。ラウドネスの測定方法として、実施の形態１では、周波数重み付けフィルタによる方法を用いたが、実施の形態２では、周波数重み付けフィルタを使用しないで（フラットな周波数特性で）、（数１４）に示すように、過去Ｎサンプルの出力音声信号の２乗平均を算出することにより、サンプル時刻番号ｎのラウドネスＬ（ｎ）を測定する。Ｎの値としては、例えば、サンプリング周波数Ｆｓが４８ｋＨｚの場合には、６４から４０９６に設定される。 The loudness measuring unit 510 measures and outputs the loudness of the output audio signal from the multiplying unit 500. As the loudness measurement method, the method using the frequency weighting filter is used in the first embodiment, but in the second embodiment, the frequency weighting filter is not used (with a flat frequency characteristic), as shown in (Expression 14). Then, the loudness L (n) of the sample time number n is measured by calculating the root mean square of the output audio signals of the past N samples. The value of N is set to 64 to 4096 when the sampling frequency Fs is 48 kHz, for example.

ラウドネス平滑化部５２０は、最初に、ラウドネス測定部５１０からのサンプル時刻番号ｎのラウドネスＬ（ｎ）と１つ前のサンプル時刻番号（ｎ−１）の平滑化されたラウドネスＬｓ（ｎ−１）とを比較して、Ｌ（ｎ）＞Ｌｓ（ｎ−１）の場合にはアタック状態、そうでない場合にはリリース状態、と判定して結果をアタック／リリース情報として、ラウドネス平滑化時定数算出部５５０とゲイン平滑化時定数算出部５６０とに出力する。次に、ラウドネス平滑化部５２０は、前記アタック／リリース情報に基づいて、ラウドネス平滑化時定数算出部５５０が算出したラウドネス平滑化時定数Ｔｌ（ｎ）でラウドネス測定部５１０からのラウドネスＬ（ｎ）の時間平滑化を行い、平滑化されたラウドネスＬｓ（ｎ）を算出して出力する。ラウドネスの時間平滑化は（数１５）に従って行う。（数１５）で、Ａの値は、（数１６）にしたがって算出される。 First, the loudness smoothing unit 520 first receives the loudness L (n) of the sample time number n from the loudness measuring unit 510 and the smoothed loudness Ls (n−1) of the previous sample time number (n−1). ), If L (n)> Ls (n−1), the attack state is determined, otherwise the release state is determined, and the result is used as attack / release information, and the loudness smoothing time constant is determined. It outputs to the calculation part 550 and the gain smoothing time constant calculation part 560. Next, the loudness smoothing unit 520 uses the loudness smoothing time constant Tl (n) calculated by the loudness smoothing time constant calculating unit 550 based on the attack / release information, and the loudness L (n ) Is smoothed, and the smoothed loudness Ls (n) is calculated and output. The loudness temporal smoothing is performed according to (Equation 15). In (Equation 15), the value of A is calculated according to (Equation 16).

ゲイン算出部５３０は、（数１７）に示すようにラウドネス平滑化部５２０からの平滑化されたラウドネスＬｓ（ｎ）とあらかじめ設定された目標ラウドネスＬＴとからサンプル単位のゲインＧ（ｎ）を算出する。なお、前記ゲインの値が、あらかじめ設定された上限値より大きい場合には、前記ゲインの値を前記上限値に制限する。また、前記ゲインの値があらかじめ設定された下限値よりも小さい場合には、前記ゲインの値を前記下限値に制限する。 The gain calculation unit 530 calculates the gain G (n) in units of samples from the smoothed loudness Ls (n) from the loudness smoothing unit 520 and the preset target loudness LT as shown in (Equation 17). To do. When the gain value is larger than a preset upper limit value, the gain value is limited to the upper limit value. When the gain value is smaller than a preset lower limit value, the gain value is limited to the lower limit value.

ゲイン平滑化部５４０は、ゲイン平滑化時定数算出部５６０からのゲイン平滑化時定数Ｔｇ（ｎ）でゲイン算出部５３０からのゲインＧ（ｎ）の時間平滑化を行い、平滑化されたゲインＧｓ（ｎ）を算出して出力する。前記ゲインの時間平滑化は（数１８）に従って行う。（数１８）で、Ｂの値は、（数１９）にしたがって算出される。 The gain smoothing unit 540 performs time smoothing of the gain G (n) from the gain calculation unit 530 with the gain smoothing time constant Tg (n) from the gain smoothing time constant calculation unit 560, and the smoothed gain Gs (n) is calculated and output. The time smoothing of the gain is performed according to (Equation 18). In (Equation 18), the value of B is calculated according to (Equation 19).

周波数分析部５７０は、最初に、前記出力音声信号ｚ（ｎ）を入力として、帯域分割フィルタバンクで、Ｍ個の周波数帯域に分割した信号Ｚ（ｍ，ｎ）（ただし、ｍは周波数帯域番号）を算出する。次に、（数２０）に従って、各周波数帯域の過去Ｎサンプルの２乗和を算出し、各周波数帯域の信号レベルＰ（ｍ，ｎ）として出力する。 First, the frequency analysis unit 570 receives the output audio signal z (n) as an input, and a signal Z (m, n) divided into M frequency bands by a band division filter bank (where m is a frequency band number). ) Is calculated. Next, according to (Equation 20), the sum of squares of the past N samples of each frequency band is calculated and output as a signal level P (m, n) of each frequency band.

実施の形態２の周波数分析部５７０では、実施の形態１と同様に、Ｍ＝２であり、３０Ｈｚ乃至１５０Ｈｚ以下の低域周波数成分Ｐ（０，ｎ）とそれ以外の中高域周波数成分Ｐ（１，ｎ）とを算出して分析結果として出力する。 In the frequency analysis unit 570 of the second embodiment, as in the first embodiment, M = 2, and the low frequency component P (0, n) of 30 Hz to 150 Hz or less and the other middle / high frequency components P ( 1, n) is calculated and output as an analysis result.

ラウドネス平滑化時定数算出部５５０は、ラウドネス平滑化部５２０からのアタック／リリース情報と周波数分析部５７０からの各周波数帯域の信号レベルとに基づいて、ラウドネス平滑化部５２０で使用する時定数を算出して出力する。ラウドネス平滑化時定数算出部５５０では、実施の形態１のラウドネス平滑化時定数算出部１７０と同様にして、低域周波数成分が所定の値より大きい場合、および全周波数に占める低域周波数成分の比率が所定の値よりも大きい場合には、前記時定数を大きくするように算出する。 The loudness smoothing time constant calculation unit 550 determines the time constant used by the loudness smoothing unit 520 based on the attack / release information from the loudness smoothing unit 520 and the signal level of each frequency band from the frequency analysis unit 570. Calculate and output. In the loudness smoothing time constant calculation unit 550, as in the case of the loudness smoothing time constant calculation unit 170 of the first embodiment, when the low frequency component is larger than a predetermined value, the low frequency component that occupies all the frequencies When the ratio is larger than a predetermined value, the time constant is calculated to be increased.

同様に、ゲイン平滑化時定数算出部５６０は、ラウドネス平滑化部５２０からのアタック／リリース情報と周波数分析部５７０からの各周波数帯域の信号レベルとに基づいて、ゲイン平滑化部５４０で使用する時定数を算出して出力する。ゲイン平滑化時定数算出部５６０の動作は、ラウドネス平滑化時定数算出部５５０の動作と同様であり、説明を省略する。 Similarly, the gain smoothing time constant calculation unit 560 is used by the gain smoothing unit 540 based on the attack / release information from the loudness smoothing unit 520 and the signal level of each frequency band from the frequency analysis unit 570. Calculate and output the time constant. The operation of the gain smoothing time constant calculation unit 560 is the same as the operation of the loudness smoothing time constant calculation unit 550, and a description thereof will be omitted.

なお、ラウドネス平滑化時定数算出部５５０とゲイン平滑化時定数算出部５６０とで算出する時定数を同一の値としても良い。この場合には、ラウドネス平滑化時定数算出部５５０とゲイン平滑化時定数算出部５６０は兼用することができるので、いずれか一方で良い。 Note that the time constants calculated by the loudness smoothing time constant calculation unit 550 and the gain smoothing time constant calculation unit 560 may be the same value. In this case, the loudness smoothing time constant calculation unit 550 and the gain smoothing time constant calculation unit 560 can be used together, and either one of them may be used.

以上のように実施の形態２の音声信号処理装置では、出力音声信号の周波数分析を行う周波数分析部５７０と、周波数分析部５７０の分析結果に基づいて、ラウドネス平滑化とゲイン平滑化の時定数をそれぞれ算出するラウドネス平滑化時定数算出部５５０とゲイン平滑化時定数算出部５６０とを備えることにより、低域周波数では、信号レベルの変化に対する音の大きさの変化が大きい、という心理音響学の知見を利用して、低域周波数成分が所定の値より大きい場合、および全周波数成分に占める低域周波数成分の比率が所定の値よりも大きい場合には、前記時定数を大きくして前記ラウドネス平滑化とゲイン平滑化を行うので、音の変動感を抑えた聞き取りやすい音声信号にすることができる。 As described above, in the audio signal processing device according to the second embodiment, the frequency analysis unit 570 that performs frequency analysis of the output audio signal, and the time constants of loudness smoothing and gain smoothing based on the analysis result of the frequency analysis unit 570 Psychoacoustics that a loudness smoothing time constant calculating unit 550 and a gain smoothing time constant calculating unit 560 that respectively calculate the sound level have a large change in sound volume with respect to a change in signal level at a low frequency. When the low frequency component is larger than a predetermined value and when the ratio of the low frequency component to the total frequency component is larger than the predetermined value, the time constant is increased and the time constant is increased. Since loudness smoothing and gain smoothing are performed, it is possible to obtain an audio signal that is easy to hear and suppresses a sense of sound fluctuation.

なお、実施の形態２の音声信号処理装置では、ラウドネス平滑化部５２０とその時定数を算出するラウドネス平滑化時定数算出部５５０、およびゲイン平滑化部５４０とその時定数を算出するゲイン平滑化時定数算出部５６０をともに備えた構成としたが、前者か後者のいずれか一方のみを備えた構成としてもよい。 In the audio signal processing apparatus according to the second embodiment, the loudness smoothing unit 520 and the loudness smoothing time constant calculating unit 550 for calculating the time constant thereof, and the gain smoothing unit 540 and the gain smoothing time constant for calculating the time constant thereof. Although the calculation unit 560 is included, the configuration may include only the former or the latter.

また、実施の形態２のラウドネス滑化時定数算出部５５０とゲイン平滑化時定数算出部５６０とでは、低域周波数成分が所定の値よりも大きい場合、および全周波数成分に占める低域周波数成分の比率が所定の値よりも大きい場合には、時定数大きくするように算出したが、前記２つの場合のいずれか一方の場合を満足するときには、時定数を大きくするようにしてもよい。 Further, in the loudness smoothing time constant calculation unit 550 and the gain smoothing time constant calculation unit 560 of the second embodiment, the low frequency component that occupies all frequency components when the low frequency component is larger than a predetermined value. When the ratio is larger than a predetermined value, the time constant is calculated to be increased. However, when either one of the two cases is satisfied, the time constant may be increased.

また、実施の形態２のラウドネス平滑化時定数算出部５５０とゲイン平滑化時定数算出部５６０では、低域周波数成分に基づいて時定数を算出するようにしたが、高域周波数成分に対しても同様な考え方を適用して、時定数を算出するようにしてもよい。 Further, in the loudness smoothing time constant calculation unit 550 and the gain smoothing time constant calculation unit 560 of the second embodiment, the time constant is calculated based on the low frequency component. A similar concept may be applied to calculate the time constant.

なお、実施の形態１ではブロック単位で処理をしているが、実施の形態２と同様にサンプル単位で処理してもよい。 In the first embodiment, processing is performed in units of blocks, but processing may be performed in units of samples as in the second embodiment.

また、実施の形態２ではサンプル単位で処理をしているが、実施の形態１と同様にブロック単位で処理してもよい。 In the second embodiment, the processing is performed in units of samples. However, the processing may be performed in units of blocks as in the first embodiment.

なお、本発明の音声信号処理装置は、各ブロックにおける処理をコンピュータに実行させるためのプログラムによって動作するコンピュータで構成してもよい。 The audio signal processing apparatus according to the present invention may be configured by a computer that operates according to a program for causing a computer to execute processing in each block.

以上のように、本発明にかかる音声信号処理装置は、不自然な音の時間的な変動感を抑えて、音声信号のダイナミック・レンジを制御し、聞き取りやすい音声信号にすることができるので、テレビ、ラジオ、ＤＶＤ、ビデオカメラ、ミニコン、携帯電話等の音声信号処理装置等として有用である。 As described above, the audio signal processing apparatus according to the present invention can suppress the sense of temporal variation of unnatural sound, control the dynamic range of the audio signal, and make an audio signal easy to hear. It is useful as an audio signal processing apparatus such as a television, radio, DVD, video camera, minicomputer, mobile phone, and the like.

本発明の実施の形態１における音声信号処理装置の構成を示すブロック図The block diagram which shows the structure of the audio | voice signal processing apparatus in Embodiment 1 of this invention. 本発明の実施の形態１のゲイン算出部１５０で使用するラウドネス対ゲインのＤＲＣ関数の一例を示す模式図Schematic diagram showing an example of a DRC function of loudness vs. gain used in gain calculation section 150 of Embodiment 1 of the present invention 本発明の実施の形態１および２における全周波数成分に占める低域周波数成分の比率に対する時定数の関数の一例を示す模式図Schematic diagram showing an example of a function of a time constant with respect to a ratio of a low frequency component occupying all frequency components in Embodiments 1 and 2 of the present invention ＩＳＯ２２６：２００３「音響−正常な音の大きさの等感曲線」に記載された自由音場試聴条件下の耳科学的に正常な人に対する純音の音の大きさの基準等感曲線を示すグラフISO 226: 2003 “Acoustic—Normal reference loudness isometric curve” shows a standard isometric curve of pure tone loudness for an otologically normal person under free-field audition conditions. Graph 本発明の実施の形態２における音声信号処理装置の構成を示すブロック図The block diagram which shows the structure of the audio | voice signal processing apparatus in Embodiment 2 of this invention. 非特許文献１に記載された従来の音声信号処理装置の構成を示すブロック図The block diagram which shows the structure of the conventional audio | voice signal processing apparatus described in the nonpatent literature 1.

Explanation of symbols

１００重畳ブロック分割部
１１０，５００乗算部
１２０重畳加算合成部
１３０，５１０ラウドネス測定部
１４０，５２０ラウドネス平滑化部
１５０，５３０ゲイン算出部
１６０，５４０ゲイン平滑化部
１７０，５５０ラウドネス平滑化時定数算出部
１８０，５６０ゲイン平滑化時定数算出部
１９０，５７０周波数分析部 DESCRIPTION OF SYMBOLS 100 Superimposition block division part 110,500 Multiplication part 120 Superposition addition composition part 130,510 Loudness measurement part 140,520 Loudness smoothing part 150,530 Gain calculation part 160,540 Gain smoothing part 170,550 Loudness smoothing time constant calculation Unit 180,560 Gain smoothing time constant calculation unit 190,570 Frequency analysis unit

Claims

A multiplier for multiplying an input audio signal by a gain and calculating an output audio signal;
A loudness measuring unit for measuring the loudness of the input audio signal or the output audio signal;
A loudness smoothing unit that performs time smoothing on the measured loudness with a predetermined time constant;
A gain calculation unit for calculating the gain based on the smoothed loudness;
A frequency analysis unit for performing frequency analysis of the input audio signal or the output audio signal;
An audio signal processing apparatus comprising: a loudness smoothing time constant calculating unit that calculates a time constant of the loudness smoothing unit based on an analysis result of the frequency analyzing unit.

The loudness smoothing time constant calculation unit calculates the loudness when the low-frequency component is larger than a predetermined value and / or when the ratio of the low-frequency component to the total frequency component is larger than a predetermined value. 2. The audio signal processing apparatus according to claim 1, wherein the calculation is performed so as to increase a time constant of the smoothing unit.

Furthermore, the gain is smoothed with a predetermined time constant, and a gain smoothing unit that calculates the smoothed gain is provided.
The audio signal processing apparatus according to claim 1, wherein the multiplication unit multiplies the input audio signal by the smoothed gain.

4. The audio signal processing apparatus according to claim 3, further comprising a gain smoothing time constant calculating unit that calculates a time constant of the gain smoothing unit based on the frequency analysis result.

A multiplier for multiplying the input audio signal by the smoothed gain and calculating an output audio signal;
A loudness measuring unit for measuring the loudness of the input audio signal or the output audio signal;
A gain calculation unit for calculating a gain based on the measured loudness;
A gain smoothing unit that performs time smoothing on the calculated gain with a predetermined time constant and calculates a smoothed gain;
A frequency analysis unit for performing frequency analysis of the input audio signal or the output audio signal;
An audio signal processing apparatus comprising: a gain smoothing time constant calculating unit that calculates a time constant of the gain smoothing unit based on an analysis result of the frequency analyzing unit.

When the low frequency component is larger than a predetermined value and / or when the ratio of the low frequency component to the total frequency component is larger than a predetermined value, the gain smoothing time constant calculation unit calculates the gain. 6. The audio signal processing apparatus according to claim 5, wherein the calculation is performed so as to increase a time constant of the smoothing unit.

The audio signal processing apparatus according to claim 2, wherein the low frequency component is a component of 30 Hz to 150 Hz or less.

A multiplication step of multiplying an input audio signal by a gain and calculating an output audio signal;
A loudness measuring step for measuring a loudness of the input audio signal or the output audio signal;
A loudness smoothing step for performing time smoothing on the measured loudness with a predetermined time constant;
A gain calculating step of calculating the gain based on the smoothed loudness;
A frequency analysis step for performing frequency analysis of the input audio signal or the output audio signal;
An audio signal processing method comprising: a loudness smoothing time constant calculating step of calculating a time constant of the loudness smoothing based on an analysis result of the frequency analysis.

A multiplication step of multiplying the input audio signal by a smoothed gain and calculating an output audio signal;
A loudness measuring step for measuring a loudness of the input audio signal or the output audio signal;
A gain calculating step of calculating a gain based on the measured loudness;
A gain smoothing step for performing time smoothing on the calculated gain with a predetermined time constant and calculating a smoothed gain;
A frequency analysis step for performing frequency analysis of the input audio signal or the output audio signal;
And a gain smoothing time constant calculating step of calculating a time constant of the gain smoothing based on an analysis result of the frequency analysis.

The program for making a computer perform each step in the audio | voice signal processing method of Claim 8 or Claim 9.