JP2001117573A

JP2001117573A - Audio spectrum enhancement method / apparatus and audio decoding apparatus

Info

Publication number: JP2001117573A
Application number: JP29850599A
Authority: JP
Inventors: Masahiro Oshikiri; 正浩押切; Kimio Miseki; 公生三関
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1999-10-20
Filing date: 1999-10-20
Publication date: 2001-04-27

Abstract

(57)【要約】【課題】不適当なスペクトル傾きを発生せず、かつスペ
クトルの凸部をずらしたり凹部を強調することのない、
理想的な音声スペクトルを可能とする。【解決手段】ＬＰＣ係数算出部１０２により音声信号の
振幅スペクトル概形であるＬＰＣ係数を算出した後、凸
部周波数／凹部周波数決定部１０３で振幅スペクトル概
形の凸部周波数及び凹部周波数を求め、これら凸部周波
数及び凹部周波数から凸部帯域／凹部帯域決定部１０４
により凸部周波数及び凹部周波数をそれぞれ含む凸部帯
域及び凹部帯域を決定し、凸部帯域に含まれる周波数成
分の振幅スペクトルを強調しかつ凹部帯域に含まれる周
波数成分の振幅スペクトルを減衰させる特性を有するフ
ィルタをフィルタ構成部１０６で構成して、このフィル
タの特性を乗算器１０９で乗じることで音声信号をフィ
ルタリングすることにより、スペクトル強調を行う。 (57) [Summary] [PROBLEMS] To prevent an inappropriate spectrum tilt from occurring, and not to shift a convex part of a spectrum or emphasize a concave part.
Enables an ideal voice spectrum. After calculating an LPC coefficient which is an approximate amplitude spectrum of an audio signal by an LPC coefficient calculating unit, a convex frequency / recess frequency determining unit calculates a convex frequency and a concave frequency of the approximate amplitude spectrum, From these convex portion frequency and concave portion frequency, the convex portion band / recess band determining portion 104 is used.
The convex band and the concave band including the convex band frequency and the concave band frequency, respectively, are determined, and the characteristic that the amplitude spectrum of the frequency component included in the convex band is emphasized and the amplitude spectrum of the frequency component included in the concave band is attenuated. The filter having the filter configuration is configured by the filter configuration unit 106, and the characteristic of the filter is multiplied by the multiplier 109 to filter the audio signal, thereby performing spectrum enhancement.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声信号のスペク
トルの凸部を強調し、凹部を減衰させるスペクトル強調
方法及び装置とこれを用いた音声復号化装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a spectrum emphasizing method and apparatus for emphasizing a convex part of a spectrum of an audio signal and attenuating a concave part, and an audio decoding apparatus using the same.

【０００２】[0002]

【従来の技術】音声信号を低ビットレートで高能率に圧
縮符号化する技術は、自動車電話等の移動体通信や企業
内通信において、電波の有効利用や通信コストの削減の
ための重要な技術である。８kbps以下のビットレートで
品質の優れた音声合成が可能な音声符号化方式として、
CELP(Code Excited Linear Prediction)方式が知られて
いる。2. Description of the Related Art A technique for efficiently compressing and encoding a voice signal at a low bit rate is an important technique for effective use of radio waves and reduction of communication costs in mobile communications such as car telephones and in-company communications. It is. As a speech encoding method capable of producing high quality speech at a bit rate of 8 kbps or less,
The CELP (Code Excited Linear Prediction) method is known.

【０００３】CELP方式は、M.R.Schrodeder氏とB.S.Atal
氏により“Code-Excited Linear Prediction(CELP) Hig
h-Quality Speech at Very low Bit Rates”, Proc. IC
ASSP;1985, pp.937-939 (文献１) で発表されて以来、
高品質な音声が合成できる方式として注目され、品質の
改善や計算量の削減について種々の検討がなされてき
た。しかし、８kbit/s以下という低いビットレートで
は、復号音声の品質はまだ十分でない。[0003] The CELP method is based on MR Schrodeder and BSAtal
“Code-Excited Linear Prediction (CELP) Hig
h-Quality Speech at Very low Bit Rates ”, Proc. IC
Since it was published in ASSP; 1985, pp.937-939 (Reference 1),
Attention has been paid to a method capable of synthesizing high-quality speech, and various studies have been made on improving quality and reducing the amount of calculation. However, at a low bit rate of 8 kbit / s or less, the quality of decoded speech is not yet sufficient.

【０００４】このような背景の下、復号音声に後処理を
施し、復号音声の聴感的な品質を向上させる技術がいく
つか報告されている。例えば、ＡＴ＆Ｔベル研のP.Kroo
n氏とB.S.Atal氏らは、“Quantization Procedures for
the Excitation in CELP Coders”, Proc. ICASSP; 19
87, pp.1649-1652（文献２）において、復号器に送られ
てきたLPC係数(線形予測係数)に係数を乗じて特性をな
まらせた後処理用のフィルタを構成し、このフィルタに
より復号音声をフィルタリングして合成音声を得る方法
を報告している。この後処理用のフィルタをz変換領域
で表すと、(1)式のようになる。[0004] Against this background, several techniques have been reported for performing post-processing on decoded speech to improve the perceptual quality of the decoded speech. For example, AT & T Bell Labs P.Kroo
n and BSAtal et al. “Quantization Procedures for
the Excitation in CELP Coders ”, Proc. ICASSP; 19
87, pp. 1649-1652 (Literature 2), a post-processing filter in which the characteristic is blunted by multiplying the LPC coefficient (linear prediction coefficient) sent to the decoder by a coefficient, and decoding is performed using this filter We report a method to obtain synthesized speech by filtering speech. If this post-processing filter is represented by the z-transform domain, it is as shown in equation (1).

【０００５】[0005]

【数１】 (Equation 1)

【０００６】ここで、Ａ(z/β)は(2)式で表される。Here, A (z / β) is expressed by equation (2).

【０００７】[0007]

【数２】 (Equation 2)

【０００８】しかし、(1)式のような全極型フィルタF1
(z)では、不適当なスペクトル傾きを含み、合成音声が
こもってしまうという問題がある。特許第2887286号
（文献３）には、この問題を解決するスペクトル強調方
法が開示されている。この文献３では、スペクトル傾き
補正を考慮に入れた極零型フィルタと１次の零型フィル
タを縦続接続する方法を提案している。このフィルタの
伝達関数をz変換領域で表すと(3)式のようになる。However, an all-pole filter F1 as shown in equation (1)
In the case of (z), there is a problem that an improper spectrum inclination is included and the synthesized speech is muffled. Japanese Patent No. 2887286 (Reference 3) discloses a spectrum enhancement method for solving this problem. This document 3 proposes a method of cascade-connecting a pole-zero filter and a first-order zero-filter in consideration of spectral tilt correction. Expressing the transfer function of this filter in the z-transform domain gives equation (3).

【０００９】[0009]

【数３】 (Equation 3)

【００１０】このスペクトル強調フィルタによると、
(3)式の項Ａ(z/γ)と項(1-μz-1)が項Ａ(z/β)の不適当
なスペクトル傾きを補正するように働くため、合成音声
がこもるという問題は軽減される。しかし、スペクトル
強調フィルタに入力される信号の特性によってスペクト
ル傾き補正の程度に過不足が生じる。その結果、スペク
トル強調フィルタの出力信号のある部分では低域部が強
調されるため、音がこもってしまい、ある部分では高域
部が強調されるために音が明るくなりすぎる。これが原
因となり、スペクトル強調フィルタの出力信号を試聴す
ると、ふらつくような印象を与えてしまう。また、(３)
式の項Ａ(z/γ)の影響によりスペクトルの凸部の位置が
ずれたり、凹部が強調されてしまうことがある。その結
果、スペクトル強調フィルタの出力信号が劣化してしま
うという問題が発生する。According to this spectrum emphasis filter,
Since the term A (z / γ) and the term (1-μz-1) in the equation (3) work to correct the inappropriate spectral tilt of the term A (z / β), the problem that the synthesized speech is muffled is It is reduced. However, depending on the characteristics of the signal input to the spectrum emphasis filter, the degree of the spectrum tilt correction may be excessive or insufficient. As a result, a low-frequency part is emphasized in a certain portion of the output signal of the spectrum emphasis filter, so that the sound is muffled. In a certain part, the high-frequency part is emphasized, so that the sound becomes too bright. For this reason, when listening to the output signal of the spectrum emphasizing filter, a wobble impression is given. Also, (3)
The position of the convex part of the spectrum may be shifted or the concave part may be emphasized due to the influence of the term A (z / γ) in the equation. As a result, there arises a problem that the output signal of the spectrum enhancement filter is deteriorated.

【００１１】[0011]

【発明が解決しようとする課題】まず、文献１に記載さ
れた方法の問題点を説明する。この方法では、復号側で
得られたLPC係数を使って、式(1)に従いスペクトル強調
フィルタを構成する。図２１は復号音声のある短区間の
LPC係数α(i)で構成される合成フィルタH(z)のLPCスペ
クトル(実線)、および文献1で提案されているスペクト
ル強調フィルタF1(z)のスペクトル特性(点線)を表して
いる。F1(z)のβには0.4を用いている。また、合成フィ
ルタH(z)は、First, problems of the method described in Reference 1 will be described. In this method, a spectrum emphasis filter is configured according to equation (1) using LPC coefficients obtained on the decoding side. FIG. 21 shows a short section of a decoded speech.
The LPC spectrum (solid line) of the synthesis filter H (z) composed of the LPC coefficient α (i) and the spectrum characteristic (dotted line) of the spectrum enhancement filter F1 (z) proposed in Document 1 are shown. 0.4 is used for β of F1 (z). Also, the synthesis filter H (z) is

【００１２】[0012]

【数４】 (Equation 4)

【００１３】と表せる。It can be expressed as

【００１４】図２１から分かるように、スペクトル強調
フィルタF1(z)のスペクトル特性は、低域部ではスペク
トルが強調され、高域部にいくに従いスペクトルが減衰
される特性であるため、低域部が過度に強調される傾向
にある。この傾向は、音声信号の多くの区間を占める有
声部で見られる。その結果、スペクトル強調後の音声信
号はこもって聞こえてしまうという問題が生じる。As can be seen from FIG. 21, the spectrum characteristic of the spectrum emphasizing filter F1 (z) is such that the spectrum is emphasized in the low frequency band and the spectrum is attenuated toward the high frequency band. Tend to be overemphasized. This tendency is observed in voiced parts that occupy many sections of the audio signal. As a result, there is a problem that the speech signal after the spectrum emphasis is heard muffled.

【００１５】同様に、文献２によるスペクトル強調フィ
ルタF2(z)の特性(点線)を図２２に示し、合成フィルタH
(z)のLPCスペクトルを実線で表す。スペクトル強調フィ
ルタF2(z)の特性を見ると、F1(z)のそれと比べて低域が
過度に強調されるという問題は軽減されている。これは
式(3)の分子項Ａ(z/γ)と項(1-μz-1)が、不適当なスペ
クトル傾きを補正するよう働くためである。この例で
は、β=0.8、γ=0.5、μ=0.4に設定してある。Similarly, the characteristic (dotted line) of the spectrum emphasis filter F2 (z) according to Document 2 is shown in FIG.
The LPC spectrum of (z) is represented by a solid line. Looking at the characteristics of the spectrum emphasis filter F2 (z), the problem that the low band is excessively emphasized compared to that of F1 (z) is reduced. This is because the numerator A (z / γ) and the term (1-μz−1) in the equation (3) work to correct an inappropriate spectral tilt. In this example, β = 0.8, γ = 0.5, and μ = 0.4.

【００１６】このため、文献２では文献１に見られたよ
うな音がこもる問題というものは軽減されるものの、こ
の例においてもスペクトル傾きが存在し、低域が強調さ
れる傾向は残る。また、このフィルタでは別の区間で適
切にスペクトル傾きを補正する保証はなく、むしろある
区間では低域が過度に強調されたり、高域が過度に強調
されるという問題は常に起こる可能性がある。[0016] For this reason, although the problem of the muffled sound as seen in Document 1 is reduced in Document 2, even in this example, there is a spectral gradient, and the tendency to emphasize low frequencies remains. Also, there is no guarantee that this filter will properly correct the spectral tilt in another section, but rather the problem of overemphasizing low frequencies or overemphasizing high frequencies in some sections may always occur. .

【００１７】また、スペクトル傾き補正の問題の他に、
スペクトル極大値とスペクトル極小値の周波数がずれる
という問題も存在する。図２２において、H(z)のスペク
トル極大値の周波数をそれぞれf1,f2とし、スペクトル
強調フィルタF2(z)のスペクトル特性の極大値をfeとす
る。ここで、f1≠fe、f2≠feであるため、スペクトル極
大値の周波数f1,f2は、実際の値から移動してしまう。
この場合、f1は周波数の高い方に、f2は周波数の低い方
にそれぞれずれる。同様に、スペクトル極小値の周波数
も移動することが起こり得る。例えば、図２２におい
て、スペクトル極小値の周波数f3は周波数の高い方にず
れてしまう。スペクトル極大値、極小値の周波数の位置
関係は音韻情報に深く関わりを持つものであり、即ちス
ペクトル極大値、極小値の周波数が移動してしまうと、
品質劣化につながることになる。In addition to the problem of spectral tilt correction,
There is also a problem that the frequency of the spectrum maximum and the spectrum minimum deviate. In FIG. 22, the frequencies of the spectral maximum values of H (z) are f1 and f2, respectively, and the local values of the spectral characteristics of the spectrum emphasis filter F2 (z) are fe. Here, since f1 ≠ fe and f2 ≠ fe, the frequencies f1 and f2 of the spectrum maximum value move from the actual values.
In this case, f1 is shifted to a higher frequency and f2 is shifted to a lower frequency. Similarly, it is possible for the frequency of the spectral minima to shift as well. For example, in FIG. 22, the frequency f3 of the spectrum minimum value is shifted to a higher frequency. The positional relationship between the frequency of the spectrum maximum value and the minimum value is closely related to the phoneme information, that is, if the frequency of the spectrum maximum value and the minimum value moves,
This will lead to quality degradation.

【００１８】さらに、２つのスペクトル極大値の周波数
が各々近接している場合、そのスペクトル極大値に挟ま
れたスペクトル極小値が強調されてしまうことが起こ
る。図２２では、スペクトル極小値の周波数f4の部分で
スペクトル強調フィルタF2(z)が強調するように働くこ
とが分かる。このような事象も音韻情報を崩してしま
い、品質劣化の原因となってしまう。Further, when the frequencies of the two spectral maximums are close to each other, the spectral minimum sandwiched between the spectral maximums may be emphasized. In FIG. 22, it can be seen that the spectrum emphasis filter F2 (z) works so as to emphasize at the frequency f4 of the spectrum minimum value. Such an event also destroys the phonemic information and causes quality deterioration.

【００１９】本発明の目的は、不適当なスペクトル傾き
を発生することなく、かつスペクトルの凸部をずらした
り凹部を強調することのない、理想的な音声スペクトル
強調方法／装置及びこれを用いた音声復号化装置を提供
することにある。An object of the present invention is to provide an ideal speech spectrum emphasizing method / apparatus which does not generate an inappropriate spectrum tilt and does not shift a convex portion of a spectrum or emphasize a concave portion, and to use the same. An object of the present invention is to provide an audio decoding device.

【００２０】[0020]

【課題を解決するための手段】上記課題を解決するた
め、本発明に係る音声スペクトル強調方法は、音声信号
の振幅スペクトル概形の凸部周波数及び凹部周波数をそ
れぞれ含む凸部帯域及び凹部帯域を決定し、凸部帯域に
含まれる周波数成分の振幅スペクトルを強調し、凹部帯
域に含まれる周波数成分の振幅スペクトルを減衰させる
特性を有するフィルタを構成して、該フィルタにより音
声信号をフィルタリングすることを特徴とする。凸部帯
域及び凹部帯域は、典型的には凸部周波数及び凹部周波
数をそれぞれの中心周波数とする帯域である。In order to solve the above-mentioned problems, a voice spectrum emphasizing method according to the present invention comprises a method of forming a convex band and a concave band including a convex frequency and a concave frequency of an approximate amplitude spectrum of an audio signal. Determined, emphasizes the amplitude spectrum of the frequency component included in the convex band, configures a filter having a characteristic of attenuating the amplitude spectrum of the frequency component included in the concave band, and filters the audio signal by the filter. Features. The convex part band and the concave part band are typically bands having the convex part frequency and the concave part frequency as their respective center frequencies.

【００２１】本発明に係る音声スペクトル強調装置は、
音声信号の振幅スペクトル概形を求める手段と、この振
幅スペクトル概形の凸部周波数及び凹部周波数を求める
手段と、凸部周波数及び凹部周波数から凸部周波数及び
凹部周波数をそれぞれ含む凸部帯域及び凹部帯域を決定
する手段と、凸部帯域に含まれる周波数成分の振幅スペ
クトルを強調し、凹部帯域に含まれる周波数成分の振幅
スペクトルを減衰させる特性を有するフィルタを構成し
て、該フィルタにより前記音声信号をフィルタリングす
る手段とを有することを特徴とする。The speech spectrum emphasizing device according to the present invention comprises:
Means for determining the approximate shape of the amplitude spectrum of the audio signal, means for determining the convex and concave frequencies of the approximate amplitude spectrum, and the convex band and the concave portion including the convex and concave frequencies from the convex and concave frequencies, respectively. A means for determining a band, and a filter having characteristics of emphasizing the amplitude spectrum of the frequency component included in the convex band and attenuating the amplitude spectrum of the frequency component included in the concave band; And a means for filtering

【００２２】より具体的には、本発明に係る他の音声ス
ペクトル強調装置は、音声信号の振幅スペクトル概形を
求める手段と、この振幅スペクトル概形の凸部周波数及
び凹部周波数を求める手段と、凸部周波数及び凹部周波
数をそれぞれ含む凸部帯域及び凹部帯域を決定する手段
と、凸部帯域に含まれる周波数成分の振幅スペクトルを
所定の凸部倍率を乗じることにより強調し、凹部帯域に
含まれる周波数成分の振幅スペクトルを所定の凹部倍率
を乗じることにより減衰させ、凸部帯域及び凹部帯域に
含まれない周波数成分の振幅スペクトルに対しては凸部
倍率の最大値以上かつ凹部倍率の最小値以上に設定され
た倍率を乗じるフィルタを構成して、該フィルタにより
前記音声信号をフィルタリング処理する手段とを有する
ことを特徴とする。More specifically, another voice spectrum emphasizing device according to the present invention comprises: means for obtaining a rough shape of an amplitude spectrum of a voice signal; means for calculating a convex portion frequency and a concave portion frequency of the rough amplitude spectrum; Means for determining the convex band and the concave band including the convex frequency and the concave frequency, respectively, and the amplitude spectrum of the frequency component included in the convex band is emphasized by multiplying by a predetermined convex magnification, and is included in the concave band. The amplitude spectrum of the frequency component is attenuated by multiplying by a predetermined concave magnification, and the amplitude spectrum of the frequency component not included in the convex band and the concave band is not less than the maximum value of the convex magnification and not less than the minimum value of the concave magnification. And a means for multiplying the audio signal by the set magnification, and filtering the audio signal by the filter.

【００２３】このように本発明では、音声信号の振幅ス
ペクトル概形から凸部周波数と凹部周波数を求め、これ
ら凸部周波数と凹部周波数がずれないようにスペクトル
強調及びスペクトル減衰を行うための凸部倍率及び凹部
倍率を決定する。具体的には、凸部帯域の周波数成分
（凸部周波数近傍の周波数成分）を同一の倍率で強調
し、同様に凹部帯域の周波数成分（凹部周波数近傍の周
波数成分）を同一の倍率で減衰する。As described above, according to the present invention, the convex portion frequency and the concave portion frequency are obtained from the approximate amplitude spectrum of the audio signal, and the convex portion for performing the spectrum emphasis and the spectrum attenuation so that the convex portion frequency and the concave portion frequency do not shift. Determine the magnification and the recess magnification. Specifically, the frequency components of the convex band (frequency components near the convex frequency) are emphasized at the same magnification, and the frequency components of the concave band (frequency components near the concave frequency) are similarly attenuated at the same magnification. .

【００２４】このような処理を行うことにより、低域が
過度に強調されたり、高域が過度に強調されるような不
適当なスペクトル傾きは、原理的に生じなくなる。ま
た、凸部周波数とその近傍を同一倍率で強調し、凹部周
波数とその近傍を同一倍率で減衰させるために、凸部周
波数や凹部周波数がずれるという問題は生じない。By performing such processing, an inappropriate spectral tilt such that the low band is excessively emphasized and the high band is excessively emphasized does not occur in principle. Further, since the convex portion frequency and its vicinity are emphasized at the same magnification, and the concave portion frequency and its vicinity are attenuated at the same magnification, there is no problem that the convex portion frequency and the concave portion frequency shift.

【００２５】さらに、凸部周波数は強調し凹部周波数は
減衰するように倍率が決定されるため、２つの凸部周波
数が各々近接している場合においても、その凸部周波数
に挟まれた凹部周波数が強調されてしまうという問題を
回避することができる。Further, the magnification is determined so that the convex frequency is emphasized and the concave frequency is attenuated. Therefore, even when the two convex frequencies are close to each other, the concave frequency sandwiched between the convex frequencies is determined. Can be avoided.

【００２６】本発明においては、音声信号をフィルタリ
ング処理する際に、周波数領域及び時間領域のいずれで
処理を行ってもよい。周波数領域でフィルタリング処理
を行うと、スペクトル強調、スペクトル減衰の度合いを
正確に制御することが可能となり、また周波数領域で符
号化を行う方式、例えばMBE(Multi-band Excitation)符
号化などとの親和性が高く、このような符号化に適用し
やすいという利点がある。フィルタリング処理を時間領
域で行うと、音声信号を周波数領域に変換する処理を除
くことができ、計算量の削減を図ることができる。In the present invention, when filtering the audio signal, the processing may be performed in either the frequency domain or the time domain. Performing filtering in the frequency domain makes it possible to accurately control the degree of spectrum emphasis and spectrum attenuation, and also has an affinity with the coding method in the frequency domain, such as MBE (Multi-band Excitation) coding. There is an advantage that the coding efficiency is high and the coding can be easily applied. When the filtering process is performed in the time domain, the process of converting the audio signal into the frequency domain can be omitted, and the amount of calculation can be reduced.

【００２７】振幅スペクトル概形としては、例えばLPC
スペクトルが求められる。この場合、振幅スペクトル概
形であるLPCスペクトルから、比較的少ない計算量で正
確な凸部周波数および凹部周波数を求めることが可能と
なり、品質改善に寄与することができる。As an outline of the amplitude spectrum, for example, LPC
A spectrum is determined. In this case, from the LPC spectrum which is the approximate shape of the amplitude spectrum, it is possible to obtain the accurate convex frequency and concave frequency with a relatively small amount of calculation, which can contribute to quality improvement.

【００２８】本発明においては、振幅スペクトルの凸部
周波数を含む凸部帯域の幅と周波数位置の少なくとも一
方を凸部周波数とその両側に位置する凹部周波数との位
置関係により決定することが望ましい。このようにする
と、強調すべき凸部周波数とそれを含む凸部帯域の幅も
しくは周波数位置を凸部周波数毎に適応的に決定するこ
とが可能となり、品質改善に寄与することができる。In the present invention, it is desirable that at least one of the width and the frequency position of the convex band including the convex frequency of the amplitude spectrum is determined by the positional relationship between the convex frequency and the concave frequencies located on both sides thereof. This makes it possible to adaptively determine the convex frequency to be emphasized and the width or frequency position of the convex band including the convex frequency for each convex frequency, thereby contributing to quality improvement.

【００２９】また、本発明では凸部倍率および凹部倍率
を振幅スペクトル概形に基づいて決定することが好まし
い。この場合、強調すべき凸部帯域および減衰すべき凹
部帯域に含まれる周波数成分の振幅スペクトルの大きさ
に応じて凸部倍率および凹部倍率を適応的に制御するこ
とにより、品質改善に寄与することができる。In the present invention, it is preferable to determine the magnification of the convex portion and the magnification of the concave portion based on the approximate shape of the amplitude spectrum. In this case, it is possible to contribute to quality improvement by adaptively controlling the convex magnification and the concave magnification according to the magnitude of the amplitude spectrum of the frequency component included in the convex band to be emphasized and the concave band to be attenuated. Can be.

【００３０】さらに、本発明による音声復号化装置は、
音声信号の符号化データを復号して復号音声信号及び少
なくとも音声信号の振幅スペクトルの情報を含むLPC係
数のようなパラメータを出力する音声復号部からの復号
音声信号及びパラメータが上述した音声スペクトル強調
装置により構成されるスペクトル強調部に入力される。
この場合、スペクトル強調部においては、LPC係数のよ
うなパラメータからスペクトル概形を求め、復号音声信
号について先と同様のフィルタリング処理を行うことに
より、スペクトル強調を行う。Furthermore, the speech decoding apparatus according to the present invention
The decoded speech signal and the parameter from the speech decoding unit that decodes the encoded data of the speech signal and outputs parameters such as LPC coefficients including information on the decoded speech signal and at least the amplitude spectrum of the speech signal, and the speech spectrum emphasis device described above. Is input to the spectrum emphasizing unit.
In this case, the spectrum emphasizing unit obtains a spectrum outline from parameters such as LPC coefficients and performs the same filtering process on the decoded speech signal to perform spectrum emphasis.

【００３１】[0031]

【発明の実施の形態】［第１の実施形態］本発明の第１
の実施形態として、時間領域の入力信号に対し周波数領
域でスペクトル強調フィルタによるフィルタリング処理
を行う例について説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS [First Embodiment] The first embodiment of the present invention
As an embodiment of the present invention, an example in which an input signal in a time domain is subjected to a filtering process using a spectrum enhancement filter in a frequency domain will be described.

【００３２】図１は、本発明の第１の実施形態に係るス
ペクトル強調装置であり、LPC係数入力端子１０１、Ｌ
ＰＣスペクトル算出部１０２、凸部周波数／凹部周波数
決定部１０３、凸部帯域／凹部帯域決定部１０４、凸部
倍率／凹部倍率決定部１０５、フィルタ構成部１０６、
音声入力端子１０７、時間−周波数変換部１０８、乗算
器１０９、周波数−時間変換部１１０、ゲイン算出部１
１１、乗算器１１２及び音声出力端子１１３からなる。FIG. 1 shows a spectrum emphasizing device according to a first embodiment of the present invention.
PC spectrum calculating section 102, convex section / concave section frequency determining section 103, convex section / concave section band determining section 104, convex section / concave section determining section 105, filter configuration section 106,
Audio input terminal 107, time-frequency converter 108, multiplier 109, frequency-time converter 110, gain calculator 1
11, a multiplier 112 and an audio output terminal 113.

【００３３】まず、図２に示すフローチャートを用いて
本実施形態の処理手順について説明する。入力端子１０
１から入力されたLPC係数は、LPCスペクトル算出部１０
２と凸部倍率／凹部倍率決定部１０５に与えられる。LP
Cスペクトル算出部１０２では、LPC係数を用いてLPCス
ペクトルが算出され（ステップＳ1001）、次いで凸部周
波数／凹部周波数決定部１０３によってLPCスペクトル
を用いて凸部周波数と凹部周波数が決定される（ステッ
プＳ1002）。さらに、凸部周波数と凹部周波数を用いて
凸部帯域／凹部帯域決定部１０４において凸部帯域と凹
部帯域が決定される（ステップＳ1003）。First, the processing procedure of this embodiment will be described with reference to the flowchart shown in FIG. Input terminal 10
The LPC coefficient input from 1 is calculated by the LPC spectrum calculator 10.
2 and the projection magnification / recess magnification determination unit 105. LP
The C spectrum calculation unit 102 calculates an LPC spectrum using the LPC coefficient (step S1001), and then the convex frequency / recess frequency determining unit 103 determines the convex frequency and the concave frequency using the LPC spectrum (step S1001). S1002). Further, the convex band / concave band determination unit 104 determines the convex band and the concave band using the convex frequency and the concave frequency (step S1003).

【００３４】凸部倍率／凹部倍率決定部１０５では、LP
C係数を用いて凸部倍率と凹部倍率が求められる（ステ
ップＳ1004）。フィルタ構成部１０６では、凸部帯域／
凹部帯域決定部１０４により求められた凸部帯域及び凹
部帯域と凸部倍率／凹部倍率決定部１０５により求めら
れた凸部倍率及び凹部倍率を用いてスペクトル強調フィ
ルタが構成される（ステップＳ1005）。In the convex / concave magnification determining unit 105, LP
The magnification of the convex portion and the magnification of the concave portion are obtained using the C coefficient (step S1004). In the filter configuration unit 106, the convex band /
A spectrum emphasis filter is configured using the convex band and concave band determined by the concave band determining unit 104 and the convex magnification and the concave magnification determined by the convex magnification / concave magnification determining unit 105 (step S1005).

【００３５】一方、入力端子１０７から入力される音声
信号は、時間−周波数変換部１０８により周波数領域の
信号に変換される（ステップＳ1006）。この周波数領域
に変換された音声信号のスペクトルとフィルタ構成部１
０６で構成されたスペクトル強調フィルタのフィルタ特
性係数とが乗算器１０９で乗算され（ステップＳ100
7）、この乗算器１０９の出力信号が周波数−時間変換
部１１０により時間領域の信号に変換される（ステップ
Ｓ1008）。On the other hand, the audio signal input from the input terminal 107 is converted into a frequency domain signal by the time-frequency converter 108 (step S1006). Spectrum of audio signal converted to this frequency domain and filter configuration unit 1
The filter 109 is multiplied by the filter characteristic coefficient of the spectrum emphasizing filter constituted by the multiplier 106 (step S100).
7), the output signal of the multiplier 109 is converted into a time-domain signal by the frequency-time conversion unit 110 (step S1008).

【００３６】ゲイン算出部１１１では、周波数−時間変
換部１１０の出力信号の大きさが入力端子１０７から入
力される信号の大きさに一致するよう補正ゲインが算出
され（ステップＳ1009）、この補正ゲインが乗算器１１
２により周波数−時間変換部１１０の出力信号に乗じら
れる（ステップＳ1010）。そして、この補正後の音声信
号が出力端子１１３から出力される。The gain calculator 111 calculates a correction gain so that the magnitude of the output signal of the frequency-time converter 110 matches the magnitude of the signal input from the input terminal 107 (step S1009). Is a multiplier 11
The output signal of the frequency-time conversion unit 110 is multiplied by 2 (step S1010). Then, the corrected audio signal is output from the output terminal 113.

【００３７】次に、本実施形態の動作をさらに詳細に説
明する。入力端子１０１から入力されるLPC係数を{α
(i);i=1〜NP}（但し、NPはLPC係数の次数を表す）とす
ると、LPCスペクトル算出部１０２でLPC係数を用いて算
出されるLPCスペクトルX(n)は、式（４）で与えられる
フィルタをz=exp(j2πn/N)と置き変えて求めることがで
きる。すなわち、LPCスペクトルX(n)は、Next, the operation of this embodiment will be described in more detail. The LPC coefficient input from the input terminal 101 is {α
(i); i = 1 to NP} (where NP represents the order of the LPC coefficient), the LPC spectrum X (n) calculated by the LPC spectrum calculation unit 102 using the LPC coefficient is expressed by the following equation (4). ) Can be obtained by replacing the filter given by z) with z = exp (j2πn / N). That is, the LPC spectrum X (n) is

【００３８】[0038]

【数５】 (Equation 5)

【００３９】と表される。入力される音声信号のサンプ
リング周期をFs[Hz]としたとき、LPCスペクトルX(n)の
スペクトル解像度は Fs/N [Hz]となる。仮にサンプリン
グ周期Fs=8000 [Hz]、N=1000 としたとき、LPCスペクト
ルX(n)は 8 [Hz]の解像度で表されることになる。## EQU1 ## When the sampling period of the input audio signal is Fs [Hz], the spectral resolution of the LPC spectrum X (n) is Fs / N [Hz]. Assuming that the sampling period is Fs = 8000 [Hz] and N = 1000, the LPC spectrum X (n) is represented by a resolution of 8 [Hz].

【００４０】本実施形態では、スペクトル概形を求める
手段としてLPCスペクトルを用いる方法について説明を
行うが、他にもLPCケプストラムおよびFFTケプストラム
などを用いてスペクトル概形を求めても良い。In this embodiment, a method using an LPC spectrum as a means for obtaining a spectrum outline will be described. Alternatively, a spectrum outline may be obtained using an LPC cepstrum and an FFT cepstrum.

【００４１】次に、凸部周波数／凹部周波数決定部１０
３では、LPCスペクトルX(n)が極大値および極小値とな
る周波数、すなわち凸部周波数及び凹部周波数を求め
る。LPCスペクトルX(n)は、LPCスペクトルX(n)をnで微
分した値 dX(n)/dn が0となるときに極大値もしくは極
小値をとる。実際上、式(6)を満足しかつ式(7)を満足す
るとき周波数nで極大値が存在し、式(6)を満足しかつ式
(8)を満足する場合には極小値が存在するとみなす。Next, the convex frequency / recess frequency determining section 10
In 3, the frequency at which the LPC spectrum X (n) has the maximum value and the minimum value, that is, the convex portion frequency and the concave portion frequency are obtained. The LPC spectrum X (n) takes a maximum value or a minimum value when a value dX (n) / dn obtained by differentiating the LPC spectrum X (n) by n becomes 0. In practice, when Equation (6) is satisfied and Equation (7) is satisfied, a local maximum exists at the frequency n, and Equation (6) is satisfied and Equation (6) is satisfied.
If (8) is satisfied, it is considered that a local minimum exists.

【００４２】[0042]

【数６】 (Equation 6)

【００４３】LPCスペクトルX(n)が極大値をとるときの
周波数nを凸部周波数{pk(j); j = 1〜NPK}と表し、極小
値をとるときの周波数ｎを凹部周波数{vy{k}; k = 1 〜
NVY}と表す。この様子を図３に示す。図３では、500H
z、1000Hz、2500Hz、3200Hzにスペクトル極大値が、800
Hz、2000Hz、3000Hzにスペクトル極小値が存在する。よ
って、NPK = 4、NVY＝3となり、凸部周波数{pk(j); j =
1 〜 4}、凹部周波数{vy{k}; k = 1 〜 3}はそれぞれp
k(1)=500, pk(2)=1000, pk(3)=2500, pk(4)=3200, vy
(1)=800, vy(2)=2000, vy(3)=3000 となる。The frequency n when the LPC spectrum X (n) takes a maximum value is expressed as a convex frequency {pk (j); j = 1 to NPK}, and the frequency n when the LPC spectrum takes a minimum value is expressed as a concave frequency {vy {k}; k = 1 to
NVY}. This is shown in FIG. In FIG. 3, 500H
z, 1000Hz, 2500Hz, 3200Hz, the spectral maximum is 800
There are spectral minima at Hz, 2000Hz and 3000Hz. Therefore, NPK = 4, NVY = 3, and the convex part frequency {pk (j); j =
1 to 4} and concave frequency {vy {k}; k = 1 to 3} are p
k (1) = 500, pk (2) = 1000, pk (3) = 2500, pk (4) = 3200, vy
(1) = 800, vy (2) = 2000, vy (3) = 3000.

【００４４】次に、凸部帯域／凹部帯域決定部１０４で
は、凸部周波数{pk(j); j = 1 〜NPK}、凹部周波数{vy
{k}; k = 1 〜 NVY}を受けて凸部帯域{Bpk(j); k = 1〜
NPK}、凹部帯域{Bvy(k); k = 1〜NVY}を求める。Next, the convex band / concave band determination section 104 determines the convex frequency {pk (j); j = 1 to NPK} and the concave frequency {vy
{k}; k = 1 to NVY} and convex band {Bpk (j); k = 1 to
NPK} and concave band {Bvy (k); k = 1 to NVY}.

【００４５】音声の振幅スペクトル概形において、凸部
周波数近傍のスペクトル概形は各々異なる形状を持ち、
同様に凹部周波数近傍のスペクトル概形も各々異なる形
状を持つ。例えば、図３における凸部周波数(pk(1),pk
(2),pk(3),pk(4))近傍の形状や凹部周波数(vy(1),vy
(2),vy(3)近傍のスペクトル概形に着目すると、それぞ
れ異なる形状を有することが分かる。従って、これらの
形状に適応させて凸部帯域および凹部帯域を決定するこ
とが望ましい。In the outline of the amplitude spectrum of the voice, the outline of the spectrum in the vicinity of the convex frequency has a different shape.
Similarly, the spectrum shapes near the concave portion frequency also have different shapes. For example, the convex portion frequency (pk (1), pk
(2), pk (3), pk (4)) and the concave frequency (vy (1), vy
Paying attention to the spectral outlines near (2) and vy (3), it can be seen that they have different shapes. Therefore, it is desirable to determine the convex zone and the concave zone according to these shapes.

【００４６】凸部帯域および凹部帯域を凸部周波数近傍
及び凹部周波数近傍のスペクトル概形の形状に適応させ
る例については、後述する第２の実施形態で説明するこ
ととし、本実施形態では説明を簡単にするために、予め
定められた固定の幅（帯域幅）で凸部帯域および凹部帯
域を決定する方法を説明する。An example in which the convex band and the concave band are adapted to the shape of the approximate spectrum near the convex frequency and the concave frequency will be described in a second embodiment which will be described later, and will be described in this embodiment. For the sake of simplicity, a method for determining the convex zone and the concave zone with a predetermined fixed width (bandwidth) will be described.

【００４７】図４は、凸部帯域{Bpk(j); j = 1〜NPK}と
凹部帯域{Bvy(k); k = 1〜NVY}の関係を模式的に表した
ものである。図４では、凸部帯域は凸部周波数を中心に
して全体で240Hzの帯域幅を持つように設定されてい
る。同様に、凹部帯域は凹部周波数を中心にして全体で
120Hzの帯域幅を持つように設定されている。凸部帯域
／凹部帯域決定部１０４では、このように凸部帯域と凹
部帯域を決定してフィルタ構成部１０６にその情報を与
える。FIG. 4 schematically shows the relationship between the convex band {Bpk (j); j = 1 to NPK} and the concave band {Bvy (k); k = 1 to NVY}. In FIG. 4, the convex band is set so as to have a total bandwidth of 240 Hz around the convex frequency. Similarly, the recessed band is the whole around the recessed frequency.
It is set to have a bandwidth of 120Hz. The convex band / concave band determining unit 104 determines the convex band and the concave band in this way and gives the information to the filter configuration unit 106.

【００４８】図５は、凸部帯域の設定値を500Hz、凹部
帯域の設定値を100Hzとした場合のものである。この場
合、図５中に示されるように凸部帯域と凹部帯域で重な
る部分が出現する。このようなとき、凸部帯域を優先さ
せるように凸部帯域および凹部帯域は設定される。図５
においては、凹部周波数800Hz, 3000Hzに対応する凹部
帯域は凸部帯域と重なっているため、凹部帯域の重なっ
ている部分は除かれる。一方、凸部周波数500Hz,1000Hz
に対応する凸部帯域は互いに重なりを持つ。このような
場合、これらは１つの凸部帯域とみなす。結果として、
凸部帯域{Bpk(j);j = 1〜NPK}と凹部帯域{Bvy(k); k =
1〜NVY}は、図５に示される通りになる。ここで、NPK=
3、NVY=1と修正される。FIG. 5 shows the case where the set value of the convex band is 500 Hz and the set value of the concave band is 100 Hz. In this case, as shown in FIG. 5, an overlapping portion appears between the convex zone and the concave zone. In such a case, the convex band and the concave band are set so as to give priority to the convex band. FIG.
In the above, since the concave band corresponding to the concave frequency of 800 Hz and 3000 Hz overlaps with the convex band, the overlapping portion of the concave band is removed. On the other hand, convex part frequency 500Hz, 1000Hz
Are overlapped with each other. In such a case, these are regarded as one convex band. as a result,
The convex band {Bpk (j); j = 1 to NPK} and the concave band {Bvy (k); k =
1 to NVY} are as shown in FIG. Where NPK =
3. Corrected to NVY = 1.

【００４９】凸部倍率／凹部倍率決定部１０５では、入
力端子１０１から与えられるLPC係数を用いて凸部倍率
と凹部倍率の少なくとも一方を決定する。これらの倍率
はスペクトル強調フィルタの強さを制御するために用い
られる。基本的な考え方としては、入力される音声信号
の振幅スペクトル概形において、スペクトル極大値とス
ペクトル極小値の振幅の差が大きい場合にホルマントが
はっきり現れていると考えてスペクトル強調を強くか
け、逆の場合にはホルマントがほとんど現れていないと
考えてスペクトル強調を弱くかけるようにすれば良い。The convex magnification / concave magnification determining unit 105 determines at least one of the convex magnification and the concave magnification using the LPC coefficient given from the input terminal 101. These magnifications are used to control the strength of the spectral enhancement filter. The basic idea is that if the difference between the amplitude of the spectral maximum and the amplitude of the spectral minima is large in the outline of the amplitude spectrum of the input audio signal, it is considered that the formant appears clearly, and the spectral emphasis is strongly applied. In this case, it is sufficient to consider that the formant hardly appears, and to weaken the spectrum.

【００５０】本実施形態では、ホルマントの現われ方に
対応するものとして、式(5)で表される合成フィルタの
予測ゲインを用いる。つまり、合成フィルタの予測ゲイ
ンが大きい場合にはスペクトル強調フィルタを強くか
け、予測ゲインが小さい場合にはスペクトル強調フィル
タを弱くかけるようにする。In the present embodiment, the prediction gain of the synthesis filter represented by the equation (5) is used to correspond to the appearance of the formant. That is, when the prediction gain of the synthesis filter is large, the spectrum emphasis filter is applied strongly, and when the prediction gain is small, the spectrum emphasis filter is applied weakly.

【００５１】具体的には、合成フィルタのフィルタゲイ
ンを式(9)で推定し、その推定値に従い凸部倍率と凹部
倍率を決定する。本実施形態では簡単のため、凹部倍率
を1.0と固定し、凸部倍率を合成フィルタの予測ゲイン
から求める場合について説明することとする。合成フィ
ルタの予測ゲインはデシベルで表すと、More specifically, the filter gain of the synthesis filter is estimated by equation (9), and the magnification of the convex portion and the magnification of the concave portion are determined according to the estimated value. For the sake of simplicity, in the present embodiment, a case will be described in which the concave portion magnification is fixed to 1.0 and the convex portion magnification is obtained from the prediction gain of the synthesis filter. When the predicted gain of the synthesis filter is expressed in decibels,

【００５２】[0052]

【数７】 (Equation 7)

【００５３】と推定される。ここで、ref(i)はPARCOR係
数を表し、LPC係数α(i)から周知の方法により求めるこ
とができる。次に、予測ゲインを用いて式(10)に従い凸
部倍率を決定する。It is estimated that Here, ref (i) represents a PARCOR coefficient, and can be obtained from the LPC coefficient α (i) by a known method. Next, the convex portion magnification is determined according to Expression (10) using the prediction gain.

【００５４】[0054]

【数８】 (Equation 8)

【００５５】ここで、TAおよびTBは凸部倍率を決定する
ための定数を表し、MIN( )は最小値を出力する関数を表
す。本実施形態では、TA=1.0、TB=3.0とする。このよう
にして求めた凸部倍率{Gpk(j); j = 1〜NPK}と凹部倍率
{Gvy(k); k = 1〜NVY}をフィルタ構成部１０６に与え
る。Here, TA and TB represent constants for determining the magnification of the convex portion, and MIN () represents a function for outputting the minimum value. In this embodiment, TA = 1.0 and TB = 3.0. The convex magnification {Gpk (j); j = 1 to NPK} and concave magnification obtained in this way
{Gvy (k); k = 1 to NVY} is provided to the filter configuration unit 106.

【００５６】本実施形態では、簡単のために全ての凸部
倍率{Gpk(j); j = 1〜NPK}を式(10)と式(11)に従い同一
の値に設定する方法について説明を行ったが、各凸部倍
率をそれぞれ適応的に設定する方法を用いても良い。例
えば、凸部帯域{Bpk(j); j =1〜NPK}にそれぞれ含まれ
る振幅スペクトル概形の平均値に基づいて凸部倍率{Gpk
(j); j = 1〜NPK}を設定する方法が考えられる。また、
本実施形態の他に凸部倍率を1.0に固定して、凹部倍率
を式(9)の結果に応じて設定する方法を用いることも可
能である。同様に、凸部倍率と凹部倍率の両者を設定す
る方法を用いることも可能である。In this embodiment, for simplicity, a method of setting all the convex magnifications {Gpk (j); j = 1 to NPK} to the same value according to the equations (10) and (11) will be described. However, a method of adaptively setting the magnification of each convex portion may be used. For example, based on the average value of the approximate amplitude spectrum included in each of the convex bands {Bpk (j); j = 1 to NPK}, the convex magnification {Gpk
(j); a method of setting j = 1 to NPK} is conceivable. Also,
In addition to the present embodiment, it is also possible to use a method of fixing the convex magnification to 1.0 and setting the concave magnification in accordance with the result of Expression (9). Similarly, it is also possible to use a method of setting both the magnification of the convex portion and the magnification of the concave portion.

【００５７】フィルタ構成部１０６では、凸部帯域／凹
部帯域決定部１０４から凸部帯域{Bpk(j); j = 1〜NPK}
と凹部帯域{Bvy(k); k = 1〜NVY}の情報を受け取り、凸
部倍率／凹部倍率決定部１０５から凸部倍率{Gpk(j); j
= 1〜NPK}と凹部倍率{Gvy(k); k = 1〜NVY}の情報を受
け取つて、スペクトル強調フィルタを構成する。ここで
は、図４に示した凸部帯域と凹部帯域を用い、かつ全て
の凸部倍率がGpk倍、全ての凹部倍率がGvy倍の場合につ
いて説明する。その模式図を図６に示す。In the filter configuration section 106, the convex band {Bpk (j); j = 1 to NPK} from the convex band / recess band determining section 104.
And information of the concave band {Bvy (k); k = 1 to NVY}, and the convex magnification {Gpk (j); j
= 1 to NPK} and the concave portion magnification {Gvy (k); k = 1 to NVY} to construct a spectrum emphasis filter. Here, the case where the convex band and the concave band shown in FIG. 4 are used, and all the convex magnifications are Gpk times and all the concave magnifications are Gvy times will be described. The schematic diagram is shown in FIG.

【００５８】図６において、凸部帯域{Bpk(j); j = 1〜
NPK}に含まれる周波数成分の振幅スペクトルはGpk倍と
なるようにスペクトル強調フィルタの特性は決定され
る。同様に、凹部帯域{Bvy(k); k = 1〜NVY}に含まれる
周波数成分の振幅スペクトルはGvy倍されるようにスペ
クトル強調フィルタの特性は決定される。In FIG. 6, the convex band {Bpk (j); j = 1 to
NPK}, the characteristic of the spectrum emphasis filter is determined so that the amplitude spectrum of the frequency component becomes Gpk times. Similarly, the characteristics of the spectrum emphasizing filter are determined such that the amplitude spectrum of the frequency component included in the concave band {Bvy (k); k = 1 to NVY} is multiplied by Gvy.

【００５９】次に、凸部帯域および凹部帯域のいずれに
も属さない周波数に対応する倍率の決定を行う。この周
波数に対応する倍率は、凸部倍率の最大値以下かつ凹部
倍率の最小値以上の範囲にあることが望ましい。さらに
いえば、現在着目している周波数に低域側で近い凸部帯
域もしくは凹部帯域、高域側で近い凸部帯域もしくは凹
部帯域で定められている２つの倍率を基に決定すること
が望ましい。Next, a magnification corresponding to a frequency that does not belong to any of the convex band and the concave band is determined. It is desirable that the magnification corresponding to this frequency be in the range of not more than the maximum value of the projection magnification and not less than the minimum value of the depression magnification. Furthermore, it is desirable to determine based on two magnifications defined by a convex band or a concave band near the low frequency side and a convex band or a concave band near the high frequency side to the current frequency of interest. .

【００６０】このことを、図６を用いて具体的に説明す
る。図６において、凸部帯域および凹部帯域のいずれに
も属さずかつ倍率がまだ決まっていない、現在着目して
いる周波数をftとする。周波数ftの成分に対する倍率
は、ftの低域側に存在する凸部帯域Bpk(2)に対応する倍
率Gpkと、ftの高域側に存在する凹部帯域Bvy(2)に対応
する倍率Gvyとを用いて設定する。その具体的な例とし
て、線形補間して倍率を決定する方法が考えられる。図
６において、凸部帯域Bpk(2)の最高周波数をfpk、凹部
帯域Bvy(2)の最低周波数をfvyとしたとき、周波数ftの
成分に対する倍率Gftは、This will be specifically described with reference to FIG. In FIG. 6, let ft be the current frequency of interest that does not belong to any of the convex band and the concave band and whose magnification is not yet determined. The magnification for the component of the frequency ft is a magnification Gpk corresponding to the convex band Bpk (2) existing on the low frequency side of ft, and a magnification Gvy corresponding to the concave band Bvy (2) existing on the high frequency side of ft. Set using. As a specific example, a method of determining the magnification by linear interpolation can be considered. In FIG. 6, when the highest frequency of the convex band Bpk (2) is fpk and the lowest frequency of the concave band Bvy (2) is fvy, the magnification Gft for the frequency ft component is

【００６１】[0061]

【数９】 (Equation 9)

【００６２】と表される。倍率が決まっていない他の周
波数成分に対する倍率も、同様の方法により決定するこ
とができる。Is expressed as follows. The magnification for other frequency components for which the magnification has not been determined can be determined in the same manner.

【００６３】また、図６における周波数0.0 〜 fl [Hz]
( fl はBpk(1)の最低周波数)の範囲に属する周波数成
分に対する倍率は、周波数0.0での倍率が決定していな
いため、求めることができない。この場合、周波数0.0
での倍率をGvy 〜 Gpkの間にあると仮定して求める必要
がある。本実施形態では、周波数0.0での倍率をflが属
する帯域に対応する倍率と対極の倍率 (この場合、Gvy)
にあると仮定する。同様に、周波数fh〜4000 [Hz] ( f
hはBpk(4)の最大周波数)に属する成分に対する倍率は、
fhが属する帯域に対応する倍率と対極の倍率(この場合G
vy)にあるとして倍率を求める。以上のような手続きに
従い決定された倍率ベクトル{G(n); n = 0 〜 N/2}は、
図７のようになる。The frequency in FIG. 6 is 0.0 to fl [Hz].
The magnification for the frequency component belonging to the range of (fl is the lowest frequency of Bpk (1)) cannot be obtained because the magnification at the frequency 0.0 is not determined. In this case, the frequency 0.0
It is necessary to obtain the magnification at assuming that it is between Gvy and Gpk. In the present embodiment, the magnification at the frequency 0.0 is the magnification corresponding to the band to which fl belongs and the magnification at the opposite pole (Gvy in this case)
Suppose that Similarly, frequency fh ~ 4000 [Hz] (f
h is the maximum frequency of Bpk (4)).
The magnification corresponding to the band to which fh belongs and the magnification at the opposite pole (in this case, G
vy) and find the magnification. The magnification vector {G (n); n = 0 to N / 2} determined according to the above procedure is
As shown in FIG.

【００６４】フィルタ構成部１０６における凸部帯域お
よび凹部帯域のいずれにも属さない周波数成分に対する
倍率の別の決定法として、予め定められた閾値以上に大
きく倍率が変化しないように制限を設ける方法がある。
例えば、図７に記載されている周波数f1〜f2、f3〜f4、
f5〜f6の間での倍率は急激に変化している。このような
区間では、近接する周波数で過度のスペクトル強調、ス
ペクトル減衰が行われてしまい、品質劣化を招く恐れが
ある。As another method of determining the magnification for the frequency component that does not belong to either the convex band or the concave band in the filter configuration unit 106, there is a method of limiting the magnification so that the magnification does not change more than a predetermined threshold value. is there.
For example, the frequencies f1 to f2, f3 to f4,
The magnification between f5 and f6 changes rapidly. In such a section, excessive spectrum emphasis and spectrum attenuation are performed at adjacent frequencies, which may cause quality deterioration.

【００６５】この問題を回避するために、予め定められ
た閾値以上に大きく倍率が変化しないように倍率に制限
を設ける方法は有効である。具体的な制限の方法とし
て、線形補間する際の補間関数の傾きの絶対値が閾値以
上とならないようにする方法がある。この方法を用いる
と、現在着目している周波数をftとしたとき、この周波
数ftに対する倍率Gftは、In order to avoid this problem, it is effective to limit the magnification so that the magnification does not change more than a predetermined threshold value. As a specific restriction method, there is a method for preventing the absolute value of the slope of the interpolation function at the time of linear interpolation from being equal to or larger than a threshold value. Using this method, when the current frequency of interest is ft, the magnification Gft for this frequency ft is

【００６６】[0066]

【数１０】 (Equation 10)

【００６７】と求めることができる。ここで、Tslは倍
率に制限を与える閾値、sign( )は符号を返す関数を表
す。Can be obtained. Here, Tsl is a threshold value for limiting the magnification, and sign () is a function that returns a sign.

【００６８】図８に、上記制限を設けた場合の倍率ベク
トル{G(n); n = 0 〜 N/2}の様子を示す。この図８で
は、凸部帯域、凹部帯域、凸部倍率、凹部倍率は図７と
同様のものを用いてある。図８から分かるように、隣接
する周波数で急激に倍率が変化する部分が無くなり、前
述したような問題を回避することが可能となる。FIG. 8 shows the state of the magnification vector {G (n); n = 0 to N / 2} in the case where the above-mentioned restriction is provided. In FIG. 8, the same projection band, recess band, projection magnification, and depression magnification as those in FIG. 7 are used. As can be seen from FIG. 8, there is no portion where the magnification changes abruptly at adjacent frequencies, and the above-described problem can be avoided.

【００６９】次に、入力端子１０７から音声信号s(i)が
与えられる。この入力音声信号と入力端子1０１から入
力されるLPC係数とは時間的な対応、つまり同期がとら
れているものとする。時間−周波数変換部１０８では、
入力端子１０７から入力される音声信号がDFTやDCTなど
の周知の技術により周波数領域の信号に変換される。そ
の際、必要であれば窓掛けも行われる。入力信号を時間
−周波数変換して求めたスペクトルを{S(n); n=0 〜 N/
2}と表す。Next, an audio signal s (i) is given from the input terminal 107. It is assumed that the input audio signal and the LPC coefficient input from the input terminal 101 have a temporal correspondence, that is, are synchronized. In the time-frequency conversion unit 108,
The audio signal input from the input terminal 107 is converted into a signal in the frequency domain by a known technique such as DFT or DCT. At that time, if necessary, window hanging is also performed. The spectrum obtained by performing time-frequency conversion on the input signal is expressed as (S (n); n = 0 to N /
2}.

【００７０】次に、乗算器１０９において入力される音
声信号のスペクトル{S(n); n=0 〜N/2}と倍率ベクトル
{G(n); n=0 〜 N/2}を次式に従い乗算を行う。ただし、
乗算後のスペクトルを{U(n); n=0 〜 N/2}とする。Next, the spectrum {S (n); n = 0 to N / 2} of the audio signal input to the multiplier 109 and the scale vector
{G (n); n = 0 to N / 2} is multiplied according to the following equation. However,
The spectrum after the multiplication is set to {U (n); n = 0 to N / 2}.

【００７１】[0071]

【数１１】 [Equation 11]

【００７２】この処理は、入力される音声信号のスペク
トルを倍率ベクトルによって周波数領域で強調・減衰す
ることを意味する。This processing means that the spectrum of the input audio signal is emphasized and attenuated in the frequency domain by the magnification vector.

【００７３】次に、乗算器１０９で算出されたスペクト
ル{U(n); n=0 〜 N/2}は、周波数−時間変換部１０８に
よって時間領域の信号に変換される。この変換は、時間
−周波数変換部１０８の逆変換として規定される。時間
−周波数変換部１０８で求められた時間領域の信号をu
(i)と表す。Next, the spectrum {U (n); n = 0 to N / 2} calculated by the multiplier 109 is converted by the frequency-time conversion unit 108 into a signal in the time domain. This conversion is defined as an inverse conversion of the time-frequency conversion unit 108. The signal in the time domain obtained by the time-frequency conversion unit 108 is represented by u
(i).

【００７４】次に、ゲイン算出部１１１においてスペク
トル強調後の信号ｕ（ｉ）のゲイン補正を行うためのゲ
インが算出される。ゲイン算出部１１１では、入力され
る音声信号s(i)とスペクトル強調後の信号u(i)を用い
て、補正ゲインgvを式(15)に従って求める。Next, the gain calculating section 111 calculates a gain for performing gain correction of the signal u (i) after spectrum enhancement. The gain calculator 111 calculates a correction gain gv according to the equation (15) using the input audio signal s (i) and the signal u (i) after spectrum enhancement.

【００７５】[0075]

【数１２】 (Equation 12)

【００７６】乗算器１１２では、スペクトル強調後の信
号u(i)に補正ゲインgvを乗じて、ゲイン補正後の信号v
(i)を生成する。ゲイン補正後の信号v(i)は、次式に従
い求められる。このゲイン補正後の信号v(i)は、出力端
子１１３から出力される。このようにしてスペクトル強
調がなされた音声信号を得ることができる。The multiplier 112 multiplies the signal u (i) after the spectrum emphasis by the correction gain gv to obtain the signal v after the gain correction.
Generate (i). The signal v (i) after gain correction is obtained according to the following equation. The signal v (i) after the gain correction is output from the output terminal 113. In this way, an audio signal whose spectrum has been emphasized can be obtained.

【００７７】[0077]

【数１３】 (Equation 13)

【００７８】なお、本実施形態では時間領域の信号同士
を用いてゲイン補正を行う例について示したが、周波数
領域の信号同士を用いてゲイン補正を行っても良い。具
体的には、乗算器１０９の出力信号のパワーを時間−周
波数変換部１０８の出力信号のパワーに一致するように
補正ゲインを算出すればよい。In the present embodiment, an example has been described in which the gain correction is performed using the signals in the time domain, but the gain correction may be performed using the signals in the frequency domain. Specifically, the correction gain may be calculated such that the power of the output signal of the multiplier 109 matches the power of the output signal of the time-frequency conversion unit 108.

【００７９】［第２の実施形態］音声の振幅スペクトル
概形において、凸部周波数及びその近傍のスペクトル概
形は、ピークの鋭さや幅の広さなどで各々異なる形状を
持つ。例として、図３における凸部周波数(pk(1),pk
(2),pk(3),pk(4))近傍のスペクトル概形に着目すると、
それぞれ異なる形状を有することが分かる。従って、こ
れらのスペクトル概形の形状に適応させて凸部帯域を決
定することが望ましいことは、第１の実施形態の説明中
で述べた通りである。[Second Embodiment] In the outline of the amplitude spectrum of the sound, the outline of the convex frequency and the spectrum in the vicinity thereof have different shapes depending on the sharpness and the width of the peak. As an example, the convex portion frequency (pk (1), pk
(2), pk (3), pk (4))
It can be seen that each has a different shape. Therefore, it is desirable to determine the convex band according to the shape of these spectral outlines, as described in the description of the first embodiment.

【００８０】そこで、本発明の第２の実施形態として、
凸部周波数近傍のスペクトル概形の形状に適した凸部帯
域を決定する実施形態について図９を用いて説明する。
図９は、１つの凸部周波数pk(j)とその両端に位置する
２つの凹部周波数vy(k), vy(k+1)の様子を表している。Therefore, as a second embodiment of the present invention,
An embodiment for determining a convex band suitable for the shape of the spectrum shape near the convex frequency will be described with reference to FIG.
FIG. 9 shows a state of one convex frequency pk (j) and two concave frequencies vy (k) and vy (k + 1) located at both ends thereof.

【００８１】まず、凸部周波数pk(j)と左側に位置する
凹部周波数vy(k)との、予め定められた比率に基づく内
分点に位置する周波数FLを求める。同様に、凸部周波数
pk(j)と右側に位置する凹部周波数vy(k+1)との、予め定
められた比率に基づく内分点に位置する周波数FRを求め
る。そして、周波数FLからFRまでの帯域を凸部周波数pk
(j)和中心とする凸部帯域Bpk(j)とする。図９では、内
分点を求める際の比率を0.5としている。また、この方
法において、凸部帯域Bpk(j)が大きくなりすぎないよう
に、予め定められた長さ以下に制限を設けても良い。First, a frequency FL located at an internal dividing point based on a predetermined ratio between the convex portion frequency pk (j) and the concave portion frequency vy (k) located on the left side is obtained. Similarly, the convex frequency
A frequency FR located at an internally dividing point based on a predetermined ratio between pk (j) and the concave portion frequency vy (k + 1) located on the right side is obtained. Then, the band from the frequency FL to FR is changed to the convex portion frequency pk.
(j) The convex band Bpk (j) with the sum center as the center. In FIG. 9, the ratio at the time of obtaining the internally dividing point is set to 0.5. In this method, a limit may be set to a predetermined length or less so that the convex band Bpk (j) does not become too large.

【００８２】また、図１０に示すように、凸部周波数pk
(j)の両側にそれぞれ位置する凹部周波数vy(k),vy(k+1)
のいずれか一方（この例ではvy(k+1)）が他方（この例
ではvy(k)）に比べ著しく離れた位置にある場合、凸部
帯域Bpk(j)が極端にずれてしまうおそれがある。Further, as shown in FIG.
Concave frequencies vy (k), vy (k + 1) located on both sides of (j)
If any one of these (vy (k + 1) in this example) is significantly farther away than the other (vy (k) in this example), the convex band Bpk (j) may be extremely displaced. There is.

【００８３】この問題を回避するために、凸部周波数pk
(j)と内分点に位置する周波数FL,FRとの距離を用いて凸
部帯域に制限を与える方法が考えられる。具体的には、
凸部周波数pk(j)とFLとの距離をDL、凸部周波数pk(j)と
FRとの距離をDRとすると、距離DL、DRは次のように表さ
れる。To avoid this problem, the projection frequency pk
A method of limiting the convex band using the distance between (j) and the frequencies FL and FR located at the inner dividing point is considered. In particular,
The distance between the convex frequency pk (j) and FL is DL, and the convex frequency pk (j) is
Assuming that the distance from FR is DR, the distances DL and DR are expressed as follows.

【００８４】[0084]

【数１４】 [Equation 14]

【００８５】次に、DLの大きさがDRの定数C ( C > 1.0
)倍を超えているか、もしくはDRの大きさがDLのC倍を
超えているかを判定し、超えているようであったらDLま
たはDRを制限する。具体的には、Next, the size of DL is a constant C of DR (C> 1.0
) It is determined whether the number exceeds DR or the size of DR exceeds C times DL. If so, DL or DR is limited. In particular,

【００８６】[0086]

【数１５】 (Equation 15)

【００８７】とし、DL'、DR'に対応する周波数FL'、FR'
を求め、FL'からFR'までの帯域を凸部周波数pk(j)の凸
部帯域Bpk(j)とする。図１０にその様子を示す。ここで
定数Cは2.0としている。また、この方法においても、凸
部帯域Bpk(j)が大きくなりすぎないように、予め定めら
れた長さ以下に制限を設けても良い。And the frequencies FL ′ and FR ′ corresponding to DL ′ and DR ′
And a band from FL ′ to FR ′ is defined as a convex band Bpk (j) of the convex frequency pk (j). FIG. 10 shows this state. Here, the constant C is set to 2.0. Also in this method, a limit may be set to a predetermined length or less so that the convex band Bpk (j) does not become too large.

【００８８】［第３の実施形態］図１１に、本発明の第
３の実施形態に係る音声スペクトル強調装置の構成を示
す。図１１において、図１と同じ名称を有する構成要素
は、機能も図１の場合と同じであるので、ここでは説明
を省略する。[Third Embodiment] FIG. 11 shows the configuration of a speech spectrum enhancing apparatus according to a third embodiment of the present invention. In FIG. 11, components having the same names as those in FIG. 1 have the same functions as those in FIG. 1, and a description thereof will be omitted.

【００８９】第１の実施形態では、スペクトル強調フィ
ルタを周波数領域で構成していたのに対し、本実施形態
では時間領域で構成する点が異なっており、図１におけ
る時間−周波数変換部１０８が除去され、これに伴って
周波数−時間変換部１１０も除去されている。また、図
１２は本実施形態の処理の流れを示すフローチャートで
あり、ステップＳ2001〜Ｓ2004の処理は図２のステップ
Ｓ1001〜Ｓ1004と同様である。The first embodiment is different from the first embodiment in that the spectrum emphasis filter is configured in the frequency domain, whereas the second embodiment is configured in the time domain. The time-frequency conversion unit 108 in FIG. The frequency-time conversion unit 110 has also been removed. FIG. 12 is a flowchart showing the flow of the processing of this embodiment. The processing of steps S2001 to S2004 is the same as steps S1001 to S1004 of FIG.

【００９０】本実施形態では、フィルタ構成部２０６に
おいて、先ず凸部帯域／凹部帯域決定部２０４により決
定された凸部帯域及び凹部帯域、凸部倍率／凹部倍率決
定部２０５により決定された凸部倍率及び凹部倍率を用
いて、図７や図８で示したようなスペクトル強調フィル
タの特性（または倍率ベクトル) {G(n); n=0 〜 N/2}を
決定した後、この特性を有する時間領域のフィルタを構
成する（ステップＳ2005）。In this embodiment, in the filter configuration section 206, first, the convex band and the concave band determined by the convex band / recess band determining section 204, and the convex section determined by the convex magnification / recess magnification determining section 205. After determining the characteristic (or magnification vector) {G (n); n = 0 to N / 2} of the spectrum emphasizing filter as shown in FIG. 7 and FIG. A time-domain filter is configured (step S2005).

【００９１】具体的には、まず所望のスペクトル強調フ
ィルタの特性のパワースペクトルを求め、このパワース
ペクトルを逆フーリエ変換して自己相関関数を求め、こ
の自己相関関数について周知のLPC分析法で分析を行
い、LPC係数{ap(m); m = 1 〜 M}を求める。このように
して求めたLPC係数{ap(m); m = 1 〜 M}を全極型フィル
タの特性として用いて時間領域のスペクトル強調フィル
タを構成し、このフィルタを用いてフィルタリング部２
１０によりフィルタリングを行って、入力される音声信
号のスペクトル強調を行い（ステップＳ2006）、次いで
ゲイン算出部２１１で補正ゲインを算出し（ステップＳ
2007）、この補正ゲインを乗算器２１２によりスペクト
ル強調後の音声信号に対して乗じる（ステップＳ200
8）。More specifically, first, a power spectrum of the characteristic of a desired spectrum emphasizing filter is obtained, and this power spectrum is subjected to inverse Fourier transform to obtain an autocorrelation function. Then, an LPC coefficient {ap (m); m = 1 to M} is obtained. The spectrum emphasis filter in the time domain is configured using the LPC coefficient {ap (m); m = 1 to M} obtained as described above as the characteristic of the all-pole filter, and the filtering unit 2 is configured using this filter.
10, filtering is performed to enhance the spectrum of the input audio signal (step S2006), and then a correction gain is calculated by the gain calculator 211 (step S2006).
2007), the multiplier 212 multiplies the audio signal after spectrum enhancement by the multiplier 212 (step S200).
8).

【００９２】また、前述の自己相関関数を基に、所望の
特性に近くなるよう極零型フィルタを周知の方法に従い
生成して時間領域のスペクトル強調フィルタを構成し、
このフィルタを用いてフィルタリング部２１０によりフ
ィルタリングを行い、入力された音声信号のスペクトル
強調を行うこともできる。このようなフィルタとして
は、Modified Yule Walkerフィルタなどが知られてい
る。Further, based on the autocorrelation function described above, a pole-zero filter is generated according to a known method so as to be close to a desired characteristic to constitute a time-domain spectrum emphasis filter.
Filtering can be performed by the filtering unit 210 using this filter, and the spectrum of the input audio signal can be emphasized. As such a filter, a Modified Yule Walker filter and the like are known.

【００９３】［第４の実施形態］図１３に、本発明の第
４の実施形態に係る音声スペクトル強調装置の構成を示
す。図１３において、図１と同じ名称を有する構成要素
は、機能も図１の場合と同じであるので、ここでは説明
を省略する。[Fourth Embodiment] FIG. 13 shows a configuration of a speech spectrum emphasizing apparatus according to a fourth embodiment of the present invention. In FIG. 13, components having the same names as those in FIG. 1 have the same functions as those in FIG.

【００９４】第１の実施形態では、振幅スペクトル概形
としてLPC係数から求められるLPCスペクトルを用いてい
たのに対して、本実施形態では時間−周波数変換部３０
８により算出される音声信号の周波数領域の信号を用い
て振幅スペクトル概形算出部３０２で振幅スペクトル概
形を求めている点が異なっている。従って、本実施形態
ではLPC係数を入力として与える必要がない。また、図
１４は本実施形態の処理の流れを示すフローチャートで
あり、ステップＳ3003〜Ｓ3010の処理は図２のステップ
Ｓ1003〜Ｓ1010と同様である。In the first embodiment, the LPC spectrum obtained from the LPC coefficient is used as the approximate shape of the amplitude spectrum.
8 in that the amplitude spectrum outline calculation unit 302 obtains the amplitude spectrum outline using the signal in the frequency domain of the audio signal calculated in step S8. Therefore, in the present embodiment, there is no need to provide the LPC coefficient as an input. FIG. 14 is a flowchart showing the flow of processing according to the present embodiment. The processing in steps S3003 to S3010 is the same as that in steps S1003 to S1010 in FIG.

【００９５】すなわち、入力端子３０７から入力された
音声信号は、まず時間−周波数変換部３０８で周波数領
域の信号に変換され（ステップＳ3001）、これにより生
成されたスペクトル{S(n); n=0 〜 N/2}を用いて振幅ス
ペクトル概形算出部３０２で振幅スペクトル概形が算出
される（ステップＳ3002）。振幅スペクトル概形の具体
的な算出方法として、例えばスペクトル{S(n); n=0 〜
N/2}の移動平均値を用いる。移動平均スペクトル{Sa
(n); n=0 〜 N/2}は、スペクトル{S(n); n=0 〜 N/2}を
用いて次式のように算出できる。That is, the audio signal input from the input terminal 307 is first converted into a signal in the frequency domain by the time-frequency conversion unit 308 (step S3001), and the spectrum {S (n); n = 0 to N / 2}, the amplitude spectrum outline calculation unit 302 calculates the amplitude spectrum outline (step S3002). As a specific calculation method of the amplitude spectrum outline, for example, a spectrum {S (n); n = 0 to
N / 2}. Moving average spectrum {Sa
(n); n = 0 to N / 2} can be calculated as follows using the spectrum {S (n); n = 0 to N / 2}.

【００９６】[0096]

【数１６】 (Equation 16)

【００９７】Mは移動平均の窓長を表す。このようにし
て求めた移動平均スペクトルは、凸部周波数／凹部周波
数決定部３０３に与えられる。M represents the window length of the moving average. The moving average spectrum obtained in this manner is provided to the convex / concave frequency determining unit 303.

【００９８】また、本実施形態では、凸部倍率／凹部倍
率決定部３０４において、入力された音声信号のスペク
トル{S(n); n=0 〜 N/2}、または移動平均スペクトル{S
a(n); n=0 〜 N/2}を用いて凸部倍率と凹部倍率が決定
される。その具体的な方法としては、例えばスペクトル
{S(n); n=0 〜 N/2}のパワーを用いる方法、移動平均ス
ペクトル{Sa(n); n=0 〜 N/2}の上限値と下限値の差を
用いる方法などが考えられる。In this embodiment, the convex / concave magnification determining unit 304 determines whether the spectrum {S (n); n = 0 to N / 2} of the input audio signal or the moving average spectrum {S
a (n); n = 0 to N / 2}, the convex portion magnification and the concave portion magnification are determined. As a specific method, for example, spectrum
A method using the power of {S (n); n = 0 to N / 2}, a method using the difference between the upper and lower limits of the moving average spectrum {Sa (n); n = 0 to N / 2}, etc. Conceivable.

【００９９】［第５の実施形態］図１５に、本発明の第
５の実施形態に係る音声スペクトル強調装置の構成を示
す。図１５において、図１と同じ名称を有する構成要素
は、機能も図１の場合と同じであるので、ここでは説明
を省略する。また、図１６は本実施形態の処理の流れを
示すフローチャートであり、ステップＳ4001〜Ｓ4005，
Ｓ4007〜Ｓ4009の処理は図２のステップＳ1001〜Ｓ100
5，Ｓ1007〜Ｓ1009と同様である。[Fifth Embodiment] FIG. 15 shows the configuration of a speech spectrum enhancing apparatus according to a fifth embodiment of the present invention. In FIG. 15, components having the same names as those in FIG. 1 have the same functions as those in FIG. 1, and a description thereof will not be repeated. FIG. 16 is a flowchart showing the flow of the processing of this embodiment, and includes steps S4001 to S4005,
The processing in S4007 to S4009 is performed in steps S1001 to S100 in FIG.
5. Same as S1007 to S1009.

【０１００】本実施形態では、入力端子４０７から与え
られる信号が周波数領域の信号、例えば音声スペクトル
である点が第１の実施形態と異なっている。従って、図
１における時間−周波数変換部１０８は不要であり、ス
テップＳ4005で構成されたスペクトル強調フィルタの特
性が音声信号の周波数スペクトルに乗算される（ステッ
プＳ4006）。This embodiment is different from the first embodiment in that the signal supplied from the input terminal 407 is a signal in the frequency domain, for example, a voice spectrum. Therefore, the time-frequency conversion unit 108 in FIG. 1 is unnecessary, and the frequency spectrum of the audio signal is multiplied by the characteristic of the spectrum emphasis filter configured in step S4005 (step S4006).

【０１０１】本実施形態の構成は、周波数領域で符号化
を行う方式、例えばMBE(Multi-bandExcitation)符号化
などのように復号された信号が周波数領域で表される場
合に適用でき、このような場合には時間−周波数変換部
を省略できる利点がある。The configuration of the present embodiment can be applied to a method of performing encoding in the frequency domain, for example, when a decoded signal is represented in the frequency domain as in MBE (Multi-band Excitation) encoding. In such a case, there is an advantage that the time-frequency conversion unit can be omitted.

【０１０２】また、本実施形態の構成では、補正ゲイン
算出部４１１では入力された音声スペクトルと乗算器４
０８の出力であるスペクトル強調後の信号を用いて求め
られる。さらに、周波数−時間変換部４１０では、乗算
器４１２の出力であるゲイン補正後の信号を用いて変換
が行われ、この出力信号が出力端子４１３から出力され
る。Further, in the configuration of the present embodiment, the input gain spectrum and the multiplier 4
It is obtained by using the signal after spectrum emphasis which is the output of the signal 08. Further, in frequency-time conversion section 410, conversion is performed using the signal after gain correction, which is the output of multiplier 412, and this output signal is output from output terminal 413.

【０１０３】［第６の実施形態］図１７に、本発明の第
６の実施形態に係る音声スペクトル強調装置の構成を示
す。図１７において、図１、図１５と同じ名称を有する
構成要素は、機能も図１、図１５の場合と同じであるの
で、ここでは説明を省略する。また、図１８は本実施形
態の処理の流れを示すフローチャートであり、ステップ
Ｓ5002〜Ｓ5009の処理は図１６のステップＳ4002〜Ｓ40
09と同様である。[Sixth Embodiment] FIG. 17 shows a configuration of a speech spectrum emphasizing apparatus according to a sixth embodiment of the present invention. In FIG. 17, components having the same names as those in FIGS. 1 and 15 have the same functions as those in FIGS. 1 and 15, and a description thereof will be omitted. FIG. 18 is a flowchart showing the flow of the processing of this embodiment. The processing in steps S5002 to S5009 is performed in steps S4002 to S40 in FIG.
Same as 09.

【０１０４】本実施形態は前述の第５の実施形態と最も
近く、異なる点は本実施形態ではLPC係数が与えられて
いない点にある。この場合、第４の実施形態と同様に、
入力端子５０７から与えられる音声スペクトルを用いて
振幅スペクトル概形算出部５０２で振幅スペクトル概形
を求める（ステップＳ5001）。This embodiment is closest to the fifth embodiment described above, and differs from the fifth embodiment in that no LPC coefficient is given in this embodiment. In this case, similar to the fourth embodiment,
Using the voice spectrum given from the input terminal 507, the amplitude spectrum outline calculation unit 502 obtains an amplitude spectrum outline (step S5001).

【０１０５】［第７の実施形態］図１９は、本発明の第
７の実施形態として本発明に係るスペクトル強調装置を
音声符号化／復号化システムにおける音声復号化装置に
適用した例である。本実施形態では、音声符号化／復号
化システムとしてCELP方式を用いた場合について説明を
行うが、これに限定されるものではない。例えば、MBE
のような周波数領域で音声符号化を行う方法にも適用で
きる。また、本実施形態では第１の実施形態で示したス
ペクトル強調装置を用いた場合について説明を行うが、
これに限定されることはなく、他の実施形態を適用する
ことも可能である。本実施形態の処理の流れを示す図２
０と併せて、本実施形態の構成と動作を説明する。[Seventh Embodiment] FIG. 19 shows an example in which a spectrum emphasizing device according to the present invention is applied to a speech decoding device in a speech encoding / decoding system as a seventh embodiment of the present invention. In the present embodiment, a case where the CELP method is used as a speech encoding / decoding system will be described, but the present invention is not limited to this. For example, MBE
It can also be applied to a method of performing speech coding in the frequency domain as described above. Also, in the present embodiment, a case will be described in which the spectrum emphasizing device shown in the first embodiment is used.
The present invention is not limited to this, and other embodiments can be applied. FIG. 2 showing a processing flow of the present embodiment.
The configuration and operation of the present embodiment will be described in combination with 0.

【０１０６】図１９に示す音声復号化装置は、大きく音
声復号部６０１とスペクトル強調部６０２からなる。ま
た、図２０において、ステップＳ6001〜ステップＳ6007
に示される処理は音声復号処理、ステップＳ6008〜ステ
ップＳ6017に示される処理はスペクトル強調処理を表
す。The speech decoding apparatus shown in FIG. 19 mainly comprises a speech decoding section 601 and a spectrum emphasizing section 602. Also, in FIG. 20, steps S6001 to S6007
Indicate the speech decoding process, and the processes indicated in steps S6008 to S6017 indicate the spectrum emphasis process.

【０１０７】入力端子６０３からは、図示されていない
音声符号化部により圧縮符号化された音声信号を表す符
号化ビットストリームが音声復号部６０１に入力され
る。入力された符号化ビットストリームは、デマルチプ
レクサ６０４によりLPC係数インデックス、ACBベクトル
インデックス、SCBベクトルインデックス、ゲインイン
デックスに分離・変換される（ステップＳ6001）。From an input terminal 603, an encoded bit stream representing an audio signal compressed and encoded by an audio encoding unit (not shown) is input to the audio decoding unit 601. The input coded bit stream is separated and converted into an LPC coefficient index, an ACB vector index, an SCB vector index, and a gain index by the demultiplexer 604 (step S6001).

【０１０８】LPC係数復号部６０５では、LPC係数インデ
ックスを基にLPC係数{αq(i); i =1〜NP}を復号する(ス
テップＳ6002）。復号されたLPC係数は、合成フィルタ
６１２及びスペクトル強調部６０２に与えられる。ACB
ベクトル復号部６０６ではACBベクトルインデックスを
用いてACBベクトルを復号し（ステップＳ6003）、SCBベ
クトル復号部６０７ではSCBベクトルインデックスを用
いてSCBベクトルを復号する（ステップＳ6004）。同様
に、ゲイン復号部６０８ではゲインインデックスを用い
てACBベクトルゲインとSCBベクトルゲインを復号する
（ステップＳ6005）。The LPC coefficient decoding section 605 decodes the LPC coefficient {αq (i); i = 1 to NP} based on the LPC coefficient index (step S6002). The decoded LPC coefficients are provided to synthesis filter 612 and spectrum enhancement section 602. ACB
The vector decoding unit 606 decodes the ACB vector using the ACB vector index (step S6003), and the SCB vector decoding unit 607 decodes the SCB vector using the SCB vector index (step S6004). Similarly, the gain decoding unit 608 decodes the ACB vector gain and the SCB vector gain using the gain index (step S6005).

【０１０９】乗算器６１０では、ACBベクトルとACBベク
トルゲインを乗算し、乗算器６０９ではSCBベクトルとS
CBベクトルゲインを乗算する。これら乗算器６１０，６
０９による乗算後の信号を加算器６１１によって加算す
ることで、音源を生成する（ステップS6006）。よっ
て、音源ex(n)は次式で表される。The multiplier 610 multiplies the ACB vector by the ACB vector gain, and the multiplier 609 multiplies the SCB vector by S
Multiply the CB vector gain. These multipliers 610 and 6
The sound source is generated by adding the signals after the multiplication by 09 by the adder 611 (step S6006). Therefore, the sound source ex (n) is represented by the following equation.

【０１１０】[0110]

【数１７】 [Equation 17]

【０１１１】ここで、p(n)はACBベクトル、c(n)はSCBベ
クトル、gp はACBベクトルゲイン、そしてgcはSCBベク
トルゲインを表す。Here, p (n) represents the ACB vector, c (n) represents the SCB vector, gp represents the ACB vector gain, and gc represents the SCB vector gain.

【０１１２】LPC係数{αq(i); i = 1〜NP}で構成される
合成フィルタ６１２に音源ex(n)を通して、合成信号so
(n)を生成する(ステップＳ６007)。合成信号so(n)は、
次式に従い求められる。The synthesized signal so is passed through a sound source ex (n) to a synthesis filter 612 composed of LPC coefficients {αq (i); i = 1 to NP}.
(n) is generated (step S6007). The composite signal so (n) is
It is obtained according to the following equation.

【０１１３】[0113]

【数１８】 (Equation 18)

【０１１４】このようにして生成された合成信号so(n)
は、スペクトル強調部６０２に与えられる。スペクトル
強調部６０２は、図１に示した第１の実施形態で説明し
たスペクトル強調装置と同様の構成である。The synthesized signal so (n) thus generated
Is given to the spectrum emphasizing unit 602. The spectrum enhancement unit 602 has the same configuration as the spectrum enhancement device described in the first embodiment shown in FIG.

【０１１５】音声スペクトル強調部６０２において、LP
Cスペクトル算出部１０２では、音声復号部６０１から
与えられた、音声信号の振幅スペクトルの情報を含むLP
C係数{αq(i); i = 1〜NP}を用いて、第１の実施形態と
同様に振幅スペクトル概形であるLPCスペクトルを算出
する（ステップＳ6008）。In the voice spectrum emphasizing unit 602, the LP
In the C spectrum calculation unit 102, the LP including the information of the amplitude spectrum of the audio signal given from the audio decoding unit 601
Using the C coefficient {αq (i); i = 1 to NP}, an LPC spectrum that is an approximate amplitude spectrum is calculated as in the first embodiment (step S6008).

【０１１６】次に、凸部周波数／凹部周波数算出部１０
３によってLPCスペクトルを用いて凸部周波数と凹部周
波数を検出し（ステップＳ6009）、凸部帯域／凹部帯域
決定部１０４によって凸部周波数と凹部周波数を基に凸
部帯域と凹部帯域を決定する（ステップＳ6010）。Next, the convex / concave frequency calculating section 10
3, the convex part frequency and the concave part frequency are detected using the LPC spectrum (step S6009), and the convex part band and the concave part band are determined by the convex part band / concave part band determining unit 104 based on the convex part frequency and the concave part frequency (step S6009). Step S6010).

【０１１７】また、凸部倍率／凹部倍率決定部１０４で
はLPC係数{αq(i); i = 1〜NP}を用いて凸部倍率と凹部
倍率を決定する(ステップＳ6011）。そして、フィルタ
構成部１０６では、凸部帯域と凹部帯域、および凸部倍
率と凹部倍率を基にスペクトル強調フィルタを構成する
（ステップＳ6012）。Also, the convex / concave magnification determining unit 104 determines the convex and concave magnifications using the LPC coefficient {αq (i); i = 1 to NP} (step S6011). Then, the filter configuration unit 106 configures a spectrum enhancement filter based on the convex band and the concave band, and the convex magnification and the concave magnification (step S6012).

【０１１８】次に、時間−周波数変換部１０８では、合
成信号so(n)を周波数領域の信号に変換し(ステップＳ60
13）、乗算器１０９によって合成信号の周波数領域の信
号とスペクトル強調フィルタを乗算し（ステップＳ601
4）、乗算後の信号を周波数−時間変換部１１０によっ
て時間領域の信号に変換する（ステップＳ6015）。Next, the time-frequency converter 108 converts the synthesized signal so (n) into a signal in the frequency domain (step S60).
13) The multiplier 109 multiplies the frequency domain signal of the synthesized signal by the spectrum emphasis filter (step S601).
4) The frequency-time conversion unit 110 converts the multiplied signal into a time-domain signal (step S6015).

【０１１９】次に、ゲイン算出部１１１によって合成信
号so(n)と周波数−時間変換部１１０の出力信号とを用
いて補正ゲインを算出し（Ｓ6016）、また周波数−時間
変換部１１０の出力信号と補正ゲインを乗算器１１２に
よって乗算し（ステップＳ6017）、出力端子６１３より
出力する。Next, the gain calculating section 111 calculates a correction gain using the composite signal so (n) and the output signal of the frequency-time converting section 110 (S6016). And the correction gain are multiplied by the multiplier 112 (step S6017) and output from the output terminal 613.

【０１２０】このようにして、音声復号化部６０１から
出力される復号音声信号に対して本発明に基づく音声ス
ペクトル強調部６０２によってスペクトル強調を施すこ
とができる。In this way, spectrum enhancement can be performed on the decoded speech signal output from speech decoding section 601 by speech spectrum enhancement section 602 according to the present invention.

【０１２１】[0121]

【発明の効果】以上説明したように、本発明によれば音
声信号の振幅スペクトル概形から凸部周波数と凹部周波
数を求め、凸部帯域の周波数成分については振幅スペク
トルを強調し、凹部帯域の周波数成分については振幅ス
ペクトルを減衰させることによって、低域が過度に強調
されたり、高域が過度に強調されるような不適当なスペ
クトル傾きが原理的に生じることなく、良好なスペクト
ル強調を行うことができる。As described above, according to the present invention, the convex frequency and the concave frequency are obtained from the outline of the amplitude spectrum of the audio signal, and the amplitude spectrum is emphasized for the frequency component of the convex band. By attenuating the amplitude spectrum of frequency components, good spectrum emphasis is performed without in principle causing an inappropriate spectrum tilt such that low frequencies are excessively emphasized or high frequencies are excessively emphasized. be able to.

【０１２２】また、凸部周波数とその近傍を同一倍率で
強調、凹部周波数とその近傍を同一倍率で減衰させるこ
とによって、凸部周波数や凹部周波数がずれるという問
題も生じない。Further, by emphasizing the convex portion frequency and its vicinity at the same magnification and attenuating the concave portion frequency and its vicinity at the same magnification, there is no problem that the convex portion frequency and the concave portion frequency shift.

【０１２３】さらに、凸部周波数は強調し凹部周波数は
減衰するように倍率が決定されるため、２つの凸部周波
数が各々近接している場合においても、その凸部周波数
に挟まれた凹部周波数が強調されてしまうという問題を
回避することができる。Further, the magnification is determined so that the convex frequency is emphasized and the concave frequency is attenuated. Therefore, even when the two convex frequencies are close to each other, the concave frequency sandwiched between the convex frequencies is determined. Can be avoided.

【０１２４】このように本発明によると、不適当なスペ
クトル傾きを発生することなく、かつスペクトルの凸部
をずらしたり凹部を強調することのない、理想的なスペ
クトル強調を可能とすることができる。As described above, according to the present invention, it is possible to perform ideal spectrum enhancement without generating an inappropriate spectrum inclination and without shifting a convex part of a spectrum or emphasizing a concave part. .

[Brief description of the drawings]

【図１】本発明の一実施形態に係る音声スペクトル強
調装置の構成を示すブロック図FIG. 1 is a block diagram showing a configuration of a speech spectrum emphasizing device according to an embodiment of the present invention.

【図２】同実施形態における処理手順を示すフローチ
ャートFIG. 2 is a flowchart showing a processing procedure in the embodiment.

【図３】同実施形態の動作を説明するための凸部周波
数及び凹部周波数の分布の一例を示す図FIG. 3 is a view showing an example of a distribution of a convex portion frequency and a concave portion frequency for explaining the operation of the embodiment.

【図４】同実施形態の動作を説明するための凸部帯域
及び凹部帯域の関係の一例を模式的に示す図FIG. 4 is a diagram schematically showing an example of a relationship between a convex band and a concave band for explaining the operation of the embodiment.

【図５】同実施形態の動作を説明するための凸部帯域
及び凹部帯域の関係の他の例を模式的に示す図FIG. 5 is a diagram schematically showing another example of the relationship between the convex band and the concave band for explaining the operation of the embodiment.

【図６】同実施形態におけるフィルタ構成部の動作を
説明するための凸部帯域及び凹部帯域と凸部倍率及び凹
部倍率の関係を示す図FIG. 6 is a diagram showing the relationship between the convex band and the concave band, and the convex magnification and the concave magnification for explaining the operation of the filter constituting unit in the embodiment.

【図７】同実施形態における凸部倍率及び凹部倍率の
決定方法の一例について示す図FIG. 7 is a view showing an example of a method for determining a magnification of a convex portion and a magnification of a concave portion in the embodiment.

【図８】同実施形態における凸部倍率及び凹部倍率の
決定方法の他の例について示す図FIG. 8 is a view showing another example of the method of determining the magnification of the convex portion and the magnification of the concave portion in the embodiment.

【図９】本発明の第２の実施形態における凸部帯域の
決定方法について示す図FIG. 9 is a diagram illustrating a method of determining a convex band according to the second embodiment of the present invention.

【図１０】同実施形態における凸部帯域の決定方法の
他の例について示す図FIG. 10 is a diagram showing another example of the method of determining the convex band according to the embodiment.

【図１１】本発明の第３の実施形態に係る音声スペク
トル強調装置の構成を示すブロック図FIG. 11 is a block diagram showing a configuration of a speech spectrum emphasizing device according to a third embodiment of the present invention.

【図１２】同実施形態の処理手順を示すフローチャー
トFIG. 12 is a flowchart showing a processing procedure according to the embodiment;

【図１３】本発明の第４の実施形態に係る音声スペク
トル強調装置の構成を示すブロック図FIG. 13 is a block diagram showing a configuration of a speech spectrum emphasizing device according to a fourth embodiment of the present invention.

【図１４】同実施形態の処理手順を示すフローチャー
トFIG. 14 is a flowchart showing a processing procedure according to the embodiment;

【図１５】本発明の第５の実施形態に係る音声スペク
トル強調装置の構成を示すブロック図FIG. 15 is a block diagram showing a configuration of a speech spectrum emphasizing device according to a fifth embodiment of the present invention.

【図１６】同実施形態の処理手順を示すフローチャー
トFIG. 16 is a flowchart showing a processing procedure according to the embodiment;

【図１７】本発明の第６の実施形態に係る音声スペク
トル強調装置の構成を示すブロック図FIG. 17 is a block diagram showing a configuration of a speech spectrum emphasizing device according to a sixth embodiment of the present invention.

【図１８】同実施形態の処理手順を示すフローチャー
トFIG. 18 is a flowchart showing a processing procedure according to the embodiment;

【図１９】本発明の第７の実施形態に係る音声復号化
装置の構成を示すブロック図FIG. 19 is a block diagram showing a configuration of a speech decoding device according to a seventh embodiment of the present invention.

【図２０】同実施形態の処理手順を示すフローチャー
トFIG. 20 is a flowchart showing a processing procedure according to the embodiment;

【図２１】第１の従来技術を説明するための合成フィ
ルタのLPCスペクトル及びスペクトル強調フィルタのス
ペクトル特性を示す図FIG. 21 is a diagram showing an LPC spectrum of a synthesis filter and a spectrum characteristic of a spectrum emphasis filter for explaining a first related art;

【図２２】第２の従来技術を説明するための合成フィ
ルタのLPCスペクトル及びスペクトル強調フィルタのス
ペクトル特性を示す図FIG. 22 is a view showing an LPC spectrum of a synthesis filter and a spectrum characteristic of a spectrum emphasis filter for explaining a second conventional technique.

[Explanation of symbols]

１０１，２０１，４０１…LPC係数入力端子１０２，２０２，４０２，５０２…LPCスペクトル算出
部３０２…振幅スペクトル概形算出部１０３，２０３，３０３，４０３，５０３…凸部周波数
／凹部周波数決定部１０４，２０４，３０４，４０４，５０４…凸部帯域／
凹部帯域決定部１０５，２０５，３０５，４０５，５０５…凸部倍率／
凹部倍率決定部１０６，２０６，３０６，４０６，５０６…フィルタ構
成部１０７，２０７，３０７…音声入力端子４０７，５０７…音声スペクトル入力端子１０８，３０８…時間−周波数変換部１０９，３０９…乗算器１１０，３１０，４１０，５１０…周波数−時間変換部２１０…フィルタリング部１１１，２１１，３１１，４１１，５１１…ゲイン算出
部１１２，２１２，３１２，４１２，５１２…乗算器１１３，２１３，３１３，４１３，５１３，６１３…音
声出力端子６０１…音声復号部６０２…音声スペクトル強調部６０３…音声符号化ビットストリーム入力端子101, 201, 401... LPC coefficient input terminals 102, 202, 402, 502... LPC spectrum calculator 302... Amplitude spectrum shape calculator 103, 203, 303, 403, 503... 204, 304, 404, 504...
Concave band determining unit 105, 205, 305, 405, 505...
Concavity magnification determining units 106, 206, 306, 406, 506: Filter components 107, 207, 307: Audio input terminals 407, 507: Audio spectrum input terminals 108, 308: Time-frequency converters 109, 309: Multipliers 110 , 310, 410, 510 frequency-time conversion section 210 filtering section 111, 211, 311, 411, 511 gain calculation section 112, 212, 312, 412, 512 ... multipliers 113, 213, 313, 413, 513 , 613: audio output terminal 601: audio decoding unit 602: audio spectrum emphasis unit 603: audio encoded bit stream input terminal

───────────────────────────────────────────────────── フロントページの続きＦターム(参考） 5D045 CA01 CB01 5J064 AA01 BB03 BB08 BB12 BB13 BC02 BC09 BC11 BC27 BD01 CA01 CB13 ──────────────────────────────────────────────────続き Continued on the front page F term (reference) 5D045 CA01 CB01 5J064 AA01 BB03 BB08 BB12 BB13 BC02 BC09 BC11 BC27 BD01 CA01 CB13

Claims

[Claims]

An amplitude spectrum of an audio signal is determined by determining a convex band and a concave band including a convex frequency and a concave frequency, respectively, and the amplitude spectrum of a frequency component included in the convex band is emphasized. An audio spectrum emphasizing method, comprising: forming a filter having a characteristic of attenuating an amplitude spectrum of a contained frequency component, and filtering the audio signal by the filter.

2. A means for obtaining an approximate shape of an amplitude spectrum of an audio signal; a means for obtaining a convex frequency and a concave frequency of the approximate amplitude spectrum; and a convex frequency and a concave frequency from the convex frequency and the concave frequency, respectively. Means for determining a convex band and a concave band including, and a filter having a characteristic of enhancing an amplitude spectrum of a frequency component included in the convex band and attenuating an amplitude spectrum of a frequency component included in the concave band. hand,
Means for filtering the audio signal by the filter.

3. A means for obtaining an approximate shape of an amplitude spectrum of an audio signal; a means for obtaining a convex portion frequency and a concave portion frequency of the approximate amplitude spectrum; and a convex portion band and a concave portion band including the convex portion frequency and the concave portion frequency, respectively. Means for determining the amplitude spectrum of the frequency component included in the convex band by multiplying the amplitude spectrum of the frequency component by a predetermined convex magnification, and the amplitude spectrum of the frequency component included in the concave band by a predetermined concave section. Attenuated by multiplying by a magnification, for the amplitude spectrum of the frequency component not included in the convex band and the concave band, the magnification set to the maximum of the convex magnification or more and the minimum of the concave magnification or more. Means for constructing a multiplying filter, and means for filtering said audio signal with said filter. .

4. An audio spectrum emphasizing apparatus according to claim 2, wherein said filtering means performs said filtering processing in a frequency domain.

5. The filtering device according to claim 2, wherein the filtering unit performs the filtering process in a time domain.
Or the speech spectrum emphasizing device according to 3.

6. The means for determining the approximate shape of the amplitude spectrum comprises:
4. The speech spectrum emphasizing device according to claim 2, wherein an LPC spectrum is obtained as the amplitude spectrum outline.

7. The means for determining the convex band and the concave band determines at least one of the width and the frequency position of the convex band based on the positional relationship between the convex frequency and the concave frequencies located on both sides thereof. The speech spectrum emphasizing device according to claim 2 or 3, wherein:

8. An audio spectrum emphasizing apparatus according to claim 2, further comprising means for determining the magnification of the convex portion and the magnification of the concave portion based on the approximate shape of the amplitude spectrum.

9. An audio decoding unit for decoding encoded data of an audio signal and outputting a decoded audio signal and a parameter including at least information on an amplitude spectrum of the audio signal, a decoded audio signal from the audio decoding unit and the parameter And a spectrum emphasis unit configured by the speech spectrum emphasis device according to any one of claims 2 to 8, wherein the spectrum emphasis unit obtains the spectrum outline from the parameters, and obtains the decoded speech. An audio decoding device, which performs the filtering process on a signal.