JP2002091498A

JP2002091498A - Audio signal encoding device

Info

Publication number: JP2002091498A
Application number: JP2000283171A
Authority: JP
Inventors: Katsuyoshi Nishitani; 勝義西谷
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 2000-09-19
Filing date: 2000-09-19
Publication date: 2002-03-27

Abstract

PROBLEM TO BE SOLVED: To conduct the section discrimination of MDCT coefficients prior to conducting quantization of the coefficients. SOLUTION: The device is composed of an auditory psychology analyzer 1 which computes auditory psychology parameters from audio signals, an MDCT equipment 2 which converts the audio signals to MDCT coefficients, a scale factor computer 3 which computes the scale factors of the scale factor band of the MDCT coefficients so that quantization noise caused by the MDCT coefficients does not become larger than the allowable quantization noise electric power of the auditory psychology parameters, a section discriminator 9 which divides each scale factor band to respective plural sections, a quantizer 4 which quantizes the MDCT coefficients by the scale factors and the entire quantization step numbers, a variable length encoder 6 which conducts a variable length encoding of the quantized values, a bit number discriminator 7 which discriminates to determine whether the encoded bit number is within a usable bit number or not and a bit stream generator 8 which summarizes the encoded data and generates a bit stream.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ＭＰＥＧ２ＡＡ
Ｃ方式を適用したオーディオ信号符号化装置において、
ＭＤＣＴ係数の量子化前にＭＤＣＴ係数のセクション判
定を行うように構成したオーディオ信号符号化装置に関
するものである。The present invention relates to an MPEG2 AA
In an audio signal encoding device to which the C system is applied,
The present invention relates to an audio signal encoding device configured to perform a section determination of an MDCT coefficient before quantizing the MDCT coefficient.

【０００２】[0002]

【従来の技術】オーディオ信号符号化方法は種々ある
が、時間軸上で入力したオーディオ信号を、周波数軸上
のオーディオ信号に変換して高能率符号化を行う方法の
一例としてＭＰＥＧ２ＡＡＣ（Moving Picture Image
Coding Experts Group Phase ２Advanced Audio Codin
g ）方式が適用されている。2. Description of the Related Art There are various methods for encoding an audio signal. An example of a method for converting an audio signal input on a time axis into an audio signal on a frequency axis and performing high-efficiency encoding is MPEG2 AAC (Moving Picture). Image
Coding Experts Group Phase 2 Advanced Audio Codin
g) The method is applied.

【０００３】図４は従来のオーディオ信号符号化装置を
説明するためのブロック図、図５はＭＤＴＣ係数のセク
ショニングの一例を説明するための図である。FIG. 4 is a block diagram for explaining a conventional audio signal encoding device, and FIG. 5 is a diagram for explaining an example of sectioning of MDTC coefficients.

【０００４】図４に示した従来のオーディオ信号符号化
装置は、上記したＭＰＥＧ２ＡＡＣ方式を適用したも
のである。[0004] The conventional audio signal encoding apparatus shown in FIG. 4 is one to which the above-mentioned MPEG2 AAC method is applied.

【０００５】図４に示した如く、従来のオーディオ信号
符号化装置では、時間軸上のオーディオ信号が聴覚心理
分析器１とＭＤＣＴ（Modified Discrete Cosine Trans
form）器２とに入力されている。As shown in FIG. 4, in the conventional audio signal encoding apparatus, the audio signal on the time axis is transmitted to the psychoacoustic analyzer 1 and the MDCT (Modified Discrete Cosine Transformer).
form) device 2.

【０００６】まず、聴覚心理分析器１では、入力した時
間軸上のオーディオ信号にＦＦＴ処理を施して、周波数
スペクトルを求め、それを基にマスキングを計算し、人
間の聴覚特性に基づいて予め設定した周波数帯域毎の許
容量子化雑音電力を聴覚心理パラメータとして算出し、
この周波数帯域毎の許容量子化雑音電力をスケールファ
クタ算出器３側に出力している。First, the psychoacoustic analyzer 1 performs an FFT process on an input audio signal on a time axis to obtain a frequency spectrum, calculates masking based on the frequency spectrum, and sets in advance based on human auditory characteristics. Calculated permissible quantization noise power for each frequency band as psychoacoustic parameters,
The permissible quantization noise power for each frequency band is output to the scale factor calculator 3 side.

【０００７】一方、ＭＤＣＴ器２では、入力した時間軸
上のオーディオ信号を周波数軸上のＭＤＣＴ係数（スペ
クトルデータ）に変換して、このＭＤＣＴ係数をスケー
ルファクタ算出器３及び量子化器４側に出力している。
この際、変換は演算ブロック長を５０％ずつオーバーラ
ップして実行し、例えば、２０４８サンプルを１０２４
本のＭＤＣＴ係数に変換している。On the other hand, the MDCT unit 2 converts the input audio signal on the time axis into MDCT coefficients (spectral data) on the frequency axis, and sends the MDCT coefficients to the scale factor calculator 3 and the quantizer 4. Output.
At this time, the conversion is performed by overlapping the operation block length by 50%, for example, by converting 2048 samples to 1024
It is converted to MDCT coefficients of a book.

【０００８】次に、スケールファクタ算出器３では、人
間の聴覚特性を基にした周波数帯域毎に１０２４本のＭ
ＤＣＴ係数を複数のスケールファクタバンドに分けて、
各スケールファクタバンド毎に計算された量子化雑音が
聴覚心理分析器１で算出されたそれぞれ対応する周波数
帯域の許容量子化雑音電力よりも大きくならないよう各
スケールファクタバンドのスケールファクタ（量子化ス
テップ数）を算出して、これを量子化器４及び可変長符
号器６側に出力している。[0008] Next, the scale factor calculator 3 calculates 1024 lines of M for each frequency band based on human auditory characteristics.
Dividing the DCT coefficients into multiple scale factor bands,
The scale factor of each scale factor band (the number of quantization steps) so that the quantization noise calculated for each scale factor band does not become larger than the permissible quantization noise power of the corresponding frequency band calculated by the psychoacoustic analyzer 1. ) Is calculated and output to the quantizer 4 and the variable-length encoder 6.

【０００９】次に、量子化器４では、ＭＤＣＴ器２から
のＭＤＣＴ係数を各スケールファクタバンド単位に量子
化を行っており、スケールファクタ算出器３で算出され
た各スケールファクタバンドのスケールファクタと全体
の量子化ステップ数から各スケールファクタバンド内の
ＭＤＣＴ係数を量子化し、量子化済みのＭＤＣＴ係数を
セクション生成器５側に出力している。この際、量子化
に必要なビット数が使用可能なビット数以内に収まるよ
うに全体の量子化ステップ数を制御して量子化を実行し
ている。Next, the quantizer 4 quantizes the MDCT coefficients from the MDCT unit 2 in units of each scale factor band, and calculates the scale factor of each scale factor band calculated by the scale factor calculator 3. The MDCT coefficients in each scale factor band are quantized from the total number of quantization steps, and the quantized MDCT coefficients are output to the section generator 5 side. At this time, the quantization is executed by controlling the total number of quantization steps so that the number of bits required for quantization falls within the number of usable bits.

【００１０】次に、セクション生成器５では、量子化器
４から出力されたＭＤＣＴ係数の量子化値を図５に示し
たように複数のブロック単位（以下、セクションと記
す）に分割してセクショニングを行う。符号化を行う際
には複数のハフマンコードブックを使用し、各ハフマン
コードブックは各スケールファクタバンド内の量子化値
の最大値に応じて、使用するハフマンコードブックを変
えている。セクション内部では、更に、包含されるスケ
ールファクタバンドの最大値により、使用するコードブ
ックが決定される。セクショニングは符号化効率を向上
させる一手法であり、連続するスケールファクタバンド
を１つのセクションとし、そのセクションに対して１つ
のハフマンコードブックで符号化を行うことで、ハフマ
ンコードブックの符号量を軽減させ、とくに低符号化レ
ートの場合に符号化率を向上させる。Next, the section generator 5 divides the quantized value of the MDCT coefficient output from the quantizer 4 into a plurality of block units (hereinafter, referred to as sections) as shown in FIG. I do. When performing encoding, a plurality of Huffman codebooks are used, and each Huffman codebook changes the Huffman codebook to be used according to the maximum value of the quantization value in each scale factor band. Inside the section, the maximum value of the included scale factor band further determines the codebook to be used. Sectioning is a technique for improving coding efficiency. A continuous scale factor band is defined as one section, and the section is encoded with one Huffman codebook, thereby reducing the amount of code in the Huffman codebook. The coding rate is improved especially in the case of a low coding rate.

【００１１】即ち、図５に示したＭＤＴＣ係数のセクシ
ョニングの一例では、ｓｆｂＮがＮ番目のスケールファ
クタバンドを表しており、ｓｆｂ０〜ｓｆｂ２からなる
ｓｅｃｔｉｏｎ０はｃｏｄｅｂｏｏｋ０を用い、ｓｆｂ
３〜ｓｆｂ６からなるｓｅｃｔｉｏｎ１はｃｏｄｅｂｏ
ｏｋ１を用い、ｓｆｂ７〜ｓｆｂ８からなるｓｅｃｔｉ
ｏｎ２はｃｏｄｅｂｏｏｋ２を用い、以下、同様に、ｓ
ｆｂＭ−１〜ｓｆｂＭ＋１からなるｓｅｃｔｉｏｎＭは
ｃｏｄｅｂｏｏｋＭを用いている。That is, in the example of the sectioning of the MDTC coefficient shown in FIG. 5, sfbN represents the N-th scale factor band, section0 consisting of sfb0 to sfb2 uses codebook0, and sfbN is used.
Section1 consisting of 3 to sfb6 is codebo
Using ok1, secti composed of sfb7 to sfb8
On2 uses codebook2, and similarly, s
The sectionM composed of fbM-1 to sfbM + 1 uses codebookM.

【００１２】例えば、サンプリング周波数４８ＫＨｚの
ステレオ信号を符号化レート６４Ｋｂｐｓで符号化する
場合、１チャンネルの１フレーム当たりの平均割り当て
ビット数は６８２ビットとなり、各スケールファクタバ
ンドで異なるハフマンコードブックを使用して符号化を
行うと、そのハフマンコードブックを表すのに必要なビ
ット数は４４１ビットとなってしまうため、量子化値の
符号量の割り当てが少なくなり音質劣化につながる。そ
こで、セクショニングにより、ハフマンコードブックの
符号量を削減し、量子化値に対して符号量の割り当てを
多くして音質を改善する。更に、セクションのパターン
を複数個用意し、可変長符号化器６に送る。For example, when encoding a stereo signal having a sampling frequency of 48 KHz at an encoding rate of 64 Kbps, the average number of allocated bits per frame of one channel is 682 bits, and a different Huffman codebook is used for each scale factor band. When the encoding is performed, the number of bits required to represent the Huffman code book becomes 441 bits, and the allocation of the code amount of the quantized value is reduced, which leads to deterioration in sound quality. Therefore, by sectioning, the code amount of the Huffman codebook is reduced, and the allocation of the code amount to the quantization value is increased to improve the sound quality. Further, a plurality of section patterns are prepared and sent to the variable length encoder 6.

【００１３】次に、可変長符号化器６では、量子化器４
から出力されるＭＤＣＴ係数の量子化値をセクション生
成器５で決められた各セクション毎に、セクション内を
このセクションと対応する同一のハフマンコードブック
を使用して可変長符号化を施している。更に、スケール
ファクタも可変長符号化を施し、冗長度を削減する。そ
して、複数のセクションのパターン中で発生符号量が最
も少ないセクションのパターンを選定して、これをビッ
ト数判定器７側に出力する。Next, in the variable length encoder 6, the quantizer 4
For each section determined by the section generator 5, the quantized value of the MDCT coefficient output from the section is subjected to variable-length encoding within the section using the same Huffman codebook corresponding to this section. Further, the scale factor is also subjected to variable length coding to reduce the redundancy. Then, a pattern of the section having the smallest generated code amount is selected from the patterns of the plurality of sections, and this is output to the bit number determination unit 7 side.

【００１４】次に、ビット数判定器７では、可変長符号
化器６で可変長符号化されたビット数が使用可能なビッ
ト数以内に収まっているか否かを判定し、ここでの判定
結果が使用可能なビット数を越えている場合には量子化
器４に戻って再度量子化を行い、生成されるビット数が
使用可能なビット数を下回るまで繰り返される。そし
て、使用可能なビット数を満足して出力された可変長符
号化データは、ビットストリーム生成器８において、サ
ンプリング周波数、符号化レートなどの符号化パラメー
タと共に多重化されてビットストリームとして伝送され
ている。Next, the bit number determiner 7 determines whether or not the number of bits subjected to variable length encoding by the variable length encoder 6 is within the usable number of bits. If the number of bits exceeds the number of usable bits, the process returns to the quantizer 4 to perform quantization again, and is repeated until the number of generated bits falls below the number of usable bits. The variable-length encoded data output satisfying the number of usable bits is multiplexed with encoding parameters such as a sampling frequency and an encoding rate by a bit stream generator 8 and transmitted as a bit stream. I have.

【００１５】[0015]

【発明が解決しようとする課題】ところで、上記した従
来のオーディオ信号符号化装置では、ＭＰＥＧ２ＡＡ
Ｃ方式を適用しているものの、ＭＤＣＴ係数のセクショ
ニングの最良の方法が確立しておらず、また、ＭＤＣＴ
係数を量子化した後にその都度ＭＤＣＴ係数の量子化値
へのセクショニングを行い、数種類のセクション分割の
中から最小のビット数になるように選定を行わなければ
ならないため、オーディオ信号符号化の処理時間がかか
り問題が生じている。By the way, in the conventional audio signal encoding apparatus described above, the MPEG2 AA
Although the C method is applied, the best method of sectioning the MDCT coefficients has not been established.
Each time the coefficients are quantized, the MDCT coefficients must be sectioned into quantized values and selected from among several types of section division so as to have the minimum number of bits. Problem has arisen.

【００１６】[0016]

【課題を解決するための手段】本発明は上記課題に鑑み
てなされたものであり、第１の発明は、オーディオ信号
が入力されて符号化が行なわれるオーディオ信号符号化
装置において、入力した前記オーディオ信号から人間の
聴覚特性に基づいた聴覚心理パラメータを算出する聴覚
心理分析手段と、入力した前記オーディオ信号をＭＤＣ
Ｔ係数（スペクトルデータ）に変換するＭＤＣＴ変換手
段と、前記ＭＤＣＴ変換手段で得られた前記ＭＤＣＴ係
数を複数のスケールファクタバンドに分けて、各スケー
ルファクタバンドの前記ＭＤＣＴ係数に基づいて計算さ
れる量子化雑音が、前記聴覚心理分析手段で算出された
前記聴覚心理パラメータの許容量子化雑音電力よりも大
きくならないように各スケールファクタバンドのスケー
ルファクタを算出するスケールファクタ算出手段と、前
記スケールファクタバンド毎に前記ＭＤＣＴ係数の最大
値を求め、この値を前記スケールファクタ算出手段で算
出された前記スケールファクタに基づいて正規化すると
共に、前記スケールファクタバンドの集合体を複数のセ
クションに分割するセクション判定手段と、前記スケー
ルファクタ算出手段で算出された各スケールファクタバ
ンドのスケールファクタと全体の量子化ステップ数にて
前記ＭＤＣＴ係数の量子化をスケールファクタバンド単
位で行う量子化手段と、前記量子化手段からの量子化値
を、前記セクション判定手段で定められた各セクション
ごとにセクションと対応するハフマンコードブックを用
いて可変長符号化する可変長符号化手段と、前記可変長
符号化手段で符号されたビット数が使用可能なビット数
に収まっているかどうかを判定するビット数判定手段
と、前記スケールファクタ算出手段の出力と前記ビット
数判定手段の出力とが供給されて符号化データをまとめ
てビットストリームを生成するビットストリーム生成手
段とを備えたことを特徴とするオーディオ信号符号化装
置である。SUMMARY OF THE INVENTION The present invention has been made in view of the above problems, and a first invention is an audio signal encoding apparatus in which an audio signal is inputted and encoded. Psychoacoustic analysis means for calculating psychoacoustic parameters based on human auditory characteristics from an audio signal;
MDCT conversion means for converting into T coefficients (spectral data), and a quantum calculated based on the MDCT coefficients of each scale factor band by dividing the MDCT coefficients obtained by the MDCT conversion means into a plurality of scale factor bands. Scale factor calculating means for calculating a scale factor of each scale factor band so that quantization noise is not larger than the permissible quantization noise power of the psychoacoustic parameter calculated by the psychoacoustic analysis means; Section determining means for determining a maximum value of the MDCT coefficient, normalizing the value based on the scale factor calculated by the scale factor calculating means, and dividing the aggregate of the scale factor bands into a plurality of sections. And the scale factor calculating means Quantizing means for quantizing the MDCT coefficients in units of scale factor bands using the scale factor of each scale factor band calculated in the above and the total number of quantization steps, and a quantized value from the quantizing means, A variable-length encoding unit that performs variable-length encoding using a Huffman codebook corresponding to the section for each section determined by the section determination unit; Bit number determining means for determining whether the number is within the range, and an output of the scale factor calculating means and an output of the bit number determining means, and a bit stream generating means for generating a bit stream by summing the encoded data. An audio signal encoding device characterized by comprising:

【００１７】また、第２の発明は、上記した第１の発明
のオーディオ信号符号化装置において、前記セクション
判定手段は、前記スケールファクタバンド毎に前記ＭＤ
ＣＴ係数の最大値を求め、正規化し、符号化レートに応
じてセクション分割判定を行い、更に、セクション毎に
前記スケールファクタバンドの正規化値からセクション
の正規化値の最大値を求めて使用するハフマンコードブ
ックを決定することを特徴とするオーディオ信号符号化
装置である。According to a second aspect of the present invention, in the audio signal encoding apparatus according to the first aspect, the section determining means includes a step of setting the MD for each of the scale factor bands.
The maximum value of the CT coefficient is obtained and normalized, the section division is determined according to the coding rate, and further, the maximum value of the normalized value of the section is obtained and used from the normalized value of the scale factor band for each section. An audio signal encoding device characterized by determining a Huffman codebook.

【００１８】また、第３の発明は、上記した第１の発明
のオーディオ信号符号化装置において、前記セクション
判定手段は、前記スケールファクタバンド毎に前記ＭＤ
ＣＴ係数の最大値を求め、正規化し、前記スケールファ
クタバンドの正規化値の上位を符号化レートに応じた個
数分選択し、選択された正規化値に基づいて各セクショ
ンに含まれるスケールファクタバンドの数を決定し、セ
クション分割判定を行い、更に、セクション毎に前記ス
ケールファクタバンドの正規化値からセクションの正規
化値の最大値を求めて使用するハフマンコードブックを
決定することを特徴とするオーディオ信号符号化装置で
ある。According to a third aspect of the present invention, in the audio signal encoding apparatus according to the first aspect, the section determining means includes a step of setting the MD for each of the scale factor bands.
Determine the maximum value of the CT coefficient, normalize, select the higher order of the normalized value of the scale factor band by the number according to the coding rate, and select the scale factor band included in each section based on the selected normalized value. Is determined, a section division is determined, and a Huffman codebook to be used is determined for each section by determining the maximum value of the normalized value of the section from the normalized value of the scale factor band. It is an audio signal encoding device.

【００１９】[0019]

【発明の実施の形態】以下に本発明に係るオーディオ信
号符号化装置の一実施例を図１乃至図３を参照して詳細
に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of an audio signal encoding apparatus according to the present invention will be described below in detail with reference to FIGS.

【００２０】図１は本発明に係るオーディオ信号符号化
装置を説明するためのブロック図、図２は本発明に係る
オーディオ信号符号化装置において、セクション判定器
によるＭＤＣＴ係数のセクション判定の一例を説明する
ための図、図３は本発明に係るオーディオ信号符号化装
置において、セクション判定器によるＭＤＣＴ係数のセ
クション判定の変形例を説明するための図である。FIG. 1 is a block diagram illustrating an audio signal encoding apparatus according to the present invention, and FIG. 2 illustrates an example of section determination of MDCT coefficients by a section determiner in the audio signal encoding apparatus according to the present invention. FIG. 3 is a diagram for explaining a modification of the section determination of the MDCT coefficient by the section determiner in the audio signal encoding apparatus according to the present invention.

【００２１】尚、説明の便宜上、先に従来例で示した構
成部材と同一構成部材に対しては同一の符号を付して適
宜説明し、且つ、従来例と異なる構成部材に新たな符号
を付して説明する。For convenience of explanation, the same reference numerals are given to the same constituent members as those shown in the conventional example, and description will be appropriately made, and new reference numerals will be given to constituent members different from the conventional example. A description is given below.

【００２２】図１に示した本発明に係るオーディオ信号
符号化装置は、従来例と同様に、ＭＰＥＧ２ＡＡＣ方
式を適用している。The audio signal encoding apparatus according to the present invention shown in FIG. 1 employs the MPEG2 AAC system as in the conventional example.

【００２３】先に図４を用いて説明した従来例では、Ｍ
ＤＣＴ係数（スペクトルデータ）のセクショニングを量
子化後に行っているの対して、本発明に係るオーディオ
信号符号化装置では、ＭＤＣＴ係数の量子化前にＭＤＣ
Ｔ係数のセクション判定を行うことで、オーディオ信号
符号化の処理時間の短縮が可能になることを特徴とする
ものである。即ち、従来例では図４に示したように量子
化器４の後にセクション生成器５を設けているが、本発
明では図１に示したように上記したセクション生成器５
に代えて、量子化器４の前にセクション判定器９を新た
に設けている。In the conventional example described above with reference to FIG.
While the sectioning of DCT coefficients (spectral data) is performed after quantization, the audio signal encoding apparatus according to the present invention employs MDC before quantization of MDCT coefficients.
By performing the section determination of the T coefficient, the processing time of the audio signal encoding can be reduced. That is, in the conventional example, the section generator 5 is provided after the quantizer 4 as shown in FIG. 4, but in the present invention, as shown in FIG.
, A section determiner 9 is newly provided before the quantizer 4.

【００２４】図１に示した如く、本発明に係るオーディ
オ信号符号化装置では、時間軸上のオーディオ信号が聴
覚心理分析器１とＭＤＣＴ器２とに入力され、聴覚心理
分析器１により入力した時間軸上のオーディオ信号から
人間の聴覚特性に基づいて予め設定した周波数帯域毎の
許容量子化雑音電力を聴覚心理パラメータとして算出し
てスケールファクタ算出器３側に出力する一方、ＭＤＣ
Ｔ器２により入力した時間軸上のオーディオ信号を周波
数軸上のＭＤＣＴ係数（スペクトルデータ）に変換し
て、このＭＤＣＴ係数をスケールファクタ算出器３及び
セクション判定器９側に出力している。As shown in FIG. 1, in the audio signal encoding apparatus according to the present invention, the audio signal on the time axis is inputted to the psychoacoustic analyzer 1 and the MDCT unit 2 and inputted by the psychoacoustic analyzer 1. On the basis of the audio signal on the time axis, the permissible quantization noise power for each frequency band set in advance based on the human auditory characteristics is calculated as a psychoacoustic parameter and output to the scale factor calculator 3 side.
The audio signal on the time axis input by the T unit 2 is converted into MDCT coefficients (spectral data) on the frequency axis, and the MDCT coefficients are output to the scale factor calculator 3 and the section determiner 9 side.

【００２５】次に、スケールファクタ算出器３では、人
間の聴覚特性を基にした周波数帯域毎に１０２４本のＭ
ＤＣＴ係数を複数のスケールファクタバンドに分けて、
各スケールファクタバンド毎に計算された量子化雑音が
聴覚心理分析器１で算出されたそれぞれ対応する周波数
帯域の許容量子化雑音電力よりも大きくならないよう各
スケールファクタバンドのスケールファクタ（量子化ス
テップ数）を算出して、これをセクション判定器９及び
量子化器４並びビットストリーム生成器８側に出力して
いる。Next, in the scale factor calculator 3, 1024 M lines are set for each frequency band based on the human auditory characteristics.
Dividing the DCT coefficients into multiple scale factor bands,
The scale factor of each scale factor band (the number of quantization steps) so that the quantization noise calculated for each scale factor band does not become larger than the permissible quantization noise power of the corresponding frequency band calculated by the psychoacoustic analyzer 1. ) Is calculated and output to the section determiner 9, the quantizer 4 and the bit stream generator 8 side.

【００２６】次に、ＭＤＣＴ器２及びスケールファクタ
算出器３と、量子化器４との間に設けたセックション判
定器９は、本発明の要部となるものである。Next, the session decision unit 9 provided between the MDCT unit 2 and the scale factor calculator 3 and the quantizer 4 is a main part of the present invention.

【００２７】このセックション判定器９では、各スケー
ルファクタバンドごとにＭＤＣＴ係数の最大値を求め、
この値をスケールファクタ算出器３から送られるスケー
ルファクタにて式（１）のように正規化をしている。The session determiner 9 calculates the maximum value of the MDCT coefficient for each scale factor band.
This value is normalized by the scale factor sent from the scale factor calculator 3 as in equation (1).

【００２８】ｙ（ｓｆｂ）＝｜ｘ｜^３／４×２^{３／１６×ｓｆ（ｓｆｂ）} ……式（１）ここで、ｓｆｂはスケールファクタバンドの番号を表
し、ｓｆ（ｓｆｂ）はｓｆｂ番目のスケールファクタバ
ンドのスケールファクタを表し、ｘはｓｆｂ番目のスケ
ールファクタバンド内部での最大絶対値となるＭＤＣＴ
係数を表すものである。Y (sfb) = | x | ^3/4 × 2 ^{3/16 × sf (sfb)} Equation (1) where sfb represents a scale factor band number, and sf (sfb) is the sfb-th number. Represents the scale factor of the scale factor band, and x is the MDCT which is the maximum absolute value inside the sfb-th scale factor band.
It represents a coefficient.

【００２９】更に、セックション判定器９では、符号化
レートに応じてセックション分割のパターンを決めたテ
ーブルに基づいて、スケールファクタバンドの集合体を
複数のセックションに分割している。ここでは最も簡単
なセックションパターンとして、図２に示したように各
セクションのスケールファクタバンドの数が等分されて
いるものを一例として挙げておく。Further, the session determiner 9 divides the aggregate of scale factor bands into a plurality of sessions based on a table in which the pattern of the session division is determined according to the coding rate. Here, as the simplest session pattern, a pattern in which the number of scale factor bands of each section is equally divided as shown in FIG. 2 will be described as an example.

【００３０】そして、符号化レートに応じて、セックシ
ョン内部のスケールファクタバンドの数を増減させてお
り、符号化レートが低いほど、１セックション内のスケ
ールファクタバンドの数を多くし、セックションの数を
減らすようにして、ハフマンコードブックを表すのに使
用する符号量を減らすようにする。上記式（１）から算
出されたスケールファクタバンドの正規化値から各セッ
クションの正規化値の最大値を求め、セックションのパ
ターンと共に量子化器４側に出力している。The number of scale factor bands in the session is increased or decreased according to the coding rate. The lower the coding rate, the larger the number of scale factor bands in one session. To reduce the amount of code used to represent the Huffman codebook. The maximum value of the normalized value of each session is obtained from the normalized value of the scale factor band calculated from the above equation (1), and is output to the quantizer 4 along with the pattern of the session.

【００３１】上記では、セックション内部のスケールフ
ァクタバンドの数が等分になるようにしたが、これに限
ることなく、例えば低周波数領域のセクションほどスケ
ールファクタバンドの数を少なくし、高周波数領域のセ
クションになるほどスケールファクタバンドの数を増す
ような重み付けをしたパターンテーブルも考えられる。In the above description, the number of scale factor bands in the session is set to be equal. However, the present invention is not limited to this. , A pattern table weighted such that the number of scale factor bands increases as the section becomes larger.

【００３２】次に、量子化器４では、スケールファクタ
バンド単位に量子化を行う。スケールファクタ算出器３
で算出された各スケールファクタバンドのスケールファ
クタと全体の量子化ステップ数から各スケールファクタ
バンド内のＭＤＣＴ係数を量子化する。更に、セックシ
ョン判定器９から出力された各セックションの正規化値
の最大値も下記の式（２）のようにして量子化を行い、
可変長符号化器６側に出力している。Next, the quantizer 4 performs quantization in units of scale factor bands. Scale factor calculator 3
The MDCT coefficient in each scale factor band is quantized from the scale factor of each scale factor band calculated in step (1) and the total number of quantization steps. Further, the maximum value of the normalized value of each session output from the session determination unit 9 is also quantized as in the following equation (2).
It is output to the variable length encoder 6 side.

【００３３】ｚ’（ｓｅｃ）＝ｚ（ｓｅｃ）×２^{−３／１６×common} ^scale ……式（２）ここで、ｓｅｃはセクション番号、common scaleは全体
の量子化ステップ数、ｚ（ｓｅｃ）はｓｅｃ番目のセク
ションの正規化値の最大値、ｚ’（ｓｅｃ）はｓｅｃ番
目のセクションの量子化値である。Z ′ (sec) = z (sec) × 2 ^{−3 / 16 × common} ^scale ... Equation (2) Here, sec is the section number, common scale is the total number of quantization steps, z (sec) is the maximum normalized value of the sec-th section, and z '(sec) is the sec-th. Are the quantized values of the section.

【００３４】可変長符号化器６では、量子化器４から出
力されたＭＤＣＴ係数の量子化値をセクション判定器９
で定められた各セクション毎に、上記式（２）で算出さ
れるセクション内の最大量子化値からハフマンコードブ
ックを決定し、セクション内をこのセクションと対応す
る同一のハフマンコードブックを使用して可変長符号化
を施している。更に、スケールファクタも可変長符号化
を施し、冗長度を削減する。そして、複数のセクション
のパターン中で発生符号量が最も少ないセクションのパ
ターンを選定して、これをビット数判定器７側に出力す
る。In the variable length encoder 6, the quantized value of the MDCT coefficient output from the quantizer 4 is used by the section decision unit 9.
For each section determined in the above, a Huffman codebook is determined from the maximum quantization value in the section calculated by the above equation (2), and the same Huffman codebook corresponding to this section is used in the section. Variable length coding is applied. Further, the scale factor is also subjected to variable length coding to reduce the redundancy. Then, a pattern of the section having the smallest generated code amount is selected from the patterns of the plurality of sections, and this is output to the bit number determination unit 7 side.

【００３５】次に、ビット数判定器７では、可変長符号
化器６で可変長符号化されたビット数が使用可能なビッ
ト数以内に収まっているか否かを判定し、ここでの判定
結果が使用可能なビット数を越えている場合には量子化
器４に戻って再度量子化を行い、生成されるビット数が
使用可能なビット数を下回るまで繰り返される。そし
て、使用可能なビット数を満足して出力された可変長符
号化データは、ビットストリーム生成器８において、ス
ケールファクタ算出器３からの出力とビット数判定７か
らの出力とが供給されて、サンプリング周波数、符号化
レートなどの符号化パラメータと共に多重化されてビッ
トストリームとして伝送されている。Next, the bit number determiner 7 determines whether or not the number of bits subjected to variable length encoding by the variable length encoder 6 falls within the usable number of bits. If the number of bits exceeds the number of usable bits, the process returns to the quantizer 4 to perform quantization again, and is repeated until the number of generated bits falls below the number of usable bits. The output from the scale factor calculator 3 and the output from the bit number determination 7 are supplied to the variable-length coded data output satisfying the number of usable bits in the bit stream generator 8, It is multiplexed with encoding parameters such as a sampling frequency and an encoding rate and transmitted as a bit stream.

【００３６】尚、上記した実施例では、セックション判
定器９によりＭＤＣＴ係数のセックション判定を行う際
に、符号化レートに応じてセックション分割のパターン
を決めたテーブルに基づいて、スケールファクタバンド
の集合体を複数のセックションに分割しているが、この
ように予め固定的なセクションのテーブルでセクショニ
ングを行うのではなく、図３に示した変形例のようにＭ
ＤＣＴ係数へのセクショニングが流動的に変化するよう
にしても良い。この場合まず、セクション判定器９にて
各スケールファクタバンドにおいて算出された正規化値
の中から、符号化レートに応じて上位数個を選択する。
符号化レートが低いほど選択する正規化値の数を減ら
し、セクションの数を減らすようにして、ハフマンコー
ドブックを表すのに使用する符号量を減らすようにす
る。In the above-described embodiment, when the session determination unit 9 performs the session determination of the MDCT coefficients, the scale factor band is determined based on the table in which the pattern of the session division is determined according to the coding rate. Is divided into a plurality of sessions. Instead of performing sectioning using a fixed section table in advance in this way, as shown in a modification shown in FIG.
The sectioning to the DCT coefficient may change fluidly. In this case, first, from the normalized values calculated in each scale factor band by the section determiner 9, several higher-order values are selected in accordance with the coding rate.
As the coding rate is lower, the number of normalization values to be selected is reduced, the number of sections is reduced, and the code amount used to represent the Huffman codebook is reduced.

【００３７】即ち、図３に示したように、セクション判
定器９（図１）によるＭＤＣＴ係数のセクション判定の
変形例を説明すると、縦軸は正規化値を表し、横軸はス
ケールファクタバンドの列を表したものとする（横軸は
最大でＮ個のスケールファクタバンドがある）。図３で
は、例えば正規化値の上位６個を選択したものとする。That is, as shown in FIG. 3, a modification of the section judgment of the MDCT coefficient by the section judgment unit 9 (FIG. 1) will be described. The vertical axis represents the normalized value, and the horizontal axis represents the scale factor band. Let the columns be represented (the horizontal axis has a maximum of N scale factor bands). In FIG. 3, it is assumed that, for example, the top six normalized values are selected.

【００３８】図３において、各正規化値（図３中のｙ
０，ｙ１，……ｙ５）を中心にして傾き｜ａ｜の直線を
各点からそれぞれ左右に描き（図３中の細点線）、各正
規化値の傾き｜ａ｜の直線と交差する箇所（図３中のＰ
０，Ｐ１，Ｐ２，Ｐ３）があれば、その交差点から垂線
を下ろし、横軸と交差するスケールファクタバンドがセ
クションの境界となる。図３の領域Ｂのように各正規化
値の傾き｜ａ｜の直線と交差しない箇所は、各正規化値
の傾き｜ａ｜の直線と横軸との交点の中間値をセクショ
ンの境界とする（図３中のＱ０）。また、図３の端の領
域Ａ或いは領域Ｃは、それらに最も近いセクションに含
まれるようにする。こうして求められたセクションのパ
ターン及び正規化値の上位の値を量子化器４に出力し、
同様に量子化し、且つ、可変長符号器６で可変長符号化
を行う。In FIG. 3, each normalized value (y in FIG. 3)
0, y1,... Y5), a straight line having a slope | a | is drawn left and right from each point (thin dotted line in FIG. 3), and a point intersecting the straight line of the slope | a | (P in FIG. 3
(0, P1, P2, P3), a vertical line is drawn from the intersection, and the scale factor band intersecting the horizontal axis is the boundary of the section. As shown in the area B of FIG. 3, a portion that does not intersect with the straight line of the gradient | a | of each normalized value is defined as an intermediate value of the intersection between the straight line of each normalized value | a | (Q0 in FIG. 3). The region A or the region C at the end in FIG. 3 is included in the section closest to them. The high-order value of the section pattern and the normalized value thus obtained is output to the quantizer 4.
Similarly, quantization and variable length encoding are performed by the variable length encoder 6.

【００３９】ここでは、各正規化値を中心にした傾き｜
ａ｜の直線によりセクション判定を行ったが、傾き｜ａ
｜は任意の数値であり、入力したオーディオ信号の特性
に応じて可変してもかまわない。また、直線以外に対数
関数や指数関数を用いてもかまわない。Here, the inclination |
The section was determined using the straight line of a |
Is an arbitrary numerical value and may be varied according to the characteristics of the input audio signal. In addition, a logarithmic function or an exponential function may be used instead of the straight line.

【００４０】[0040]

【発明の効果】以上詳述した本発明に係るオーディオ信
号符号化装置によると、とくに、ＭＤＣＴ係数（スペク
トルデータ）の量子化を行う前にＭＤＣＴ係数の量子化
のセクショニングの判定を行うことで処理時間の短縮が
可能となる。更に、符号化レートが低い場合にはハフマ
ンコードブックの符号量を削減することでオーディオ信
号に符号量を多く割り当てることができ、音質改善が期
待できる。According to the audio signal encoding apparatus according to the present invention described in detail above, the processing is performed particularly by determining the sectioning of the quantization of the MDCT coefficients before quantizing the MDCT coefficients (spectral data). Time can be reduced. Further, when the coding rate is low, a large amount of code can be allocated to the audio signal by reducing the code amount of the Huffman codebook, and improvement in sound quality can be expected.

[Brief description of the drawings]

【図１】本発明に係るオーディオ信号符号化装置を説明
するためのブロック図である。FIG. 1 is a block diagram illustrating an audio signal encoding device according to the present invention.

【図２】本発明に係るオーディオ信号符号化装置におい
て、セクション判定器によるＭＤＣＴ係数のセクション
判定の一例を説明するための図である。FIG. 2 is a diagram for explaining an example of section determination of MDCT coefficients by a section determiner in the audio signal encoding device according to the present invention.

【図３】本発明に係るオーディオ信号符号化装置におい
て、セクション判定器によるＭＤＣＴ係数のセクション
判定の変形例を説明するための図である。FIG. 3 is a diagram for explaining a modification of section determination of MDCT coefficients by a section determiner in the audio signal encoding device according to the present invention.

【図４】従来のオーディオ信号符号化装置を説明するた
めのブロック図である。FIG. 4 is a block diagram for explaining a conventional audio signal encoding device.

【図５】ＭＤＴＣ係数のセクショニングの一例を説明す
るための図である。FIG. 5 is a diagram for explaining an example of sectioning of MDTC coefficients.

[Explanation of symbols]

１…聴覚心理分析器、２…ＭＤＣＴ器、３…スケールフ
ァクタ算出器、４…量子化器、６…可変長符号化器、７
…ビット数判定器、８…ビットストリーム生成器、９…
セクション判定器。DESCRIPTION OF SYMBOLS 1 ... Psychological analyzer, 2 ... MDCT device, 3 ... Scale factor calculator, 4 ... Quantizer, 6 ... Variable length encoder, 7
... bit number judging device, 8 ... bit stream generator, 9 ...
Section determiner.

Claims

[Claims]

An audio signal encoding apparatus for receiving and encoding an audio signal, comprising: a psychoacoustic analysis means for calculating a psychoacoustic parameter based on a human auditory characteristic from the input audio signal; MDCT conversion means for converting the audio signal into MDCT coefficients (spectral data); and dividing the MDCT coefficients obtained by the MDCT conversion means into a plurality of scale factor bands, based on the MDCT coefficients of each scale factor band. Scale factor calculation means for calculating the scale factor of each scale factor band so that the calculated quantization noise is not larger than the permissible quantization noise power of the psychoacoustic parameter calculated by the psychoacoustic analysis means, The MD for each scale factor band Section determining means for determining the maximum value of the T coefficient, normalizing the value based on the scale factor calculated by the scale factor calculating means, and dividing the aggregate of the scale factor bands into a plurality of sections; Quantizing means for quantizing the MDCT coefficients in units of scale factor bands based on the scale factor of each scale factor band calculated by the scale factor calculating means and the total number of quantization steps; Variable-length encoding means for performing variable-length encoding using a Huffman codebook corresponding to a section for each section determined by the section determining means, and the number of bits encoded by the variable-length encoding means. Bit number determining means for determining whether or not is within the number of usable bits; An audio signal encoding apparatus, comprising: a bit stream generation unit that receives an output of the scale factor calculation unit and an output of the bit number determination unit and collectively generates encoded data to generate a bit stream.

2. The audio signal encoding apparatus according to claim 1, wherein the section determination unit determines a maximum value of the MDCT coefficient for each of the scale factor bands, normalizes the maximum value, and determines a section division according to a coding rate. And then
An audio signal encoding apparatus, wherein a maximum value of a normalized value of a section is obtained from a normalized value of the scale factor band for each section to determine a Huffman codebook to be used.

3. The audio signal encoding device according to claim 1, wherein the section determination means obtains a maximum value of the MDCT coefficient for each of the scale factor bands, normalizes the maximum value, and calculates a normalized value of the scale factor band. Higher ranks are selected by the number according to the coding rate, the number of scale factor bands included in each section is determined based on the selected normalized value, section division determination is performed, and the scale factor is further determined for each section. An audio signal encoding apparatus, wherein a maximum value of a section normalization value is obtained from a band normalization value to determine a Huffman codebook to be used.