JP2002118517A

JP2002118517A - Orthogonal transform apparatus and method, inverse orthogonal transform apparatus and method, transform coding apparatus and method, and decoding apparatus and method

Info

Publication number: JP2002118517A
Application number: JP2000369001A
Authority: JP
Inventors: Kenichi Makino; 堅一牧野; Atsushi Matsumoto; 淳松本; Masayuki Nishiguchi; 正之西口
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2000-07-31
Filing date: 2000-12-04
Publication date: 2002-04-19

Abstract

(57)【要約】【課題】オーバーラップの区切りを任意に決め、かつ
オーバーラップ加算での完全な信号再生を可能とする変
換符号化装置及び方法を提供する。【解決手段】線形・非線形予測分析部３は、入力端子
２から入力されたオーディオ信号に線形・非線形の予測
分析処理を施して予測残差を出力する。定常性推定部７
は、上記オーディオ信号の定常性を推定する。ブロック
長決定部８は、定常性推定部７で推定された結果に基づ
いてＭＤＣＴ時のブロック長を決定する。ＭＤＣＴ部５
は、ブロック長決定部８で決定されたブロック長でバッ
ファ４を介して入力される上記予測残差の時系列サンプ
ルＭにＭＤＣＴ処理を施してＭＤＣＴ係数を生成する。
量子化部６は、ＭＤＣＴ部５で生成されたＭＤＣＴ係数
を量子化する。 (57) [Summary] [PROBLEMS] To provide a transform coding apparatus and method capable of arbitrarily determining a break of an overlap and enabling complete signal reproduction by overlap addition. SOLUTION: A linear / nonlinear prediction analysis unit 3 performs a linear / nonlinear prediction analysis process on an audio signal input from an input terminal 2 and outputs a prediction residual. Stationarity estimator 7
Estimates the stationarity of the audio signal. The block length determining unit 8 determines a block length at the time of MDCT based on the result estimated by the stationarity estimating unit 7. MDCT unit 5
Performs MDCT processing on the time series sample M of the prediction residual input via the buffer 4 with the block length determined by the block length determining unit 8 to generate MDCT coefficients.
The quantization unit 6 quantizes the MDCT coefficients generated by the MDCT unit 5.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は入力された時系列サ
ンプルをオーバーラップさせながら直交変換する直交変
換装置及び方法に関する。また、本発明は時系列サンプ
ルをオーバーラップさせながら直交変換して得た直交変
換係数を逆直交変換する逆直交変換装置及び方法に関す
る。また、本発明は上記直交変換装置及び方法を適用し
た変換符号化装置及び方法に関する。また、本発明は上
記逆直交変換装置及び方法を適用した復号装置及び方法
に関する。[0001] 1. Field of the Invention [0002] The present invention relates to an orthogonal transformation apparatus and method for performing orthogonal transformation while overlapping input time-series samples. The present invention also relates to an inverse orthogonal transform apparatus and method for performing orthogonal transform on orthogonal transform coefficients obtained by performing orthogonal transform while overlapping time series samples. The present invention also relates to a transform coding apparatus and method using the above-described orthogonal transform apparatus and method. The present invention also relates to a decoding device and a method to which the above-described inverse orthogonal transform device and method are applied.

【０００２】[0002]

【従来の技術】オーディオや画像信号等の時系列サンプ
ルのディジタル符号化方式において、高速フーリエ変換
（ＦＦＴ）、離散コサイン変換（ＤＣＴ）、改良離散コ
サイン変換（Modified ＤＣＴ：ＭＤＣＴ）等の直交変
換を用いるものが各種提案されている。2. Description of the Related Art In a digital encoding method for time-series samples such as audio and image signals, orthogonal transform such as fast Fourier transform (FFT), discrete cosine transform (DCT), and improved discrete cosine transform (Modified DCT: MDCT) is performed. Various proposals have been made for use.

【０００３】特にＭＤＣＴは時系列サンプルをオーバー
ラップさせながら直交変換をすることにより、ＤＣＴと
比較してブロック間の継ぎ目で発生するノイズを軽減さ
せており、オーディオ信号を直交変換し圧縮符号化する
変換符号化方式において、近年非常にポピュラーな方法
となっている。In particular, the MDCT performs orthogonal transform while overlapping time-series samples, thereby reducing noise generated at seams between blocks as compared with DCT, and orthogonally transforms and compresses and encodes an audio signal. In recent years, the transform coding method has become a very popular method.

【０００４】ＭＤＣＴと、ＭＤＣＴの逆変換であるＩＭ
ＤＣＴの定義は以下の（１）式、（２）式の通りであ
る。[0004] MDCT and IM which is an inverse transform of MDCT
The definition of DCT is as in the following equations (1) and (2).

【０００５】[0005]

【数１】 (Equation 1)

【０００６】[0006]

【数２】 (Equation 2)

【０００７】但し、上記（１）、（２）式でｘは入力信
号、ｙはＭＤＣＴ係数、ｘ~は逆ＭＤＣＴ出力、Ｍはブ
ロック長、ｈは順変換用窓関数、ｆは逆変換用窓関数で
ある。In the above equations (1) and (2), x is an input signal, y is an MDCT coefficient, xｘ is an inverse MDCT output, M is a block length, h is a window function for forward conversion, and f is an inverse conversion window function. It is a window function.

【０００８】上記（１）式に（２）式を代入して整理す
ると、次の（３）式が得られる。By substituting equation (2) into equation (1), the following equation (3) is obtained.

【０００９】[0009]

【数３】 (Equation 3)

【００１０】この（３）式から、ＭＤＣＴ後、ＩＭＤＣ
Ｔして得られる時系列信号ｘ~(m)には、エイリアシング
成分が含まれていることが分かる。このエイリアシング
成分は、適当な窓関数ｈ(m)，ｆ(m)を選び、時系列信号
に５０％オーバーラップさせて変換することにより、完
全に打ち消すことができる。From equation (3), after MDCT, IMDC
It can be seen that the time series signal xｘ (m) obtained by T contains an aliasing component. This aliasing component can be completely canceled by selecting an appropriate window function h (m), f (m) and converting it into a time-series signal with 50% overlap.

【００１１】図６はＭＤＣＴ及びＩＭＤＣＴのアルゴリ
ズムを説明するための図である。図６では、時系列サン
プルｘ(m)における任意の隣接するブロックに対する操
作を示しており、図中のブロックj-1とブロックjはとも
に長さがＭで、互いのブロックは５０％のオーバーラッ
プがある。このブロックに対して窓関数h(m)で表現され
る窓がけをし、線形順変換を行うことで、Ｍ／２点のＭ
ＤＣＴ係数が得られる。以上までがＭＤＣＴ順変換の操
作である。ＩＭＤＣＴでは、ＭＤＣＴ係数を線形逆変換
し、窓関数ｆ(m)で窓がけをした後、隣接するブロック
でオーバーラップ加算することで、Ｍ／２個の時系列サ
ンプルｘ~(m)を得る。FIG. 6 is a diagram for explaining the MDCT and IMDCT algorithms. FIG. 6 shows an operation for an arbitrary adjacent block in the time-series sample x (m). In the figure, the block j-1 and the block j are both M in length, and each block has an overlap of 50%. There is wrap. By windowing this block with a window function h (m) and performing a linear forward transformation, the M / 2 point M
DCT coefficients are obtained. The above is the operation of the MDCT forward conversion. In IMDCT, M / 2 time-series samples xｘ (m) are obtained by performing a linear inverse transform of MDCT coefficients, performing windowing with a window function f (m), and performing overlap addition on adjacent blocks. .

【００１２】オーディオ符号化方式、特に変換符号化方
式においては、直交変換のブロック長の選択が音質を特
徴づける一つの要因となっている。一般的に、直交変換
のブロック長が長いほど周波数分解能が高くなり、短い
ほど時間分解能が高くなる。したがって、信号の時間変
動が少ない入力の場合は長いブロック長のほうが効率の
点などから有利であり、逆に時間変動の大きい信号の場
合は短いブロック長にした方が良い。例えば、信号の時
間変動が大きいアタック音楽の入力に対して長すぎるブ
ロック長でＭＤＣＴを行った場合、十分な時間分解能が
得られないために、再生音にプリエコーやポストエコー
が発生し音質が損なわれる。このため、入力信号の性質
に基づいて適応的にブロック長を変化させる方法を高能
率符号化に取り入れることが考えられ、実際に取り入れ
た方式も提案されている。[0012] In an audio coding method, particularly in a transform coding method, selection of a block length of orthogonal transform is one factor that characterizes sound quality. Generally, the longer the orthogonal transform block length, the higher the frequency resolution, and the shorter the orthogonal transform block length, the higher the time resolution. Therefore, a long block length is more advantageous in terms of efficiency and the like for an input with a small time variation of a signal, and a shorter block length is better for a signal with a large time variation. For example, if MDCT is performed with an excessively long block length for the input of attack music having a large time variation of the signal, a pre-echo or a post-echo occurs in the reproduced sound due to insufficient time resolution, resulting in deterioration of sound quality. It is. For this reason, it is conceivable to adopt a method of adaptively changing the block length based on the characteristics of an input signal into high-efficiency coding, and a method that actually adopts the method has been proposed.

【００１３】しかしながら、上記（１）式、（２）式の
定義式を用いてブロック長を変化させようとする場合、
ｘ~(m)とｘ(m)を完全に一致させるためには、時間領域
に生じるエイリアシングをキャンセルさせる必要がある
ため注意を要する。例えば、［１］：「Takashi Mochiz
uki.Perfect Recontsruction Conditions for Adaptive
Blocksize MDCT.IEICE Trans.fundamentals,vol.E77-
A,No.5,pp.894-899,May1994.」では、エイリアシングを
キャンセルするような窓を選ぶことにより、上記（１）
式、（２）式の定義式のままでブロック長の混在を実現
している。図７は上記［１］の方法を用いて、ブロック
長をＭ_１からＭ_２（Ｍ_１＜Ｍ_２）に切り替えを行う例で
ある。図７において、フレームj-2，j-1ではブロック長
Ｍ_１であり、フレームjでブロック長がＭ_１からＭ_２へ
移行している。However, when the block length is to be changed using the definition formulas (1) and (2),
Care must be taken to completely match x ~ (m) and x (m) because it is necessary to cancel aliasing occurring in the time domain. For example, [1]: “Takashi Mochiz
uki.Perfect Recontsruction Conditions for Adaptive
Blocksize MDCT.IEICE Trans.fundamentals, vol.E77-
A, No.5, pp.894-899, May1994. ”, By selecting a window that cancels the aliasing,
The block lengths are mixed with the definition expression of expression (2). Figure 7 using the method of [1] is an example of switching the block length from _{M 1} to _{_{_{M 2 (M 1 <M 2}}} ). 7, the frame j-2, a j-1 at block length M _1, block length has shifted from M ₁ to M ₂ in frame j.

【００１４】[0014]

【発明が解決しようとする課題】ところで、上記図７に
おいてブロック長が移行するフレームjでは窓の前半
（Ｍ２−Ｍ１）／４個の係数が０となっており、窓の有
効範囲は３（Ｍ_２−Ｍ_１）／４になるため、ＭＤＣＴの
ブロック長Ｍ_２よりも短い。このため、３（Ｍ_２−
Ｍ_１）／４の入力サンプルに対して必要以上に長いブロ
ック長でＭＤＣＴを行うことになり、効率の点で得策と
はいえない。また、ＭＤＣＴを行う前に、前処理を時間
領域でブロックに区切って行っている場合などは、位相
が変化するため前処理したサンプルの処理が厄介なもの
となる。By the way, in the frame j in which the block length shifts in FIG. 7, the first half (M2-M1) / 4 coefficients of the window are 0, and the effective range of the window is 3 ( M ₂ −M ₁ ) / 4, which is shorter than the block length M _{2 of the} MDCT. Therefore, 3 (M ₂ −
Since MDCT is performed with an unnecessarily long block length for input samples of M ₁ ) / 4, it cannot be said that it is a good idea in terms of efficiency. In addition, when the pre-processing is performed in blocks in the time domain before the MDCT is performed, the phase changes, so that the processing of the pre-processed sample becomes troublesome.

【００１５】そこで、図８のようにブロック長がＭ_１か
らＭ_２に移行するフレームjにおいて、前後のフレーム
とオーバーラップするサンプルの数を一致させるように
すれば、窓の有効範囲とＭＤＣＴのブロック長が一致す
る。さらに、Ｍ_２がＭ_１の整数倍になっていれば、ブロ
ック長を切り替えることによる位相の変化は無いため、
前処理したサンプルの扱いが簡単になる。[0015] Therefore, in the frame j the block length is changed from the M ₁ to M ₂ as shown in FIG. 8, if to match the number of samples that overlap the front and rear frames, the scope and MDCT window The block lengths match. Furthermore, if M ₂ is an integer multiple of M ₁ , there is no change in phase due to switching of the block length.
Pre-treated samples are easier to handle.

【００１６】しかしながら、上記（１）式、（２）式の
定義式を用いてＭＤＣＴを行う限り、対称な時間窓を用
いて、前後のフレームと５０％オーバーラップさせない
限り、ＩＭＤＣＴを行って得られるｘ~(m)に含まれるエ
イリアシングをキャンセルさせることができないため、
図３のフレームjのようなオーバラップではもとの時系
列サンプルを復元することができない。However, as long as the MDCT is performed using the above defined equations (1) and (2), the IMDCT is obtained using the symmetrical time windows unless the frames before and after are overlapped by 50%. Aliasing included in x ~ (m) cannot be canceled,
The original time-series samples cannot be restored with an overlap such as frame j in FIG.

【００１７】本発明は、上記実情に鑑みてなされたもの
であり、オーバーラップの区切りを任意に決めることが
出来る直交変換装置及び方法の提供を目的とする。ま
た、このような直交変換装置及び方法で得られた直交変
換係数を逆変換する逆直交変換装置及び方法の提供を目
的とする。The present invention has been made in view of the above circumstances, and an object of the present invention is to provide an orthogonal transformation apparatus and method capable of arbitrarily determining a break of an overlap. It is another object of the present invention to provide an inverse orthogonal transform apparatus and method for inversely transforming orthogonal transform coefficients obtained by such an orthogonal transform apparatus and method.

【００１８】また、本発明は上記実情に鑑みてなされた
ものであり、オーバーラップの区切りを任意に決め、か
つオーバーラップ加算での完全な信号再生を可能とする
変換符号化装置及び方法、並びに復号装置及び方法の提
供を目的とする。Further, the present invention has been made in view of the above-mentioned circumstances, and a transform coding apparatus and method for arbitrarily determining a break of an overlap and enabling complete signal reproduction by overlap addition, and It is an object of the present invention to provide a decoding device and method.

【００１９】[0019]

【課題を解決するための手段】本発明に係る直交変換装
置は、上記課題を解決するために、入力された時系列サ
ンプルをオーバーラップさせながら直交変換する直交変
換装置において、時系列Ｍサンプルを直交変換するとき
に、逆直交変換時にエイリアシングが生じる境界となる
サンプル位置αを０≦α＜Ｍの範囲で任意に決定して直
交変換を行うことを特徴とする。SUMMARY OF THE INVENTION In order to solve the above-mentioned problems, an orthogonal transform apparatus according to the present invention performs orthogonal transform while overlapping input time-series samples. In the orthogonal transform, the orthogonal transform is performed by arbitrarily determining a sample position α as a boundary where aliasing occurs at the time of the inverse orthogonal transform in a range of 0 ≦ α <M.

【００２０】本発明に係る直交変換方法は、上記課題を
解決するために、入力された時系列サンプルをオーバー
ラップさせながら直交変換する直交変換方法において、
時系列Ｍサンプルを直交変換するときに、逆直交変換時
にエイリアシングが生じる境界となるサンプル位置αを
０≦α＜Ｍの範囲で任意に決定して直交変換を行うこと
を特徴とする。In order to solve the above problem, an orthogonal transform method according to the present invention is directed to an orthogonal transform method for performing orthogonal transform while overlapping input time-series samples.
When orthogonal transformation is performed on the time-series M samples, the orthogonal transformation is performed by arbitrarily determining a sample position α as a boundary where aliasing occurs at the time of inverse orthogonal transformation in a range of 0 ≦ α <M.

【００２１】本発明に係る逆直交変換装置は、上記課題
を解決するために、時系列サンプルをオーバーラップさ
せながら直交変換して得た直交変換係数を逆直交変換す
る逆直交変換装置において、逆直交変換時にエイリアシ
ングが生じる境界となるサンプル位置αを０≦α＜Ｍの
範囲で任意に決定して直交変換された直交変換係数を逆
直交変換することを特徴とする。In order to solve the above problem, an inverse orthogonal transform apparatus according to the present invention is directed to an inverse orthogonal transform apparatus for inverse orthogonal transforming orthogonal transform coefficients obtained by performing orthogonal transform while overlapping time series samples. It is characterized by arbitrarily determining a sample position α as a boundary where aliasing occurs at the time of orthogonal transformation within a range of 0 ≦ α <M, and inversely orthogonally transforming the orthogonally transformed orthogonal transformation coefficient.

【００２２】本発明に係る逆直交変換方法は、上記課題
を解決するために、時系列サンプルをオーバーラップさ
せながら直交変換して得た直交変換係数を逆直交変換す
る逆直交変換方法において、逆直交変換時にエイリアシ
ングが生じる境界となるサンプル位置αを０≦α＜Ｍの
範囲で任意に決定して直交変換された直交変換係数を逆
直交変換することを特徴とする。In order to solve the above problem, the inverse orthogonal transform method according to the present invention is directed to an inverse orthogonal transform method for inverse orthogonal transforming orthogonal transform coefficients obtained by orthogonally transforming time-series samples while overlapping. It is characterized by arbitrarily determining a sample position α as a boundary where aliasing occurs at the time of orthogonal transformation within a range of 0 ≦ α <M, and inversely orthogonally transforming the orthogonally transformed orthogonal transformation coefficient.

【００２３】本発明に係る変換符号化装置は、上記課題
を解決するために、入力信号を直交変換して圧縮符号化
する変換符号化装置において、上記入力信号を所定サン
プルずつ取り込み、予測分析して予測残差を出力する予
測分析手段と、上記入力信号の所定サンプル毎の特性を
判断する特性判断手段と、上記特性判断手段で判断され
た特性に基づいて直交変換時のブロック長を決定するブ
ロック長決定手段と、上記ブロック長決定手段で決定さ
れたブロック長で、逆直交変換時にエイリアシングが生
じる境界となるサンプル位置αを０≦α＜Ｍの範囲で任
意に決定して、上記予測分析手段から出力される上記予
測残差を入力時系列Ｍサンプルとしてオーバーラップさ
せながら、上記時系列Ｍサンプルに直交変換処理を施し
て直交変換係数を生成する直交変換手段と、上記直交変
換手段で生成された直交変換係数を量子化する量子化手
段とを備えることを特徴とする。In order to solve the above-mentioned problems, a transform coding apparatus according to the present invention is a transform coding apparatus for orthogonally transforming an input signal and compressing and encoding the input signal. Predictive analysis means for outputting a prediction residual, characteristic determining means for determining characteristics of the input signal for each predetermined sample, and determining a block length at the time of orthogonal transformation based on the characteristics determined by the characteristic determining means. The block length determining means and the block length determined by the block length determining means arbitrarily determine a sample position α as a boundary at which aliasing occurs at the time of inverse orthogonal transformation in a range of 0 ≦ α <M, and perform the prediction analysis. The orthogonal residual is applied to the time series M samples to generate orthogonal transform coefficients while overlapping the prediction residuals output from the means as input time series M samples. And a quantizing means for quantizing the orthogonal transform coefficients generated by the orthogonal transform means.

【００２４】このため、入力信号の性質に応じて直交変
換のブロック長を切り替え、直交変換の出力係数を量子
化するといった変換符号化が容易に実現できる。Therefore, it is possible to easily realize transform coding such as switching the orthogonal transform block length in accordance with the properties of the input signal and quantizing the orthogonal transform output coefficients.

【００２５】本発明に係る変換符号化方法は、上記課題
を解決するために、入力信号を直交変換して圧縮符号化
する変換符号化方法において、上記入力信号を所定サン
プルずつ取り込み、予測分析して予測残差を出力する予
測分析工程と、上記入力信号の所定サンプル毎の特性を
判断する特性判断工程と、上記特性判断工程で判断され
た特性に基づいて直交変換時のブロック長を決定するブ
ロック長決定工程と、上記ブロック長決定工程で決定さ
れたブロック長で、逆直交変換時にエイリアシングが生
じる境界となるサンプル位置αを０≦α＜Ｍの範囲で任
意に決定して、上記予測分析工程から出力される上記予
測残差を入力時系列Ｍサンプルとしてオーバーラップさ
せながら、上記時系列Ｍサンプルに直交変換処理を施し
て直交変換係数を生成する直交変換工程と、上記直交変
換工程で生成された直交変換係数を量子化する量子化工
程とを備えることを特徴とする。In order to solve the above-mentioned problems, a transform coding method according to the present invention is a transform coding method for orthogonally transforming an input signal and compressing and encoding the input signal, wherein the input signal is fetched by a predetermined sample and prediction analysis is performed. A prediction analysis step of outputting a prediction residual by using the above method, a characteristic determination step of determining characteristics of the input signal for each predetermined sample, and determining a block length at the time of orthogonal transformation based on the characteristics determined in the characteristic determination step. The block length determining step and the block length determined in the block length determining step arbitrarily determine a sample position α as a boundary where aliasing occurs at the time of inverse orthogonal transformation in a range of 0 ≦ α <M, and perform the prediction analysis. While overlapping the prediction residual output from the process as an input time series M samples, the time series M samples are subjected to orthogonal transform processing to generate orthogonal transform coefficients. And a quantization step of quantizing the orthogonal transformation coefficients generated in the orthogonal transformation step.

【００２６】本発明に係る復号装置は、上記課題を解決
するために、入力信号の特性に応じて決定されるブロッ
ク長で、逆直交変換時にエイリアシングが生じる境界と
なるサンプル位置αを０≦α＜Ｍの範囲で任意に決定し
て、入力時系列Ｍサンプルをオーバーラップさせながら
直交変換して得られた直交変換係数を量子化した量子化
データを復号する復号装置であって、上記量子化データ
を逆量子化する逆量子化手段と、上記逆量子化手段で逆
量子化されて得られた直交変換係数を、上記入力信号の
特性に応じて決定されたブロック長で、逆直交変換する
逆直交変換手段とを備えることを特徴とする。In order to solve the above problem, the decoding apparatus according to the present invention sets a sample position α, which is a boundary at which aliasing occurs at the time of inverse orthogonal transform, to 0 ≦ α with a block length determined according to the characteristics of an input signal. A decoding device for decoding quantized data obtained by quantizing orthogonal transform coefficients obtained by orthogonally transforming the input time-series M samples while overlapping the input time-series M samples by arbitrarily determining in the range of <M. Inverse quantization means for inversely quantizing the data, and inverse orthogonal transform of the orthogonal transform coefficient obtained by inverse quantization by the inverse quantization means, with a block length determined according to the characteristics of the input signal. And an inverse orthogonal transform unit.

【００２７】本発明に係る復号方法は、上記課題を解決
するために、入力信号の特性に応じて決定されるブロッ
ク長で、逆直交変換時にエイリアシングが生じる境界と
なるサンプル位置αを０≦α＜Ｍの範囲で任意に決定し
て、入力時系列Ｍサンプルをオーバーラップさせながら
直交変換して得られた直交変換係数を量子化した量子化
データを復号する復号方法であって、上記量子化データ
を逆量子化する逆量子化工程と、上記逆量子化工程で逆
量子化されて得られた直交変換係数を、上記入力信号の
特性に応じて決定されたブロック長で、逆直交変換する
逆直交変換工程とを備えることを特徴とる。In order to solve the above-mentioned problems, the decoding method according to the present invention sets a sample position α, which is a boundary at which aliasing occurs at the time of inverse orthogonal transform, to 0 ≦ α with a block length determined according to the characteristics of an input signal. A decoding method for decoding quantized data obtained by quantizing orthogonal transform coefficients obtained by performing orthogonal transform while arbitrarily determining the M samples in the input time series while overlapping each other in the range of <M. An inverse quantization step of inversely quantizing the data, and an inverse orthogonal transform of the orthogonal transform coefficient obtained by the inverse quantization in the inverse quantization step, with a block length determined according to the characteristics of the input signal. And an inverse orthogonal transformation step.

【００２８】[0028]

【発明の実施の形態】以下、本発明のいくつかの実施の
形態について図面を参照しながら説明する。先ず、第１
の実施の形態は、図１に示すエンコーダ１であり、入力
端子２から入力された１６ＫＨｚでサンプリングされた
オーディオ信号を、後述するＭＤＣＴを用いて圧縮符号
化する変換符号化装置の具体例である。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Some embodiments of the present invention will be described below with reference to the drawings. First, first
Is an encoder 1 shown in FIG. 1 and is a specific example of a transform encoding device that compresses and encodes an audio signal sampled at 16 KHz input from an input terminal 2 using MDCT described later. .

【００２９】図１において、エンコーダ１は、入力端子
２から入力された上記オーディオ信号に線形・非線形の
予測分析処理を施して予測残差を出力する線形・非線形
予測分析部３と、上記オーディオ信号の定常性を推定す
る定常性推定部７と、この定常性推定部７で推定された
結果に基づいてＭＤＣＴ時のブロック長を決定するブロ
ック長決定部８と、このブロック長決定部８で決定され
たブロック長でバッファ４を介して入力される上記予測
残差の時系列ＭサンプルにＭＤＣＴ処理を施してＭＤＣ
Ｔ係数を生成するＭＤＣＴ部５と、このＭＤＣＴ部５で
生成されたＭＤＣＴ係数を量子化する量子化部６とを備
える。In FIG. 1, an encoder 1 performs a linear / nonlinear prediction analysis process on the audio signal input from an input terminal 2 to output a prediction residual, and a linear / nonlinear prediction analysis unit 3 for outputting the prediction residual; , A block length determining unit 8 that determines a block length at the time of MDCT based on the result estimated by the continuity estimating unit 7, and a block length determining unit 8 that determines the block length. The MDCT processing is performed on the time series M samples of the prediction residual inputted through the buffer 4 with the set block length and the MDC
An MDCT unit 5 that generates a T coefficient and a quantization unit 6 that quantizes the MDCT coefficient generated by the MDCT unit 5 are provided.

【００３０】線形・非線形予測分析部３は、上記オーデ
ィオ信号を例えば１０２４サンプル取り込み、線形ある
いは非線形な予測を行って、予測残差をバッファ４に出
力する。また、線形・非線形予測分析部３は得られた分
析パラメータを出力端子９から出力する。例えば、上記
オーディオ信号を１６次でＬＰＣ分析する。ＬＰＣ係数
はＬＳＰへと変換をし、ＬＳＰを量子化した後、ＬＳＰ
をフレーム間で補間をし、補間した係数を用いてＬＣＰ
残差を求める。さらに、ＬＰＣ残差における最適ピッチ
ラグを求め、このラグと±１ポイントにおける最適ゲイ
ンを算出し、ベクトル量子化する。そして、量子化され
たピッチゲインによりピッチ逆フィルタを構成し、それ
を用いてピッチ残差を得る。The linear / non-linear prediction analysis unit 3 takes in, for example, 1024 samples of the audio signal, performs linear or non-linear prediction, and outputs a prediction residual to the buffer 4. Further, the linear / nonlinear prediction analysis unit 3 outputs the obtained analysis parameters from the output terminal 9. For example, the audio signal is subjected to LPC analysis in the 16th order. The LPC coefficients are converted to LSP, and after quantizing the LSP, the LSP
Is interpolated between frames, and the LCP is calculated using the interpolated coefficients.
Find the residual. Further, an optimum pitch lag in the LPC residual is obtained, an optimum gain at this lag and ± 1 point is calculated, and vector quantization is performed. Then, a pitch inverse filter is constituted by the quantized pitch gain, and a pitch residual is obtained using the inverse filter.

【００３１】定常性推定部７は、上記オーディオ信号の
定常性を推定する。過度的な信号では、ＭＤＣＴのブロ
ック長が長すぎると、十分な時間分解能が得られないこ
とにより、再生音にプリエコーやポストエコーが発生す
る。このため、そのような信号ではＭＤＣＴブロック長
を短くする方が望ましい。一方、信号の時間変動が少な
い準定常的な信号に対しては、ＭＤＣＴのブロック長を
長くとることで、正規化や分析のパラメータにとるビッ
ト数を削減でき、ＭＤＣＴ係数の量子化により多くのビ
ットを割り当てることができる。このため、図１に示す
エンコーダ１では入力信号の性質に従ってブロック長を
例えばＬＯＮＧとＳＨＯＲＴの２段階に切り替える。こ
のとき、入力信号の特性を判断するのがこの定常性推定
部７である。具体的には、入力信号のフレームパワーの
直前のフレームからの変動とＬＳＰの直前のフレームか
らの変動を求め、一定の閾値に基づき前のフレームと比
べて変動が大きいと判定をしたフレームにフラグを立て
ていく。そして、現在のフレームから前後数フレームに
わたってフラグが立っていない場合を、信号の時間変動
が少ない準定常的な信号と判断する。The stationarity estimating section 7 estimates stationarity of the audio signal. With an excessive signal, if the block length of the MDCT is too long, a sufficient time resolution cannot be obtained, so that a pre-echo or a post-echo occurs in the reproduced sound. Therefore, it is desirable to shorten the MDCT block length for such a signal. On the other hand, for a quasi-stationary signal with a small time variation of the signal, by increasing the MDCT block length, the number of bits used for normalization and analysis parameters can be reduced. Bits can be allocated. For this reason, the encoder 1 shown in FIG. 1 switches the block length between two stages, for example, LONG and SHORT according to the properties of the input signal. At this time, it is the continuity estimator 7 that determines the characteristics of the input signal. Specifically, a change in the frame power of the input signal from the frame immediately before and a change from the frame immediately before the LSP are obtained, and a frame determined to have a large change compared to the previous frame based on a certain threshold value is flagged. I will make up. Then, when the flag is not set from the current frame to several frames before and after the current frame, the signal is determined to be a quasi-stationary signal with little time variation of the signal.

【００３２】ブロック長決定部８は、定常性推定部７で
の推定により、定常性が高いと判断された場合はＭＤＣ
Ｔのブロック長を長くとり（ＬＯＮＧ）、過度的な信号
の場合は短くする（ＳＨＯＲＴ）。このブロック長決定
部８で決定されたブロック長情報は、入力端子１１を介
して出力される。If the continuity estimating unit 7 determines that the continuity is high, the block length determining unit 8
The block length of T is lengthened (LONG), and is shortened in the case of an excessive signal (SHORT). The block length information determined by the block length determining unit 8 is output via the input terminal 11.

【００３３】例えば、図２は、あるオーディオ信号のサ
ンプル列をプロットしたものであるが、図中の中間点付
近で信号レベルの急激な変動があるのがわかる。このよ
うな信号を図１のエンコーダ１に入力する場合、急激な
変動があるサンプル位置では短いブロック長を選択され
るのが望ましい。For example, FIG. 2 is a plot of a sample sequence of a certain audio signal. It can be seen that there is a sharp change in the signal level near the middle point in the figure. When such a signal is input to the encoder 1 of FIG. 1, it is desirable to select a short block length at a sample position where there is a sudden change.

【００３４】ＭＤＣＴ部５は、ブロック長決定部８で決
定されたブロック長で、ＩＭＤＣＴ時にエイリアシング
が生じる境界となるサンプル位置αを０≦α＜Ｍの範囲
で任意に決定して、線形・非線形予測分析部３から出力
される上記予測残差を入力時系列Ｍサンプルとしてオー
バーラップさせながら、時系列ＭサンプルにＭＤＣＴ処
理を施してＭＤＣＴ係数を生成する。The MDCT unit 5 arbitrarily determines a sample position α as a boundary where aliasing occurs at the time of IMDCT within the range of 0 ≦ α <M based on the block length determined by the block length determination unit 8, and performs linear / non-linear processing. While overlapping the prediction residuals output from the prediction analysis unit 3 as input time-series M samples, the time-series M samples are subjected to MDCT processing to generate MDCT coefficients.

【００３５】量子化部６は、ＭＤＣＴ部５で生成された
ＭＤＣＴ係数を量子化し、そのインデックスを出力端子
１０から出力する。量子化部６での量子化の具体例を以
下に説明する。線形・非線形予測分析部３から出力され
た上記予測残差が上記ピッチ残差であるとき、ＭＤＣＴ
部５により生成されたＭＤＣＴ係数は、正規化を行った
後、２次元８ビット、４次元８ビット、８次元８ビット
の３種類の量子化ユニットを用いて量子化する。ビット
アロケーションは、分析や正規化に用いるパラメータの
みから算出される重みを用いて決定する。これにより、
個々の係数に対する最適ビットアロケーションを求めて
から量子化する方法と比べて、位置情報等のパラメータ
が不要であるため、係数の量子化に割り当て可能なビッ
ト数の割合を多くとることが可能となる。そして、量子
化部６は、上記インデックスとしての量子化データを６
ｋｂｐｓ〜３２ｋｂｐｓのレートで出力する。The quantization section 6 quantizes the MDCT coefficients generated by the MDCT section 5 and outputs the index from the output terminal 10. A specific example of the quantization in the quantization unit 6 will be described below. When the prediction residual output from the linear / nonlinear prediction analysis unit 3 is the pitch residual, the MDCT
After performing normalization, the MDCT coefficients generated by the unit 5 are quantized using three types of quantization units of two-dimensional eight bits, four-dimensional eight bits, and eight-dimensional eight bits. Bit allocation is determined using weights calculated from only parameters used for analysis and normalization. This allows
Compared with the method of obtaining the optimal bit allocation for each coefficient and then performing quantization, since parameters such as position information are not required, it is possible to increase the ratio of the number of bits that can be allocated to the quantization of the coefficient. . Then, the quantization unit 6 converts the quantized data as the index into 6
Output at a rate of kbps to 32 kbps.

【００３６】次に、上記構成のエンコーダ１の概略的な
動作について説明する。入力端子２を介してエンコーダ
１には、１６ＫＨｚでサンプリングされたオーディオ信
号が入力される。線形・非線形予測分析部３では、オー
ディオ信号を１０２４サンプル取り込み、線形あるいは
非線形な予測を行って、予測残差をバッファ４に供給す
ると共に、定常性推定部７に供給する。定常性推定部７
は、供給された予測残差を基に、上記オーディオ信号の
定常性を推定し、推定結果をブロック長決定部８に供給
する。ブロック長決定部８は定常性推定部７で推定され
た入力サンプルの特性から、ＭＤＣＴのブロック長を例
えば１０２４或いは２０４８のどちらかに決定する。ブ
ロック長はより高い時間分解能が必要と判定された区間
で１０２４が、信号の時間変化が少なく準定常的とみな
せる区間には２０４８が選択されるようにする。その
後、決定されたブロック長に基づき、ＭＤＣＴ部５はバ
ッファ４からサンプルを取り出してＭＤＣＴを行う。Ｍ
ＤＣＴ係数は量子化部６で量子化され、そのインデック
スが量子化データとして６ｋｂｐｓ〜３２ｋｂｐｓのレ
ートで出力端子１０から出力される。また、線形・非線
形予測分析部３で得られた分析パラメータは出力端子９
から出力される。また、ブロック長決定部８で得られた
ブロック長情報は出力端子１１から出力される。Next, a schematic operation of the encoder 1 having the above configuration will be described. An audio signal sampled at 16 KHz is input to the encoder 1 via the input terminal 2. The linear / nonlinear prediction analysis unit 3 captures 1024 samples of the audio signal, performs linear or non-linear prediction, supplies the prediction residual to the buffer 4, and supplies the prediction residual to the continuity estimation unit 7. Stationarity estimator 7
Estimates the stationarity of the audio signal based on the supplied prediction residual, and supplies the estimation result to the block length determination unit 8. The block length determining unit 8 determines the MDCT block length to be, for example, either 1024 or 2048 based on the characteristics of the input sample estimated by the stationarity estimating unit 7. The block length is selected to be 1024 in a section where it is determined that a higher time resolution is required, and 2048 is selected in a section in which a signal has little temporal change and can be regarded as quasi-stationary. Thereafter, based on the determined block length, the MDCT unit 5 takes out a sample from the buffer 4 and performs MDCT. M
The DCT coefficient is quantized by the quantization unit 6, and its index is output as quantized data from the output terminal 10 at a rate of 6 kbps to 32 kbps. The analysis parameters obtained by the linear / nonlinear prediction analysis unit 3 are output to an output terminal 9.
Output from The block length information obtained by the block length determining unit 8 is output from the output terminal 11.

【００３７】次に、第２の実施の形態として、図３に示
すデコーダ２０を説明する。このデコーダ２０は、上記
図１のエンコーダ１から出力された分析パラメータ、イ
ンデックス、ブロック長情報を受け取り、これらを基に
オーディオ信号を再生する。Next, a decoder 20 shown in FIG. 3 will be described as a second embodiment. The decoder 20 receives the analysis parameter, index, and block length information output from the encoder 1 shown in FIG. 1, and reproduces an audio signal based on the information.

【００３８】このデコーダ２０は、入力端子２１を介し
て受け取った上記インデックスを逆量子化する逆量子化
部２２と、この逆量子化部２２で逆量子化されて得られ
たＭＤＣＴ係数を、入力端子２３を介して受け取った上
記ブロック長情報を基に、逆ＭＤＣＴするＩＭＤＣＴ部
２４と、このＩＭＤＣＴ部２４からの出力時系列サンプ
ルと入力端子２５を介して受け取った分析パラメータを
基にオーディオ信号を合成する合成部２６とを備えてな
り、出力端子２７からオーディオ信号を出力する。The decoder 20 inputs an inverse quantization unit 22 for inversely quantizing the index received via the input terminal 21 and an MDCT coefficient obtained by inverse quantization by the inverse quantization unit 22 to an input. Based on the block length information received via the terminal 23, an IMDCT unit 24 for inverse MDCT, and an audio signal based on an output time-series sample from the IMDCT unit 24 and an analysis parameter received via the input terminal 25 And a synthesizing unit 26 for synthesizing, and outputs an audio signal from an output terminal 27.

【００３９】次に、上記構成のデコーダ２０の概略的な
動作について説明する。逆量子部２２は、入力端子２１
を介して上記エンコーダ１から供給されたインデックス
を逆量子化する。ＩＭＤＣＴ部２４は、逆量子化部２２
で得られたＭＤＣＴ係数を入力端子２３を介して供給さ
れたブロック長情報から決定されるブロック長でＩＭＤ
ＣＴする。合成部２６は、入力端子２５を介して供給さ
れた分析パラメータとＩＭＤＣＴ部２４からの時系列パ
ラメータとを基にオーディオ信号を合成して再生する。Next, a schematic operation of the decoder 20 having the above configuration will be described. The inverse quantum unit 22 includes an input terminal 21
, The index supplied from the encoder 1 is inversely quantized. The IMDCT unit 24 includes the inverse quantization unit 22
The MDCT coefficient obtained in step (1) is calculated using the block length determined from the block length information supplied through the input terminal 23 by the IMD.
CT. The synthesizing unit 26 synthesizes and reproduces an audio signal based on the analysis parameters supplied via the input terminal 25 and the time-series parameters from the IMDCT unit 24.

【００４０】以上に第１の実施の形態として図１のエン
コーダ１を、第２の実施の形態として図３のデコーダ２
０を説明した。ここからは、本発明の直交変換装置や逆
直交変換装置の具体例について説明する。As described above, the encoder 1 of FIG. 1 is used as the first embodiment, and the decoder 2 of FIG. 3 is used as the second embodiment.
0 has been described. Hereinafter, specific examples of the orthogonal transform device and the inverse orthogonal transform device of the present invention will be described.

【００４１】以下では、図１のエンコーダ１を構成した
ＭＤＣＴ部５を本発明の直交変換装置の具体例として、
また図３のデコーダ２０を構成したＩＭＤＣＴ部２４を
本発明の逆直交変換装置の具体例として詳細に説明す
る。このＭＤＣＴ部５は、上記図８を用いて説明したよ
うに、上記（１）式、（２）式の定義式を用いて行われ
る従来のＭＤＣＴにおいては、対称な時間窓を用い、前
後のフレームと５０％オーバーラップさせるので、ＩＭ
ＤＣＴを行って得られるｘ~(m)に含まれるエイリアシン
グをキャンセルさせることができないことから、図８の
フレームjのようなオーバラップのさせかたではもとの
時系列サンプルを復元することができないという課題を
解決するために成された。Hereinafter, the MDCT unit 5 constituting the encoder 1 shown in FIG. 1 will be described as a specific example of the orthogonal transform apparatus of the present invention.
Further, the IMDCT unit 24 constituting the decoder 20 of FIG. 3 will be described in detail as a specific example of the inverse orthogonal transform device of the present invention. As described with reference to FIG. 8, the MDCT unit 5 uses a symmetrical time window in the conventional MDCT performed using the definition formulas (1) and (2), and 50% overlap with frame, so IM
Since the aliasing included in x エイ (m) obtained by performing the DCT cannot be canceled, the original time-series samples can be restored by the overlapping method such as the frame j in FIG. It was done to solve the problem of being unable to do so.

【００４２】そこで、上記ＭＤＣＴ部５では、図８のよ
うなブロック長の切り替えを行った場合でも、もとの時
系列サンプルが完全に復元できるようにするために、上
記（１）式と（２）式の、ＭＤＣＴとＩＭＤＣＴの定義
式を一般化し、以下の（４）式、（５）式ような定義を
導入する。Therefore, in the MDCT unit 5, even when the block length is switched as shown in FIG. 8, the above equation (1) and (1) are used so that the original time-series samples can be completely restored. The definition of MDCT and IMDCT in the expression 2) is generalized, and the following expressions (4) and (5) are introduced.

【００４３】[0043]

【数４】 (Equation 4)

【００４４】[0044]

【数５】 (Equation 5)

【００４５】但し、上記（４）、（５）式でｘは入力信
号、ｙはＭＤＣＴ係数、ｘ~は逆ＭＤＣＴ出力、Ｍはブ
ロック長、ｈは順変換用窓関数、ｆは逆変換用窓関数、
αはエイリアシング境界（０≦α≦Ｍ）である。In the above equations (4) and (5), x is an input signal, y is an MDCT coefficient, x ブロック is an inverse MDCT output, M is a block length, h is a window function for forward transform, and f is an inverse transform. Window function,
α is an aliasing boundary (0 ≦ α ≦ M).

【００４６】上記（４）式と（５）式では、新たにαと
いうパラメータが導入されているが、これはＩＭＤＣＴ
部２４でＩＭＤＣＴされたサンプルｘ~(m)におけるエイ
リアシングの境界となるサンプル位置を決定づけるパラ
メータである。α＝Ｍ／２とすると、上記（１）式、
（２）式で定義される通常のＭＤＣＴと同一となる。In the above equations (4) and (5), a parameter α is newly introduced.
These parameters determine a sample position that is a boundary of aliasing in the samples x to (m) subjected to the IMDCT by the unit 24. If α = M / 2, the above equation (1)
This is the same as the normal MDCT defined by equation (2).

【００４７】上記（４）式に（５）式を代入すると、次
の（６）式となる。By substituting equation (5) into equation (4), the following equation (6) is obtained.

【００４８】[0048]

【数６】 (Equation 6)

【００４９】ここで、以下のようにξ(l)を定義する。Here, ξ (l) is defined as follows.

【００５０】[0050]

【数７】 (Equation 7)

【００５１】上記（６）式を上記ξ(l)を用いて書き直
すと、次の（７）式が得られる。When the above equation (6) is rewritten using the above ξ (l), the following equation (7) is obtained.

【００５２】[0052]

【数８】 (Equation 8)

【００５３】ここで、ξ(l)は、以下の（８）式であ
る。Here, ξ (l) is the following equation (8).

【００５４】[0054]

【数９】 (Equation 9)

【００５５】また、上記（６）式において０≦ｒ＜Ｍか
つ０≦ｍ＜Ｍであることから、次式を満たす項のみが残
る。Since 0 ≦ r <M and 0 ≦ m <M in the above equation (6), only the terms satisfying the following equation remain.

【００５６】[0056]

【数１０】 (Equation 10)

【００５７】このため、次の（９）式が得られる。Therefore, the following equation (9) is obtained.

【００５８】[0058]

【数１１】 [Equation 11]

【００５９】この（９）式で、それぞれ第２項に現れる
ものがエイリアシング成分であり、エイリアシングはα
番目のサンプルを境に、逆極性のものが生じている。し
たがって、窓ｆ(m)，ｈ(m)を適当に選び、隣接するフレ
ーム間でエイリアシングの境界を合わせることにより、
エイリアシングをキャンセルさせることが可能である。In the equation (9), what appears in the second term is an aliasing component, and the aliasing is α
The one with the opposite polarity occurs after the second sample. Therefore, by appropriately selecting the windows f (m) and h (m) and matching the aliasing boundary between adjacent frames,
Aliasing can be canceled.

【００６０】次に、元サンプルが復元されるための条件
について説明する。フレームjにおけるブロック長をＭ
_ｊ、エイリアシング境界をα_ｊ、順変換用窓をｈ
_ｊ(m)、逆変換用窓をｆ_ｊ(m)とおくと、エイリアシング
がキャンセルされ、もとのサンプルが完全に復元される
ための諸条件は、以下の（１０）、（１１）、（１２）
式である。Next, conditions for restoring the original sample will be described. Let the block length in frame j be M
_j , the aliasing boundary is α _j , and the forward transform window is h
_{If j} (m) and the inverse transform window are f _j (m), the conditions for canceling the aliasing and completely restoring the original sample are as follows (10), (11), (12)
It is an expression.

【００６１】[0061]

【数１２】 (Equation 12)

【００６２】[0062]

【数１３】 (Equation 13)

【００６３】[0063]

【数１４】 [Equation 14]

【００６４】ここで、上記の一般化されたＭＤＣＴ部５
で行われるブロック長切り替えの例を示す。簡単のた
め、順変換用窓ｈ(m)と逆変換用窓ｆ(m)は同一であると
する。さらに、切り替えを行うブロック以外では、通常
のＭＤＣＴ（α＝Ｍ／２）を行うものとし、窓は対称と
する。すなわち、０≦ｍ＜Ｍにおいて、上記窓を次式の
ようにする。Here, the generalized MDCT unit 5
Shows an example of the block length switching performed in step (a). For simplicity, it is assumed that the forward conversion window h (m) and the inverse conversion window f (m) are the same. Further, except for the block where the switching is performed, normal MDCT (α = M / 2) is performed, and the window is symmetric. That is, when 0 ≦ m <M, the above-mentioned window is set as the following equation.

【００６５】[0065]

【数１５】 (Equation 15)

【００６６】このとき、次式が成り立てば、元サンプル
が復元されるための条件が満たされる。At this time, if the following equation is satisfied, the condition for restoring the original sample is satisfied.

【００６７】[0067]

【数１６】 (Equation 16)

【００６８】このような条件のもとで、図８に示される
Ｍ_１からＭ_２（Ｍ_１＜Ｍ_２）へのブロック長の切り替え
を考える。先ず、ブロック長切り替えを行う第jフレー
ムのエイリアシング境界αは、上記（１０）式の条件か
ら、次の（１３）式が満たされなければならない。[0068] Under such conditions, consider a switch of block lengths from _{M 1} shown in FIG. 8 _M 2 to _{_(M} 1 <M _2). First, the aliasing boundary α of the j-th frame for which block length switching is performed must satisfy the following expression (13) from the condition of the above expression (10).

【００６９】[0069]

【数１７】 [Equation 17]

【００７０】ブロック長がＭ１のフレームで用いる窓を
ｈ_ｓ(m)、ブロック長がＭ２のフレームで用いる窓をｈ
_ｌ(m)とする。第jフレームの窓ｈ_ｔ(m)は、上記（１
１）式の条件から、次の（１４）式でなければならな
い。The window used in the frame whose block length is M1 is h _s (m), and the window used in the frame whose block length is M2 is h.
_l (m). The window h _t (m) of the j-th frame is calculated by using
From the condition of the expression (1), the following expression (14) must be satisfied.

【００７１】[0071]

【数１８】 (Equation 18)

【００７２】また、上記（１３）（１４）式の条件が満
たされれば、上記（１２）式が満たされるのは明かであ
る。したがって、切り替えを行うブロックにおいても、
もとの時系列サンプルを完全に復元できることがわか
る。If the conditions of the above equations (13) and (14) are satisfied, it is clear that the above equation (12) is satisfied. Therefore, even in the switching block,
It can be seen that the original time-series samples can be completely restored.

【００７３】次に、[2]：「岩垂正宏,西谷隆夫、杉山昭
彦．ＭＤＣＴ方式に関する一検討と高速算法．信学技
報，Vol．CAS90-9 DSP90-13,pp.49-54,1990」ではＭＤ
ＣＴの高速演算が提案されており、このアルゴリズムを
用いて上記（４）式、（５）式で定義される一般化ＭＤ
ＣＴも高速に計算できる。以下にその手順を示す。Next, [2]: "Masahiro Iwatare, Takao Nishitani, Akihiko Sugiyama. A Study on MDCT Method and High-Speed Algorithm. IEICE Technical Report, Vol. CAS90-9 DSP90-13, pp. 49-54, 1990. Is MD
A high-speed calculation of CT has been proposed. Using this algorithm, generalized MD defined by the above equations (4) and (5) is used.
CT can also be calculated at high speed. The procedure is described below.

【００７４】先ず、順変換について説明する。初めに以
下のようにｘｈ(m)，ｘ_２(m)を定義する。First, the forward conversion will be described. Xh (m) as follows First, define the x 2 _(m).

【００７５】[0075]

【数１９】 [Equation 19]

【００７６】上記（１５）式の並べ替えの演算は、
［２］における（１１）式に相当するもので、実際α＝
Ｍ／２とすれば全く同一となる。さらにｘ_２(m)を用い
て上記（４）式を書き直すと、次の（１６）式となる。The rearrangement operation of the above equation (15) is as follows.
It is equivalent to the equation (11) in [2], and actually α =
If M / 2, it will be exactly the same. When the above equation (4) is rewritten using x ₂ (m), the following equation (16) is obtained.

【００７７】[0077]

【数２０】 (Equation 20)

【００７８】この（１６）式は［２］における（１２）
式と同一の形になる。［２］ではこの式を変形すること
により高速演算を実現しており、［２］における式（１
１）の変わりに本願の（１５）式の演算を行い、以降を
［２］と同じ演算を行うことで、（４）式の計算を
［２］で提案されている高速演算のアルゴリズムを適用
可能であることが分かる。計算の手順をまとめると、以
下に示すようになる。This equation (16) is equivalent to equation (12) in [2].
It has the same form as the formula. In [2], a high-speed operation is realized by modifying this equation, and the equation (1) in [2] is realized.
By performing the operation of Expression (15) of the present application instead of 1) and performing the same operation as in [2], the calculation of Expression (4) is applied to the high-speed operation algorithm proposed in [2]. It turns out that it is possible. The calculation procedure is summarized as follows.

【００７９】先ず、順変換用窓をかけた入力信号ｘｈ
(m)を、上記（１６）式にしたがって並べ替える。First, the input signal xh with the window for forward conversion applied
(m) is rearranged according to the above equation (16).

【００８０】次に、以下の（１７）式にしたがいｘ
_２(m)からｘ_３(m)を作る。Next, according to the following equation (17), x
Create x ₃ (m) from ₂ (m).

【００８１】[0081]

【数２１】 (Equation 21)

【００８２】次に、ｘ_３(m)にexp（-j・（2πm／Ｍ））
をかけて、次の（１８）式に示す複素信号z_１(m)を作
る。Next, exp (−j · (2πm / M)) is applied to x ₃ (m).
To produce a complex signal z ₁ (m) shown in the following equation (18).

【００８３】[0083]

【数２２】 (Equation 22)

【００８４】次に、ｚ_１(m)にＭ／２点ＦＦＴを施し、
次の（１９）式に示すｚ_２(k)を得る。Next, an M / 2-point FFT is performed on z ₁ (m).
Z ₂ (k) shown in the following equation (19) is obtained.

【００８５】[0085]

【数２３】 (Equation 23)

【００８６】最後に、ＦＦＴの出力から以下（２０）式
のようにしてＭＤＣＴ係数をとりだす。Finally, MDCT coefficients are extracted from the output of the FFT as in the following equation (20).

【００８７】[0087]

【数２４】 (Equation 24)

【００８８】逆変換の場合も順変換と同様に、［２］の
高速演算が適用できる。逆変換の場合は、最後の時系列
サンプルの極性反転と並べ替え以外は、同じ手順で計算
すればよい。In the case of the inverse transform, the high-speed operation of [2] can be applied as in the case of the forward transform. In the case of the inverse transform, the calculation may be performed in the same procedure except for the polarity inversion and rearrangement of the last time series sample.

【００８９】すなわち、以下の（２１）式のように係数
の並べ替えを行う。That is, the coefficients are rearranged as in the following equation (21).

【００９０】[0090]

【数２５】 (Equation 25)

【００９１】次に、ｙ_２(k)にexp（-j・（2πk／Ｍ））
をかけて、次の（２２）式に示す複素信号z_１(k)を作
る。Next, exp (-j · (2πk / M)) is applied to y ₂ (k).
To produce a complex signal z ₁ (k) shown in the following equation (22).

【００９２】[0092]

【数２６】 (Equation 26)

【００９３】次に、ｚ_１(k)にＭ／２点の逆ＦＦＴを施
し次の（２３）式に示すｚ_２(m)を得る。Next, an inverse FFT of M / 2 points is performed on z ₁ (k) to obtain z ₂ (m) shown in the following equation (23).

【００９４】[0094]

【数２７】 [Equation 27]

【００９５】そして、逆ＦＦＴの出力から以下（２４）
式のようにしてｘ_０~(m)を取り出す。From the output of the inverse FFT, the following (24)
Extract x ₀ ~ (m) as in the equation.

【００９６】[0096]

【数２８】 [Equation 28]

【００９７】最後にｘ_０~(m)の極性反転・並べ替えを行
い、次の（２５）式のＩＭＤＣＴの出力ｘ~(m)を得る。Finally, the polarity inversion and rearrangement of x ₀ to (m) are performed to obtain the output x to (m) of the IMDCT of the following equation (25).

【００９８】[0098]

【数２９】 (Equation 29)

【００９９】ここで、入力ポイント数に関して説明す
る。本発明の方法でブロック長の切り替えを行うと、従
来のＭＤＣＴを行うフレームのブロック長Ｍが２のべき
乗であっても、切り替えを行うフレームでは２のべき乗
にならないケースがでてくる。例えば、第j-1フレーム
及び第j+1フレームのブロック長とエイリアシング境界
がそれぞれ、以下の式で示される場合である。Here, the number of input points will be described. When the block length is switched by the method of the present invention, there are cases where the block length M of the frame to be subjected to the conventional MDCT is a power of 2, but does not become a power of 2 in the frame to be switched. For example, this is a case where the block lengths and the aliasing boundaries of the j-1 frame and the j + 1 frame are respectively represented by the following equations.

【０１００】[0100]

【数３０】 [Equation 30]

【０１０１】[0101]

【数３１】 (Equation 31)

【０１０２】すると、第jフレームのブロック長Ｍ_ｊは
上記（１０）式の条件から、以下の（２６）式となる。Then, the block length M _j of the j-th frame is given by the following equation (26) from the condition of the above equation (10).

【０１０３】[0103]

【数３２】 (Equation 32)

【０１０４】このときａ＜ｂであるとすると、Ｍ_ｊは、
以下の式のようになる。At this time, if a <b, M _j is
It becomes like the following formula.

【０１０５】[0105]

【数３３】 [Equation 33]

【０１０６】つまり２のべき乗とならない。この例の第
jフレームでは、上記（１９）式や（２３）式において
２のべき乗ではないＦＦＴやＩＦＦＴを行わなくてはな
らない。通常、ＦＦＴやＩＦＦＴはポイント数が２のべ
き乗であることが前提であり、これに当てはまらないポ
イント数は計算できないため、そのようなＦＦＴ装置を
用いる場合は上記の演算が不可能になる。That is, it does not become a power of two. In this example
In the j-frame, an FFT or IFFT that is not a power of 2 in the above equations (19) and (23) must be performed. Normally, FFT and IFFT are based on the premise that the number of points is a power of two, and the number of points that does not apply to this cannot be calculated. Therefore, when such an FFT device is used, the above calculation becomes impossible.

【０１０７】ところで、Ｐを奇数、Ｑを整数としＰ×２
^ＱポイントのＦＦＴおよびＩＦＦＴを実現する方法とし
て、本件出願人が出願した、高速フーリエ変換方法、及
び高速逆フーリエ変換方法がある。この方法を用いるこ
とで、上記の例のような場合でも高速演算が可能とな
る。By the way, when P is an odd number and Q is an integer, P × 2
^As a method of realizing the ^Q- point FFT and IFFT, there are a fast Fourier transform method and a fast inverse Fourier transform method applied by the present applicant. By using this method, high-speed operation can be performed even in the case of the above example.

【０１０８】このうち高速フーリエ変換方法は、Ｐを奇
数、Ｑを整数とするときにＰ×２^Ｑポイントの複素数デ
ータを入力データとし、この入力データに高速フーリエ
変換、を施し、Ｐ×２^Ｑポイントの複素数データを出力
する。具体的には配列ｘを構成するＮポイントを奇数Ｑ
で分割したＮ／Ｑ毎にＰポイントのデータを取り出し、
このＱポイントのデータに離散フーリエ変換を施し、得
られたＱポイント離散フーリエ変換係数にひねり係数を
乗算し、その乗算結果を上記配列ｘに戻してから、Ｐ個
に分割された領域で２^Ｑポイント高速フーリエ変換を行
うものである。Among them, the fast Fourier transform method uses a complex number data of P × 2 ^Q points as input data when P is an odd number and Q is an integer, performs a fast Fourier transform on the input data, and obtains a P × 2 ^Q Output the complex data of point. Specifically, the N points that make up the array x are odd Q
Extract P point data for each N / Q divided by
Performing discrete Fourier transform on the data of the Q point, the twist coefficient by multiplying the Q-point discrete Fourier transform coefficients obtained, the multiplication result from the back to the sequence x, 2 ^Q in areas divided into P It performs point fast Fourier transform.

【０１０９】次に、図４は、ＭＤＣＴブロック長切り替
えの方法として、従来のものを用いた場合のフレームど
りの例である。フレームj+1を境にフレームj+2は短いブ
ロック長が選択されており、フレームj+3を境にフレー
ムj+4では長いブロック長が選択されている。図４で明
らかなように、フレームj+1、とフレームj+2の間で２５
６サンプル分位相がずれており、同様にフレームj+2と
フレームj+3でも２５６サンプル分のずれがあるのが分
かる。フレームj+2及びフレームj+4ではＭＤＣＴより以
前の処理（線形・非線形予測）に、ずれを考慮する必要
があるため、前処理のブロック長を変更するなど、特別
な処理を要する。Next, FIG. 4 shows an example of frame switching when a conventional method is used as a method for switching the MDCT block length. A short block length is selected for frame j + 2 at frame j + 1, and a long block length is selected for frame j + 4 at frame j + 3. As is apparent from FIG. 4, 25 frames between frame j + 1 and frame j + 2.
It can be seen that the phase is shifted by 6 samples, and similarly, there is a shift of 256 samples between frame j + 2 and frame j + 3. In the frames j + 2 and j + 4, the processing prior to the MDCT (linear / non-linear prediction) needs to consider the deviation, so that special processing such as changing the block length of the preprocessing is required.

【０１１０】図５は、図２の信号を上記エンコーダ１に
入力したときの、ＭＤＣＴブロックの窓のとり方の例を
示している。フレームj+2では短いブロック長が選択さ
れており、他は長いブロック長が選択されてる。図４の
場合とは異なり、位相のずれは起こらないのでＭＤＣＴ
より以前に例外的な処理を必要としない。FIG. 5 shows an example of how to take the window of the MDCT block when the signal of FIG. In the frame j + 2, a short block length is selected, and in other cases, a long block length is selected. Unlike the case of FIG. 4, no phase shift occurs, so the MDCT
No earlier need for exceptional processing.

【０１１１】以上に説明した実施の形態によれば、前処
理を行ったあとの予測残差信号などにＭＤＣＴを施すよ
うな変換符号化において、ＭＤＣＴのブロック長を切り
替える際、位相のずれを起こすことなく前処理のブロッ
ク長を一定に保つことができる。さらに、従来方法では
切り替えを行うフレームで位相のずれが起こるような場
合にも、位相のずれを起こさずブロック長の切り替えが
可能となる。According to the above-described embodiment, a phase shift occurs when the MDCT block length is switched in the transform coding in which the MDCT is performed on the prediction residual signal or the like after the preprocessing is performed. The block length of the pre-processing can be kept constant without any problem. Further, in the conventional method, even when a phase shift occurs in a frame to be switched, the block length can be switched without causing a phase shift.

【０１１２】[0112]

【発明の効果】本発明にかかる直交変換装置及び方法に
よれば、オーバーラップの区切りを任意に決めることが
出来る。According to the orthogonal transform apparatus and method according to the present invention, it is possible to arbitrarily determine the break of the overlap.

【０１１３】本発明に係る逆直交変換装置及び方法によ
れば、上記直交変換装置及び方法で得られた直交変換係
数を逆変換することができる。According to the inverse orthogonal transform apparatus and method according to the present invention, the orthogonal transform coefficients obtained by the above orthogonal transform apparatus and method can be inversely transformed.

【０１１４】本発明に係る変換符号化装置及び方法は、
オーバーラップの区切りを任意に決め、かつオーバーラ
ップ加算での完全な信号再生を可能とすることができ
る。The apparatus and method for transform coding according to the present invention
It is possible to arbitrarily determine the break of the overlap and to enable complete signal reproduction by overlap addition.

【０１１５】本発明に係る復号装置及び方法は、上記変
換符号化装置及び方法により符号化された符号化データ
を復号することができる。The decoding apparatus and method according to the present invention can decode coded data coded by the above-described transform coding apparatus and method.

[Brief description of the drawings]

【図１】本発明の第１の実施の形態となるエンコーダの
構成を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of an encoder according to a first embodiment of the present invention.

【図２】あるオーディオ信号のサンプル列をプロットし
た図である。FIG. 2 is a diagram in which a sample sequence of a certain audio signal is plotted.

【図３】本発明の第２の実施の形態となるデコーダの構
成を示すブロック図である。FIG. 3 is a block diagram illustrating a configuration of a decoder according to a second embodiment of the present invention.

【図４】ブロック長切り替えの従来の方法の具体例を説
明するための図である。FIG. 4 is a diagram for explaining a specific example of a conventional method of switching a block length.

【図５】ブロック長切り替えの本発明における具体例を
説明するための図である。FIG. 5 is a diagram for explaining a specific example of block length switching in the present invention.

【図６】ＭＤＣＴのアルゴリズムを説明するための図で
ある。FIG. 6 is a diagram for explaining an MDCT algorithm.

【図７】従来方法におけるブロック長切り替えの具体例
を示す図である。FIG. 7 is a diagram showing a specific example of block length switching in a conventional method.

【図８】窓に０が無いブロック長切り替えの具体例を示
す図である。FIG. 8 is a diagram illustrating a specific example of block length switching in which there is no 0 in a window.

[Explanation of symbols]

１エンコーダ、３線形・非線形予測分析部、５Ｍ
ＤＣＴ部、６量子化部、７定常推定部、８ブロッ
ク長決定部、２０デコーダ、２２逆量子化部、２４
ＩＭＤＣＴ部1 encoder, 3 linear / nonlinear predictive analysis section, 5M
DCT section, 6 quantization section, 7 stationary estimation section, 8 block length determination section, 20 decoder, 22 inverse quantization section, 24
IMDCT section

───────────────────────────────────────────────────── フロントページの続き (72)発明者西口正之東京都品川区北品川６丁目７番35号ソニー株式会社内Ｆターム(参考） 5D045 DA02 5J064 BA12 BA13 BA16 BB03 BB05 BC02 BC08 BC16 BD01 5K041 AA01 CC01 CC02 EE11 HH09 ────────────────────────────────────────────────── ─── Continuing on the front page (72) Inventor Masayuki Nishiguchi 6-35 Kita Shinagawa, Shinagawa-ku, Tokyo Sony Corporation F-term (reference) 5D045 DA02 5J064 BA12 BA13 BA16 BB03 BB05 BC02 BC08 BC16 BD01 5K041 AA01 CC01 CC02 EE11 HH09

Claims

[Claims]

1. An orthogonal transform apparatus for orthogonally transforming an input time-series sample while overlapping the time-series sample. When orthogonally transforming a time-series M sample, a sample position α which is a boundary at which aliasing occurs at the time of inverse orthogonal transform is set to 0. An orthogonal transformation device, wherein the orthogonal transformation is performed arbitrarily determined within a range of ≦ α <M.

2. The orthogonal transform apparatus according to claim 1, wherein the sample position α is adjusted between adjacent frames.

3. The orthogonal transform apparatus according to claim 2, wherein the sample position α is matched between adjacent frames by appropriately selecting a window function.

4. The orthogonal transform apparatus according to claim 3, wherein said window function does not include a zero component.

5. An orthogonal transform method for orthogonally transforming input time-series samples while overlapping the time-series samples. When orthogonally transforming time-series M samples, a sample position α, which is a boundary at which aliasing occurs at the time of inverse orthogonal transform, is set to 0. An orthogonal transformation method characterized by arbitrarily determining an orthogonal transformation in a range of ≦ α <M.

6. An inverse orthogonal transform apparatus for inversely orthogonally transforming orthogonal transform coefficients obtained by orthogonally transforming time-series samples while overlapping the time-series samples, wherein a sample position α at which aliasing occurs at the time of inverse orthogonal transform is set to 0 ≦ α. An inverse orthogonal transform apparatus, which performs inverse orthogonal transform on orthogonal transform coefficients that have been arbitrarily determined and orthogonally transformed in a range of <M.

7. An inverse orthogonal transform method for performing inverse orthogonal transform on orthogonal transform coefficients obtained by orthogonally transforming time-series samples while overlapping the time-series samples, wherein a sample position α that is a boundary at which aliasing occurs at the time of inverse orthogonal transform is 0 ≦ α. <Inverse orthogonal transformation method characterized by performing inverse orthogonal transformation on orthogonal transformation coefficients that have been arbitrarily determined and orthogonally transformed in the range of M.

8. A transform coding apparatus for orthogonally transforming an input signal and compressing and encoding the input signal, comprising: a predictive analysis unit that captures the input signal by predetermined samples, performs a predictive analysis, and outputs a prediction residual; Characteristic determining means for determining a characteristic for each predetermined sample; block length determining means for determining a block length at the time of orthogonal transformation based on the characteristic determined by the characteristic determining means; and a block determined by the block length determining means A sample position α which is a boundary at which aliasing occurs at the time of inverse orthogonal transformation is arbitrarily determined within a range of 0 ≦ α <M, and the prediction residual output from the prediction analysis unit is input to the input time series M
Orthogonal transform means for performing orthogonal transform processing on the time series M samples to generate orthogonal transform coefficients while overlapping as samples, and quantizing means for quantizing the orthogonal transform coefficients generated by the orthogonal transform means. A transform encoding device, comprising:

9. The transform coding apparatus according to claim 8, wherein said orthogonal transform means matches a sample position α of said time-series M samples to be subjected to an orthogonal transform process between adjacent frames.

10. The transform code according to claim 9, wherein said orthogonal transform means matches a sample position α of said time-series M samples to be subjected to an orthogonal transform process between frames by appropriately selecting a window function. Device.

11. The transform coding apparatus according to claim 10, wherein said window function does not include a zero component.

12. The transform coding apparatus according to claim 8, wherein said characteristic judging means judges the continuity of said input signal for each predetermined sample.

13. The block length determining means determines the block length when the characteristic determining means determines quasi-stationarity with a small time variation of the signal, and determines that the signal temporal variation is large by the characteristic determining means. Than the above block length,
13. The transform encoding device according to claim 12, wherein the length is made longer.

14. An apparatus according to claim 8, wherein said input signal is a speech signal and / or an audio signal.

15. The quantized data is from 6 Kbps to 32 Kbps.
The transform coding apparatus according to claim 8, wherein the output is performed at a rate of kbps.

16. A transform encoding method for orthogonally transforming an input signal and compressing and encoding the input signal, wherein a prediction analysis step of capturing the input signal by predetermined samples, performing a prediction analysis and outputting a prediction residual, A characteristic determining step of determining a characteristic for each predetermined sample; a block length determining step of determining a block length at the time of orthogonal transformation based on the characteristic determined in the characteristic determining step; and a block determined in the block length determining step A sample position α, which is a boundary at which aliasing occurs at the time of inverse orthogonal transformation, is arbitrarily determined within a range of 0 ≦ α <M, and the prediction residual output from the prediction analysis step is input time series M
An orthogonal transformation step of performing orthogonal transformation processing on the time-series M samples to generate orthogonal transformation coefficients while overlapping as samples, and a quantization step of quantizing the orthogonal transformation coefficients generated in the orthogonal transformation step. A transform encoding method comprising:

17. An input time series in which a block length determined according to characteristics of an input signal and a sample position α serving as a boundary at which aliasing occurs at the time of inverse orthogonal transform is arbitrarily determined within a range of 0 ≦ α <M. What is claimed is: 1. A decoding apparatus for decoding quantized data obtained by quantizing orthogonal transform coefficients obtained by orthogonally transforming M samples while overlapping each other, comprising: inverse quantization means for inversely quantizing the quantized data; A decoding device comprising: an inverse orthogonal transform unit that performs an inverse orthogonal transform on an orthogonal transform coefficient obtained by inverse quantization by a quantization unit with a block length determined according to the characteristics of the input signal. .

18. A block length determined according to characteristics of an input signal, and a sample position α serving as a boundary at which aliasing occurs at the time of inverse orthogonal transform is arbitrarily determined within a range of 0 ≦ α <M to obtain an input time series. A decoding method for decoding quantized data obtained by quantizing orthogonal transform coefficients obtained by orthogonally transforming M samples while overlapping each other, comprising: an inverse quantization step of inversely quantizing the quantized data; An inverse orthogonal transformation step of performing an inverse orthogonal transformation on the orthogonal transform coefficient obtained by the inverse quantization in the quantization step with a block length determined according to the characteristics of the input signal. .