JPH08179800A

JPH08179800A - Speech coding device

Info

Publication number: JPH08179800A
Application number: JP6322494A
Authority: JP
Inventors: Toshiyuki Morii; 利幸森井
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1994-12-26
Filing date: 1994-12-26
Publication date: 1996-07-12

Abstract

(57)【要約】【目的】本発明は、ＣＥＬＰ方式に基づく符号化装置
におけるＲＯＭ容量削減を目的とする。【構成】上記目的を達成するために本発明は、確率的
符号帳探索の際に確率的符号帳に格納されているコード
ベクトルの正順だけでなく逆順のベクトルもコードベク
トルとして用いる。 (57) [Summary] [Object] An object of the present invention is to reduce the ROM capacity in a coding device based on the CELP method. To achieve the above object, the present invention uses not only the forward order of code vectors stored in the stochastic codebook but also the reverse order vector as a code vector at the time of searching the stochastic codebook.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、ディジタル信号として
の音響情報を少ない情報量で符号化し、伝送し、復号化
することにより、効率の良いデータ伝送を行なう音声符
号化装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech coder for efficiently transmitting data by coding acoustic information as a digital signal with a small amount of information, transmitting it, and decoding it.

【０００２】[0002]

【従来の技術】携帯電話等のディジタル移動通信の分野
では加入者の増加に対処するために低ビットレートの音
声の圧縮符号化法が求められており、各研究機関におい
て研究開発が進んでいる。日本国内においてはモトロー
ラ社の開発したビットレート１１．２ｋｂｐｓのＶＳＥ
ＬＰという符号化法がディジタル携帯電話用の標準符号
化方式として採用された（同方式を搭載したディジタル
携帯電話は１９９４年秋に国内において発売された）。
また更に、ＮＴＴ移動通信網株式会社の開発したビット
レート５．６ｋｂｐｓのＰＳＩ−ＣＥＬＰという符号化
方式が次期携帯電話の標準化方式として採用され、現在
製品開発の段階にある。これらの方式はいずれもＣＥＬ
Ｐ（ＣｏｄｅＥｘｉｔｅｄＬｉｎｅａｒＰｒｅｄ
ｉｃｔｉｏｎ： M.R.Schroeder ”High Quality Speech
at Low Bit Rates” Proc.ICASSP'85 pp.937-940に記
載されている）という方式を改良したものである。これ
は音声を音源情報と声道情報とに分離し、音源情報につ
いては符号帳に格納された複数の音源サンプルのインデ
クスによって符号化し声道情報についてはＬＰＣ（線形
予測係数）を符号化するということと、音源情報符号化
の際には声道情報を加味して入力音声と比較を行なうと
いう方法（Ａ−ｂ−Ｓ：ＡｎａｌｙｓｉｓｂｙＳｙ
ｎｔｈｅｓｉｓ）を採用していることに特徴がある。2. Description of the Related Art In the field of digital mobile communications such as mobile phones, a low bit rate voice compression coding method is required to cope with an increase in subscribers, and research and development are progressing in each research institute. . In Japan, VSE with a bit rate of 11.2 kbps developed by Motorola
An encoding method called LP was adopted as a standard encoding method for digital mobile phones (digital mobile phones equipped with the same method were released in Japan in the fall of 1994).
Furthermore, a coding method called PSI-CELP with a bit rate of 5.6 kbps developed by NTT Mobile Communication Network Co., Ltd. has been adopted as a standardization method for the next mobile phone, and is currently in the stage of product development. All of these methods are CEL
P (Code Exited Linear Pred)
iction: MRSchroeder "High Quality Speech
At Low Bit Rates ”Proc.ICASSP'85 pp.937-940)), which is an improvement of the method that separates speech into source information and vocal tract information, and the source information is encoded. The input speech is coded by the index of a plurality of sound source samples stored in the book and the LPC (linear prediction coefficient) is coded for vocal tract information, and the vocal tract information is taken into consideration when coding the sound source information. And a comparison method (AbS: Analysis by Sy
It is characterized by adopting the N. Thesis).

【０００３】ここで、ＣＥＬＰ方式の基本的アルゴリズ
ムについて説明する。図２はＣＥＬＰ方式の符号化装置
の機能ブロック図である。まず、ＬＰＣ分析部22におい
て、入力された音声データ21に対して自己相関分析とＬ
ＰＣ分析を行なうことによってＬＰＣ係数を得、また得
られたＬＰＣ係数の符号化を行ないＬＰＣ符号を得、ま
た得られたＬＰＣ符号を復号化して復号化ＬＰＣ係数を
得る。次に、加算部25において、適応符号帳23と確率的
符号帳24に格納された音源サンプルを取り出し、それぞ
れに対する最適ゲインを求め、その最適ゲインによって
パワー調節したそれぞれの音源を加算して合成音源を得
る。更に、ＬＰＣ合成部26において、加算部25で得られ
た合成音源に対して、ＬＰＣ分析部22で得られた復号化
ＬＰＣ係数によってフィルタリングを行ない合成音を得
る。更に、比較部27は、適応符号帳23と確率的符号帳24
の全ての音源サンプルに対して加算部25、ＬＰＣ合成部
26を機能させることによって得られる多くの合成音と入
力音声との距離計算を行ない、その結果得られる距離の
中で最も小さいときの音源サンプルのインデクスを求め
る。パラメータ符号化部28では、最適ゲインの符号化を
行なうことによってゲイン符号を得、ＬＰＣ符号、音源
サンプルのインデクス、ゲイン符号等をまとめて伝送路
29へ送る。また、ゲイン符号とインデクスから合成音源
を作成し、それを適応符号帳23に格納すると同時に古い
音源サンプルを破棄する。また、ＬＰＣ合成部26におい
ては、線形予測係数や高域強調フィルタや長期予測係数
（入力音声の長期予測分析を行なうことによって得られ
る）を用いた聴感重み付けを行なう。また、適応符号帳
と確率的符号帳による音源探索は、分析区間を更に細か
く分けた区間（サブフレームと呼ばれる）で行われる。Here, a basic algorithm of the CELP method will be described. FIG. 2 is a functional block diagram of a CELP type encoding device. First, the LPC analysis unit 22 performs autocorrelation analysis and L
The LPC coefficient is obtained by performing PC analysis, the obtained LPC coefficient is encoded to obtain the LPC code, and the obtained LPC code is decoded to obtain the decoded LPC coefficient. Next, in the adder 25, the sound source samples stored in the adaptive codebook 23 and the stochastic codebook 24 are taken out, the optimum gains for them are obtained, and the respective sound sources whose powers have been adjusted by the optimum gains are added to obtain a synthesized sound source To get Further, in the LPC synthesis unit 26, the synthesized sound source obtained in the addition unit 25 is filtered by the decoded LPC coefficient obtained in the LPC analysis unit 22 to obtain a synthesized sound. Further, the comparison unit 27 includes an adaptive codebook 23 and a stochastic codebook 24.
Adder 25, LPC synthesizer for all sound source samples
The distance between many synthetic sounds obtained by operating 26 and the input voice is calculated, and the index of the sound source sample at the smallest of the obtained distances is obtained. The parameter coding unit 28 obtains a gain code by coding the optimum gain, and collects the LPC code, the excitation sample index, the gain code, etc., in the transmission path.
Send to 29. Also, a synthetic excitation is created from the gain code and the index, stored in the adaptive codebook 23, and at the same time the old excitation sample is discarded. Further, the LPC synthesis unit 26 performs perceptual weighting using a linear prediction coefficient, a high-frequency emphasis filter, and a long-term prediction coefficient (obtained by performing long-term prediction analysis of input speech). Further, the excitation search using the adaptive codebook and the stochastic codebook is performed in a section (called a subframe) in which the analysis section is further divided.

【０００４】[0004]

【発明が解決しようとする課題】符号化・復号化装置を
携帯電話等の小型の機器に搭載するためにはＲＯＭ容量
は少ないことが望ましい。しかし、従来のＣＥＬＰ方式
に基づく方法では固定音源に多くのコードベクトルを必
要としていた。長いベクトルをシフトしながらコードベ
クトルとして使用するという工夫がなされているものも
あるが、同じ様なベクトルになってしまうために、多く
の別々のコードベクトルを用いたときよりも音質が劣化
してしまうという問題を有していた。また、作成する毎
にコードベクトル作成の計算を行わなくてはならないの
で、多くの計算量が必要になるという問題も有してい
た。In order to mount the encoding / decoding device on a small device such as a mobile phone, it is desirable that the ROM capacity is small. However, the conventional method based on the CELP method requires a large number of code vectors for the fixed sound source. Some devices have been devised so that they can be used as code vectors while shifting long vectors, but since they are the same vector, the sound quality is worse than when many different code vectors are used. It had the problem of being lost. In addition, since calculation for creating a code vector must be performed each time it is created, there is a problem that a large amount of calculation is required.

【０００５】[0005]

【課題を解決するための手段】上記課題を解決するため
に、本発明は、符号帳探索の際に、確率的コードブック
に格納されているコードベクトルの正順だけでなく逆順
のベクトルもコードベクトルとして使用する。SUMMARY OF THE INVENTION In order to solve the above problems, the present invention codes not only the forward order of code vectors stored in a probabilistic codebook but also the reverse order of vectors in a codebook search. Used as a vector.

【０００６】[0006]

【作用】この構成によって、あるコードベクトルの逆順
のベクトルはもとのコードベクトルとは差が大きいので
音質の劣化にはつながらず、またベクトルのサンプルポ
インタを逆に使用するだけで実現できることから計算量
は従来と殆ど同じであり、したがって、ほとんど音質の
劣化を起こさずに計算量も増やさずに符号帳のサイズを
１／２にすることが出来る。With this configuration, a vector in the reverse order of a certain code vector has a large difference from the original code vector, which does not lead to deterioration in sound quality, and can be realized only by using the vector sample pointer in the reverse direction. The amount is almost the same as the conventional one, so that the size of the codebook can be halved with almost no deterioration in sound quality and without increasing the calculation amount.

【０００７】[0007]

【実施例】以下、本発明の実施例について、ＣＥＬＰ方
式に応用した場合を例として、図１、図２を用いて説明
する。実施例における、ＣＥＬＰ方式の基本アルゴリズ
ムは従来例（図２）と同様であるのでこれを省略する。Embodiments of the present invention will be described below with reference to FIGS. 1 and 2 by taking the case of application to the CELP system as an example. Since the basic algorithm of the CELP method in the embodiment is the same as that of the conventional example (FIG. 2), the description thereof is omitted.

【０００８】この中で、確率的コードブック探索におい
ては、図１に示すように１つのコードベクトルから正順
と逆順の２つのコードベクトルを生成してそれぞれを探
索の候補として用いる。そして、最適なコードベクトル
のインデクスと正順・逆順のコード１ビットとを符号と
して伝送する。この逆順ベクトルを用いた探索は、コー
ドベクトルのサンプルポインタを逆に使用するだけの処
理であることから、従来の計算量と殆ど同じ計算量で実
現できる。したがって、この処理によりベクトル数を２
倍に増やすことができるので、従来の符号帳サイズの１
／２のサイズの確率的符号帳で同様の符号化性能を得る
ことができる。Among them, in the probabilistic codebook search, as shown in FIG. 1, two code vectors of forward order and reverse order are generated from one code vector and each is used as a search candidate. Then, the index of the optimum code vector and the coded 1 bit in the forward / reverse order are transmitted as a code. The search using the reverse order vector is a process that uses the sample pointer of the code vector in reverse, and thus can be realized with almost the same calculation amount as the conventional calculation amount. Therefore, this process reduces the number of vectors to 2
Since it can be doubled, it is 1 of the conventional codebook size.
Similar coding performance can be obtained with a stochastic codebook of size / 2.

【０００９】なお、本発明の確率的符号長のコードベク
トルの学習の際には、コードベクトルが左右対称に近く
ならないような処理や、逆順にした時に他のベクトルに
近くならないような処理を行えば音質の向上につなが
る。具体的方法の例を以下に述べる。（１）ＬＢＧアルゴリズム（文献「IEEE TRANSACTIONS
ON COMMUNICATIONS,VOL.COM-28,NO.1,JANUARY 1980」p
p.84-95に記載）によるＶＱ（ベクトル量子化:Vector Q
uantization）コードブック作成の場合は、作成前に予
め母集団のベクトルを調整しておく方法が挙げられる。
ベクトルの各要素の前半のパワーと後半のパワーを求め
て、前半より後半のパワーが大きな場合は逆順に変換す
るという処理を全てのベクトルに対して行う。この処理
によって前半のパワーが大きなセントロイド群が求めら
れる。（２）逐次適応による符号帳学習の際には、逆順に対し
ても同様の学習を行うという方法が挙げられる。処理手
順を以下に述べる。 (1) 入力音声を符号化し、適応符号帳と確率的符号帳の
最適音源と最適ゲインとを求める。 (2) 入力音声に対してＬＰＣ合成部の合成フィルタの逆
フィルターを掛けることにより理想音源を得、得られた
理想音源から適応符号帳の最適音源を減じることにより
理想確率的符号帳音源を得、これを用いて以下の（数
１）によって確率的符号帳の最適音源を学習する。In learning the code vector of the stochastic code length according to the present invention, processing is performed so that the code vector does not become close to left-right symmetry, or processing that does not become close to other vectors when reversed. This will improve the sound quality. An example of a specific method will be described below. (1) LBG algorithm (reference “IEEE TRANSACTIONS
ON COMMUNICATIONS, VOL.COM-28, NO.1, JANUARY 1980 "p
VQ (Vector Quantization: Vector Q) according to p.84-95
uantization) In the case of codebook creation, there is a method of adjusting the population vector in advance before creation.
The first half power and the second half power of each element of the vector are obtained, and when the power of the second half is larger than that of the first half, conversion is performed in reverse order for all the vectors. By this process, a centroid group with large power in the first half is obtained. (2) In codebook learning by sequential adaptation, there is a method of performing the same learning in reverse order. The processing procedure is described below. (1) The input speech is encoded and the optimum excitation and optimum gain of the adaptive codebook and the stochastic codebook are obtained. (2) An ideal sound source is obtained by applying an inverse filter of the synthesis filter of the LPC synthesis unit to the input speech, and an ideal stochastic codebook sound source is obtained by subtracting the optimum sound source of the adaptive codebook from the obtained ideal sound source. , The optimal sound source of the stochastic codebook is learned by the following (Equation 1).

【００１０】[0010]

【数１】 [Equation 1]

【００１１】(3) 上記(1)(2)の処理を学習用音声データ
全てを入力して行うという処理を繰り返す。この際、学
習係数を徐々に小さくすることにより収束させる。(3) The processes of (1) and (2) are repeated by inputting all the learning voice data. At this time, the learning coefficient is gradually reduced to converge.

【００１２】[0012]

【発明の効果】以上のように本発明により、あるコード
ベクトルの逆順のベクトルはもとのコードベクトルとは
差が大きいので音質の劣化にはつながらず、またベクト
ルのサンプルポインタを逆に使用するだけで実現できる
ことから計算量は従来と殆ど同じであり、したがって、
ほとんど音質の劣化を起こさずに計算量も増やさずに符
号帳のサイズを１／２にすることが出来る。As described above, according to the present invention, since the reverse vector of a certain code vector has a large difference from the original code vector, it does not lead to the deterioration of the sound quality, and the vector sample pointer is used in reverse. The calculation amount is almost the same as the conventional one because it can be realized only by
The size of the codebook can be halved with almost no deterioration in sound quality and without increasing the calculation amount.

[Brief description of drawings]

【図１】本発明のコードベクトルの逆順化の様子を示し
た図FIG. 1 is a diagram showing a state of code vector reverse ordering according to the present invention.

【図２】ＣＥＬＰ方式の基本機能のブロック図FIG. 2 is a block diagram of the basic functions of the CELP method.

[Explanation of symbols]

２１入力音声２２ＬＰＣ分析部２３適応符号帳２４確率的符号帳２５加算部２６ＬＰＣ合成部２７比較部２８パラメータ符号化部２９伝送路 21 input speech 22 LPC analysis section 23 adaptive codebook 24 stochastic codebook 25 addition section 26 LPC synthesis section 27 comparison section 28 parameter coding section 29 transmission path

Claims

[Claims]

1. An LPC analysis means for performing LPC (Linear Prediction Coding) analysis on an input speech signal (input speech) to obtain a linear prediction coefficient, an adaptive codebook in which past synthesized sound sources are stored, A stochastic codebook in which fixed excitations are stored, a sound source adding means for extracting a sound source from two codebooks and multiplying it by gains to obtain a synthetic sound source, and a linear prediction coefficient for the synthetic sound source By comparing the distance between the synthesized voice and the input voice with the LPC synthesis means for obtaining the synthesized voice by applying the used filter, 2
And a means for searching the most suitable one of the sound sources stored in one codebook, and in the probabilistic codebook search, the forward and reverse vectors of the stochastic codebook code vector are used as the sound source candidates. A speech coding apparatus characterized by: