JPH02203400A

JPH02203400A - Voice encoding method

Info

Publication number: JPH02203400A
Application number: JP1023009A
Authority: JP
Inventors: Hideo Osawa; 大沢　英男
Original assignee: Japan Radio Co Ltd
Current assignee: Japan Radio Co Ltd
Priority date: 1989-02-01
Filing date: 1989-02-01
Publication date: 1990-08-13

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】［産業上の利用分野］本発明は音声符号化方法、特に適応予測符号化法を用い
て高能率で音声を符号化しこの符号化された音声情報を
高速度でリアルタイムに伝送するための音声符号化方法
に関するものである。[Detailed Description of the Invention] [Field of Industrial Application] The present invention uses a speech encoding method, particularly an adaptive predictive coding method, to encode speech with high efficiency and to encode this encoded speech information at high speed in real time. The present invention relates to a voice encoding method for transmission to a computer.

［従来の技術］電話通信、テレビ電話システムあるいは音声認識等の分
野において音声をデジタル符号化する音声符号化方法が
広く求められており、従来においても実用可能な段階で
いくつかの符号化方法が提案されている。[Prior Art] There is a wide demand for audio encoding methods for digitally encoding audio in fields such as telephone communications, videophone systems, and voice recognition. Proposed.

第２図には従来における音声符号化装置の一例が示され
ており、電気的にとり込まれた音声のサンプル置系列は
入力端子１０から音声符号化装置に供給され、適応子Ｕ
Ｊ符号化を行うために予測値との減算処理がまず行われ
る。FIG. 2 shows an example of a conventional speech encoding device, in which a sample sequence of electrically captured speech is supplied to the speech encoding device from an input terminal 10, and an adaptor U
In order to perform J encoding, subtraction processing with the predicted value is first performed.

このために、入力信号は線形子ｎｊ回路１２に供給され
、入力信号に応じた線形予測係数が予測器１４にセット
される。For this purpose, the input signal is supplied to the linearizer nj circuit 12, and linear prediction coefficients corresponding to the input signal are set in the predictor 14.

一方、この予測値演算中、前記人力信号は遅延回路１６
によって所定時間遅延され、前記予測器１４の出力であ
る予７ＪＰｊ値と共に減算器１８に供給され、入力信号
から予測値の減算処理が行われる。On the other hand, during this predicted value calculation, the human input signal is transmitted to the delay circuit 16.
The predicted value is delayed for a predetermined time by , and is supplied to the subtracter 18 together with the predicted value 7JPj that is the output of the predictor 14, where the predicted value is subtracted from the input signal.

前記減算器１８の減算結果である差分信号はＦＦＴ回路
２０によって周波数変換され、その出力であるスペクト
ル情報が量子化符号化器２２によって量子化及び符号化
され、このスペクトル情報がマルチプレクサ２４から出
力端子２６を経てデジタル符号化された音声符号化信号
として出力される。The difference signal, which is the result of the subtraction by the subtracter 18, is frequency-converted by the FFT circuit 20, and the output spectrum information is quantized and encoded by the quantization encoder 22, and this spectrum information is sent from the multiplexer 24 to the output terminal. 26 and output as a digitally encoded audio encoded signal.

したがって、このような音声符号化方法によれば、音声
のサンプル値系列情報をリアルタイムでデジタル符号化
してこれを伝送することが可能となる。Therefore, according to such a voice encoding method, it is possible to digitally encode voice sample value series information in real time and transmit the same.

［発明が解決しようとする課題］しかしながら、この様な従来の音声符号化方法によれば
、リアルタイム伝送時における伝送符号化速度によりデ
ジタル符号化情報量が制限されてしまうという問題があ
った。[Problems to be Solved by the Invention] However, such conventional audio encoding methods have a problem in that the amount of digitally encoded information is limited by the transmission encoding speed during real-time transmission.

例えば、通常の伝送で用いられる３Ｋｂｐｓの符号化速
度で伝送を行う場合、前述した周波数変換された差分信
号の量子化ビット数が減少し、受信側における再生音声
の品質が著しく劣化してしまうという欠点があった。For example, when transmitting at the 3Kbps encoding speed used in normal transmission, the number of quantization bits of the frequency-converted differential signal described above decreases, and the quality of the reproduced audio on the receiving side deteriorates significantly. There were drawbacks.

本発明は上記従来の課題に鑑みされたものであり、その
目的は、伝送速度の制約にもかかわらず受信された符号
化音声の品質を充分なレベルに保つことの出来る改良さ
れた音声符号化方法を提供することにある。The present invention has been made in view of the above-mentioned conventional problems, and an object of the present invention is to provide an improved speech encoding system that can maintain the quality of received encoded speech at a sufficient level despite the limitations of transmission speed. The purpose is to provide a method.

［課題を解決するための手段］上記目的を達成するために、本発明は、周波数変換され
た差分信号を符号化するときに、ホルマント周波数及び
その近傍の周波数成分を重点的に量子化符号化し、この
ホルマント符号化信号を単独であるいは線形子ｎ１回路
によって求められた線形予測係数と合わせて伝送し、少
ないビット数でも十分に優れた品質の符号化音声情報を
伝送可能としたことを特徴とする。[Means for Solving the Problems] In order to achieve the above object, the present invention performs quantization encoding with emphasis on the formant frequency and frequency components in its vicinity when encoding a frequency-converted difference signal. The present invention is characterized in that this formant encoded signal is transmitted alone or together with the linear prediction coefficient obtained by the linearizer n1 circuit, thereby making it possible to transmit encoded speech information of sufficiently excellent quality even with a small number of bits. do.

［作用］従って、本発明によれば伝送速度の制約がある場合にお
いても、音声中の母音の特徴を与えるホルマント周波数
帯域の信号を重点的に伝送し、他の密度の低い情報をカ
ットすることにより、品質を落す事なく通常の伝送速度
で高品質の符号化音声伝送を可能とした。[Operation] Therefore, according to the present invention, even when there is a restriction on transmission speed, signals in the formant frequency band that give the characteristics of vowels in speech are transmitted preferentially, and other low-density information is cut. This makes it possible to transmit high-quality encoded audio at normal transmission speeds without degrading quality.

［実施例］以下図面に基づいて本発明の好適な実施例を説明する。[Example] Preferred embodiments of the present invention will be described below based on the drawings.

第１図には本発明に係る音声符号化方法の適用された符
号化回路の一例を示し、前述した第２図の従来回路と同
一の部材には、同一符号を付して、説明を省略する。FIG. 1 shows an example of an encoding circuit to which the speech encoding method according to the present invention is applied, and the same members as those in the conventional circuit shown in FIG. do.

従来と同様に、本発明においても、電気的に取り込まれ
た音声のサンプル値系列情報は入力端子１０から人力さ
れたのち、特定の長さ（例えば２０〜３０ｍ５ｅｃ）で
フレーム化され、線形予測回路１２で予測係数が計算さ
れ、サンプルテニタと、予測器出力である予ａｌｌｌ値
との差分値が取り出される。この差分値を、周波数変換
したのち、特定のスペクトルの情報のみを量子化、符号
化して伝送することにより量子化ビット数を著しく減少
させることが可能となり高能率音声符号化が達成されて
いる。As in the past, in the present invention, electrically captured voice sample value sequence information is manually inputted from the input terminal 10, and then framed with a specific length (for example, 20 to 30 m5ec), and then sent to the linear prediction circuit. At step 12, a prediction coefficient is calculated, and a difference value between the sample tenitor and the predetermined value, which is the output of the predictor, is extracted. After converting the frequency of this difference value, only the information of a specific spectrum is quantized, encoded, and transmitted, thereby making it possible to significantly reduce the number of quantization bits and achieving highly efficient speech encoding.

本発明において特徴的なことは、ＦＦＴ回路２０によっ
て周波数変換された差分信号を従来のようにすべて量子
化符号化するのでなく、特定の周波数成分のみを量子化
して伝送することにあり、このために、ＦＦＴ回路２０
と量子化符号化器２２との間には周波数間引き回路１０
０が設けられている。The characteristic feature of the present invention is that the difference signal frequency-converted by the FFT circuit 20 is not entirely quantized and encoded as in the past, but only a specific frequency component is quantized and transmitted. , FFT circuit 20
A frequency thinning circuit 10 is provided between the
0 is set.

従って、スペクトル情報はこの周波数間引き回路１００
によって定められる周波数帯域のみが選択的に量子化符
号化器２２へ供給され、この選択される周波数成分を再
生音声の品質上重要な帯域のみに設定すれば、比較的低
い伝送速度に対しても重要な音声情報を適確に伝送する
ことが可能となる。Therefore, the spectral information is transmitted to this frequency thinning circuit 100.
Only the frequency band defined by is selectively supplied to the quantization encoder 22, and if the selected frequency components are set only to bands that are important for the quality of reproduced audio, it can be used even at relatively low transmission speeds. It becomes possible to accurately transmit important audio information.

本発明において、この選択される周波数帯域としては音
声スペクトル上の共振周波数であるホルマント周波数及
びこれに隣接する帯域が選択される。In the present invention, the selected frequency band is a formant frequency that is a resonance frequency on the audio spectrum and a band adjacent thereto.

ホルマント周波数は周知のように母音を特徴付ける優勢
な成分が存在する周波数帯域であり、通常周波数の低い
方から順次に第１、第２及び第３ホルマント周波数等と
して呼ばれる。As is well known, formant frequencies are frequency bands in which dominant components that characterize vowels exist, and are usually called first, second, third formant frequencies, etc. in order from the lowest frequency.

従って、本実施例において、線形予測回路１２の出力か
らホルマント計算回路１０１に現在処理中の音声に対す
るホルマント帯域幅を計算させ、これによって周波数間
引き回路１００は差分信号のうちのホルマント周波数近
傍の周波数成分のみを通過させるフィルター作用を行い
、これによって量子化符号化に供されるスペクトル情報
を選択する。Therefore, in this embodiment, the formant calculation circuit 101 calculates the formant bandwidth for the voice currently being processed from the output of the linear prediction circuit 12, and thereby the frequency thinning circuit 100 calculates the frequency component near the formant frequency of the difference signal. The spectral information to be subjected to quantization and encoding is selected by performing a filtering action that only allows the spectral information to pass through.

実際上、スペクトル情報を通過させるホルマント帯域幅
は前述したように線形予測係数からホルマント計算回路
１０１にて計算可能であり、このホルマント帯域幅に対
応して符号化する必要があるスペクトル位置及び本数を
周波数軸上で対応させるテーブルをホルマント計算回路
１０１或いは周波数間引き回路１００内部に設け、これ
によってリアルタイムで必要なホルマント帯域信号のみ
を量子化符号化に供することが可能となる。In practice, the formant bandwidth for passing spectral information can be calculated by the formant calculation circuit 101 from the linear prediction coefficients as described above, and the spectral position and number of lines that need to be encoded corresponding to this formant bandwidth can be calculated using the formant calculation circuit 101. A table that corresponds on the frequency axis is provided inside the formant calculation circuit 101 or the frequency thinning circuit 100, thereby making it possible to provide only necessary formant band signals to quantization encoding in real time.

もちろん、前述した変換テーブルと同様のテーブルは復
号器側にも設けられ、符号器と同様に伝送したスペクト
ルの周波数軸上に位置情報を送ることなく復号器側での
ホルマント計算回路による復号時の帯域幅をテーブルか
ら呼び出すことによって容易に正確な復号を行うことが
出来る。Of course, a table similar to the above-mentioned conversion table is also provided on the decoder side, so that the formant calculation circuit on the decoder side can perform decoding without sending position information on the frequency axis of the transmitted spectrum in the same way as the encoder. Accurate decoding can be easily performed by retrieving the bandwidth from the table.

通常の音声帯域を０．３〜３．４ＫＨ２程度とすると、
ホルマント帯域・は第１〜第３の三個程度考慮すれば良
く、それ以外のスペクトルを削除することによって符号
化する必要のあるスペクトルの本数を大幅に削減するこ
とが可能となる。Assuming that the normal audio band is about 0.3 to 3.4KH2,
It is only necessary to consider about three formant bands, first to third, and by deleting the other spectra, it is possible to significantly reduce the number of spectra that need to be encoded.

従って、この様にして周波数変換された差分信号は１サ
ンプリング情報当たりの量子化ビット数を増やすことが
可能となり、結果的に音声品質を著しく改善することが
可能となる。Therefore, it is possible to increase the number of quantization bits per piece of sampling information in the difference signal frequency-converted in this manner, and as a result, it is possible to significantly improve the audio quality.

また、本発明の符号化方式によれば、比較的低い伝送速
度の例えば３ｋｂｐｓ程度の伝送速度に対しても、前記
ホルマント周波数帯域で選択された差分信号のみを送る
ので、品質劣化の少ない伝送をリアルタイムで可能とす
る。Further, according to the encoding method of the present invention, only the differential signal selected in the formant frequency band is sent even at a relatively low transmission rate, for example, about 3 kbps, so transmission with little quality deterioration can be achieved. possible in real time.

実施例においては、周波数間引き回路において使用され
るホルマント情報はフレームごとに更新される為に予測
値によって変更されており、このために、線形予測回路
１２の予ａｌ値もマルチプレクサ２４から前記量子化符
号化された差分情報と共に出力端子２６から伝送系に出
力されている。In the embodiment, the formant information used in the frequency thinning circuit is updated for each frame and is therefore changed by the predicted value, and for this reason, the preliminary value of the linear prediction circuit 12 is also transferred from the multiplexer 24 to the quantization It is output from the output terminal 26 to the transmission system together with the encoded difference information.

しかしながら、本発明において、ホルマント帯域を特定
の或いは全母音をカバーする領域内に固定化すれば、前
記予測回路１２の出力は必ずしも伝送系に送られる必要
はない。However, in the present invention, if the formant band is fixed within a region covering a specific or all vowels, the output of the prediction circuit 12 does not necessarily need to be sent to the transmission system.

［発明の効果コ以上説明した様に、本発明によれば、適応子ΔＰ１によ
って差分信号のみを伝送する音声符号化方法において、
差分情報を更にホルマント周波数帯域のみの情報として
伝送する事により、密度の濃い音声符号化を可能とする
事ができる。[Effects of the Invention] As explained above, according to the present invention, in the speech encoding method in which only the difference signal is transmitted by the adaptor ΔP1,
By further transmitting the difference information as information only in the formant frequency band, it is possible to perform high-density speech encoding.

従って、本発明によれば、伝送速度の制限によっても音
声品質の劣化を招くことになる、また充分な伝送速度が
許容される場合には究めて再現性の優れた符号化伝送を
可能とすることが理解される。Therefore, according to the present invention, even if the transmission speed is limited, voice quality deteriorates, and if a sufficient transmission speed is allowed, encoded transmission with excellent reproducibility is possible. That is understood.

[Brief explanation of the drawing]

第１図は本発明に係る音声符号化方法の適応された符号
化回路の好適な実施例を示す回路説明図、第２図は従来
における音声符号化回路の説明図である。１２　　・・・２０　　・・・２２　　・・・１００　　　・・・１０１　　　・・・線形子１１１１１回路ＦＦＴ回路量子化符号化器周波数間引き回路ホルマント系算回路FIG. 1 is a circuit explanatory diagram showing a preferred embodiment of an encoding circuit to which the audio encoding method according to the present invention is applied, and FIG. 2 is an explanatory diagram of a conventional audio encoding circuit. 12 ... 20 ... 22 ... 100 ... 101 ... Linear element 11111 circuit FFT circuit Quantization encoder Frequency thinning circuit Formant system calculation circuit

Claims

[Claims]

In an audio encoding method that calculates a difference between audio information using each sampling value and its predicted value, quantizes and encodes the difference information, and transmits the same, in which spectrum information in a formant band of the difference signal is selectively encoded and transmitted. A speech encoding method characterized by: