CN101944362A - Integer wavelet transform-based audio lossless compression encoding and decoding method - Google Patents
Integer wavelet transform-based audio lossless compression encoding and decoding method Download PDFInfo
- Publication number
- CN101944362A CN101944362A CN201010281033XA CN201010281033A CN101944362A CN 101944362 A CN101944362 A CN 101944362A CN 201010281033X A CN201010281033X A CN 201010281033XA CN 201010281033 A CN201010281033 A CN 201010281033A CN 101944362 A CN101944362 A CN 101944362A
- Authority
- CN
- China
- Prior art keywords
- signal
- frame
- information
- module
- wavelet transform
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
本发明公开了一种音频无损压缩编码、解码方法,属于信源编、解码领域。本方法根据信号前后帧的相关情况自适应对信号进行分帧,分帧后的一帧信号是信号特性相近的信号组合,使得编码器能获得到更好的压缩效率,为后面的整型小波变换和线性预测编码带来好处。对于无损压缩编码来说,应该对于信号可以完全的重构,所以采用整型提升小波变换来保证信号的完全可重构特性。与现有技术相比,本发明在引入了基于相关的自适应分帧模块和基于整型提升小波的去相关模块后,原始信号中的冗余信息可以被更好的去相关,生成的压缩数据中所含有的冗余信息更少,所以本发明可以用很小的计算复杂度代价换来了较大的压缩比提高。
The invention discloses an audio lossless compression encoding and decoding method, which belongs to the field of information source encoding and decoding. This method adaptively divides the signal into frames according to the relevant conditions of the frames before and after the signal. The framed signal is a combination of signals with similar signal characteristics, so that the encoder can obtain better compression efficiency, which is the following integer wavelet Transform and linear predictive coding bring benefits. For lossless compression coding, the signal should be completely reconfigurable, so the integer lifting wavelet transform is used to ensure the completely reconfigurable characteristics of the signal. Compared with the prior art, after the present invention introduces the adaptive framing module based on correlation and the de-correlation module based on integer lifting wavelet, the redundant information in the original signal can be better de-correlated, and the generated compressed The redundant information contained in the data is less, so the present invention can obtain a larger compression ratio improvement with a small calculation complexity cost.
Description
技术领域technical field
本发明属于信源编、解码领域,具体涉及一种音频无损压缩编码、解码方法。The invention belongs to the field of information source encoding and decoding, and in particular relates to an audio lossless compression encoding and decoding method.
背景技术Background technique
随着数字时代的到来,音频信号的数字化给人们带来诸多方便的同时生成了海量的音频数据,这给音频信号的存储和传输带来了很大的挑战,成为了阻碍人们获得和使用多媒体信息的瓶颈问题之一。为了解决这个问题,就必须对音频数据进行压缩,以压缩编码的方式对数据进行存储和传输。事实证明,对多媒体的数据进行压缩是必要和可行的,因为声音和图像等多媒体数据信息中有较强的冗余信息,即数据之间有较强的相关性,可通过去掉冗余信息(即去除数据间的相关性)、保留有用的音频信息来实现压缩。因此,研究和开发高效的音频编码方法,以压缩的形式存储和传输音频信息是必然的选择。而且随着人们对音频质量要求的提高,如何在保留全部音频信息的条件下,以尽可能大的压缩比压缩音频数据,从而给人们提供真正透明的音质,成为当前音频压缩编码所面临的主要课题。With the advent of the digital age, the digitalization of audio signals has brought a lot of convenience to people while generating massive audio data, which has brought great challenges to the storage and transmission of audio signals, and has become an obstacle that hinders people from obtaining and using multimedia. One of the information bottlenecks. In order to solve this problem, it is necessary to compress the audio data, and store and transmit the data in a compression coded manner. Facts have proved that it is necessary and feasible to compress multimedia data, because there is strong redundant information in multimedia data information such as sound and image, that is, there is a strong correlation between data, which can be removed by removing redundant information ( That is to remove the correlation between data), retain useful audio information to achieve compression. Therefore, it is an inevitable choice to research and develop efficient audio coding methods to store and transmit audio information in a compressed form. Moreover, with the improvement of people's requirements for audio quality, how to compress audio data with the largest possible compression ratio under the condition of retaining all audio information, so as to provide people with truly transparent sound quality, has become the main problem faced by current audio compression coding. topic.
早在20世纪70年代,英国、日本等广播部门就开始研究数字音频有损压缩编码,目前的有损音频压缩编码标准经过四十年的发展,出现了很多优秀的编码标准,其中有代表性的有MP3、AAC、WMA等,这些编码格式在很多情况下都可以达到较好的主观音质和很高的压缩比,但是当它们遇到频率动态范围较大的音乐,例如大型交响乐等,这些有损音频编码后的音质表现就显得差强人意。另外在音频编辑领域中,对有损压缩编码的音频数据做二次编码(即两种有损编码格式之间的转换)会丢失更多的信息,从而引入更大的失真。为了解决上述的问题,满足一些对音质要求比较高的需要,就必然要使用无损压缩编码。As early as the 1970s, broadcasting departments in the United Kingdom and Japan began to study digital audio lossy compression coding. After 40 years of development, many excellent coding standards have emerged in the current lossy audio compression coding standards. Among them are representative There are MP3, AAC, WMA, etc. These encoding formats can achieve good subjective sound quality and high compression ratio in many cases, but when they encounter music with a large frequency dynamic range, such as large-scale symphonies, etc., these The sound quality performance after lossy audio encoding is unsatisfactory. In addition, in the field of audio editing, performing secondary encoding on lossy compression-encoded audio data (that is, conversion between two lossy encoding formats) will lose more information, thereby introducing greater distortion. In order to solve the above-mentioned problems and meet some requirements for sound quality, it is necessary to use lossless compression coding.
目前针对音频信号进行无损压缩编码的研究与应用相比较有损压缩编码而言却并不多见。无损压缩未能得到足够关注的原因是其压缩比很难超过3∶1,而有损算法压缩比能达到12∶1甚至更高。但是对有损压缩算法来说,压缩比越高,最终获得的音频质量越差,一旦确定最低可能的数据率,有损压缩算法是唯一选择。然而,音乐爱好者想从网上下载高保真立体声音频信号以便获得最佳的音乐效果,因此,网上音乐推广将提供更高压缩比的音频信号,以便于不同消费者浏览和选择,而酷爱CD级音频质量的音乐爱好者希望获得原始音频信号的无损压缩拷贝——该备份不因压缩算法的差异而有任何信号损失。除了可供网上音频信号下载外,无损音频压缩编码还可应用于专业环境下高保真音频数据的归档、混音、演播室、节目制作等。在这种情况下,无损压缩避免了使用有损压缩编码情况下因多次编辑而引起的信号损失。Compared with lossy compression coding, the research and application of lossless compression coding for audio signals are rare. The reason why lossless compression has not received enough attention is that its compression ratio is difficult to exceed 3:1, while lossy algorithms can achieve a compression ratio of 12:1 or even higher. But for the lossy compression algorithm, the higher the compression ratio, the worse the final audio quality will be. Once the lowest possible data rate is determined, the lossy compression algorithm is the only choice. However, music lovers want to download high-fidelity stereo audio signals from the Internet in order to obtain the best music effects. Therefore, online music promotion will provide audio signals with higher compression ratios for different consumers to browse and choose. Music lovers of audio quality want a lossless compressed copy of the original audio signal - a backup without any signal loss due to differences in compression algorithms. In addition to being available for online audio signal download, lossless audio compression coding can also be applied to archiving, mixing, studio, program production, etc. of high-fidelity audio data in professional environments. In this case, lossless compression avoids the signal loss caused by multiple edits that would otherwise occur with lossy compression encoding.
从信息论观点来看,音频信号作为一个信源,描述信源的数据是信息量(信息熵)和信息冗余量之和。几乎所有的无损音频压缩都基于相似的思想,首先从信号中去除冗余,去除的只是数据中的冗余量,而没有减少信源中的信息量。然后用有效的数据编码方案进行编码。音频信号中的存在着多种冗余,主要有信号幅度分布的非均匀性,相邻样值之间的相关性,周期之间的相关性等。所以无损压缩编码算法的主要思想就是如何有效的去处音频信号中的冗余。目前比较知名的音频无损编码算法的格式有FLAC(Free Lossless Audio Codec)、WavPack、TAK(Tom′s Audio Kompressor)、APE(Monkey′s Audio)、OFR(OptimFROG)、ALAC(Apple Lossless Audio Codec)、WMAL(Windows Media Audio Lossless)、Shorten、LA(LosslessAudio)、TTA(Ture Audio)、LPAC(Lossless Predictive Audio Coder)、RAL(RealAudioLossless)、MPEG-ALS等。这些算法主要利用两种方法来进行去相关从而进一步进行无损压缩编码:一种是基于时域线性预测编码(LPC)的技术,另一种是基于变换域的技术例如IntMDCT(Integer Modified Discrete Cosnie Transform,整数改进型离散余弦变换)。无损压缩的目标是除去数据中的冗余(redundancy),完美重构原始音频信号。线性预测编码可以进一步减少冗余,对于那些具有平稳特性的信号特别有效。一般来讲,平稳的声音信号信息冗余较大,而一个不协调(类似噪音)的信号信息冗余较小。一个特定取样值的大小与其邻近的取样值有关,一般而言,当前取样值与其上一个取样值较为接近。对低频信号,更是如此。From the point of view of information theory, the audio signal is regarded as a source, and the data describing the source is the sum of the amount of information (information entropy) and the amount of information redundancy. Almost all lossless audio compression is based on a similar idea, first removing redundancy from the signal, only the amount of redundancy in the data is removed, without reducing the amount of information in the source. Then encode with an efficient data encoding scheme. There are many kinds of redundancies in the audio signal, mainly including the non-uniformity of signal amplitude distribution, the correlation between adjacent samples, and the correlation between periods. So the main idea of the lossless compression coding algorithm is how to effectively remove the redundancy in the audio signal. At present, the formats of well-known audio lossless coding algorithms include FLAC (Free Lossless Audio Codec), WavPack, TAK (Tom's Audio Kompressor), APE (Monkey's Audio), OFR (OptimFROG), ALAC (Apple Lossless Audio Codec) , WMAL (Windows Media Audio Lossless), Shorten, LA (Lossless Audio), TTA (Ture Audio), LPAC (Lossless Predictive Audio Coder), RAL (Real Audio Lossless), MPEG-ALS, etc. These algorithms mainly use two methods for de-correlation and further lossless compression coding: one is based on time domain linear predictive coding (LPC) technology, and the other is based on transform domain technology such as IntMDCT (Integer Modified Discrete Cosnie Transform , integer modified discrete cosine transform). The goal of lossless compression is to remove redundancy in the data and perfectly reconstruct the original audio signal. Linear predictive coding can further reduce redundancy, especially for those signals with stationary characteristics. Generally speaking, a smooth sound signal has more information redundancy, while a dissonant (noise-like) signal has less information redundancy. The size of a specific sample value is related to its neighboring sample values. Generally speaking, the current sample value is closer to the previous sample value. This is especially true for low frequency signals.
目前对于主流的线性预测编码方法主要思想都是体现在去相关部分,使得交给熵编码模块的数据更适合利用熵编码的方法压缩,使得熵编码能够对于待编码的数据有更加出色的压缩性能。线性预测编码器的基本原理是利用声音信号的相关性,用过去的样值x[n-1],x[n-2]...来预测当前的样值x[n],利用过去的样值越多则预测精度越高。再把当前的样值与预测值相减取其差(预测误差)进行编码。由于预测误差的动态范围要远小于原始信号的动态范围,这时即使仍采用原信号量化时采用的量化级,也可降低码位进行编码,进而实现比特率压缩。例如幅度起伏平缓的声音,预测误差会在零和很小值之间变化,e[n]的均值将比x[n]小很多,并且预测误差e[n]相邻样值之间基本上是不相关的,有平坦的频谱。所以,只需较少的数据位就可以表示其实际值。而常用的熵编码为RICE码,其编解码过程简单,而且编码时不需要知道信号的先验分布,所以在音频无损压缩中应用广泛。经过RICE编码后,能够获得较大压缩率的数据一般具有以下特点:一是幅值较小,因为编码最后都需要量化的过程,而较小的幅值意味着可以用较少的比特数来表示;二是数据间相关性小,三是数据分布尽量接近几何分布。使用线性预测编码进行去相关时没有对原始音频信号的冗余去除干净,即去相关不彻底.。即输入到熵编码模块的预测误差数据还带有冗余信息,误差信号的相邻样值之间还存在一定的相关性,可以进一步处理。At present, the main idea of the mainstream linear predictive coding method is reflected in the decorrelation part, so that the data handed over to the entropy coding module is more suitable for compression using the entropy coding method, so that the entropy coding can have better compression performance for the data to be coded . The basic principle of the linear predictive coder is to use the correlation of the sound signal to predict the current sample value x[n] with the past sample value x[n-1], x[n-2]..., and use the past sample value x[n-2]... The more samples, the higher the prediction accuracy. Then subtract the current sample value from the predicted value and take the difference (prediction error) for encoding. Since the dynamic range of the prediction error is much smaller than that of the original signal, even if the quantization level used in the quantization of the original signal is still used at this time, the code bits can be reduced for encoding, thereby realizing bit rate compression. For example, for a sound with gentle amplitude fluctuations, the prediction error will vary between zero and a small value, the mean value of e[n] will be much smaller than x[n], and the prediction error e[n] between adjacent samples is basically are uncorrelated and have a flat spectrum. Therefore, fewer data bits are needed to represent its actual value. The commonly used entropy coding is RICE code, which has a simple coding and decoding process and does not need to know the prior distribution of the signal when coding, so it is widely used in audio lossless compression. After RICE encoding, the data that can obtain a larger compression rate generally has the following characteristics: First, the amplitude is small, because the coding process needs to be quantized at the end, and the small amplitude means that it can be compressed with a small number of bits. The second is that the correlation between the data is small, and the third is that the data distribution is as close to the geometric distribution as possible. When linear predictive coding is used for decorrelation, the redundancy of the original audio signal is not completely removed, that is, the decorrelation is not complete. That is, the prediction error data input to the entropy encoding module still has redundant information, and there is a certain correlation between adjacent sample values of the error signal, which can be further processed.
发明内容Contents of the invention
本发明的目的是提供一种音频无损压缩编码、解码方法,该方法基于相关系数的分帧策略根据信号前后帧的相关情况自适应对信号进行分帧,使得一帧内的信号具有很强相关性,分帧后的一帧信号是信号特性相近的信号组合,使得编码器能获得到更好的压缩效率,为后面的整型小波变换和线性预测编码带来好处。为了使得残差幅值尽可能小,要求线性预测尽可能准确,而线性预测编码对于相关性强的信号具有很好的预测能力,所以考虑利用小波变换来对信号进行分带处理,因为窄带内的信号相关性会好于全频带的信号的相关性,因此信号经过小波变换后更有利于去除样值点的相关性;对于无损压缩编码来说,应该对于信号可以完全的重构,所以要采用整型提升小波变换来保证信号的完全可重构特性。我们在引入了基于相关的自适应分帧模块和基于整型提升小波的去相关模块后,原始信号中的冗余信息可以被更好的去相关,生成的压缩数据中所含有的冗余信息更少,所以我们可以用很小的计算复杂度代价换来了较大的压缩比提高。The purpose of the present invention is to provide a method for audio lossless compression encoding and decoding, which uses a correlation coefficient-based framing strategy to adaptively divide the signal into frames according to the correlation of the frames before and after the signal, so that the signals in one frame have a strong correlation The one-frame signal after framing is a combination of signals with similar signal characteristics, so that the encoder can obtain better compression efficiency and bring benefits to the subsequent integer wavelet transform and linear predictive coding. In order to make the residual amplitude as small as possible, linear prediction is required to be as accurate as possible, and linear predictive coding has good predictive ability for highly correlated signals, so it is considered to use wavelet transform to process the signal in bands, because within a narrow band The signal correlation of the signal will be better than that of the full-band signal, so the wavelet transform of the signal is more conducive to removing the correlation of the sample points; for lossless compression coding, the signal should be completely reconstructed, so it is necessary to Integer lifting wavelet transform is used to ensure the fully reconfigurable characteristics of the signal. After we introduce the correlation-based adaptive framing module and the integer lifting wavelet-based de-correlation module, the redundant information in the original signal can be better de-correlated, and the redundant information contained in the generated compressed data Fewer, so we can use a small computational complexity cost in exchange for a large compression ratio improvement.
本发明包括基于相关的自适应分帧技术,基于提升的整型小波变换的去相关技术以及编、解码中涉及到的其他相关技术。它能够提供比单独使用线性预测技术去相关的音频无损编码、解码系统提供更高的压缩比。The present invention includes correlation-based self-adaptive framing technology, lift-based integer wavelet transform de-correlation technology and other related technologies involved in encoding and decoding. It can provide a higher compression ratio than the audio lossless encoding and decoding system using linear prediction technology to decorrelate alone.
根据本发明方法的音频无损编解码器系统可以分为编码器子系统和解码器子系统两部分:The audio lossless codec system according to the inventive method can be divided into two parts of encoder subsystem and decoder subsystem:
编码器子系统包括:The encoder subsystem includes:
分帧模块:用于对输入的音频信号进行自适应的分帧;Framing module: used for adaptive framing of the input audio signal;
整型小波变换模块:用于对分帧后的一段音频信号进行分带处理;Integer wavelet transform module: used to process a section of audio signal after framing;
线性预测编码模块:用于对每一个子带内的信号进行线性预测去除相邻样点之间的相关性;Linear predictive coding module: used to linearly predict the signal in each subband to remove the correlation between adjacent samples;
熵编码模块:用于对线性预测编码模块输出的残差信号进行无损的信源编码Entropy coding module: used for lossless source coding of the residual signal output by the linear predictive coding module
比特流形成模块:用于把上述模块中形成的熵编码流、帧长信息、小波分级信息、LPC参数、码本信息按一定的格式形成比特流并写成文件;Bit stream forming module: used to form the entropy encoded stream, frame length information, wavelet classification information, LPC parameters and codebook information formed in the above modules into a bit stream in a certain format and write it into a file;
解码器子系统包括:The decoder subsystem includes:
比特流分离模块:用于把压缩后的音频文件中的比特流按照规定格式进行分离,分别生成熵编码流、帧长信息、小波分级信息、LPC参数、码本信息等不同的数据;Bit stream separation module: used to separate the bit stream in the compressed audio file according to the specified format, and generate different data such as entropy encoded stream, frame length information, wavelet classification information, LPC parameters, codebook information, etc.;
熵解码模块:用于把熵编码流通过解码重新完整的生成残差信号Entropy decoding module: used to completely regenerate the residual signal by decoding the entropy encoded stream
LPC重构模块:用于把边信息中的LPC参数和残差信号重构成小波变换后的分带信号。LPC reconstruction module: used to reconstruct the LPC parameters and residual signal in the side information into sub-band signals after wavelet transform.
整型提升小波重构模块:用于把小波分解后的分带信号重新合成为一个完整的音频信号帧。Integer Lifting Wavelet Reconstruction Module: Used to resynthesize the sub-band signal after wavelet decomposition into a complete audio signal frame.
合并帧模块:把重构后的每一帧音频信号合并成一个音频的PCM文件,并写入WAVE文件的文件头,生成解压后的WAVE文件。Merge frame module: Merge each frame of reconstructed audio signal into an audio PCM file, write it into the file header of the WAVE file, and generate the decompressed WAVE file.
根据本发明的音频信号无损编/解码方法的具体实现如下:The specific implementation of the audio signal lossless encoding/decoding method according to the present invention is as follows:
音频信号的无损编码流程是音频文件先按照分帧策略分成若干帧,分帧信息(即帧长信息)纳入边信息传输;每帧单独处理,即先通过小波变换得到近似信号和细节信号,小波分解级数(即小波分级信息)按自适应规则获得,分解级数同样纳入边信息;近似信号和细节信号通过线性预测模块得到残差信号和LPC参数,在线性预测模块中得到的残差信号经过熵编码得到熵编码流,LPC参数和熵编码的码本信息纳入边信息,最后将各路码流(即边信息和熵编码流)复用形成最终的压缩码流。The lossless encoding process of the audio signal is that the audio file is first divided into several frames according to the framing strategy, and the framing information (that is, the frame length information) is included in the side information transmission; each frame is processed separately, that is, the approximate signal and the detail signal are first obtained through the wavelet transform, and the wavelet The decomposition series (that is, the wavelet classification information) is obtained according to the adaptive rules, and the decomposition series is also included in the side information; the approximate signal and the detail signal are obtained through the linear prediction module. Residual signal and LPC parameters, the residual signal obtained in the linear prediction module The entropy coded stream is obtained through entropy coding, and the LPC parameters and the codebook information of the entropy code are included in the side information. Finally, each code stream (ie, the side information and the entropy coded stream) is multiplexed to form the final compressed code stream.
音频信号的无损解码流程实际上就是编码流程的逆过程,通过先解码出边信息,从边信息中分离出熵编码的码本、LPC参数、分级信息和帧长信息,熵编码模块根据码本信息进行熵解码,从熵编码流中解得LPC预测后残差信号,LPC重构模块利用LPC参数从残差信号中解得小波分解的近似信号和细节信号,整型提升小波重构模块再根据小波分级信息对近似信号和细节信号进行重构,得到每帧信号,最后根据分帧信息将各帧顺次连接起来,无损的得到原始音频文件。The lossless decoding process of the audio signal is actually the reverse process of the encoding process. By first decoding the side information, the entropy codebook, LPC parameters, classification information and frame length information are separated from the side information. The entropy coding module is based on the codebook The information is entropy decoded, and the residual signal after LPC prediction is obtained from the entropy encoded stream. The LPC reconstruction module uses the LPC parameters to obtain the approximate signal and detail signal of the wavelet decomposition from the residual signal. According to the wavelet classification information, the approximate signal and the detail signal are reconstructed to obtain the signal of each frame, and finally the frames are connected sequentially according to the frame information to obtain the original audio file without loss.
根据本发明方法的音频无损编/解码器系统其中包括编码器子系统和解码器子系统两部分。整个系统中采用的主要关键技术有基于相关的自适应分帧技术、整型提升小波变换技术、自适应的线性预测编码技术、针对几何分布的数据的莱斯码熵编码技术。下面将分别介绍各个技术内容:The audio lossless coder/decoder system according to the method of the present invention includes two parts: a coder subsystem and a decoder subsystem. The main key technologies used in the whole system are correlation-based adaptive framing technology, integer lifting wavelet transform technology, adaptive linear predictive coding technology, and Rice code entropy coding technology for geometrically distributed data. The following will introduce each technical content:
1、基于相关的自适应分帧技术1. Correlation-based adaptive framing technology
帧一词来自图像,其意是将一个连续活动图像划为一幅幅画面,连环画就是一个很好例子。在数字音频中借用“帧”,其意是模拟信号变换为数字信号,将其数字信号分成许多的小片段,称这小片段为1帧。由于音频信号中存在相当多的突变信号,如果采用固定帧长进行分帧,得到的各帧内的信号间的相关性会受到较大影响,进而使得压缩率降低。The word frame comes from image, which means to divide a continuous moving image into pictures. Comics are a good example. Borrowing "frame" in digital audio means that the analog signal is converted into a digital signal, and the digital signal is divided into many small segments, which are called 1 frame. Since there are quite a lot of abrupt changes in the audio signal, if a fixed frame length is used for framing, the correlation between the obtained signals in each frame will be greatly affected, thereby reducing the compression rate.
本发明根据相邻帧的相关系数,将相关性大的信号合并到一帧内,这样,小波变换和线性预测的紧致性都会提高,可以得到较高的压缩效率。首先以最小帧长为单位,计算当前帧与前一帧的相关系数,如果此系数小于阈值,则标记该帧与前一帧为不相关帧,单独成一帧,如果此系数大于阈值,则认为当前帧与前一帧相关帧,将相邻的相关帧依次合并,但最大帧长不超过设定的最大帧长允许值,当合并帧的长度超过设定的最大帧长时重起一帧。采取以上的分帧策略,可以使特性一致的信号能够在一帧内进行处理。According to the correlation coefficients of adjacent frames, the present invention combines signals with high correlation into one frame, so that the compactness of wavelet transformation and linear prediction can be improved, and higher compression efficiency can be obtained. First, calculate the correlation coefficient between the current frame and the previous frame with the minimum frame length as the unit. If the coefficient is less than the threshold, mark the frame and the previous frame as irrelevant frames and form a single frame. If the coefficient is greater than the threshold, it is considered The current frame is related to the previous frame, and the adjacent related frames are merged sequentially, but the maximum frame length does not exceed the set maximum frame length allowable value. When the length of the merged frame exceeds the set maximum frame length, restart a frame . By adopting the above framing strategy, signals with consistent characteristics can be processed within one frame.
2、整型小波变换技术2. Integer wavelet transform technology
整型小波变换是将整数映射到整数的小波变换,即输入信号为整数,变换后的小波系数也为整数,且原信号可以由逆变换精确的恢复。传统小波变换以后产生的系数是浮点数,不仅计算量非常大,而且无法实现数据的无损压缩。采用提升方案计算小波变换,在提升过程中加入量化运算就能实现由整数到整数的小波变换。整型小波变换在图像压缩领域有很多的应用,可以实现从有损到无损的低复杂度嵌入式编码,然而在音频信号的无损压缩还没有很好的应用。Integer wavelet transform is a wavelet transform that maps integers to integers, that is, the input signal is an integer, and the transformed wavelet coefficients are also integers, and the original signal can be accurately restored by the inverse transform. The coefficients generated after traditional wavelet transform are floating-point numbers, which not only requires a lot of calculation, but also cannot achieve lossless data compression. The wavelet transform is calculated by using the lifting scheme, and the wavelet transform from integer to integer can be realized by adding quantization operation in the lifting process. Integer wavelet transform has many applications in the field of image compression, and can realize low-complexity embedded coding from lossy to lossless, but it has not been well applied in lossless compression of audio signals.
传统的变换方法,无论是快速傅里叶变换还是小波变换,输入信号为整数,得到的变换后的系数是浮点数,计算机在处理时存在舍入误差,不能实现数据的无损压缩。考虑在提升步骤中加入量化运算,如果输入向量x为整数,则输出y也为整数,并且由y可精确地恢复出x,需要注意的是,此处量化的作用不同于数据压缩中的量化,该量化并不带来信息损失,而只是为了得到整数输出。由于包含了量化运算,因此整型小波变换是一种非线性变换,这使得对整型小波变换的分析变得较为复杂。在实际应用中,适当地选取量化运算的形式,可将整型小波变换近似的看作线性变换,以简化分析。The traditional transform method, no matter it is fast Fourier transform or wavelet transform, the input signal is an integer, and the transformed coefficient is a floating-point number. There are rounding errors in the computer processing, and the lossless data compression cannot be realized. Consider adding a quantization operation in the lifting step. If the input vector x is an integer, the output y is also an integer, and x can be accurately recovered from y. It should be noted that the role of quantization here is different from that in data compression. , the quantization does not bring information loss, but only for integer output. Integer wavelet transform is a kind of nonlinear transformation because it includes quantization operation, which makes the analysis of integer wavelet transform more complicated. In practical applications, if the form of quantization operation is properly selected, the integer wavelet transform can be regarded as a linear transformation to simplify the analysis.
用多分辨率分析或者是带通滤波器的观点来看,小波分解并不限于上述的一级分解,还可以对一级分解之后的近似信号继续做小波分解,进一步去除其相关性,但由于不同信号在频率上分布不同,采用不同的分解级数对压缩的结果会有影响,本发明的方法根据信号分解后的压缩效果,自适应的选择级数,使压缩结果达到最佳,并将最佳的分解级数信息记录到边信息中。From the perspective of multi-resolution analysis or band-pass filter, wavelet decomposition is not limited to the above-mentioned one-level decomposition, and wavelet decomposition can be continued on the approximate signal after one-level decomposition to further remove its correlation, but due to Different signals have different frequency distributions, and the use of different decomposition levels will have an impact on the compression results. The method of the present invention adaptively selects the levels according to the compression effect after signal decomposition, so that the compression results can be optimized, and the The best decomposition level information is recorded in the side information.
3、自适应的线性预测编码技术3. Adaptive linear predictive coding technology
无损音频编码器预测精度越高,编码效率则越高。大多数算法通过一些改进的线性预测器去除冗余,这些算法将线性预测器应用于每一帧数据,产生预测误差序列。预测器的参数,代表着从信号中移去的冗余,无损编码预测器的参数和预测误差一起代表每一帧信号。The higher the prediction accuracy of the lossless audio encoder, the higher the encoding efficiency. Most algorithms remove redundancy through some modified linear predictor, which is applied to each frame of data, producing a sequence of prediction errors. The parameters of the predictor represent the redundancy removed from the signal, and the parameters of the lossless coded predictor together with the prediction error represent the signal for each frame.
线性预测器的基本原理是利用声音信号的相关性,用过去的样值x[n-1]、x[n-2]、......等来预测当前的样值x[n],利用过去的样值越多则预测精度越高。再把当前的样值与预测值相减取其之差(预测误差)进行编码。由于预测误差的动态范围要远小于原始信号的动态范围,这时即使仍采用原信号量化时采用的量化级,也可降低码位进行编码,进而实现比特率压缩。这种方法对于那些具有平稳特性的声音信号特别有效。例如幅度起伏平缓的声音,预测误差会在零至很小值之间变化。如预测器运行良好,预测误差e[n]是不相关的,有平坦的频谱。同样,e[n]的均值将比x[n]小,只要较少的数据位就可以表示其实际值。The basic principle of the linear predictor is to use the correlation of the sound signal to predict the current sample value x[n] with the past sample values x[n-1], x[n-2],...etc. , the more past samples are used, the higher the prediction accuracy will be. Then subtract the current sample value from the predicted value and take the difference (prediction error) for encoding. Since the dynamic range of the prediction error is much smaller than that of the original signal, even if the quantization level used in the quantization of the original signal is still used at this time, the code bits can be reduced for encoding, thereby realizing bit rate compression. This method is especially effective for those sound signals with stationary characteristics. For example, for sounds with flat amplitude fluctuations, the prediction error will vary from zero to a very small value. If the predictor works well, the prediction errors e[n] are uncorrelated and have a flat spectrum. Likewise, e[n] will have a smaller mean than x[n], requiring fewer data bits to represent its actual value.
线性预测器被广泛地应用于语音和音频信号处理,大多数情况下,使用FIR滤波器,预测滤波器A(z)的系数决定于均方预测误差的最小化。若不考虑量化器,FIR预测系数可通过求解一组线性方程式获得。若在无损音频压缩中使用FIR滤波器,则系数可通过确定的步骤求得然后进行量化,在解码端中利用同样的系数由e[n]重建x[n]。由于必须完全无损的重构原始信号,所以,预测系数(即LPC参数)必须进行量化并编码,以作为无损音频编码的一部分。通常,为了使预测器适应信号的变化,分帧后的每一帧须确定一组新的预测系数。Linear predictors are widely used in speech and audio signal processing. In most cases, using FIR filters, the coefficients of the prediction filter A(z) are determined to minimize the mean squared prediction error. If the quantizer is not considered, the FIR prediction coefficients can be obtained by solving a set of linear equations. If the FIR filter is used in lossless audio compression, the coefficients can be obtained through certain steps and then quantized, and the same coefficients can be used to reconstruct x[n] from e[n] at the decoding end. Since the original signal must be completely lossless reconstructed, the prediction coefficients (that is, the LPC parameters) must be quantized and encoded as part of the lossless audio encoding. Usually, in order for the predictor to adapt to changes in the signal, a new set of prediction coefficients must be determined for each frame after framing.
4、针对几何分布的数据的莱斯码熵编码技术4. Rice code entropy coding technology for geometrically distributed data
数据压缩技术的理论基础就是信息论。信息论中的信源编码理论解决的主要问题:(1)数据压缩的理论极限(2)数据压缩的基本途径。根据信息论的原理,可以找到最佳数据压缩编码的方法,数据压缩的理论极限是信息熵。信息熵为信源的平均信息量(不确定性的度量)。如果要求编码过程中不丢失信息量,即要求保存信息熵,这种信息保持编码叫熵编码,熵编码(entropy encoding)是一类利用数据的统计信息进行压缩的无语义数据流的无损编码它是根据消息出现概率的分布特性而进行的,在这个过程中,可以移去误差信号中的冗余。而没有信息丢失。经常使用的熵编码方式有:行程编码(RLE)、香农(Shannon)编码、哈夫曼(Huffman)编码和算术编码(arithmetic coding)。熵编码是一种无损的信源编码,熵编码的作用是移去预测误差信号中的冗余信息,在这个过程中,没有数据信息的丢失。由于残差信号的信源服从几何分布,所以采用Rice编码来对残差信号进行编码。The theoretical basis of data compression technology is information theory. The main problems solved by source coding theory in information theory: (1) The theoretical limit of data compression (2) The basic way of data compression. According to the principle of information theory, the best data compression coding method can be found, and the theoretical limit of data compression is information entropy. Information entropy is the average amount of information (a measure of uncertainty) of a source. If it is required that the amount of information is not lost during the encoding process, that is, information entropy is required to be preserved. This kind of information-preserving encoding is called entropy encoding. It is carried out according to the distribution characteristics of the probability of occurrence of the message, and in this process, the redundancy in the error signal can be removed. without loss of information. The commonly used entropy coding methods are: run-length coding (RLE), Shannon coding, Huffman coding and arithmetic coding (arithmetic coding). Entropy coding is a kind of lossless source coding. The function of entropy coding is to remove the redundant information in the prediction error signal. In this process, no data information is lost. Since the source of the residual signal obeys geometric distribution, Rice coding is used to encode the residual signal.
Rice编码是一个信源为Laplace分布的哈夫曼编码,只有一个参数k,事实上,声道内去相关操作中的预测误差信号都近似于Laplace概率密度分布。Rice编码由三部分组成:①符号位,②k位低阶码;③保留的高阶位。码字的第一部分表示e[n]的符号;第二部分包含|e[n]|的二进制码的低k个有效位,第三部分由N个连零构成,这里N是|e[n]|剩余有效位的二进制代表值,N个连零后插入1作为分隔符。Rice coding is a Huffman coding whose information source is Laplace distribution, and has only one parameter k. In fact, the prediction error signals in the decorrelation operation in the channel are all approximate to the Laplace probability density distribution. Rice encoding consists of three parts: ①sign bit, ②k-bit low-order code; ③reserved high-order bit. The first part of the codeword represents the symbol of e[n]; the second part contains the lower k significant bits of the binary code of |e[n]|, and the third part consists of N consecutive zeros, where N is |e[n ]|The binary representative value of the remaining significant digits, inserting 1 after N consecutive zeros as a separator.
假定对整数n进行Rice,则编码步骤为Assuming that Rice is performed on integer n, the encoding steps are
(1)符号位(1代表正,0代表负)(1) sign bit (1 represents positive, 0 represents negative)
(2)n/(2k)个连零(2)n/(2 k ) consecutive zeros
(3)分隔位1(3)
(4)n的后k位有效位(4) The last k effective digits of n
我们做了两组实验来比较本文所描述的无损压缩编码算法与MPEG ALS(RM22)以及FLAC两种无损编码格式进行了比较。We did two sets of experiments to compare the lossless compression coding algorithm described in this paper with MPEG ALS (RM22) and FLAC two lossless coding formats.
第一组实验我们选择了十三种不同的音乐风格来进行无损音频压缩,已证明该编码器对于不同音质的音频信号都可以取得较好的压缩性能。In the first set of experiments, we selected thirteen different music styles for lossless audio compression, and it has been proved that the encoder can achieve better compression performance for audio signals of different sound quality.
不同风格的音频文件压缩结果比较Comparison of compression results of different styles of audio files
第二组实验我们选择了不同采样率和量化精度不同的一组音频信号来对我们提出的编码系统进行测试,以此证明该方案对于各种采样率和量化精度的组合中都可以有较好的效果。In the second set of experiments, we selected a group of audio signals with different sampling rates and different quantization precisions to test our proposed coding system, in order to prove that the scheme can be better for various combinations of sampling rates and quantization precision. Effect.
不同采样率和量化精度的音频文件压缩结果比较Comparison of audio file compression results with different sampling rates and quantization precision
以上的两组实验证明对于不同风格的音乐文件本发明方法均可以取得较好的压缩效果,并且对于不同的采样率和量化精度的音频文件本发明方法也可以取得比当前主流音频无损压缩软件相当的结果。The above two groups of experiments prove that the method of the present invention can achieve better compression effects for music files of different styles, and for audio files of different sampling rates and quantization precisions, the method of the present invention can also obtain a compression effect comparable to that of the current mainstream audio lossless compression software. the result of.
与现有技术相比,本发明的积极效果在于:Compared with prior art, positive effect of the present invention is:
1、本系统可根据信号前后帧的相关情况自适应对信号进行分帧,使得一帧内的信号具有强相关性,为后面的小波分解处理和LPC预测带来好处。1. The system can adaptively divide the signal into frames according to the correlation between the frames before and after the signal, so that the signals in one frame have a strong correlation, which brings benefits to the subsequent wavelet decomposition processing and LPC prediction.
2.本系统采用整型提升小波变换,避免了对处理数据和滤波器系数进行截断产生的误差。2. This system adopts integer lifting wavelet transform, which avoids the error caused by truncation of processing data and filter coefficients.
3.本系统以最佳压缩率为优化函数,对信号的整型提升小波级数根据不同信号自适应调整。3. The system optimizes the function with the best compression rate, and adjusts the wavelet series of the signal to the integer lifting according to different signals.
4.对小波分解后的信号采用LPC预测处理,增加分解后信号的紧致性。4. Use LPC predictive processing for the signal after wavelet decomposition to increase the compactness of the decomposed signal.
5.用适合几何分布的莱斯码对残差信号进行编码,对信号进一步压缩。5. Coding the residual signal with a Rice code suitable for geometric distribution to further compress the signal.
附图说明Description of drawings
图1:编码器结构框图;Figure 1: Encoder structure block diagram;
图2:解码器结构框图;Figure 2: Block diagram of decoder structure;
图3:分帧策略;Figure 3: Framing strategy;
图4:提升小波的分解;Figure 4: Decomposition of lifting wavelet;
图5:提升小波的重构;Figure 5: Reconstruction of lifting wavelet;
图6:整型小波变换以及其逆变换;Figure 6: Integer wavelet transform and its inverse transform;
图7:前向预测器编码图;Figure 7: Forward predictor encoding diagram;
图8:前向预测器解码图。Figure 8: Forward predictor decoding graph.
具体实施方式Detailed ways
下面参照本发明的附图,更详细地描述本发明的最佳实施例,详细描述如何实现该发明的技术方案:Below with reference to accompanying drawing of the present invention, describe preferred embodiment of the present invention in more detail, describe in detail how to realize the technical scheme of this invention:
根据本发明方法的音频无损编/解码器系统其中包括编码器子系统和解码器子系统两部分。系统的结构框图如图1和图2所示,其中图1是音频无损压缩编码子系统结构框图,图2是频无损压缩解码器子系统结构框图。以下将结合附图详细介绍系统结构。The audio lossless coder/decoder system according to the method of the present invention includes two parts: a coder subsystem and a decoder subsystem. The structural block diagram of the system is shown in Figure 1 and Figure 2, wherein Figure 1 is a structural block diagram of the audio lossless compression coding subsystem, and Figure 2 is a structural block diagram of the audio lossless compression decoder subsystem. The system structure will be introduced in detail below in conjunction with the accompanying drawings.
1、总体方案:1. Overall plan:
编码器部分:编码器结构框图如图1所示:音频文件先按照分帧策略分成若干帧,分帧信息纳入边信息传输;将分帧后的每帧单独处理,即先通小波变换得到每一帧的近似信号和细节信号,小波分解级数按自适应规则获得,分解级数同样纳入边信息;近似信号和细节信号通过线性预测模块,在线性预测模块中得到的残差信号经过熵编码,LPC参数纳入边信息,最后将各路码流复用形成最终的压缩码流。Encoder part: The structural block diagram of the encoder is shown in Figure 1: the audio file is first divided into several frames according to the framing strategy, and the frame information is included in the side information transmission; each frame after framing is processed separately, that is, the wavelet transform is first used to obtain each For the approximate signal and detail signal of a frame, the wavelet decomposition series is obtained according to the adaptive rule, and the decomposition series is also included in the side information; the approximate signal and the detail signal pass through the linear prediction module, and the residual signal obtained in the linear prediction module is entropy encoded , the LPC parameters are incorporated into the side information, and finally the code streams of each channel are multiplexed to form the final compressed code stream.
解码器部分:解码器结构框图如图2所示:实际上就是编码器的逆过程,通过先解码出边信息,从中分离出熵编码的码本,解得LPC预测后残差,利用LPC参数解得小波分解的近似信号和细节信号,再根据小波分解级数的信息,重构得到每帧信号,最后根据分帧信息将各帧顺次连接起来,无损的得到原始音频文件。Decoder part: The structural block diagram of the decoder is shown in Figure 2: it is actually the inverse process of the encoder. By first decoding the side information, the entropy-encoded codebook is separated from it, and the residual after LPC prediction is solved, and the LPC parameter is used Solve the approximate signal and detail signal of wavelet decomposition, and then reconstruct the signal of each frame according to the information of wavelet decomposition series, and finally connect each frame sequentially according to the frame information, and obtain the original audio file without loss.
2、分帧策略:2. Framing strategy:
由于音频信号中存在相当多的突变信号,如果采用固定帧长进行分帧,得到的各帧内的信号间的相关性会受到较大影响,进而使得压缩率降低。本方案根据相邻帧的相关系数,将相关性大的信号合并到一帧内,这样,小波变换和线性预测的紧致性都会提高,可以得到较高的压缩效率。所以本方案中采取如图3的分帧策略,使特性一致的信号能够在一帧内进行处理。Since there are quite a lot of abrupt changes in the audio signal, if a fixed frame length is used for framing, the correlation between the obtained signals in each frame will be greatly affected, thereby reducing the compression rate. According to the correlation coefficients of adjacent frames, this scheme merges highly correlated signals into one frame. In this way, the compactness of wavelet transform and linear prediction will be improved, and higher compression efficiency can be obtained. Therefore, in this solution, the framing strategy shown in Figure 3 is adopted, so that signals with consistent characteristics can be processed within one frame.
首先以最小帧长为单位,计算当前帧与前一帧的相关系数,如果此系数小于阈值,则标记该帧与前一帧为不相关帧,单独成一帧,如果此系数大于阈值,则认为当前帧与前一帧相关帧,将相邻的相关帧依次合并,但最大帧长不超过设定的最大帧长允许值。First, calculate the correlation coefficient between the current frame and the previous frame with the minimum frame length as the unit. If the coefficient is less than the threshold, mark the frame and the previous frame as irrelevant frames and form a single frame. If the coefficient is greater than the threshold, it is considered The current frame is related to the previous frame, and the adjacent related frames are merged sequentially, but the maximum frame length does not exceed the set maximum frame length allowable value.
3、整型提升小波变换:3. Integer lifting wavelet transform:
提升策略:Swledens等人已经证明,任何已有的小波变换,都可以通过有限次的提升步骤级联来实现,除此之外,提升变换实现框架可以用整型小波变换实现。图4给出了实现小波变换的一个最基本的提升方案,包含三个基本步骤:分离、预测和更新。Lifting strategy: Swledens et al. have proved that any existing wavelet transform can be realized by cascading a limited number of lifting steps. In addition, the lifting transform implementation framework can be realized by integer wavelet transform. Figure 4 shows the most basic lifting scheme to realize wavelet transform, which contains three basic steps: separation, prediction and update.
分离(split):指的是通过下采样将x[n]分离为奇序列xo[n]=x[2n+1]偶序列xe[n]=x[2n]。这种分离也被称作Lazy小波变换。Split: refers to separating x[n] into an odd sequence xo[n]=x[2n+1] and an even sequence xe[n]=x[2n] by downsampling. This separation is also called Lazy wavelet transform.
预测(predict):一般来说,奇偶序列间存在很强的相关性,可以用偶数集预测奇数集(P为预测算子)从而去掉数据间的冗余,得到信号的高频细节信息。Prediction (predict): Generally speaking, there is a strong correlation between the odd and even sequences, and the even set can be used to predict the odd set (P is the prediction operator) to remove the redundancy between the data and obtain the high-frequency details of the signal.
d[n]=xo[n]-P(xe[n]) (1)d[n]=x o [n]-P(x e [n]) (1)
如果信号是局部光滑的,则预测残差的值会很小。If the signal is locally smooth, the value of the prediction residual will be small.
更新(update):这一步骤可以得到x[n]的低频信息,即信号的概貌,或称为近似信号。对应的运算为预测残差d[n]经过更新算子U加到偶序列xe[n]中,Update: This step can obtain the low-frequency information of x[n], that is, the overview of the signal, or an approximate signal. The corresponding operation is to add the prediction residual d[n] to the even sequence xe[n] through the update operator U,
s[n]=xe[n]+U(do[n]) (2)s[n]=x e [n]+U(d o [n]) (2)
以上是提升方案的分解算法,由提升矩阵的性质可以导出提升方案的重构算法如下,包括反更新、反预测、拼合三个步骤,如图5所示The above is the decomposition algorithm of the lifting scheme, and the reconstruction algorithm of the lifting scheme can be derived from the properties of the lifting matrix as follows, including three steps of anti-update, anti-prediction, and combination, as shown in Figure 5
反更新(undo update):Undo update:
xe[n]=s[n]-U(d[n]) (3)x e [n]=s[n]-U(d[n]) (3)
反预测(undo predict):Anti-prediction (undo predict):
xo[n]=d[n]+P(xe[n]) (4)x o [n]=d[n]+P(x e [n]) (4)
合成(merge):Synthesis (merge):
x[n]=Merge(x[2n+1],x[2n]) (5)x[n]=Merge(x[2n+1],x[2n]) (5)
构造不同的小波变换取决于选择不同的预测算子和更新算子,这些算子可以是常系数(简单相乘),也可以是滤波器的冲击响应(卷积运算)。预测算子和传统小波变换中的尺度函数相对应,由不同的预测插值算子可以得到不同的尺度函数。同样的,更新算子也同传统小波变换中的小波函数相对应。Constructing different wavelet transforms depends on choosing different predictors and update operators. These operators can be constant coefficients (simple multiplication) or impulse responses of filters (convolution operation). The prediction operator corresponds to the scale function in the traditional wavelet transform, and different scale functions can be obtained by different forecast interpolation operators. Similarly, the update operator corresponds to the wavelet function in the traditional wavelet transform.
传统的变换方法,无论是快速傅里叶变换还是小波变换,输入信号为整数,得到的变换后的系数是浮点数,计算机在处理时存在舍入误差,不能实现数据的无损压缩。The traditional transform method, no matter it is fast Fourier transform or wavelet transform, the input signal is an integer, and the transformed coefficient is a floating-point number. There are rounding errors in the computer processing, and the lossless data compression cannot be realized.
反观采用提升方案计算小波变换,在提升过程中就可以加入量化运算,由于提升矩阵满足完全可逆的要求,从而就能实现从整型到整型的小波变换。具体的情况如下列式子所示:(Q为量化算子)On the other hand, the lifting scheme is used to calculate the wavelet transform, and the quantization operation can be added in the lifting process. Since the lifting matrix meets the requirement of complete reversibility, the wavelet transform from integer to integer can be realized. The specific situation is shown in the following formula: (Q is a quantization operator)
y(i)=x(i)+Q[αx(j)],y(k)=x(k),k≠i (6)y(i)=x(i)+Q[αx(j)], y(k)=x(k), k≠i (6)
x(i)=y(i)-Q[αy(j)],x(k)=y(k),k≠i (7)x(i)=y(i)-Q[αy(j)], x(k)=y(k), k≠i (7)
根据上述量化提升步骤,就可以得到整型小波变换及其逆变换的实现形式,如图6所示,每一步提升步骤均加入量化运算,保证了每一步的输出为整数。首先,任何现存的第一代小波,都可以经过有限次的提升结构级联得以实现,其次,可以通过额外的四次提升变换使得增益因子为1,这样,最后输出的LP和BP也为整数,由于每一步都可逆的整型变换,显然在逆变换时可以由LP和BP精确地恢复出原始信号。According to the above quantization and lifting steps, the implementation form of integer wavelet transform and its inverse transform can be obtained, as shown in Figure 6, each step of lifting is added with quantization operation, which ensures that the output of each step is an integer. First of all, any existing first-generation wavelet can be realized through a finite number of lifting structure cascades, and secondly, the gain factor can be 1 through four additional lifting transformations, so that the final output LP and BP are also integers , due to the reversible integer transformation at each step, it is obvious that the original signal can be accurately restored by LP and BP during the inverse transformation.
用多分辨率分析或者是带通滤波器的观点来看,小波分解并不限于上述的一级分解,还可以对一级分解之后的近似信号继续做小波分解,进一步去除其相关性,但由于不同信号在频率上分布不同,采用不同的分解级数对压缩的结果会有影响,本发明采用不同的小波分解级数对信号进行分解,根据分解后的结果,选择压缩比最高的级数,并将最佳的分解级数信息记录到边信息中。From the perspective of multi-resolution analysis or band-pass filter, wavelet decomposition is not limited to the above-mentioned one-level decomposition, and wavelet decomposition can be continued on the approximate signal after one-level decomposition to further remove its correlation, but due to Different signals have different frequency distributions, and the use of different decomposition series will have an impact on the compression result. The present invention uses different wavelet decomposition series to decompose the signal. According to the decomposed results, the series with the highest compression ratio is selected. And record the best decomposition series information into the side information.
4、LPC预测:4. LPC prediction:
本方案采用的是前向预测的方法,认为离散时间信号当前时间点的值可以由前K个点的值的线性组合进行预测,表达式如下所示:This scheme adopts the method of forward prediction, and considers that the value of the discrete time signal at the current time point can be predicted by the linear combination of the values of the previous K points, and the expression is as follows:
其中k是预测器的阶数。如果预测器预测结果接近当前信号,那么其残差的方差比原始信号更加接近于零,从而有效地减少了编码长度。前向预测器的结构如图7所示,其对应的解码过程如图8所示。where k is the order of the predictor. If the prediction result of the predictor is close to the current signal, the variance of its residual error is closer to zero than the original signal, thus effectively reducing the encoding length. The structure of the forward predictor is shown in Figure 7, and its corresponding decoding process is shown in Figure 8.
LPC预测的核心问题是如何根据输入信号得到预测器系数,设预测器的阶数为p,则由Yule-Walker方程得:The core problem of LPC forecasting is how to obtain the predictor coefficients according to the input signal. Assuming that the order of the predictor is p, it can be obtained from the Yule-Walker equation:
其中,rx(i),i=0,1,2,...,p是样点自相关值,可由预测器中当前样点前p个点的自相关函数进行估计。a1,a2,a3,...,ap为预测器系数,σ2为前向预测的最小误差功率。当已知自相关矩阵时,使误差功率最小的预测器系数即为所求。本方案采用经典的Levinson-Durbin算法来求解此系数。Wherein, rx(i), i=0, 1, 2, . . . , p is the sample autocorrelation value, which can be estimated by the autocorrelation function of p points before the current sample point in the predictor. a1, a2, a3, ..., ap are predictor coefficients, and σ2 is the minimum error power of forward prediction. When the autocorrelation matrix is known, the predictor coefficients that minimize the error power are sought. This program uses the classic Levinson-Durbin algorithm to solve this coefficient.
令反射系数am(m)=km,由Levinson-Durbin算法可得到系数的递推方程如下:Let the reflection coefficient am(m)=km, the recurrence equation of the coefficient obtained by the Levinson-Durbin algorithm is as follows:
am(k)=am-1(k)-kmam-1(m-k) (11)a m (k) = a m-1 (k)-k m a m-1 (mk) (11)
该算法中各阶系数递推得到,所以可以比较选择合适的预测器阶数和各阶系数。由于km2为大于0的数,所以永远有ρm<ρm-1,即随着迭代过程的进行,预测误差将逐级减小。In this algorithm, the coefficients of each order are recursively obtained, so the appropriate predictor order and coefficients of each order can be compared and selected. Since km2 is a number greater than 0, there is always ρm<ρm-1, that is, as the iterative process proceeds, the prediction error will decrease step by step.
由于按照上述最优准则计算出来的预测系数都是浮点数,所以在边信息中保留的预测器反射系数是经过量化的定点数,虽然舍弃了一定的最优化准则,但是保证了信号的完全可重构特性。Since the prediction coefficients calculated according to the above optimal criteria are all floating-point numbers, the predictor reflection coefficients retained in the side information are quantized fixed-point numbers. Although certain optimization criteria are discarded, the complete reliability of the signal is guaranteed. Refactoring features.
5、熵编码5. Entropy coding
由于残差服从一定的分布,所以对于残差一般应用熵编码。本方案采用Rice码对残差信号进行编码。Rice编码适用于其分布近似于几何分布的数据编码。其分布函数为:Since the residuals obey a certain distribution, entropy coding is generally applied to the residuals. In this scheme, the Rice code is used to code the residual signal. Rice encoding is suitable for encoding data whose distribution approximates a geometric distribution. Its distribution function is:
Pr{rθ=p}=(1-θ)θp,θ∈(0,1) (13)P r {r θ = p} = (1-θ)θ p , θ∈(0,1) (13)
但是莱斯编码要求数据必须为正值。为了达到这一要求,首先需要对残差信号做一个映射,将负值映射到正值。该过程如下所示:But Rice encoding requires that the data must be positive. In order to meet this requirement, a mapping is first required on the residual signal, mapping negative values to positive values. The process is as follows:
对于一个数据p,首先用M(一般情况下M=2n)除该数,得到一个商q与相应的余数r。即For a data p, divide the number by M (generally M=2n) at first to obtain a quotient q and the corresponding remainder r. Right now
r=p-q·M (16)r=p-q·M (16)
则该数据的编码方式如下:q个比特的1,用来表示商;1比特的0,用来区分商和余数的标志位;[log2r]个比特用来表示余数r。Then the encoding method of the data is as follows: q bits of 1 are used to represent the quotient; 1 bit of 0 is used to distinguish the flag bit between the quotient and the remainder; [log2r] bits are used to represent the remainder r.
式中合适的M值由下述方法确定:The appropriate M value in the formula is determined by the following method:
其中常数c1=0.97。Where the constant c1 = 0.97.
尽管为说明目的公开了本发明的具体实施例和附图,其目的在于帮助理解本发明的内容并据以实施,但是本领域的技术人员可以理解:在不脱离本发明及所附的权利要求的精神和范围内,各种替换、变化和修改都是可能的。因此,本发明不应局限于最佳实施例和附图所公开的内容。Although specific embodiments and drawings of the present invention are disclosed for the purpose of illustration, the purpose is to help understand the content of the present invention and implement it accordingly, but those skilled in the art can understand that: without departing from the present invention and the appended claims Various substitutions, changes and modifications are possible within the spirit and scope of . Therefore, the present invention should not be limited to what is disclosed in the preferred embodiments and drawings.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201010281033XA CN101944362B (en) | 2010-09-14 | 2010-09-14 | Integer wavelet transform-based audio lossless compression encoding and decoding method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201010281033XA CN101944362B (en) | 2010-09-14 | 2010-09-14 | Integer wavelet transform-based audio lossless compression encoding and decoding method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101944362A true CN101944362A (en) | 2011-01-12 |
CN101944362B CN101944362B (en) | 2012-05-30 |
Family
ID=43436323
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201010281033XA Expired - Fee Related CN101944362B (en) | 2010-09-14 | 2010-09-14 | Integer wavelet transform-based audio lossless compression encoding and decoding method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101944362B (en) |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102368385A (en) * | 2011-09-07 | 2012-03-07 | 中科开元信息技术(北京)有限公司 | Backward block adaptive Golomb-Rice coding and decoding method and apparatus thereof |
CN103632673A (en) * | 2013-11-05 | 2014-03-12 | 无锡北邮感知技术产业研究院有限公司 | Nonlinear quantization method of voice linear prediction model |
CN104041039A (en) * | 2011-11-07 | 2014-09-10 | 奥林奇公司 | Method for encoding and decoding images, encoding and decoding device, and corresponding computer program |
CN104217726A (en) * | 2014-09-01 | 2014-12-17 | 东莞中山大学研究院 | A lossless audio compression coding method and its decoding method |
CN105191144A (en) * | 2013-03-22 | 2015-12-23 | 富士通株式会社 | Compression device, compression method, decompression device, decompression method, and information processing system |
CN105814630A (en) * | 2013-10-22 | 2016-07-27 | 弗劳恩霍夫应用研究促进协会 | Concept of combined dynamic range compression and guide truncation prevention for audio equipment |
CN106024000A (en) * | 2016-05-23 | 2016-10-12 | 苏州大学 | End-to-end voice encryption and decryption method based on frequency spectrum mapping |
CN106098073A (en) * | 2016-05-23 | 2016-11-09 | 苏州大学 | A kind of end-to-end speech encrypting and deciphering system mapping based on frequency spectrum |
CN106409310A (en) * | 2013-08-06 | 2017-02-15 | 华为技术有限公司 | Audio signal classification method and device |
US10142660B2 (en) | 2011-11-07 | 2018-11-27 | Dolby International Ab | Method of coding and decoding images, coding and decoding device, and computer programs corresponding thereto |
CN109147805A (en) * | 2018-06-05 | 2019-01-04 | 安克创新科技股份有限公司 | Audio sound quality enhancing based on deep learning |
CN109309513A (en) * | 2018-09-11 | 2019-02-05 | 广东石油化工学院 | An Adaptive Reconstruction Method for Power Line Communication Signals |
CN110249385A (en) * | 2017-02-03 | 2019-09-17 | 高通股份有限公司 | Multichannel decoding |
CN110380826A (en) * | 2019-08-21 | 2019-10-25 | 苏州大学 | The compression of mobile communication signal ADAPTIVE MIXED and decompressing method |
CN110992739A (en) * | 2019-12-26 | 2020-04-10 | 上海乂学教育科技有限公司 | Student on-line dictation system |
CN112118445A (en) * | 2020-07-29 | 2020-12-22 | 广东省建筑科学研究院集团股份有限公司 | A Data Compression Method for Bridge Health Monitoring Based on Wavelet Analysis |
CN112740708A (en) * | 2020-05-21 | 2021-04-30 | 华为技术有限公司 | A kind of audio data transmission method and related device |
WO2021218229A1 (en) * | 2020-04-28 | 2021-11-04 | 华为技术有限公司 | Coding method and device for linear prediction coding parameter |
CN114258568A (en) * | 2021-11-26 | 2022-03-29 | 北京小米移动软件有限公司 | A stereo audio signal processing method, device, encoding device, decoding device and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6094631A (en) * | 1998-07-09 | 2000-07-25 | Winbond Electronics Corp. | Method of signal compression |
CN1318904A (en) * | 2001-03-13 | 2001-10-24 | 北京阜国数字技术有限公司 | Practical sound coder based on wavelet conversion |
CN1354456A (en) * | 2001-12-21 | 2002-06-19 | 北京阜国数字技术有限公司 | Block effect eliminating method in wavelet voice frequency signal processing |
US6496797B1 (en) * | 1999-04-01 | 2002-12-17 | Lg Electronics Inc. | Apparatus and method of speech coding and decoding using multiple frames |
CN1424713A (en) * | 2003-01-14 | 2003-06-18 | 北京阜国数字技术有限公司 | High frequency coupled pseudo small wave 5-tracks audio encoding/decoding method |
CN1529246A (en) * | 2003-09-28 | 2004-09-15 | 王向阳 | Digital audio-frequency water-print inlaying and detecting method based on auditory characteristic and integer lift ripple |
CN1920950A (en) * | 2006-09-25 | 2007-02-28 | 北京理工大学 | Characteristic waveform decomposition and reconfiguration method based on Haar wavelet exaltation |
-
2010
- 2010-09-14 CN CN201010281033XA patent/CN101944362B/en not_active Expired - Fee Related
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6094631A (en) * | 1998-07-09 | 2000-07-25 | Winbond Electronics Corp. | Method of signal compression |
US6496797B1 (en) * | 1999-04-01 | 2002-12-17 | Lg Electronics Inc. | Apparatus and method of speech coding and decoding using multiple frames |
CN1318904A (en) * | 2001-03-13 | 2001-10-24 | 北京阜国数字技术有限公司 | Practical sound coder based on wavelet conversion |
CN1354456A (en) * | 2001-12-21 | 2002-06-19 | 北京阜国数字技术有限公司 | Block effect eliminating method in wavelet voice frequency signal processing |
CN1424713A (en) * | 2003-01-14 | 2003-06-18 | 北京阜国数字技术有限公司 | High frequency coupled pseudo small wave 5-tracks audio encoding/decoding method |
CN1529246A (en) * | 2003-09-28 | 2004-09-15 | 王向阳 | Digital audio-frequency water-print inlaying and detecting method based on auditory characteristic and integer lift ripple |
CN1920950A (en) * | 2006-09-25 | 2007-02-28 | 北京理工大学 | Characteristic waveform decomposition and reconfiguration method based on Haar wavelet exaltation |
Cited By (43)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102368385A (en) * | 2011-09-07 | 2012-03-07 | 中科开元信息技术(北京)有限公司 | Backward block adaptive Golomb-Rice coding and decoding method and apparatus thereof |
CN102368385B (en) * | 2011-09-07 | 2013-08-14 | 中科开元信息技术(北京)有限公司 | Backward block adaptive Golomb-Rice coding and decoding method and apparatus thereof |
US11277630B2 (en) | 2011-11-07 | 2022-03-15 | Dolby International Ab | Method of coding and decoding images, coding and decoding device and computer programs corresponding thereto |
CN104041039A (en) * | 2011-11-07 | 2014-09-10 | 奥林奇公司 | Method for encoding and decoding images, encoding and decoding device, and corresponding computer program |
US11889098B2 (en) | 2011-11-07 | 2024-01-30 | Dolby International Ab | Method of coding and decoding images, coding and decoding device and computer programs corresponding thereto |
US10257532B2 (en) | 2011-11-07 | 2019-04-09 | Dolby International Ab | Method of coding and decoding images, coding and decoding device and computer programs corresponding thereto |
US11109072B2 (en) | 2011-11-07 | 2021-08-31 | Dolby International Ab | Method of coding and decoding images, coding and decoding device and computer programs corresponding thereto |
US10701386B2 (en) | 2011-11-07 | 2020-06-30 | Dolby International Ab | Method of coding and decoding images, coding and decoding device and computer programs corresponding thereto |
US10681389B2 (en) | 2011-11-07 | 2020-06-09 | Dolby International Ab | Method of coding and decoding images, coding and decoding device and computer programs corresponding thereto |
US11943485B2 (en) | 2011-11-07 | 2024-03-26 | Dolby International Ab | Method of coding and decoding images, coding and decoding device and computer programs corresponding thereto |
US10142660B2 (en) | 2011-11-07 | 2018-11-27 | Dolby International Ab | Method of coding and decoding images, coding and decoding device, and computer programs corresponding thereto |
CN105191144A (en) * | 2013-03-22 | 2015-12-23 | 富士通株式会社 | Compression device, compression method, decompression device, decompression method, and information processing system |
CN105191144B (en) * | 2013-03-22 | 2019-01-01 | 富士通株式会社 | Compression set, compression method, decompression device, decompressing method and information processing system |
US10529361B2 (en) | 2013-08-06 | 2020-01-07 | Huawei Technologies Co., Ltd. | Audio signal classification method and apparatus |
CN106409310A (en) * | 2013-08-06 | 2017-02-15 | 华为技术有限公司 | Audio signal classification method and device |
US11756576B2 (en) | 2013-08-06 | 2023-09-12 | Huawei Technologies Co., Ltd. | Classification of audio signal as speech or music based on energy fluctuation of frequency spectrum |
US11289113B2 (en) | 2013-08-06 | 2022-03-29 | Huawei Technolgies Co. Ltd. | Linear prediction residual energy tilt-based audio signal classification method and apparatus |
US12198719B2 (en) | 2013-08-06 | 2025-01-14 | Huawei Technologies Co., Ltd. | Audio signal classification based on frequency spectrum fluctuation |
CN106409310B (en) * | 2013-08-06 | 2019-11-19 | 华为技术有限公司 | A kind of audio signal classification method and device |
CN105814630A (en) * | 2013-10-22 | 2016-07-27 | 弗劳恩霍夫应用研究促进协会 | Concept of combined dynamic range compression and guide truncation prevention for audio equipment |
CN111580772A (en) * | 2013-10-22 | 2020-08-25 | 弗劳恩霍夫应用研究促进协会 | Concept of Combined Dynamic Range Compression and Lead Truncation Prevention for Audio Devices |
CN111580772B (en) * | 2013-10-22 | 2023-09-26 | 弗劳恩霍夫应用研究促进协会 | Concept for combined dynamic range compression and guided truncation prevention for audio devices |
US12051432B2 (en) | 2013-10-22 | 2024-07-30 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Concept for combined dynamic range compression and guided clipping prevention for audio devices |
CN103632673B (en) * | 2013-11-05 | 2016-05-18 | 无锡北邮感知技术产业研究院有限公司 | A kind of non-linear quantization of speech linear predictive model |
CN103632673A (en) * | 2013-11-05 | 2014-03-12 | 无锡北邮感知技术产业研究院有限公司 | Nonlinear quantization method of voice linear prediction model |
CN104217726A (en) * | 2014-09-01 | 2014-12-17 | 东莞中山大学研究院 | A lossless audio compression coding method and its decoding method |
CN106024000A (en) * | 2016-05-23 | 2016-10-12 | 苏州大学 | End-to-end voice encryption and decryption method based on frequency spectrum mapping |
CN106098073A (en) * | 2016-05-23 | 2016-11-09 | 苏州大学 | A kind of end-to-end speech encrypting and deciphering system mapping based on frequency spectrum |
CN110249385B (en) * | 2017-02-03 | 2023-05-30 | 高通股份有限公司 | Multi-channel decoding |
CN110249385A (en) * | 2017-02-03 | 2019-09-17 | 高通股份有限公司 | Multichannel decoding |
WO2019233364A1 (en) * | 2018-06-05 | 2019-12-12 | 安克创新科技股份有限公司 | Deep learning-based audio quality enhancement |
CN109147805A (en) * | 2018-06-05 | 2019-01-04 | 安克创新科技股份有限公司 | Audio sound quality enhancing based on deep learning |
CN109309513B (en) * | 2018-09-11 | 2021-06-11 | 广东石油化工学院 | Adaptive reconstruction method for power line communication signals |
CN109309513A (en) * | 2018-09-11 | 2019-02-05 | 广东石油化工学院 | An Adaptive Reconstruction Method for Power Line Communication Signals |
CN110380826A (en) * | 2019-08-21 | 2019-10-25 | 苏州大学 | The compression of mobile communication signal ADAPTIVE MIXED and decompressing method |
CN110380826B (en) * | 2019-08-21 | 2021-09-28 | 苏州大学 | Self-adaptive mixed compression method for mobile communication signal |
CN110992739A (en) * | 2019-12-26 | 2020-04-10 | 上海乂学教育科技有限公司 | Student on-line dictation system |
CN110992739B (en) * | 2019-12-26 | 2021-06-01 | 上海松鼠课堂人工智能科技有限公司 | Student on-line dictation system |
WO2021218229A1 (en) * | 2020-04-28 | 2021-11-04 | 华为技术有限公司 | Coding method and device for linear prediction coding parameter |
CN112740708B (en) * | 2020-05-21 | 2022-07-22 | 华为技术有限公司 | Audio data transmission method and related device |
CN112740708A (en) * | 2020-05-21 | 2021-04-30 | 华为技术有限公司 | A kind of audio data transmission method and related device |
CN112118445A (en) * | 2020-07-29 | 2020-12-22 | 广东省建筑科学研究院集团股份有限公司 | A Data Compression Method for Bridge Health Monitoring Based on Wavelet Analysis |
CN114258568A (en) * | 2021-11-26 | 2022-03-29 | 北京小米移动软件有限公司 | A stereo audio signal processing method, device, encoding device, decoding device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN101944362B (en) | 2012-05-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101944362A (en) | Integer wavelet transform-based audio lossless compression encoding and decoding method | |
TWI515720B (en) | Method of compressing a digitized audio signal, method of decoding an encoded compressed digitized audio signal, and machine readable storage medium | |
JP5265682B2 (en) | Digital content encoding and / or decoding | |
CN101199121B (en) | Encoding input signal method and encoder/decoder | |
JP4081447B2 (en) | Apparatus and method for encoding time-discrete audio signal and apparatus and method for decoding encoded audio data | |
CN103280221B (en) | A kind of audio lossless compressed encoding, coding/decoding method and system of following the trail of based on base | |
US7343287B2 (en) | Method and apparatus for scalable encoding and method and apparatus for scalable decoding | |
US6963842B2 (en) | Efficient system and method for converting between different transform-domain signal representations | |
JP3814611B2 (en) | Method and apparatus for processing time discrete audio sample values | |
JP2010537245A5 (en) | ||
CN101484937B (en) | Decode predictively encoded data using buffer scaling | |
US7991622B2 (en) | Audio compression and decompression using integer-reversible modulated lapped transforms | |
KR20100085994A (en) | Scalable speech and audio encoding using combinatorial encoding of mdct spectrum | |
US7333929B1 (en) | Modular scalable compressed audio data stream | |
US8086465B2 (en) | Transform domain transcoding and decoding of audio data using integer-reversible modulated lapped transforms | |
CN102158692B (en) | Encoding method, decoding method, encoder and decoder | |
EP1743326B1 (en) | Lossless multi-channel audio codec | |
JP3237178B2 (en) | Encoding method and decoding method | |
JP2002091497A (en) | Audio signal encoding method, decoding method, and program storage medium for executing those methods | |
Kim | Lossless Wideband Audio Compression: Prediction and Transform | |
CN103035249B (en) | Audio arithmetic coding method based on time-frequency plane context | |
KR100682966B1 (en) | Frequency magnitude data quantization / dequantization method and apparatus and audio coding / decoding method and apparatus using same | |
CN100486332C (en) | Method and apparatus for synthesized subband filtering | |
US7580843B2 (en) | Synthesis subband filter process and apparatus | |
Vasilache | Entropic encoding of lattice codevectors based on product code indexing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20120530 |