CN1747330A

CN1747330A - Coding method of symmetric reversible long code

Info

Publication number: CN1747330A
Application number: CN 200410073747
Authority: CN
Inventors: 罗忠; 王静; 霍俊彦; 常义林; 马林华
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2004-09-09
Filing date: 2004-09-09
Publication date: 2006-03-15
Anticipated expiration: 2024-09-09
Also published as: CN100499377C

Abstract

The present invention is a coding method of a symmetrical reversible long code, which includes: step A: according to the occurrence probability of each different symbol in the data, calculate the shortest average code word length that can be obtained theoretically, and construct all codes with the The codeword with the shortest average codeword length is used as the current candidate codeword; step B: judge whether the current candidate codeword satisfies the symmetric condition, select all candidate codewords satisfying the symmetric condition as the symmetric reversible long codeword, and construct Sub-codewords of all candidate codewords that do not satisfy the symmetry condition are used as new current candidate codewords; step C: judge whether N symmetric reversible long-code codewords have been obtained through the selection process of step B, if so , then end this process, otherwise, return to step B until N symmetric reversible long code words are generated; wherein, N is the number of types of different symbols appearing in the data to be encoded, which is a natural number.

Description

A Coding Method of Symmetric Reversible Inverted Length Code

技术领域technical field

本发明属于通信技术中的编码技术领域，尤其涉及一种对称型可逆变长码(RVLC)的编码方法。The invention belongs to the technical field of encoding in communication technology, and in particular relates to an encoding method of a symmetrical reversible reversible long code (RVLC).

背景技术Background technique

熵编码是通信技术中基本的信源编码技术。利用熵编码能够将信源从任意符号转变为二进制的形式，以利于通信发送；同时，熵编码利用熵的本质特性，还能充分消除数据中的冗余性，达到压缩数据，从而节省通信需要的时间或者带宽等资源的目的。在熵编码中，变长码是广泛使用的码字，在文本、数据、语音、图像和视频以及其它各种媒体编码中都有广泛的应用。例如，在静止图像和视频编码标准中，编码系统包括源图像数据输入、DCT编码器和压缩图像数据流的输出三个部分，其中，DCT编码器中包括正向DCT变换器、量化器以及熵编码三部分，还包括量化表以及熵编码表，通过熵编码以及熵编码表实现对源图像数据经过运动预测、DCT变换和量化后的结果进行熵编码，以一方面消除数据的冗余性，另一方面将源图像数据转变为易于通信传输的二进制形式。Entropy coding is the basic information source coding technology in communication technology. The use of entropy coding can change the information source from any symbol to binary form, which is convenient for communication transmission; at the same time, entropy coding uses the essential characteristics of entropy, and can fully eliminate the redundancy in the data and achieve compressed data, thereby saving communication needs The purpose of resources such as time or bandwidth. In entropy coding, variable length codes are widely used codewords, and are widely used in text, data, voice, image and video, and other media coding. For example, in still image and video coding standards, the coding system includes three parts: source image data input, DCT coder and output of compressed image data stream, wherein, DCT coder includes forward DCT transformer, quantizer and entropy The three parts of coding also include quantization table and entropy coding table. Through entropy coding and entropy coding table, entropy coding is performed on the result of motion prediction, DCT transformation and quantization of source image data, so as to eliminate data redundancy on the one hand, On the other hand, the source image data is converted into a binary form that is easy to communicate.

在熵编码技术中，哈夫曼编码是一种经典的熵编码方法，采用该方法，能够在平均码字长度上达到最优，因此就传输效率而言是最高的，从而，在早期的熵编码应用中，广泛采用此种哈夫曼编码方法。但是，随着变长编码的发展，人们逐渐认识到：除了编码传输效率之外，还有很多其它因素决定具体应用中变长编码的优劣，其中的一个重要因素是：由于变长码本身的特性，对误码非常敏感。基于此点分析，当变长码在误码率较高的信道上传输时，传输效率会由于其对误码非常敏感而显著降低，从而不利于实现通信。由于哈夫曼码自身的抗误码性能并不突出，因此，可能会由于自身对误码的敏感而降低传输效率。In the entropy coding technology, Huffman coding is a classic entropy coding method. With this method, the average code word length can be optimized, so it is the highest in terms of transmission efficiency. Therefore, in the early entropy In coding applications, this Huffman coding method is widely used. However, with the development of variable-length codes, people gradually realized that besides the coding transmission efficiency, there are many other factors that determine the pros and cons of variable-length codes in specific applications. One of the important factors is: due to the variable-length code itself characteristics and is very sensitive to bit errors. Based on this analysis, when the variable-length code is transmitted on a channel with a high bit error rate, the transmission efficiency will be significantly reduced due to its sensitivity to bit errors, which is not conducive to the realization of communication. Since the anti-error performance of the Huffman code itself is not outstanding, the transmission efficiency may be reduced due to its own sensitivity to bit errors.

为了不使传输效率因为变长码本身对误码的敏感性而显著降低，当前提出了可逆变长码。该可逆变长码具有出色的抗误码性能，因此已经被近期的视频标准，如H.263+、MPEG-4等所采纳。可逆变长码指从前向和后向两个方向都能够即时解码的变长编码；其中，能够即时解码指的是一组码满足异字头条件(Prefix-free condition)，异字头条件指的是在一组码中，没有任何完整的码字是其它码字的前缀，例如一组即时码C＝(0，10，110，111)就是满足异字头条件的码字。由上述分析可以得到：可逆变长码也就是所有码字同时满足前向异字头条件和后向异字头条件的码字。根据码字自身是否存在对称性，可逆变长码还可分为对称型可逆变长码和非对称型可逆变长码，对称型可逆变长码和同步技术结合使用可取得很好的译码性能。对于可逆变长码，其工作原理为：In order not to reduce the transmission efficiency significantly due to the sensitivity of variable-length codes to bit errors, variable variable-length codes are currently proposed. The variable variable length code has excellent anti-error performance, so it has been adopted by recent video standards, such as H.263+, MPEG-4 and so on. Reversible variable length codes refer to variable-length codes that can be decoded in real time from both forward and backward directions; among them, being able to be decoded in real time refers to a group of codes that meet the prefix-free condition, and the prefix-free condition It means that in a group of codes, no complete codeword is the prefix of other codewords, for example, a group of instant codes C=(0, 10, 110, 111) is exactly a codeword satisfying the different prefix condition. From the above analysis, it can be obtained that a reversible long code is a codeword in which all codewords satisfy both the forward different prefix condition and the backward different prefix condition. According to whether there is symmetry in the code word itself, reversible long codes can also be divided into symmetric reversible long codes and asymmetric reversible long codes. The combination of symmetric reversible long codes and synchronization technology can achieve a lot Good decoding performance. For variable variable length codes, its working principle is:

在各种需要压缩的数据中，以熵来度量这些数据中所含有的不可压缩部分的信息量；同时，在这些数据中，同一个符号按照不同的发生概率反复出现，从而形成了这些压缩数据中的冗余。通过采用可逆变长码这种熵编码方式，压缩掉数据中的冗余而保持数据的熵不变，通过用构造得到的可逆变长码替换数据中的符号本身，一方面将数据中的符号转变为适合传输的二进制形式，另一方面，达到压缩数据长度的目的；其中，此种压缩是遵循数据本身的统计模型进行的，该统计模型是描述数据中各个符号发生概率分布上所具有特性的模型，常见的有指数分布、泊松分布等，由于此种压缩基于这些统计模型进行，因此，在统计意义上能够达到最优。Among all kinds of data that needs to be compressed, entropy is used to measure the amount of information contained in the incompressible part of these data; at the same time, in these data, the same symbol appears repeatedly according to different occurrence probabilities, thus forming these compressed data Redundancy in . By adopting the entropy encoding method of reversible long code, the redundancy in the data is compressed and the entropy of the data is kept unchanged. On the other hand, it achieves the purpose of compressing the data length; among them, this kind of compression is carried out in accordance with the statistical model of the data itself, which describes the occurrence probability distribution of each symbol in the data. Models with characteristics, such as exponential distribution and Poisson distribution, are common. Since this compression is based on these statistical models, it can be optimal in a statistical sense.

在现有技术中，为了能够根据数据中符号的发生概率来构造可逆变长码，通常采用如下两种方式，这两种方式均采用间接构造的方式，首先根据数据中符号的发生概率构造得到哈夫曼码，然后再由哈夫曼码得到对称型可逆变长码：In the prior art, in order to construct variable variable length codes according to the occurrence probability of symbols in the data, the following two methods are usually adopted, both of which adopt the indirect construction method. First, construct according to the occurrence probability of symbols in the data Get the Huffman code, and then get the symmetric reversible long code from the Huffman code:

方式一：method one:

在该种方式中，首先根据数据中符号的发生概率构造得到哈夫曼码，得到如图1所示的哈夫曼码树，这里所用的树，是离散数学，图论等理论中定义的一种抽象的图，或者理解成计算机算法中的数据结构，其实是一种倒的“树”，根节点在最上面，得到该树后，利用该哈夫曼码树，自树根起从上而下的将码字转变为对称型可逆变长码，得到图2所示的对称型可逆变长码的码树；在从哈夫曼码树转变为对称型可逆变长码的码树的过程中，需要在哈夫曼码树的每一层上均进行是否满足可逆变长码条件的判断，而该种方式并没有定义一种高效率的判断方式，从而，在考察一个候选码字时，首先要判断它是否满足对称条件，如果符合，则与已选的码字进行比较，以确定是否满足前向和后向异字头条件，在这些判断均符合的情况下，方能把该候选码字确定为可逆变长码中的码字；可见，该种方式具有如下缺点：In this method, first construct the Huffman code according to the occurrence probability of the symbols in the data, and obtain the Huffman code tree shown in Figure 1. The tree used here is defined in discrete mathematics, graph theory and other theories An abstract graph, or understood as a data structure in a computer algorithm, is actually an inverted "tree" with the root node at the top. After obtaining the tree, use the Huffman code tree to start from the root of the tree. Convert the code word into a symmetric reversible long code from top to bottom, and obtain the code tree of the symmetric reversible long code shown in Figure 2; In the process of the code tree, it is necessary to judge whether the reversible long code condition is satisfied on each layer of the Huffman code tree, but this method does not define a high-efficiency judgment method, so, in When inspecting a candidate codeword, it is first necessary to judge whether it satisfies the symmetric condition, and if so, compare it with the selected codeword to determine whether it satisfies the forward and backward different prefix conditions. Only then can the candidate codeword be determined as the codeword in the reversible variable length code; it can be seen that this method has the following disadvantages:

(1)由于该种方式并没有给判断方式以高效的定义，因此，上述判断过程十分复杂，会大大体高计算的复杂度，不利于实际应用；(1) Since this method does not give an efficient definition of the judgment method, the above judgment process is very complicated, which will greatly increase the complexity of calculation, which is not conducive to practical application;

(2)由于此种方式中所涉及的算法自身的限制，在对哈夫曼码树中一层节点进行判断时，不可避免地会造成某些符合可逆变长码条件的节点被错误地遗漏，从而造成该种方式无法充分利用短码字，造成短码字的浪费，其直接后果就是使得平均码字长度上升，与熵编码应该优先使用短码获得最小平均码字长度这一根本原则相背离；(2) Due to the limitations of the algorithm itself involved in this method, when judging a layer of nodes in the Huffman code tree, it will inevitably cause some nodes that meet the conditions of the variable variable length code to be mistakenly Omission, resulting in the failure of this method to make full use of short codewords, resulting in waste of short codewords, the direct consequence is to increase the average codeword length, and entropy coding should give priority to using short codes to obtain the minimum average codeword length. deviate from

(3)对于具有相同发生概率的符号，如果构造的哈夫曼码不同，则得到的对称型可逆变长码也不相同，从而会造成编码效果的不统一；(3) For symbols with the same probability of occurrence, if the constructed Huffman codes are different, the obtained symmetric reversible long codes are also different, which will cause non-uniform coding effects;

方式二：Method 2:

该种方式对方式一进行了一些改进，对于相同发生概率的符号，即使构造的哈夫曼码不同，所得到的对称型可逆变长码仍然相同，并且，该种方式也可以获得较小的平均码字长度；但是，由于此种方式是对于方式一的改进，因此，仍然无法完全避免对于短码字的不充分使用，从而造成短码字的浪费，另外，该种方式仍然没有解决计算复杂度高这一问题。This method has made some improvements to method 1. For symbols with the same occurrence probability, even if the constructed Huffman codes are different, the obtained symmetric reversible long codes are still the same, and this method can also obtain smaller The average codeword length; however, since this method is an improvement over method 1, it is still impossible to completely avoid the insufficient use of short codewords, resulting in the waste of short codewords. In addition, this method still has no solution high computational complexity.

发明内容Contents of the invention

有鉴于此，本发明的主要目的在于提供一种对称型可逆变长码的编码方法，该方法直接根据数据中符号的发生概率构造得到对称型可逆变长码，该方法能够充分利用短码字，计算复杂度低，同时，该方法能够实现对于相同概率的符号得到同样的对称型可逆变长码。In view of this, the main purpose of the present invention is to provide a coding method of a symmetric reversible long code, which can be directly constructed according to the occurrence probability of symbols in the data to obtain a symmetric reversible long code, which can make full use of short The codeword has low computational complexity, and at the same time, the method can realize the same symmetric reversible long code for symbols with the same probability.

为实现上述目的，本发明提供了一种对称型可逆变长码的编码方法，该方法包括：In order to achieve the above object, the present invention provides a kind of encoding method of symmetrical reversible long code, and this method comprises:

步骤A：根据数据中各个符号的发生概率，计算得到理论上可以获得的最短平均码字长度，构造所有具有该最短平均码字长度的码字作为当前候选码字；Step A: Calculate the theoretically obtainable shortest average codeword length according to the occurrence probability of each symbol in the data, and construct all codewords with the shortest average codeword length as current candidate codewords;

步骤B：判断当前候选码字是否满足对称条件，将满足对称条件的所有候选码字选择作为对称型可逆变长码码字，和构造不满足对称性条件的所有候选码字的子码字作为当前候选码字；Step B: Determine whether the current candidate codeword satisfies the symmetry condition, select all candidate codewords that satisfy the symmetry condition as a symmetric reversible long codeword, and construct subcodewords of all candidate codewords that do not satisfy the symmetry condition as the current candidate codeword;

步骤C：判断是否已经通过步骤B的选择过程获得N个对称型可逆变长码码字，如果是，则结束本流程，否则，返回步骤B，直至获得N个对称型可逆变长码；其中，N为需要进行编码的数据中出现不同符号的种类数，为自然数。Step C: Judging whether N symmetric reversible long code words have been obtained through the selection process of step B, if yes, then end this process, otherwise, return to step B until N symmetric reversible long codes are obtained ; Wherein, N is the number of types of different symbols appearing in the data to be encoded, which is a natural number.

其中，步骤A所述计算得到理论上可以获得的最短平均码字长度包括：Wherein, the calculation in step A obtains the theoretically obtainable shortest average codeword length including:

其中，p(x)表示要进行编码的数据中各个符号所对应的发生概率，运算符号

表示不超过被运算数的最大整数。

Among them, p(x) represents the occurrence probability corresponding to each symbol in the data to be encoded, and the operation symbol

Represents the largest integer that does not exceed the operand.

其中，步骤A所述构造所有具有该最短平均码字长度的码字作为当前候选码字包括：Wherein, the construction described in step A has all the codewords with the shortest average codeword length as the current candidate codewords including:

利用全排列的方法，产生所有长度为所述最短平均码字长度的0、1序列，以这些序列作为所述当前候选码字。Using the full permutation method, all sequences of 0 and 1 whose length is the shortest average codeword length are generated, and these sequences are used as the current candidate codewords.

其中，步骤B包括：Wherein, step B includes:

步骤B1：创建一个队列，以步骤A所述最短平均码字长度作为该队列的起始码字长度，将所述当前候选码字放入该队列；Step B1: Create a queue, use the shortest average codeword length described in step A as the initial codeword length of the queue, and put the current candidate codeword into the queue;

步骤B2：对该队列中的所有当前候选码字分别判断是否满足对称条件，将满足对称条件的所有候选码字选择作为对称型可逆变长码码字，并将这些满足对称条件的候选码字从队列中删除，构造不满足对称条件的候选码字的子码字作为当前候选码字，在清空队列之后，将这些构造出的当前候选码字放入队列中。Step B2: Determine whether all the current candidate codewords in the queue meet the symmetric conditions, select all the candidate codewords that meet the symmetric conditions as symmetric reversible long code codewords, and select these candidate codewords that meet the symmetric conditions The word is deleted from the queue, and the sub-codewords of the candidate codewords that do not satisfy the symmetric condition are constructed as the current candidate codewords. After the queue is cleared, these constructed current candidate codewords are put into the queue.

其中，所述构造不满足对称性条件的所有候选码字的子码字包括：Wherein, the subcodewords of all candidate codewords that do not satisfy the symmetry condition include:

在所述候选码字的后面加上0或者1，形成所述候选码字的子码字。Add 0 or 1 behind the candidate codeword to form a sub-codeword of the candidate codeword.

可见，本发明具有如下有益效果：Visible, the present invention has following beneficial effect:

(1)该算法最大限度地利用了较短码长层中的所有的满足对称性和双向异字头条件的节点，减小了平均码字长度，尤其在一些特定的符号发生概率分布下，该算法得到的平均码字长度与理论熵非常接近，几乎相等，比如说指数分布；(1) The algorithm maximizes the use of all nodes in the shorter code length layer that satisfy the symmetry and bidirectional different header conditions, reducing the average code word length, especially under some specific symbol occurrence probability distributions, The average codeword length obtained by this algorithm is very close to the theoretical entropy, almost equal, such as exponential distribution;

(2)该码字生成时，无需再对其先进行哈夫曼编码，再构造对称码字，从而降低了实现复杂度；(2) When the codeword is generated, there is no need to perform Huffman coding on it first, and then construct a symmetrical codeword, thereby reducing the complexity of implementation;

(3)大大简化了码字选取机制：在目前已有的两种算法，实现起来复杂度很高，而采用本发明后，我们可以发现当某一节点被选为码字后，其子节点对应的码字将不再予以考虑，其余节点不满足对称条件时，将它的子节点对应的码字入队，依次进行判别，也就是说，队列内的所有候选码字都已经自动满足异字头条件，我们只需讨论其对称性即可。(3) The code word selection mechanism is greatly simplified: in the existing two algorithms at present, the complexity of implementation is very high, and after adopting the present invention, we can find that after a certain node is selected as a code word, its child nodes The corresponding codewords will no longer be considered. When the other nodes do not meet the symmetric conditions, the codewords corresponding to its child nodes will be put into the queue and judged in turn. That is to say, all the candidate codewords in the queue have automatically satisfied the different The prefix condition, we only need to discuss its symmetry.

附图说明Description of drawings

图1为哈夫曼码树的示意图；Fig. 1 is the schematic diagram of Huffman code tree;

图2为对称型可逆变长码的码树的示意图；Fig. 2 is the schematic diagram of the code tree of symmetric reversible long code;

图3为本发明的队列的示意图；Fig. 3 is the schematic diagram of queue of the present invention;

图4为实现本发明的流程图。Fig. 4 is a flowchart for realizing the present invention.

具体实施方式Detailed ways

本发明为一种对称型可逆变长码的编码方法，所述的可逆变长码为对称型可逆变长码，本发明方法直接根据数据中符号的发生概率构造得到对称型可逆变长码，该方法具有确定的选取可逆变长码的规则，其计算复杂度低。另外，由于该方法所采用的算法自身特性，因此能够充分利用短码字长度，从而符合熵编码应该优先使用短码字以获得最小平均码字长度这一根本原则，另外，对于相同发生概率的符号，该方法能够构造得到相同的可逆变长码。The present invention relates to a coding method of a symmetrical reversible long code. The reversible long code is a symmetrical reversible long code. The method of the present invention is directly constructed according to the occurrence probability of symbols in data to obtain a symmetrical reversible code. Variable length codes, the method has definite rules for selecting variable variable length codes, and its computational complexity is low. In addition, due to the inherent characteristics of the algorithm used in this method, it can make full use of the short codeword length, thus conforming to the fundamental principle that entropy coding should use short codewords first to obtain the minimum average codeword length. In addition, for the same occurrence probability symbol, this method can construct the same variable variable length code.

下面结合附图对本发明进行详细介绍。The present invention will be described in detail below in conjunction with the accompanying drawings.

在本发明中，引入队列这一概念，队列是数据结构中的术语，是一种操作受限的线性表，图3所示为一个队列的示意图，其中，a_i是队列中的一个元素，该队列只允许在队尾进行插入(入队)，在对头进行删除(出队)。In the present invention, introduce this concept of queue, queue is the terminology in the data structure, is the linear table of a kind of operation restriction, and Fig. 3 shows the schematic diagram of a queue, wherein, a _i is an element in the queue, The queue only allows insertion (entry) at the end of the queue and deletion (exit) at the head.

本发明首先根据数据中符号的发生概率，得到一个码树，该码树与图1所示的哈夫曼码树的形式类似。The present invention first obtains a code tree according to the occurrence probability of symbols in the data, and the code tree is similar to the Huffman code tree shown in FIG. 1 .

在本发明构造可逆变长码之前，首先需要确定要进行编码的数据中所出现的不同符号的种类数N，其中，N为自然数，然后，通过图4所示步骤实现构造可逆变长码：Before the present invention constructs the reversible length code, it is first necessary to determine the number N of different symbols appearing in the data to be coded, where N is a natural number, and then realize the construction of the reversible length code through the steps shown in Figure 4 code:

步骤400：Step 400:

构造一个队列，计算得到该队列的起始码字长度，其中，该队列用于完成构造可逆变长码，采用如下方法计算得到该队列的起始码字长度：Construct a queue and calculate the initial codeword length of the queue. The queue is used to complete the construction of the reversible long code. The initial codeword length of the queue is calculated by the following method:

由数据中各个符号发生概率的分布，计算得到理论上可以获得的最短平均码字长度，然后，以该最短平均码字长度作为该队列的起始码字长度；在本发明中，采用公式(1)计算最短平均码字长度：By the distribution of the occurrence probability of each symbol in the data, calculate the theoretically obtainable shortest average codeword length, then, use the shortest average codeword length as the initial codeword length of the formation; in the present invention, adopt the formula ( 1) Calculate the shortest average codeword length:

公式(1)：

运算符号

表示不超过被运算数的最大整数；Formula 1):

calculating signs

Indicates the largest integer not exceeding the operand;

该公式是一个简写的公式，每个信源符号对应着一个发生概率，用p(x)表示，x分别代表信源中的各个符号；该公式表示的是信息论中熵编码可以达到的最短码长，哈夫曼编码可以达到这个最优下限；任何其它的熵编码的平均码长都不会低于这个值。因此对于我们这里的可逆变长码，其平均码长一定高于这个值；因此将该值作为起始码长是合适的；This formula is a shorthand formula, each source symbol corresponds to a probability of occurrence, denoted by p(x), x represents each symbol in the source; this formula represents the shortest code that can be achieved by entropy coding in information theory Long, Huffman coding can reach this optimal lower limit; the average code length of any other entropy coding will not be lower than this value. Therefore, for our variable variable length code here, its average code length must be higher than this value; therefore, it is appropriate to use this value as the initial code length;

步骤401：构造所有具有步骤400所述队列的起始码字长度的码字，然后，将这些构造出的码字作为当前候选码字放入队列；其中，在本发明实施例中，采用如下方法实现本步骤中所述的构造码字：Step 401: Construct all codewords with the initial codeword length of the queue described in step 400, and then put these constructed codewords into the queue as current candidate codewords; wherein, in the embodiment of the present invention, the following The method implements the construction codeword described in this step:

设队列的起始码字长度为L_Initial，采用全排列的方法，在码字长度为L_Initial这一条件的限制下，对二进制编码0和1进行全排列，得到所有长度为L_Initial的0、1序列，从而构造得到所述的具有队列起始码字长度的码字；其中，不同的具有队列起始码字长度的码字共有2^LInitial个；Let the initial codeword length of the queue be L _Initial , adopt the method of full arrangement, under the condition that the codeword length is L _Initial , perform full arrangement on the binary code 0 and 1, and obtain all 0s whose length is L _Initial , 1 sequence, so as to construct the codeword with the length of the initial codeword of the queue; wherein, different codewords with the length of the initial codeword of the queue have 2 ^LInitial ;

步骤400～步骤401实现构造队列以及初始化队列的过程，从而得到一个以理论上可以获得的最短平均码字长度作为该队列的起始码字长度的队列；其中，参见后续步骤，因为队列过程是输出码字长度增大的过程，也就是说，本发明中先构造短码字，后构造长码字，因此，队列的初始化过程需要确定最短的码长度作为起始码字长度；在本发明中，将长度为L_Initial的所有候选码字入队，也就是将根据数据中符号发生概率所得到的码树中的第L_Initial。层上的所有节点均存入队列中，作为队列的初始状态，从而实现队列的初始化；Steps 400 to 401 realize the process of constructing and initializing the queue, so as to obtain a queue with the theoretically obtainable shortest average codeword length as the initial codeword length of the queue; wherein, refer to the subsequent steps, because the queue process is The process that the length of the output code word increases, that is to say, in the present invention, construct the short code word first, then construct the long code word, therefore, the initialization process of queue needs to determine the shortest code length as the initial code word length; In the present invention , enqueue all candidate codewords with a length of L _Initial , that is, the L _Initial in the code tree obtained according to the occurrence probability of symbols in the data. All nodes on the layer are stored in the queue as the initial state of the queue, thereby realizing the initialization of the queue;

步骤402～步骤405：对该队列中所有当前候选码字分别判断是否满足对称条件，将这些候选码字中所有满足对称条件的码字选择作为对称型可逆变长码字，并且，选择出这些候选码字中所有不满足对称条件的码字，构造得到这些不满足对称条件的码字的子码字，将原队列清空后，将这些新构造子码字作为当前候选码字放入该队列；Steps 402 to 405: judge whether all the current candidate codewords in the queue satisfy the symmetric condition, select all the codewords satisfying the symmetric condition among these candidate codewords as symmetric reversible long codewords, and select All the codewords that do not meet the symmetric conditions in these candidate codewords are constructed to obtain the sub-codewords of these codewords that do not meet the symmetric conditions. After the original queue is cleared, these newly constructed sub-codewords are put into the current candidate codewords as the current candidate codewords. queue;

其中，本发明所要选择的码字需要分别满足对称条件和异字头条件，在采用上述队列操作规则后，可以发现入队的候选码字都满足异字头条件，其原因在于：采用该队列出队入队方式，当且仅当一个候选码字的所有的父节点码字都不在队列中，该候选码字才能够被选入队，因此，该候选码字的任何前缀都不是队列中码字，满足异字头条件；因此，对于候选码字，无需判断是否满足异字头条件，只需要判断是否对称即可，从而大大简化了计算的复杂程度；Wherein, the codewords to be selected in the present invention need to satisfy the symmetric condition and the different prefix condition respectively. After adopting the above-mentioned queue operation rules, it can be found that the candidate codewords entering the team all satisfy the different prefix condition. List the way of enqueuing. If and only if all the parent codewords of a candidate codeword are not in the queue, the candidate codeword can be selected into the queue. Therefore, any prefix of the candidate codeword is not a queue The middle codeword satisfies the different prefix condition; therefore, for the candidate codeword, it is not necessary to judge whether the different prefix condition is satisfied, but only needs to judge whether it is symmetrical, thereby greatly simplifying the complexity of calculation;

步骤406：判断是否N种符号都已经分配到符合条件的码字，如果是，则结束本流程，构造得到对称型可逆变长码，否则，跳至步骤402，直至对所有符号都已经分配到符合条件的码字。Step 406: Judging whether all the N symbols have been allocated to qualified codewords, if so, then end the process and construct a symmetric reversible long code, otherwise, skip to step 402 until all symbols have been allocated to the qualified codeword.

以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included in the scope of the present invention. within the scope of protection.

Claims

1, a kind of coding method of symmetric form reversible long code is characterized in that, this method comprises:

Steps A: be compressed the probability of happening of each symbol in the coded data as required, calculate the shortest average code length, construct all have this code word of short average code length as current candidate codewords;

Step B: judge whether current candidate codewords satisfies symmetric condition, all candidate codewords that satisfy symmetric condition are selected not satisfy the subcode word of all candidate codewords of Symmetry Condition as current candidate codewords as symmetric form reversible long code code word and structure;

Step C: judge whether to have obtained N symmetric form reversible long code code word by the selection course of step B, if, process ends then, otherwise, step B returned, until obtaining N symmetric form reversible long code code word; Wherein, N is natural number for carrying out occurring in the coded data species number of distinct symbols.

2, method according to claim 1 is characterized in that, the described the shortest average code length that can obtain in theory that calculates of steps A comprises:

Wherein, p (x) indicates to carry out the pairing probability of happening of each symbol in the data of compressed encoding; Oeprator

Expression is no more than the maximum integer of operand.

3, method according to claim 1 is characterized in that, the described structure of steps A all have this code word of short average code length comprise as current candidate codewords:

Utilize full method of arranging, produce all length and be 0,1 sequence of described the shortest average code length, with these sequences as described current candidate codewords.

4, method according to claim 1 is characterized in that, step B comprises:

Step B1: create a formation,, described current candidate codewords is put into this formation with the initial code word length of steps A the shortest described average code length as this formation;

Step B2: all the current candidate codewords in this formation are judged whether to satisfy symmetric condition respectively, all candidate codewords that satisfy symmetric condition are selected as symmetric form reversible long code code word, and these candidate codewords that satisfy symmetric condition are deleted from formation, the subcode word of constructing the candidate codewords that does not satisfy symmetric condition is as current candidate codewords, after emptying formation, these current candidate codewords that construct are put into formation.

According to claim 1 or 4 described methods, it is characterized in that 5, the subcode word that described structure does not satisfy all candidate codewords of Symmetry Condition comprises:

Add 0 or 1 in the back of described candidate codewords, form the subcode word of described candidate codewords.