HK1226572B

HK1226572B - Multiple sign bit hiding within a transform unit

Info

Publication number: HK1226572B
Application number: HK16114737.1A
Authority: HK
Inventors: 王竞; 余祥; 何大可
Original assignee: 威勒斯媒体国际有限公司
Priority date: 2012-01-20
Filing date: 2016-12-28
Publication date: 2021-04-09

Description

Multiple sign bit hiding within transform units

本申请是2013年1月18日提交的中国专利申请No.201310019680.7的发明名称为“变换单元内的多符号位隐藏”的分案申请。The present application is a divisional application of Chinese patent application No. 201310019680.7 filed on January 18, 2013, entitled “Multiple-sign bit hiding within a transform unit”.

版权通知Copyright Notice

本文档和附属材料的公开的一部分包含要求了版权的材料。版权人不反对任何人对如在专利商标局的文件或记录中发表的专利文档或专利公开的复制，但是无论如何保留所有其他版权权利。A portion of the disclosure of this document and accompanying material contains material for which copyright is claimed. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office files or records, but otherwise reserves all other copyright rights whatsoever.

技术领域Technical Field

本申请总体涉及数据压缩，具体地涉及用于在对残差视频数据进行编码和解码时进行符号位隐藏的方法和设备。The present application relates generally to data compression, and more particularly to methods and apparatus for sign bit hiding when encoding and decoding residual video data.

背景技术Background Art

数据压缩发生在众多上下文中。在通信和计算机联网中非常普遍地使用数据压缩，以有效地存储、传输和复制信息。其在图像、音频和视频的编码方面得到了具体应用。由于每个视频帧所需的大量数据以及经常需要发生的编码和解码的速度，视频对数据压缩提出相当大的挑战。视频编码的当前最新技术是ITU-T H.264/AVC视频编码标准。该标准定义了针对不同应用的多个不同简档，包括主简档、基线简档等等。通过MPEG-ITU的联合发起，当前正在开发称为“高效视频编码(HEVC)”的下一代视频编码标准。Data compression occurs in many contexts. Data compression is very commonly used in communications and computer networking to efficiently store, transmit and copy information. It finds particular application in the encoding of images, audio and video. Video presents considerable challenges for data compression due to the large amount of data required for each video frame and the speed at which encoding and decoding often need to occur. The current state-of-the-art in video coding is the ITU-T H.264/AVC video coding standard. This standard defines a number of different profiles for different applications, including a main profile, a baseline profile, and so on. A next generation video coding standard called "High Efficiency Video Coding (HEVC)" is currently under development through a joint initiative of MPEG-ITU.

存在用于编码/解码图像和视频的多个标准，包括H.264，其使用基于块的编码过程。在这些过程中，图像或帧被分割成块，通常是4×4或8×8，并且块被频谱变换成系数、量化、和熵编码。在许多情况中，被变换的数据不是实际像素数据，而是预测操作之后的残差数据。预测可以是：帧内的，即帧/图像内的块到块；或者帧间的，即在帧之间(也称为运动预测)。期望MPEG-H将也具有这些特征。Several standards exist for encoding and decoding images and video, including H.264, which uses a block-based coding process. In these processes, images or frames are divided into blocks, typically 4×4 or 8×8, and the blocks are spectrally transformed into coefficients, quantized, and entropy coded. In many cases, the transformed data is not actual pixel data, but rather residual data after a prediction operation. Prediction can be either intra-frame, i.e., block-by-block within a frame/image; or inter-frame, i.e., between frames (also known as motion prediction). MPEG-H is expected to incorporate these features.

当对残差数据进行频谱变换时，这些标准中的多个标准规定了使用离散余弦变换(DCT)或基于它的一些变型。然后使用量化器对所得DCT系数进行量化，以产生量化后的变换域系数或索引。When performing a spectral transform on the residual data, many of these standards specify the use of a discrete cosine transform (DCT) or some variation thereof. The resulting DCT coefficients are then quantized using a quantizer to produce quantized transform domain coefficients or indices.

然后，使用特定的上下文模型对量化后的变换域系数的块或矩阵(有时称为“变换单元”)进行熵编码。在H.264/AVC中和在针对MPEG-H的当前开发工作中，量化后的变换系数通过下述方式来编码：(a)对指示变换单元中的最后一个非零系数的位置的末位有效系数位置进行编码；(b)对指示变换单元中的包含非零系数的位置(除了末位有效系数位置)的有效图进行编码；(c)对非零系数的幅度进行编码；以及(d)对非零系数的符号进行编码。对量化后的变换系数的编码常常占到比特流中的编码数据的30-80％。The block or matrix of quantized transform-domain coefficients (sometimes called a "transform unit") is then entropy coded using a specific context model. In H.264/AVC and in current development work for MPEG-H, the quantized transform coefficients are encoded by: (a) encoding the last significant coefficient position, which indicates the position of the last non-zero coefficient in the transform unit; (b) encoding a significance map, which indicates the positions in the transform unit that contain non-zero coefficients (except the last significant coefficient position); (c) encoding the magnitudes of the non-zero coefficients; and (d) encoding the signs of the non-zero coefficients. Coding the quantized transform coefficients often accounts for 30-80% of the coded data in the bitstream.

变换单元通常是N×N。常见大小包括4×4、8×8、16×16和32×32，但其他大小也是可能的，包括在某些实施例中的非正方形大小，如8×32或32×8。按每个非零系数使用一个符号位，对块中的每个非零系数的符号进行编码。Transform units are typically N x N. Common sizes include 4 x 4, 8 x 8, 16 x 16, and 32 x 32, but other sizes are possible, including non-square sizes such as 8 x 32 or 32 x 8 in some embodiments. The sign of each non-zero coefficient in the block is encoded using one sign bit per non-zero coefficient.

发明内容Summary of the Invention

本申请描述了使用符号位隐藏对视频数据进行编码和解码的方法和编码器/解码器。在一些实施例中，编码器和解码器可以使用多级有效图对有效系数标记进行编码。使用奇偶校验技术，对于变换单元中的每个系数子集，可以隐藏至少一个系数的符号位。在一些情形下，系数子集与例如在有效图编码和解码中使用的多级图中使用的系数组相对应。在至少一种情形下，多级图与较大的变换单元(诸如16×16和32×32TU)一起使用。在一些情形下，多级图与8×8TU、非正方形TU和其他大小的TU一起使用。符号位隐藏技术可被用于包含超过阈值数目的非零系数的那些系数子集。在一些实施例中，即使TU不使用多级有效图编码，也可以针对这些TU使用基于子集的符号位隐藏技术，特别是在对TU的有效系数编码是针对有效系数标记子集模块化实现的情况下，更是如此。This application describes methods and encoders/decoders for encoding and decoding video data using sign bit hiding. In some embodiments, the encoder and decoder may use multi-level significance maps to encode significant coefficient flags. Using parity checking techniques, for each coefficient subset in a transform unit (TU), the sign bit of at least one coefficient may be hidden. In some cases, the coefficient subsets correspond to coefficient groups used in multi-level maps, such as those used in significance map encoding and decoding. In at least one case, multi-level maps are used with larger transform units (TUs), such as 16×16 and 32×32 TUs. In some cases, multi-level maps are used with 8×8 TUs, non-square TUs, and other TU sizes. Sign bit hiding techniques may be used for coefficient subsets that contain more than a threshold number of non-zero coefficients. In some embodiments, subset-based sign bit hiding techniques may be used for TUs even if they do not use multi-level significance map encoding, particularly when the encoding of the significant coefficients of the TUs is implemented modularly for the subsets of significant coefficient flags.

在一个方面，本申请描述了一种通过重构变换单元的系数对编码视频的比特流进行解码的方法，所述比特流编码所述变换单元的两个或更多符号位集合，每个集合与所述变换单元的相应系数集合相对应，其中每个符号位指示相应集合内的对应的非零系数的符号。所述方法包括：针对所述两个或更多符号位集合中的每一个，对与该符号位集合相对应的相应集合中的系数的绝对值进行求和，以获得奇偶校验值；以及基于所述奇偶校验值是偶数还是奇数，向所述相应集合内的系数之一分派符号。In one aspect, the present application describes a method for decoding a bitstream of encoded video by reconstructing coefficients of a transform unit, the bitstream encoding two or more sets of sign bits for the transform unit, each set corresponding to a respective set of coefficients for the transform unit, wherein each sign bit indicates a sign of a corresponding non-zero coefficient within the respective set. The method includes: for each of the two or more sets of sign bits, summing the absolute values of the coefficients in the respective set corresponding to the sign bit set to obtain a parity value; and assigning a sign to one of the coefficients in the respective set based on whether the parity value is even or odd.

在另一方面，本申请描述了一种通过对变换单元的系数的符号位进行编码来编码视频的比特流的方法。所述方法包括：针对所述变换单元的两个或更多系数集合中的每一个，对该集合中的系数的绝对值进行求和，以获得奇偶校验值；确定该集合中的系数之一的符号不对应于所述奇偶校验值；以及，将该集合中的系数的级别进行值为1的调整，以便将所述奇偶校验值改变为与所述系数之一的符号相对应。In another aspect, the present application describes a method for encoding a bitstream of video by encoding sign bits of coefficients of a transform unit. The method includes: for each of two or more sets of coefficients of the transform unit, summing the absolute values of the coefficients in the set to obtain a parity value; determining that the sign of one of the coefficients in the set does not correspond to the parity value; and adjusting the levels of the coefficients in the set by 1 to change the parity value to correspond to the sign of the one of the coefficients.

在又一方面，本申请描述了配置为执行这种编码和解码方法的编码器和解码器。In yet another aspect, the present application describes encoders and decoders configured to perform such encoding and decoding methods.

在又一方面，本申请描述了存储计算机可执行程序指令的非瞬时计算机可读介质，所述计算机可执行程序指令在被执行时将处理器配置为执行所描述的编码和/或解码的方法。In yet another aspect, the present application describes a non-transitory computer-readable medium storing computer-executable program instructions that, when executed, configure a processor to perform the described encoding and/or decoding methods.

结合附图，通过阅读以下示例的描述，本领域技术人员将理解本申请的其他方面和特征。Those skilled in the art will understand other aspects and features of the present application by reading the following description of examples in conjunction with the accompanying drawings.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

现在参照附图作为示例，附图示出了本申请的示例实施例，在附图中：Reference is now made to the accompanying drawings, which illustrate example embodiments of the present application, in which:

图1以框图形式示出了用于对视频进行编码的编码器；FIG1 shows, in block diagram form, an encoder for encoding a video;

图2以框图形式示出了用于对视频进行解码的解码器；FIG2 shows, in block diagram form, a decoder for decoding a video;

图3示出了针对16×16的变换单元的多级扫描顺序的示例；。FIG3 shows an example of a multi-level scanning order for a 16×16 transform unit.

图4示出了被分割成按逆组级扫描顺序编号的系数组的示例16×16的变换单元；FIG4 shows an example 16×16 transform unit partitioned into coefficient groups numbered in reverse group-level scan order;

图5示出了变换单元的一个示例，其中形成了用于符号位隐藏的四个系数组的组；FIG5 shows an example of a transform unit in which a group of four coefficient groups is formed for sign bit hiding;

图6示出了针对符号位隐藏的系数组分组的另一示例；FIG6 shows another example of coefficient grouping for sign bit hiding;

图7示出了针对符号位隐藏的系数组分组的又一示例；FIG7 shows yet another example of coefficient grouping for sign bit hiding;

图8示出了针对符号位隐藏，动态形成系数集合的示例；FIG8 shows an example of dynamically forming a coefficient set for sign bit hiding;

图9以流程图的形式示出了符号位隐藏的示例过程；FIG9 shows an example process of sign bit hiding in the form of a flow chart;

图10示出了编码器的示例实施例的简化框图；以及FIG10 shows a simplified block diagram of an example embodiment of an encoder; and

图11示出了解码器的示例实施例的简化框图。FIG11 shows a simplified block diagram of an example embodiment of a decoder.

在不同的附图中可能已经使用类似的参考标号来标记类似的组件。Like reference numerals may have been used in different drawings to identify similar components.

具体实施方式DETAILED DESCRIPTION

在以下描述中，参考用于视频编码的H.264标准和/或开发中的MPEG-H标准，描述了一些示例实施例。本领域技术人员应该理解，本申请不限于H.264或MPEG-H，而是可适用于其他视频编码/解码标准，包括可能的将来的标准、多视图编码标准、可伸缩视频编码标准、以及可重新配置的视频编码标准。In the following description, some example embodiments are described with reference to the H.264 standard for video coding and/or the developing MPEG-H standard. Those skilled in the art will appreciate that this application is not limited to H.264 or MPEG-H but is applicable to other video coding/decoding standards, including potential future standards, multi-view coding standards, scalable video coding standards, and reconfigurable video coding standards.

在以下描述中，当参考视频或图像时，在某种程度上可以互换地使用术语帧、图片、片、平铺块(tile)和矩形片组。本领域技术人员将认识到，在H.264标准的情况中，帧可以包含一个或多个片。还将认识到，取决于适用的图像或视频编码标准的特定要求或术语学，某些编码/解码操作是逐帧执行的，一些编码/解码操作是逐片执行的，一些编码/解码操作是逐图片执行的，一些编码/解码操作是逐平铺块执行的，以及一些编码/解码操作是逐矩形片组执行的，视情况而定。在任何特定实施例中，适用的图像或视频编码标准可以确定是否关于帧和/或片和/或图片和/或平铺块和/或矩形片组来执行以下描述的操作。相应地，根据本公开，本领域技术人员将理解，这里描述的特定操作或过程以及对帧、片、图片、平铺块、矩形片组的特定引用对于给定实施例是否适用于帧、片、图片、平铺块、矩形片组、或者其中的一些或全部。这也可应用于变换单元、编码单元、编码单元的组等等，如根据以下描述将变得明显的那样。In the following description, when referring to a video or image, the terms frame, picture, slice, tile, and slice group are used somewhat interchangeably. Those skilled in the art will recognize that, in the context of the H.264 standard, a frame may contain one or more slices. They will also recognize that, depending on the specific requirements or terminology of the applicable image or video coding standard, some encoding/decoding operations may be performed on a frame-by-frame basis, some encoding/decoding operations may be performed on a slice-by-slice basis, some encoding/decoding operations may be performed on a picture-by-picture basis, some encoding/decoding operations may be performed on a tile-by-tile basis, and some encoding/decoding operations may be performed on a slice-by-slice basis, as appropriate. In any particular embodiment, the applicable image or video coding standard may determine whether the operations described below are performed with respect to frames and/or slices and/or pictures and/or tiles and/or slice groups. Accordingly, those skilled in the art will understand, based on this disclosure, whether specific operations or processes described herein and specific references to frames, slices, pictures, tiles, or slice groups apply to frames, slices, pictures, tiles, or slice groups, or some or all of these, for a given embodiment. This also applies to transform units, coding units, groups of coding units, etc., as will become apparent from the following description.

本申请描述了用于对变换单元的非零系数的符号位进行编码和解码的示例过程和设备。非零系数是由有效图来标识的。有效图是映射到或对应于变换单元或定义的系数单元(例如，若干变换单元、变换单元的一部分、或者编码单元)的标记的块、矩阵、组或集合。每个标记指示变换单元或指定的单元中的对应位置是否包含非零系数。在现有标准中，这些标记可被称为有效系数标记。在现有标准中，按扫描顺序从DC系数到末位有效系数，每个系数一个标记，并且如果对应系数为0，则该标记是为0的比特，以及如果对应系数不为0，则该标记被设置为1。本文使用的术语“有效图”旨在指代变换单元的有效系数标记的矩阵或有序集合(如根据下文描述将理解的)或者定义的系数单元(根据本申请的上下文将清楚)。The present application describes example processes and apparatus for encoding and decoding the sign bits of non-zero coefficients of a transform unit. The non-zero coefficients are identified by a significance map. A significance map is a block, matrix, group, or set of flags that are mapped to or correspond to a transform unit or a defined coefficient unit (e.g., several transform units, a portion of a transform unit, or a coding unit). Each flag indicates whether the corresponding position in the transform unit or the specified unit contains a non-zero coefficient. In existing standards, these flags may be referred to as significant coefficient flags. In existing standards, there is one flag for each coefficient in scan order, from the DC coefficient to the last significant coefficient, and the flag is a bit that is 0 if the corresponding coefficient is 0, and is set to 1 if the corresponding coefficient is not 0. The term "significance map" as used herein is intended to refer to a matrix or ordered set of significant coefficient flags for a transform unit (as will be understood from the description below) or a defined coefficient unit (as will be clear from the context of the present application).

应该理解，根据以下描述，多级编码和解码结构可以应用在特定情形中，并且那些情形可以根据诸如视频内容类型(正常视频或序列中识别的图形、图片、或者片头)之类的辅助信息来确定。例如，针对正常视频可以使用两个级别，以及针对图形可以使用三个级别(其通常更加稀疏)。另一可能性是在序列、图片或片头之一中提供标记，该标记指示该结构是具有一个、两个还是三个级别，由此允许编码器灵活地选择针对本内容的最合适的结构。在另一实施例中，标记可以表示内容类型，其将与级别数目相关联。例如，内容类型“图像”可以以三个级别为特色。Should be understood that, according to the following description, multi-level coding and decoding structure can be applied in specific situations, and those situations can be determined according to auxiliary information such as video content type (graphics, pictures, or headers identified in normal video or sequence). For example, two levels can be used for normal video, and three levels (which are usually more sparse) can be used for graphics. Another possibility is to provide a mark in one of the sequence, picture or header, and this mark indicates whether this structure has one, two or three levels, thereby allowing the encoder to flexibly select the most suitable structure for this content. In another embodiment, the mark can represent the content type, which will be associated with the number of levels. For example, the content type "image" can be characterized by three levels.

现在参照图1，图1以框图形式示出了用于对视频进行编码的编码器10。还参照图2，图2示出了用于对视频进行解码的解码器50的框图。可以认识到，这里描述的编码器10和解码器50均可以在专用或通用计算设备(包含一个或多个处理单元和存储器)上实现。编码器10或解码器50执行的操作可以通过例如专用集成电路或通过通用处理器可执行的存储程序指令来实现，视情况而定。设备可以包括附加软件，包括例如用于控制基本设备功能的操作系统。关于以下描述，本领域技术人员可以认识到在其中可以实现编码器10或解码器50的设备和平台的范围。Reference is now made to FIG1 , which shows in block diagram form an encoder 10 for encoding a video. Reference is also made to FIG2 , which shows a block diagram of a decoder 50 for decoding a video. It will be appreciated that both the encoder 10 and the decoder 50 described herein can be implemented on a dedicated or general-purpose computing device (comprising one or more processing units and memory). The operations performed by the encoder 10 or decoder 50 can be implemented, for example, by a dedicated integrated circuit or by stored program instructions executable by a general-purpose processor, as the case may be. The device may include additional software, including, for example, an operating system for controlling basic device functions. With respect to the following description, those skilled in the art will recognize the range of devices and platforms in which the encoder 10 or decoder 50 can be implemented.

编码器10接收视频源12并产生编码比特流14。解码器50接收编码比特流14并输出解码视频帧16。编码器10和解码器50可以被配置为符合多个视频压缩标准来操作。例如，编码器10和解码器50可以符合H.264/AVC。在其他实施例中，编码器10和解码器50可以符合其他视频压缩标准，包括类似MPEG-H之类的H.264/AVC标准的演进。The encoder 10 receives a video source 12 and generates an encoded bitstream 14. The decoder 50 receives the encoded bitstream 14 and outputs decoded video frames 16. The encoder 10 and decoder 50 can be configured to operate in accordance with a plurality of video compression standards. For example, the encoder 10 and decoder 50 can conform to H.264/AVC. In other embodiments, the encoder 10 and decoder 50 can conform to other video compression standards, including evolutions of the H.264/AVC standard such as MPEG-H.

编码器10包括空间预测器21、编码模式选择器20、变换处理器22、量化器24和熵编码器24。本领域技术人员可以认识到，编码模式选择器20确定视频源的适合编码模式，例如对象帧/片是I、P还是B类型，帧/片内的特定编码单元(例如宏块、编码单元等)是帧间还是帧内编码。变换处理器22对空间域数据执行变换。具体地，变换处理器22应用基于块的变换来将空间域数据转换为频谱分量。例如，在许多实施例中，使用离散余弦变换(DCT)。在一些实施例中，可以使用其他变换，如离散正弦变换等等。取决于宏块或编码单元的大小，该基于块的变换是在编码单元、宏块或者子块的基础上执行的。在H.264标准中，例如，典型的16×16的宏块包含16个4×4的变换块，并且针对4×4的块执行DCT过程。在一些情况中，变换块可以是8×8的，这意味着每个宏块存在4个变换块。在另外的情况中，变换块可以是其他大小。在一些情况中，16×16的宏块可以包括4×4和8×8的变换块的非重叠的组合。The encoder 10 includes a spatial predictor 21, a coding mode selector 20, a transform processor 22, a quantizer 24, and an entropy encoder 24. Those skilled in the art will appreciate that the coding mode selector 20 determines the appropriate coding mode for the video source, such as whether the target frame/slice is of I, P, or B type, and whether a particular coding unit (e.g., macroblock, coding unit, etc.) within the frame/slice is inter- or intra-coded. The transform processor 22 performs a transform on the spatial domain data. Specifically, the transform processor 22 applies a block-based transform to convert the spatial domain data into spectral components. For example, in many embodiments, a discrete cosine transform (DCT) is used. In some embodiments, other transforms, such as discrete sine transforms, may be used. Depending on the size of the macroblock or coding unit, this block-based transform is performed on a coding unit, macroblock, or sub-block basis. In the H.264 standard, for example, a typical 16×16 macroblock contains 16 4×4 transform blocks, and the DCT process is performed on the 4×4 blocks. In some cases, the transform blocks may be 8×8, meaning there are four transform blocks per macroblock. In other cases, the transform blocks may be of other sizes.In some cases, a 16x16 macroblock may comprise a non-overlapping combination of 4x4 and 8x8 transform blocks.

将基于块的变换应用于像素数据块得到变换域系数的集合。在该上下文中，“集合”是有序集合，在该集合中系数具有系数位置。在一些实例中，变换域系数的集合可被认为是系数的“块”或矩阵。在本文的描述中，短语“变换域系数的集合”或“变换域系数的块”被互换地使用，并且用于指示变换域系数的有序集合。Applying a block-based transform to a block of pixel data results in a set of transform domain coefficients. In this context, a "set" is an ordered set in which a coefficient has a coefficient position. In some examples, a set of transform domain coefficients can be considered a "block" or matrix of coefficients. In the description herein, the phrases "set of transform domain coefficients" or "block of transform domain coefficients" are used interchangeably and are used to refer to an ordered set of transform domain coefficients.

量化器24对变换域系数的集合进行量化。然后，熵编码器26对量化后的系数和关联信息进行编码。The set of transform domain coefficients is quantized by a quantizer 24. The quantized coefficients and associated information are then encoded by an entropy encoder 26.

本文中，可以将量化后的变换域系数的块或矩阵称为“变换单元”(TU)。在一些情况下，TU可以是非正方形的，例如是非正方形的正交变换(NSQT)。A block or matrix of quantized transform domain coefficients may be referred to herein as a “transform unit” (TU). In some cases, a TU may be non-square, such as a non-square quadrature transform (NSQT).

帧内编码的帧/片(即，类型I)不参照其他帧/片进行编码。换言之，它们不采用时间预测。然而，帧内编码的帧依赖于帧/片内的空间预测，如图1中通过空间预测器21进行说明。即，在对特定块编码时，可以将块中的数据与针对该帧/片已经编码的块内邻近像素的数据进行比较。使用预测算法，可以将块的源数据转换为残差数据。然后，变换处理器22对残差数据进行编码。例如，H.264规定了4×4变换块的9种空间预测模式。在一些实施例中，这9种模式中的每一种可以用于独立处理块，然后使用速率(rate)失真优化来选择最佳模式。Intra-coded frames/slices (i.e., Type I) are encoded without reference to other frames/slices. In other words, they do not employ temporal prediction. However, intra-coded frames rely on spatial prediction within the frame/slice, as illustrated in FIG1 by the spatial predictor 21. That is, when encoding a particular block, the data in the block can be compared with the data of neighboring pixels within the block that have already been encoded for that frame/slice. Using a prediction algorithm, the source data for the block can be converted into residual data. The transform processor 22 then encodes the residual data. For example, H.264 specifies 9 spatial prediction modes for 4×4 transform blocks. In some embodiments, each of these 9 modes can be used to process the block independently, and then rate-distortion optimization is used to select the best mode.

H.264标准还规定了使用运动预测/补偿来利用时间预测。相应地，编码器10具有反馈环路，反馈环路包括：解量化器28、逆变换处理器30和解块处理器32。解块处理器32可以包括解块处理器和过滤处理器。这些单元反映了解码器50实现以再现帧/片的解码过程。帧存储器34用于存储再现帧。按照这种方式，运动预测基于在解码器50处重构帧是什么，而不基于原始帧，由于编码/解码中涉及的有损压缩，原始帧可能不同于重构帧。运动预测器36使用帧存储器34中存储的帧/片作为源帧/片，来与当前帧进行比较，以识别相似块。相应地，对于应用运动预测的宏块或编码单元，变换处理器22编码的“源数据”是出自运动预测过程的残差数据。例如，其可以包括关于参考帧、空间置换或“运动矢量”的信息、以及表示参考块与当前块之间的差异(如果存在)的残差像素数据。关于参考帧和/或运动矢量的信息可以不由变换处理器22和/或量化器24处理，而是可以提供给熵编码器26，作为比特流的一部分与量化后的系数一起编码。The H.264 standard also specifies the use of motion prediction/compensation to exploit temporal prediction. Accordingly, encoder 10 has a feedback loop comprising a dequantizer 28, an inverse transform processor 30, and a deblocking processor 32. Deblocking processor 32 may include a deblocking processor and a filtering processor. These units mirror the decoding process implemented by decoder 50 to reproduce frames/slices. Frame memory 34 is used to store reproduced frames. In this manner, motion prediction is based on what the reconstructed frame is at decoder 50, rather than on the original frame, which may differ from the reconstructed frame due to the lossy compression involved in encoding/decoding. Motion predictor 36 uses the frames/slices stored in frame memory 34 as source frames/slices to compare with the current frame to identify similar blocks. Accordingly, for macroblocks or coding units to which motion prediction is applied, the "source data" encoded by transform processor 22 is the residual data from the motion prediction process. For example, this may include information about a reference frame, a spatial permutation or "motion vector," and residual pixel data representing the difference (if any) between the reference block and the current block. Information about reference frames and/or motion vectors may not be processed by the transform processor 22 and/or the quantizer 24, but may be provided to the entropy encoder 26 and encoded as part of the bitstream along with the quantized coefficients.

本领域技术人员将认识到用于实现视频编码器的细节和可能变型。Those skilled in the art will recognize the details and possible variations for implementing a video encoder.

解码器50包括：熵解码器52、解量化器54、逆变换处理器56、空间补偿器57和解块处理器60。解块处理器60可以包括解块和过滤处理器。帧缓冲器58提供重构帧以便应用运动补偿的运动补偿器62使用。空间补偿器57表示根据先前解码块来恢复特定帧内编码块的视频数据的操作。Decoder 50 includes an entropy decoder 52, a dequantizer 54, an inverse transform processor 56, a spatial compensator 57, and a deblocking processor 60. Deblocking processor 60 may include deblocking and filtering processors. Frame buffer 58 provides reconstructed frames for use by motion compensator 62, which applies motion compensation. Spatial compensator 57 represents the operation of recovering the video data of a particular intra-frame coded block based on previously decoded blocks.

熵解码器52接收并解码比特流14，以恢复量化后的系数。在熵解码过程中，还可以恢复辅助信息，如果适用，一些辅助信息可以提供给运动补偿环路，以用于运动补偿。例如，熵解码器52可以恢复运动矢量和/或针对帧间编码宏块的参考帧信息。The entropy decoder 52 receives and decodes the bitstream 14 to recover the quantized coefficients. During the entropy decoding process, auxiliary information may also be recovered. If applicable, some of this auxiliary information may be provided to the motion compensation loop for use in motion compensation. For example, the entropy decoder 52 may recover motion vectors and/or reference frame information for inter-coded macroblocks.

然后，解量化器54对量化后的系数进行解量化，以产生变换域系数，然后，逆变换处理器56对变换域系数进行逆变换，以重建“视频数据”。可以认识到，在一些情况下，如对于帧内编码的宏块或编码单元，重建的“视频数据”是相对于帧内先前解码块的、用于空间补偿的残差数据。空间补偿器57根据残差数据和来自先前解码块的像素数据来产生视频数据。在其他情况下，如对于帧间编码的宏块或编码单元，来自逆变换处理器56的重建“视频数据”是相对于来自不同帧的参考块的、用于运动补偿的残差数据。这里，空间和运动补偿均可以称为“预测操作”。The dequantizer 54 then dequantizes the quantized coefficients to produce transform domain coefficients, which are then inversely transformed by the inverse transform processor 56 to reconstruct the "video data." It will be appreciated that in some cases, such as for intra-coded macroblocks or coding units, the reconstructed "video data" is residual data used for spatial compensation relative to a previously decoded block within the frame. A spatial compensator 57 generates video data based on the residual data and pixel data from the previously decoded block. In other cases, such as for inter-coded macroblocks or coding units, the reconstructed "video data" from the inverse transform processor 56 is residual data used for motion compensation relative to a reference block from a different frame. Both spatial and motion compensation may be referred to herein as "prediction operations."

运动补偿器62在帧缓冲器58内定位专用于特定帧间的编码宏块或编码单元的参考块。运动补偿器62基于专用于帧间编码的宏块或编码单元的参考帧信息和运动矢量来进行该操作。然后，运动补偿器62提供参考块像素数据，以与残差数据组合，得到针对该编码单元/宏块的重构视频数据。The motion compensator 62 locates a reference block within the frame buffer 58 that is specific to the coded macroblock or coding unit for the particular inter. The motion compensator 62 does this based on the reference frame information and the motion vectors specific to the inter-coded macroblock or coding unit. The motion compensator 62 then provides the reference block pixel data to be combined with the residual data to obtain the reconstructed video data for that coding unit/macroblock.

然后，可以对重构帧/片应用解块/过滤过程，如解块处理器60所示。在解块/过滤之后，输出帧/片作为解码视频帧16，例如以在显示设备上显示。可以理解，视频回放机(如计算机、机顶盒、DVD或蓝光播放器和/或移动手持设备)可以在输出设备上显示之前将解码帧缓冲在存储器中。A deblocking/filtering process may then be applied to the reconstructed frame/slice, as shown by a deblocking processor 60. After deblocking/filtering, the frame/slice is output as a decoded video frame 16, for example, to be displayed on a display device. It will be appreciated that a video playback machine (e.g., a computer, set-top box, DVD or Blu-ray player, and/or mobile handheld device) may buffer the decoded frame in memory before displaying it on an output device.

期望兼容MPEG-H的编码器和解码器将具有这些相同或类似特征中的多个特征。It is expected that MPEG-H compliant encoders and decoders will have many of these same or similar features.

对量化后的变换域系数的编码和解码Encoding and decoding of quantized transform domain coefficients

如上面注意到的，对量化后的变换域系数的块或集合的熵编码包括对该量化后的变换域系数的块或集合的有效图(例如，有效系数标记集合)进行编码。该有效图是指示非零系数出现在哪些位置(从DC位置到末位有效系数位置)的块的二进制映射。可以根据扫描顺序(其可以是垂直的、水平的、对角线的、Z字形的、或者适用的编码标准规定的任何其他扫描顺序)将有效图转换成矢量。该扫描通常按照“逆”顺序完成，该“逆”顺序即：从末位有效系数开始，并且按照逆方向工作反向穿过有效图，直到到达左上角[0，0]处的有效系数标记。在本描述中，术语“扫描顺序”旨在表示处理标记、系数或组所依照的顺序(视情况而定)，并且可以包括通俗地称为“逆扫描顺序”的顺序。As noted above, entropy encoding of a block or set of quantized transform-domain coefficients includes encoding a significance map (e.g., a set of significant-coefficient flags) for the block or set of quantized transform-domain coefficients. The significance map is a binary map of the block that indicates where non-zero coefficients occur (from the DC position to the last significant coefficient position). The significance map can be converted into a vector according to a scan order (which can be vertical, horizontal, diagonal, zigzag, or any other scan order specified by the applicable coding standard). The scan is typically performed in "reverse" order, starting with the last significant coefficient and working backwards through the significance map in the reverse direction until the significant-coefficient flag at the upper left corner [0, 0] is reached. In this description, the term "scan order" is intended to refer to the order in which flags, coefficients, or groups are processed (as appropriate), and may include an order colloquially referred to as "reverse scan order."

然后，使用适用的上下文自适应编码机制对每个有效系数标记进行熵编码。例如，在许多应用中，可以使用上下文自适应二进制算术编码(CABAC)机制。Each significant-coefficient flag is then entropy coded using an applicable context-adaptive coding scheme. For example, in many applications, a context-adaptive binary arithmetic coding (CABAC) scheme may be used.

利用16×16和32×32的有效图，有效系数标记的上下文(在大多数情形下)基于相邻有效系数标记值。在针对16×16和32×32的有效图所使用的上下文中，存在专用于比特位置[0，0]和(在一些示例实现中)专用于相邻比特位置的特定上下文，但是有效系数标记中的大多数取基于相邻有效系数标记的累积值的四个或五个上下文中的一个。在这些实例中，对有效系数标记的正确上下文的确定取决于对相邻位置(通常是5个位置，但是在一些实例中可以是更多或更少的位置)处的有效系数标记的值的确定以及求和。With 16x16 and 32x32 significance maps, the context of a significant-coefficient flag is (in most cases) based on the values of neighboring significant-coefficient flags. Among the contexts used for 16x16 and 32x32 significance maps, there are specific contexts dedicated to bit position [0, 0] and (in some example implementations) to neighboring bit positions, but the majority of significant-coefficient flags take one of four or five contexts based on the cumulative values of neighboring significant-coefficient flags. In these examples, determining the correct context for a significant-coefficient flag depends on determining and summing the values of the significant-coefficient flags at neighboring positions (typically five positions, but in some examples, there could be more or fewer positions).

于是，可以对那些非零系数的有效系数级别进行编码。在一个示例实现中，可以通过下述方式对所述级别进行编码：首先对绝对值级别大于1的那些非零系数构成的图进行编码。然后，可以对级别大于2的那些非零系数构成的另一图进行编码。然后，对绝对值大于2的系数中的任何系数的值或级别进行编码。在一些情形下，被编码的值可以是实际值减去3。The significant coefficient levels of those non-zero coefficients can then be encoded. In one example implementation, the levels can be encoded by first encoding a map of those non-zero coefficients whose absolute value level is greater than 1. Then, another map of those non-zero coefficients whose level is greater than 2 can be encoded. Then, the value or level of any coefficient whose absolute value is greater than 2 is encoded. In some cases, the encoded value can be the actual value minus 3.

还对非零系数的符号进行编码。每个非零系数具有指示该非零系数的级别是负还是正的符号位。已经做出隐藏变换单元中的第一系数的符号位的提议：Clare，Gordon，etal.，“Sign Data Hiding”，JCTVC-G271，7^th Meeting，Geneva，21-30 November，2011。在该提议下，通过变换单元中的量化后的系数的总和的奇偶校验来对变换单元中的第一系数的符号进行编码。在该奇偶校验不与第一系数的实际符号对应的情况下，编码器必须将系数之一的级别向上或向下进行值为1的调整，以便调整该奇偶校验。将使用RDOQ来确定要调整哪个系数以及在哪个方向上调整。The signs of nonzero coefficients are also encoded. Each nonzero coefficient has a sign bit that indicates whether the level of the nonzero coefficient is negative or positive. A proposal has been made to hide the sign bit of the first coefficient in a transform unit: Clare, Gordon, et al., "Sign Data Hiding," JCTVC-G271, ^7th Meeting, Geneva, November 21-30, 2011. Under this proposal, the sign of the first coefficient in a transform unit is encoded by a parity check of the sum of the quantized coefficients in the transform unit. If this parity check does not correspond to the actual sign of the first coefficient, the encoder must adjust the level of one of the coefficients up or down by 1 in order to adjust the parity check. RDOQ is used to determine which coefficient to adjust and in which direction.

前面的工作关注于使用多级有效图。现在参考图3，图3示出了按照多级对角线扫描顺序示出的16x16的变换单元100。变换单元100被分割成16个连续的4×4系数组或“有效系数标记集合”。在每个系数组内，对角线扫描顺序被应用在该组内，而不是横跨整个变换单元600。集合或系数组自身按照扫描顺序进行处理，在该示例实现中也是对角线扫描顺序。应该注意，该示例中的扫描顺序被示为“逆”扫描顺序；也即，该扫描顺序被示出为从右下系数组开始按向左下的对角线方向朝着左上系数组前进。在一些实现中，可以沿另一方向定义相同的扫描顺序；也即，按向右上的对角线方向前进，并且当在编码或解码期间应用时，可按照“逆”扫描顺序应用。Previous work has focused on the use of multi-level significance maps. Referring now to FIG. 3 , FIG. 3 shows a 16x16 transform unit 100 illustrated in a multi-level diagonal scan order. The transform unit 100 is partitioned into 16 consecutive 4x4 coefficient groups or "significant-coefficient flag sets." Within each coefficient group, a diagonal scan order is applied within the group, rather than across the entire transform unit 600. The sets or coefficient groups themselves are processed in a scan order, which in this example implementation is also a diagonal scan order. It should be noted that the scan order in this example is shown as a "reverse" scan order; that is, the scan order is shown starting from the lower-right coefficient group and proceeding diagonally from the lower left to the upper-left coefficient group. In some implementations, the same scan order can be defined in the other direction; that is, proceeding diagonally from the upper right, and when applied during encoding or decoding, can be applied in a "reverse" scan order.

对多级有效图的使用涉及对L1或更高级别的有效图的编码，所述L1或更高级别的有效图指示哪些系数组可被期望包含非零有效系数标记以及哪些系数组包含全零有效系数标记。对可被期望包含非零有效系数标记的系数组的有效系数标记进行编码，而不对包含全零有效系数标记的系数组进行编码(除非它们是由于假定它们包含至少一个非零有效系数标记的特殊情形例外而被编码的组)。每个系数组具有有效系数组标记(除非适用系数组具有假定值的标记的特殊情形，诸如包含末位有效系数的组、左上的组，等等)。The use of multi-level significance maps involves encoding a significance map at level L1 or higher that indicates which coefficient groups are expected to contain non-zero significant-coefficient flags and which coefficient groups contain all-zero significant-coefficient flags. The significant-coefficient flags of coefficient groups that are expected to contain non-zero significant-coefficient flags are encoded, while coefficient groups containing all-zero significant-coefficient flags are not encoded (unless they are encoded due to the special case exception that they are assumed to contain at least one non-zero significant-coefficient flag). Each coefficient group has a significant-coefficient-group flag (unless special cases apply where a coefficient group has a flag with an assumed value, such as the group containing the last significant coefficient, the group to the top left, etc.).

对于编码和解码而言，对多级有效图的使用有利于对残差数据的模块化处理。For encoding and decoding, the use of multi-level significance maps facilitates modular processing of residual data.

较大的TU给出隐藏多个符号位的机会。TU可被分割或划分成非零系数集合，并且针对每个非零系数集合，使用该集合中的非零系数的总和的奇偶校验，可以隐藏符号位。在一个实施例中，可以使得非零系数集合与针对多级有效图定义的系数组相对应。Larger TUs provide the opportunity to hide multiple sign bits. The TU can be split or divided into sets of non-zero coefficients, and for each set of non-zero coefficients, the sign bit can be hidden using a parity check of the sum of the non-zero coefficients in the set. In one embodiment, the non-zero coefficient sets can be made to correspond to the coefficient groups defined for the multi-level significance map.

无论数据类型如何，都可以使用单个阈值来确定针对特定非零系数集合是否隐藏符号位。在一个示例中，阈值测试是基于该集合中的第一非零系数与最后一个非零系数之间的系数的数目。也即，在该集合的第一和最后一个非零系数之间的系数的数目是否至少等于阈值数目。在另一示例中，该测试可以是基于在该集合中的非零系数的数目是否至少等于阈值数目。在又一个实施例中，该测试可以是基于在该集合中的非零系数的绝对值的总和超出阈值。在又一个实施例中，可以应用这些测试的组合；也即，在该集合中必须有至少最小数目的系数并且系数的累积绝对值必须超过阈值。还可以使用这些阈值测试的变形。Regardless of the data type, a single threshold can be used to determine whether to hide the sign bit for a particular set of non-zero coefficients. In one example, the threshold test is based on the number of coefficients between the first and last non-zero coefficients in the set. That is, whether the number of coefficients between the first and last non-zero coefficients in the set is at least equal to a threshold number. In another example, the test can be based on whether the number of non-zero coefficients in the set is at least equal to a threshold number. In yet another embodiment, the test can be based on whether the sum of the absolute values of the non-zero coefficients in the set exceeds a threshold. In yet another embodiment, a combination of these tests can be applied; that is, there must be at least a minimum number of coefficients in the set and the cumulative absolute value of the coefficients must exceed a threshold. Variations of these threshold tests can also be used.

现在参考图4，其示出了示例16×16的变换单元120。变换单元120被分割成4×4的系数组，即16个系数集合。这些系数组按其处理顺序(例如逆对角线扫描顺序)被编号为1、2、3...16。Reference is now made to Figure 4, which shows an example 16x16 transform unit 120. The transform unit 120 is partitioned into 4x4 coefficient groups, i.e., 16 coefficient sets. These coefficient groups are numbered 1, 2, 3...16 in the order in which they are processed (e.g., inverse diagonal scan order).

在第一实施例中，每个系数组是用于符号位隐藏目的的系数集合。也即，相对阈值对每个系数组进行测试，以确定该系数组是否适用于符号位隐藏。如上面提到的，该测试可以是：该系数组包含在该系数组内的第一非零系数和最后一个非零系数之间的至少最小数目的系数。In a first embodiment, each coefficient group is a collection of coefficients for sign bit hiding purposes. That is, each coefficient group is tested against a threshold to determine whether the coefficient group is suitable for sign bit hiding. As mentioned above, the test can be that the coefficient group contains at least a minimum number of coefficients between the first non-zero coefficient and the last non-zero coefficient in the coefficient group.

在第二实施例中，用于符号位隐藏的系数集合是通过对系数组进行分组来形成的。图5示出了16×16的TU 140，在其上示出了将系数组分成四个系数集合的示例分组。在该示例中，用于符号位隐藏目的的每个系数集合包含四个系数组。每个集合中的四个系数组是按扫描顺序的连续的组。例如，第一系数集合142包含系数组16、15、14和13。第二系数集合144包含系数组12、11、10和9。第三系数集合146包含系数组8、7、6和5。最后，第四系数集合148包含系数组4、3、2和1。在该实施例中，可以针对每个系数集合，隐藏符号位。也即，每个TU 140，可以隐藏多达4个符号位。In a second embodiment, the coefficient sets used for sign bit hiding are formed by grouping coefficient groups. FIG5 shows a 16×16 TU 140, on which an example grouping of coefficient groups into four coefficient sets is shown. In this example, each coefficient set used for sign bit hiding purposes contains four coefficient groups. The four coefficient groups in each set are consecutive groups in scan order. For example, the first coefficient set 142 contains coefficient groups 16, 15, 14, and 13. The second coefficient set 144 contains coefficient groups 12, 11, 10, and 9. The third coefficient set 146 contains coefficient groups 8, 7, 6, and 5. Finally, the fourth coefficient set 148 contains coefficient groups 4, 3, 2, and 1. In this embodiment, the sign bit can be hidden for each coefficient set. That is, up to 4 sign bits can be hidden for each TU 140.

针对每个系数集合142、144、146和148，将第一和最后一个非零系数之间的系数数目(或者非零系数的数目，或者那些系数的累积总值)相对于阈值进行测试，以确定是否针对该集合隐藏符号位。该集合中的那些系数的绝对值的总和的奇偶校验通过其隐藏符号位的机制。如果该奇偶校验与要隐藏的符号不对应，则通过对该集合中的系数之一的级别进行调整来调整该奇偶校验。For each coefficient set 142, 144, 146, and 148, the number of coefficients between the first and last non-zero coefficients (or the number of non-zero coefficients, or the cumulative total of those coefficients) is tested against a threshold to determine whether to hide the sign bit for that set. A parity check of the sum of the absolute values of the coefficients in the set is passed through the mechanism by which the sign bit is hidden. If the parity check does not correspond to the sign to be hidden, the parity check is adjusted by adjusting the level of one of the coefficients in the set.

图6示出了利用16×16的TU150进行符号位隐藏的系数集合的第三实施例。在该实施例中，集合同样是基于系数组来形成的，但是集合不一定包含相同数目的系数或系数组。例如，在该示图中，定义了五个系数集合。第一集合152包含系数组1到6。第二集合154包含四个系数组7、8、9和10。第三集合156包含系数组11、12和13。第四集合158包含系数组14和15。第五集合159仅包含左上系数组16。应该理解，该实施例在变换单元150中的很可能存在较少非零系数的区域提供较大的系数集合，以及在变换单元150中通常存在较多非零系数的区域提供较小的系数集合。注意，上述实施例可应用于32×32或者更大的TU大小以及8×8的TU大小，只要系数组结构适用于那些TU。FIG6 illustrates a third embodiment of coefficient sets for sign bit hiding using a 16×16 TU 150. In this embodiment, sets are also formed based on coefficient groups, but the sets do not necessarily contain the same number of coefficients or coefficient groups. For example, in this diagram, five coefficient sets are defined. A first set 152 contains coefficient groups 1 through 6. A second set 154 contains four coefficient groups 7, 8, 9, and 10. A third set 156 contains coefficient groups 11, 12, and 13. A fourth set 158 contains coefficient groups 14 and 15. A fifth set 159 contains only the upper left coefficient group 16. It should be understood that this embodiment provides larger coefficient sets in areas of the transform unit 150 where fewer non-zero coefficients are likely to exist, and smaller coefficient sets in areas of the transform unit 150 where more non-zero coefficients are typically present. Note that the above embodiment is applicable to TU sizes of 32×32 or larger, as well as TU sizes of 8×8, as long as the coefficient group structure is applicable to those TUs.

图7示出了第四实施例，其中用于8×8的变换单元160内的符号位隐藏的系数集合是在不坚持系数组结构的情况下形成的。8×8的变换单元可以具有也可以不具有出于有效图编码的目的进行的系数组分割。在任何情况下，在该实施例中，针对符号位编码和隐藏，使用基于变换单元的对角线扫描来处理系数。在该情形下，系数集合被形成为使得按扫描顺序对连续的系数进行分组。例如，在该示图中，变换单元160被分组成四个系数集合，每个系数集合包含按扫描顺序的16个连续的系数。在图7中这些组被标记为162、164、166和168。FIG7 illustrates a fourth embodiment in which coefficient sets for sign bit hiding within an 8×8 transform unit 160 are formed without adhering to a coefficient group structure. An 8×8 transform unit may or may not have coefficient group partitioning for significance map coding purposes. In any case, in this embodiment, coefficients are processed using a transform unit-based diagonal scan for sign bit coding and hiding. In this case, the coefficient sets are formed so that consecutive coefficients are grouped in scan order. For example, in this diagram, the transform unit 160 is grouped into four coefficient sets, each containing 16 consecutive coefficients in scan order. These groups are labeled 162, 164, 166, and 168 in FIG7 .

在又一实施例中，系数集合可以不坚持扫描顺序。也即，每个集合可以包括来自变换单元中的较高频率位置的一些系数和来自变换单元中的较低频率位置的一些系数。这些集合中的所有系数在扫描顺序上可能不一定是相邻的。In yet another embodiment, the coefficient sets may not adhere to the scan order. That is, each set may include some coefficients from higher frequency positions in the transform unit and some coefficients from lower frequency positions in the transform unit. All coefficients in these sets may not necessarily be adjacent in the scan order.

图8示出了第五实施例，其中用于16×16的TU 170内的符号位隐藏的系数集合是使用系数组结构和扫描顺序而动态形成的。在该实施例中，编码器和解码器通过服从扫描顺序且跟踪相对于阈值测量的量直到达到该阈值来形成集合，而不是基于变换单元大小和扫描顺序定义固定的系数集合。一旦达到阈值，则针对编码器和解码器当时正在处理的系数组，隐藏符号位。FIG8 shows a fifth embodiment in which the coefficient sets for sign bit hiding within a 16×16 TU 170 are dynamically formed using the coefficient group structure and scan order. In this embodiment, rather than defining fixed sets of coefficients based on transform unit size and scan order, the encoder and decoder form sets by following the scan order and tracking quantities measured relative to a threshold until the threshold is reached. Once the threshold is reached, the sign bit is hidden for the coefficient group that the encoder and decoder are currently processing.

作为示例，图8示出了系数组[2，2]内的末位有效系数。按照扫描顺序，编码器和解码器然后依次移动到系数组[1，3]、[3，0]和[2，1]。当处理系数组[2，1]中的系数时，达到阈值。因此，系数组[2，1]中的按逆扫描顺序要处理的最后一个非零系数的符号位被隐藏在从末位有效系数(LSC)开始到贯穿当前系数组[2，1]且包括当前系数组[2，1]内的所有系数的系数绝对值的累积值的奇偶校验中。该示例中的阈值测试可以基于存在最少数目的非零系数，或者基于系数的绝对值超过某个阈值。参考标号174指示关于在特定系数组中的“最后一个”或最左上角的系数的符号位隐藏操作。As an example, Figure 8 shows the last significant coefficient in coefficient group [2, 2]. In the scan order, the encoder and decoder then move to coefficient groups [1, 3], [3, 0] and [2, 1] in sequence. When processing the coefficient in coefficient group [2, 1], the threshold is reached. Therefore, the sign bit of the last non-zero coefficient to be processed in the reverse scan order in coefficient group [2, 1] is hidden in the parity check of the cumulative value of the absolute value of the coefficients starting from the last significant coefficient (LSC) and running through and including all coefficients in the current coefficient group [2, 1]. The threshold test in this example can be based on the presence of a minimum number of non-zero coefficients, or based on the absolute value of the coefficient exceeding a certain threshold. Reference numeral 174 indicates the sign bit hiding operation with respect to the "last" or upper left-most coefficient in a particular coefficient group.

在第六实施例中，符号位隐藏是基于系数组来完成的，并且用于确定系数组是否适合于进行符号位隐藏的标准是根据先前解码的系数组动态调整的。作为示例，如果紧靠其右方的系数组或紧靠其底部的系数组具有非零系数，则只要当前系数组包含最少两个非零系数，就确定当前系数组适合于进行符号位隐藏。如果在系数组内的第一非零系数和最后一个非零系数之间至少包含最小数目的系数，则也可确定该系数组是适合的。In a sixth embodiment, sign bit hiding is performed on a coefficient group basis, and the criteria for determining whether a coefficient group is suitable for sign bit hiding is dynamically adjusted based on previously decoded coefficient groups. As an example, if the coefficient group immediately to the right or the coefficient group immediately below it has non-zero coefficients, then the current coefficient group is determined to be suitable for sign bit hiding, as long as the current coefficient group contains at least two non-zero coefficients. A coefficient group can also be determined to be suitable if there are at least a minimum number of coefficients between the first and last non-zero coefficients in the coefficient group.

应该理解，在前述实施例中的一些实施例中，一个系数组中的符号位隐藏可以基于依赖于另一系数组中的系数的奇偶校验值来实现。换言之，一个系数组中的系数的符号值可被隐藏在通过对另一系数组中的系数的级别改变进行的奇偶校验中。It should be understood that in some of the aforementioned embodiments, the sign bit hiding in one coefficient group can be implemented based on a parity check value that depends on the coefficients in another coefficient group. In other words, the sign value of the coefficients in one coefficient group can be hidden in the parity check performed by changing the level of the coefficients in another coefficient group.

此外，应该理解，在前述实施例中的一些实施例中，系数集合中的被隐藏的符号位可以是来自不同语法单元，如运动矢量差标记(例如，mvd_sign_flag)。Furthermore, it should be understood that in some of the aforementioned embodiments, the concealed sign bit in the coefficient set may be from a different syntax unit, such as a motion vector difference flag (eg, mvd_sign_flag).

在编码器侧，做出关于在奇偶校验值与符号不对应的情况下为了隐藏符号位要调整哪个系数的判决。在奇偶校验值需要调整的情况下，为了改变该奇偶校验，必须将系数级别加1或者减1。On the encoder side, a decision is made as to which coefficient to adjust in order to hide the sign bit if the parity value does not correspond to the sign. In the event that the parity value needs to be adjusted, the coefficient level must be increased or decreased by 1 in order to change the parity.

在一个实施例中，调节系数级别的过程中的第一步骤是确定搜索范围，即按扫描顺序的开始位置和结束位置。然后，对该范围内的系数进行评估，并且选择一个系数以进行改变。在一个示例性实施例中，该搜索范围可以是从按扫描顺序的第一非零系数到最后一个系数。In one embodiment, the first step in the process of adjusting the coefficient levels is to determine a search range, i.e., a start position and an end position in the scan order. The coefficients within this range are then evaluated and one is selected for change. In an exemplary embodiment, the search range can be from the first non-zero coefficient to the last coefficient in the scan order.

利用多级有效图，可以改变针对子集的搜索范围的结束位置，以利用块级信息。具体地，如果一个子集包含刚好是整个TU中的最后一个非零系数，(所谓的全局中的最后一个，或者末位有效系数)，可以将搜索范围确立为从第一非零系数到最后一个非零系数。对于其他子集，搜索范围可被扩展到从第一非零系数到当前子块的结束的范围。With multi-level significance maps, the end position of the search range for a subset can be changed to take advantage of block-level information. Specifically, if a subset contains the last non-zero coefficient in the entire TU (the so-called last or last significant coefficient in the world), the search range can be established from the first non-zero coefficient to the last non-zero coefficient. For other subsets, the search range can be extended to the range from the first non-zero coefficient to the end of the current subblock.

在一个实施例中，开始位置可被扩展为：有条件地包括在第一非零的量化后的系数前面的未量化的系数。特别地，考虑量化前的所有系数。具有与要隐藏的符号相同的符号的未量化的系数将被包括在搜索中。对于从位置0到第一非零的量化后的系数的位置的未量化的系数，将在该搜索中评估和测试将量化后的系数从0改变到1的代价。In one embodiment, the starting position can be expanded to conditionally include the unquantized coefficients preceding the first non-zero quantized coefficient. In particular, all coefficients before quantization are considered. Unquantized coefficients with the same sign as the sign to be hidden will be included in the search. For the unquantized coefficients from position 0 to the position of the first non-zero quantized coefficient, the cost of changing the quantized coefficient from 0 to 1 will be evaluated and tested in the search.

用于调整系数级别的过程中的另一问题是定义用于评估调整的影响的代价计算。当关心计算复杂度时，代价可以是基于失真并且不考虑速率，在该情况下搜索是要最小化失真。另一方面，当计算复杂度不重要时，代价可以包括速率和失真，使得最小化速率失真代价。Another issue in the process of adjusting coefficient levels is defining a cost calculation to evaluate the impact of the adjustment. When computational complexity is a concern, the cost can be based on distortion and not rate, in which case the search is to minimize distortion. On the other hand, when computational complexity is unimportant, the cost can include both rate and distortion, minimizing the rate-distortion cost.

如果启用RDOQ，则RDOQ可被用于调整级别。然而，在许多情况下，RDOQ的计算复杂度可能是不利的，并且可能不启用RDOQ。因此，在一些实施例中，在编码器处可以应用简化的速率-失真分析以实现符号位隐藏。If RDOQ is enabled, then RDOQ can be used to adjust the level. However, in many cases, the computational complexity of RDOQ may be unfavorable, and RDOQ may not be enabled. Therefore, in some embodiments, a simplified rate-distortion analysis can be applied at the encoder to achieve sign bit hiding.

可以通过粗略计算因为系数加1造成的失真和因为将系数减1造成的失真，对在该集合中的第一非零系数与该集合中的最后一个非零系数之间的每个系数进行测试。一般而言，系数值u具有实际值u+δ。由(δq)²给出失真。如果该系数u通过加1向上调整为u+1，则得到的失真可以估计如下：Each coefficient between the first non-zero coefficient in the set and the last non-zero coefficient in the set can be tested by roughly calculating the distortion caused by adding 1 to the coefficient and the distortion caused by subtracting 1 from the coefficient. In general, the coefficient value u has an actual value of u + δ. The distortion is given by (δq) ^2. If the coefficient u is adjusted upward to u + 1 by adding 1, the resulting distortion can be estimated as follows:

q²(1-2δ)q ² (1-2δ)

如果系数u是通过减1向下调整为u-1，则得到的失真可以估计如下：If the coefficient u is scaled down to u-1 by subtracting 1, the resulting distortion can be estimated as follows:

q²(1+2δ)q ² (1+2δ)

应该认识到，对于块间编码(inter-coded)的情形，当RDOQ关闭时，量化失真δ在[-1/6到+5/6]的范围内。在块内编码(intra-coded)块的情形下，当RDOQ关闭时，量化失真δ在[-1/3到+2/3]的范围内。当RDOQ开启时，δ的范围将发生变化。然而，上面的失真增长的计算仍然有效，无论δ的范围是多少。It should be appreciated that for inter-coded blocks, when RDOQ is off, the quantization distortion δ is in the range of [-1/6 to +5/6]. In the case of intra-coded blocks, when RDOQ is off, the quantization distortion δ is in the range of [-1/3 to +2/3]. When RDOQ is on, the range of δ will change. However, the calculation of distortion growth above remains valid regardless of the range of δ.

编码器还使用逻辑规则集合(即，预定的速率代价标准)做出关于各个系数的速率代价的粗略估计。例如，在一个实施例中，预定的速率代价标准可以包括：The encoder also uses a set of logical rules (i.e., a predetermined rate cost criterion) to make a rough estimate of the rate cost of each coefficient. For example, in one embodiment, the predetermined rate cost criterion may include:

u+1(u≠0和u≠-1)→0.5比特u+1 (u≠0 and u≠-1) → 0.5 bits

u-1(u≠0和u≠+1)→-0.5比特u-1 (u≠0 and u≠+1) → -0.5 bits

u＝1或-1并且改变到0→-1-0.5-0.5比特u=1 or -1 and changes to 0→-1-0.5-0.5 bits

u＝0并且改变到1或-1→1+0.5+0.5比特u=0 and changes to 1 or -1→1+0.5+0.5 bits

其中，符号标记的代价被估计为1比特，有效系数标记的代价被估计为0.5个比特，以及从u到u+1的代价增长被估计为0.5个比特。Among them, the cost of sign marking is estimated to be 1 bit, the cost of significant coefficient marking is estimated to be 0.5 bit, and the cost increase from u to u+1 is estimated to be 0.5 bit.

在其他实施例中可以使用其他规则或估计。Other rules or estimates may be used in other embodiments.

现在参考图9，其示出了具有基于系数组的符号位隐藏的、对视频数据进行解码的示例过程200。过程200基于上面描述的第二实施例。在研究本说明之后，本领域技术人员将明白对过程200的替代和修改，以实现所描述的其他实施例。Referring now to FIG9 , there is shown an example process 200 for decoding video data with coefficient group-based sign bit hiding. The process 200 is based on the second embodiment described above. After studying this description, those skilled in the art will readily appreciate alternatives and modifications to the process 200 to implement the other embodiments described.

在操作202中，设置阈值。在一些实施例中，该阈值可以是在解码器中预定的或者预先配置的。在其他实施例中，该值可以是从编码视频数据的比特流中抽取的。例如，阈值可以在图像头中，或者在比特流内的其他位置。In operation 202, a threshold is set. In some embodiments, the threshold may be predetermined or preconfigured in the decoder. In other embodiments, the value may be extracted from the bitstream of the encoded video data. For example, the threshold may be in a picture header or elsewhere in the bitstream.

在操作204中，解码器按扫描顺序识别当前系数组(即系数集合)中的第一非零位置，以及当前系数组中的最后一个非零位置。然后，确定按扫描顺序在该系数组中的第一和最后一个非零系数之间的系数的数目。In operation 204, the decoder identifies the first non-zero position in the current coefficient group (i.e., set of coefficients) in scan order and the last non-zero position in the current coefficient group. The number of coefficients between the first and last non-zero coefficients in the coefficient group in scan order is then determined.

在操作206中，解码器从比特流中解码出符号位。针对除了该系数组中的最左上的非零系数(逆扫描顺序中的最后一个非零系数)之外的该系数组中的每个非零系数，解码符号位。符号位被应用到其各自的非零系数。例如，如果适用的惯例是值为0的符号位表示“正”以及值为1的符号位表示“负”，则对于设置为1的所有符号位，对应的系数级别为“负”。In operation 206, the decoder decodes the sign bit from the bitstream. A sign bit is decoded for each non-zero coefficient in the coefficient group except for the top-left non-zero coefficient in the coefficient group (the last non-zero coefficient in the reverse scan order). The sign bit is applied to its respective non-zero coefficient. For example, if the applicable convention is that a sign bit with a value of 0 represents "positive" and a sign bit with a value of 1 represents "negative", then for all sign bits set to 1, the corresponding coefficient level is "negative".

在操作208中，解码器评估：该系数组中按扫描顺序在第一非零系数和最后一个非零系数之间的系数的数目是否超过阈值。如果不超过，则在编码器处没使用符号位隐藏，因此在操作210中解码器解码出最左上的非零系数(逆扫描顺序中的最后一个)，并且将它应用到系数级别。如果系数数目达到阈值，则在操作212中解码器评估当前系数组中的系数的总和的绝对值是偶数还是奇数(即，其奇偶校验)。如果是偶数，则最左上的非零系数的符号为正，并且解码器不需要对它进行调节。如果是奇数，则最左上的非零系数的符号为负，因此在操作214中使得该系数为负。In operation 208, the decoder evaluates whether the number of coefficients between the first non-zero coefficient and the last non-zero coefficient in the coefficient group in scan order exceeds a threshold. If not, sign hiding was not used at the encoder, so in operation 210 the decoder decodes the top-left non-zero coefficient (the last one in reverse scan order) and applies it to the coefficient level. If the number of coefficients reaches the threshold, the decoder evaluates in operation 212 whether the absolute value of the sum of the coefficients in the current coefficient group is even or odd (i.e., its parity check). If it is even, the sign of the top-left non-zero coefficient is positive, and the decoder does not need to adjust it. If it is odd, the sign of the top-left non-zero coefficient is negative, so in operation 214, the coefficient is made negative.

在操作216中，解码器确定是否已经完成对系数组的处理。如果完成，则退出过程200。否则，在操作218中按组扫描顺序移动到下一个系数组，并且返回操作204。In operation 216, the decoder determines whether it has completed processing the coefficient group. If so, the process exits 200. Otherwise, it moves to the next coefficient group in the group scan order in operation 218 and returns to operation 204.

在一个实施例中，系数集合的大小可以减小到单个系数。也即，符号位隐藏可以是基于单个系数的符号隐藏。在该实施例中，测试每个系数，以检查是否隐藏其符号信息。一个示例测试是将系数级别的大小与给定阈值进行比较。对于具有大于阈值的级别的系数，隐藏其符号位；否则使用传统的符号位编码/解码。In one embodiment, the size of the coefficient set can be reduced to a single coefficient. That is, sign bit hiding can be performed on a single coefficient basis. In this embodiment, each coefficient is tested to see if its sign information is hidden. An example test is to compare the magnitude of the coefficient level to a given threshold. For coefficients with a level greater than the threshold, their sign bit is hidden; otherwise, conventional sign bit encoding/decoding is used.

为了在单个系数的情形下应用符号位隐藏，将符号信息与系数级别的奇偶校验进行比较。作为示例，奇偶校验为偶数可以对应于正号，而奇数可以对应于负号。如果级别与符号不对应，则编码器调整该级别。应该理解，该技术隐含了：在阈值之上，所有负级别为奇数，而所有正级别为偶数。在某种意义上，这可被认为实际上对系数的量化步骤大小的修改具有大于阈值的量。To apply sign bit hiding in the case of a single coefficient, the sign information is compared to the parity of the coefficient level. As an example, an even parity may correspond to a positive sign, while an odd parity may correspond to a negative sign. If the level does not correspond to the sign, the encoder adjusts the level. It should be understood that this technique implies that above a threshold, all negative levels are odd, and all positive levels are even. In a sense, this can be considered to actually modify the quantization step size of the coefficient by an amount greater than the threshold.

用于实现符号位隐藏的示例语法提供如下。该示例语法仅是一种可能实现。在该示例中，符号位隐藏是基于系数组来应用的，并且阈值测试是基于从系数组中的第一非零系数到系数组中的最后一个非零系数的系数数目。在图片头中发送标为sign_data_hiding的标记，该标记指示符号位隐藏是否启用。如果启用，则该图片头还包含参数tsig，该参数tsig是阈值。示例语法阐述如下：An example syntax for implementing sign bit hiding is provided below. This example syntax is just one possible implementation. In this example, sign bit hiding is applied on a coefficient group basis, and the threshold test is based on the number of coefficients from the first non-zero coefficient in the coefficient group to the last non-zero coefficient in the coefficient group. A flag labeled sign_data_hiding is sent in the picture header that indicates whether sign bit hiding is enabled. If enabled, the picture header also contains a parameter tsig, which is the threshold value. The example syntax is set forth below:

下面的伪码示出基于系数组的符号位隐藏的一个示例实现：The following pseudocode shows an example implementation of coefficient group-based sign bit hiding:

现在参照图10，图10示出了编码器900的示例实施例的简化框图。编码器900包括：处理器902、存储器904和编码应用906。编码应用906可以包括存储在存储器904中并包含指令的计算机程序或应用，所述指令用于将处理器902配置为执行这里描述的操作。例如，编码应用906可以编码并输出根据这里描述的过程所编码的比特流。可以理解，编码应用906可以存储在计算机可读介质上，如致密光盘、闪存设备、随机存取存储器、硬盘等等。Referring now to FIG. 10 , a simplified block diagram of an example embodiment of an encoder 900 is shown. The encoder 900 includes a processor 902, a memory 904, and an encoding application 906. The encoding application 906 may comprise a computer program or application stored in the memory 904 and containing instructions for configuring the processor 902 to perform the operations described herein. For example, the encoding application 906 may encode and output a bitstream encoded according to the process described herein. It will be appreciated that the encoding application 906 may be stored on a computer-readable medium, such as a compact disc, a flash memory device, random access memory, a hard drive, and the like.

现在还参照图11，图11示出了解码器1000的示例实施例的简化框图。解码器1000包括：处理器1002、存储器1004和解码应用1006。解码应用1006可以包括存储在存储器1004中并包含指令的计算机程序或应用，所述指令用于将处理器1002配置为执行这里描述的操作。解码应用1006可以包括：配置为如本文所描述的至少部分基于重构有效系数标记来重构残差的熵解码器。可以理解，解码应用1006可以存储在计算机可读介质上，如致密光盘、闪存设备、随机存取存储器、硬盘等等。Referring now also to FIG. 11 , a simplified block diagram of an example embodiment of a decoder 1000 is shown. Decoder 1000 includes a processor 1002, a memory 1004, and a decoding application 1006. Decoding application 1006 may comprise a computer program or application stored in memory 1004 and containing instructions for configuring processor 1002 to perform the operations described herein. Decoding application 1006 may comprise an entropy decoder configured to reconstruct the residual based at least in part on reconstructed significant-coefficient flags as described herein. It will be appreciated that decoding application 1006 may be stored on a computer-readable medium, such as a compact disc, a flash memory device, random access memory, a hard drive, and the like.

可以认识到，根据本申请的解码器和/或编码器可以在多个计算设备中实现，包括但不限于服务器、合适编程的通用计算机、音频/视频编码和回放设备、电视机顶盒、电视广播设备和移动设备。可以通过包含指令的软件来实现解码器或编码器，所述指令用于将处理器配置为执行这里描述的功能。软件指令可以存储在任何合适的非瞬时计算机可读存储器上，包括CD、RAM、ROM、闪存等等。It will be appreciated that decoders and/or encoders according to the present application can be implemented in multiple computing devices, including but not limited to servers, general-purpose computers of suitable programming, audio/video encoding and playback equipment, television set-top boxes, television broadcasting equipment, and mobile devices. Decoders or encoders can be implemented by software comprising instructions, which are used to configure a processor to perform functions described herein. Software instructions can be stored on any suitable non-transient computer-readable storage device, including CD, RAM, ROM, flash memory, etc.

可以理解，这里描述的编码器以及实现所描述的用于配置编码器的方法/过程的模块、例程、进程、线程或其他软件组件可以使用标准计算机编程技术和语言来实现。本申请不限于特定处理器、计算机语言、计算机编程惯例、数据结构、其他这种实现细节。本领域技术人员将认识到，可以将所描述的过程实现为存储在易失性或非易失性存储器中的计算机可执行代码的一部分、专用集成芯片(ASIC)的一部分等。It will be appreciated that the encoders described herein and the modules, routines, processes, threads, or other software components that implement the described methods/processes for configuring the encoders can be implemented using standard computer programming techniques and languages. This application is not limited to a particular processor, computer language, computer programming conventions, data structures, or other such implementation details. Those skilled in the art will recognize that the described processes can be implemented as part of a computer executable code stored in a volatile or non-volatile memory, as part of an application specific integrated circuit (ASIC), or the like.

可以对所描述的实施例进行特定适配和修改。因此，上述实施例被认为是示意性而非限制性。Certain adaptations and modifications may be made to the described embodiments.Therefore, the above embodiments are to be considered as illustrative rather than restrictive.

Claims

1. A method for decoding a bitstream of encoded video by reconstructing the coefficients of transform units, wherein the transform units are divided into non-overlapping coefficient groups, each non-overlapping coefficient group contains a corresponding set of coefficients, and each non-zero coefficient has a sign bit indicating whether the coefficient is positive or negative, the method comprising:

For each of at least two sets of coefficients in the transform unit:

Decode the sign bits of all non-zero coefficients in the coefficient group except for one non-zero coefficient;

Summing the absolute values of the coefficients in the coefficient set; and

Based on whether the absolute value is even or odd, a sign is assigned to the non-zero coefficients in the coefficient group whose sign bits have not been decoded.

2. The method according to claim 1, further comprising:

For each coefficient group in the transform unit, determine whether to decode the sign bits of all non-zero coefficients in that coefficient group, or whether that coefficient group is one of the at least two coefficient groups.

3. The method of claim 2, wherein determining for each coefficient group further comprises: determining whether the number of coefficients between the first non-zero coefficient and the last non-zero coefficient in the coefficient group in the scanning order exceeds a threshold.

4. The method of claim 2, wherein determining each coefficient group further comprises: determining whether adjacent coefficient groups contain at least one non-zero coefficient.

5. The method according to claim 4, wherein the adjacent coefficient group includes any one of the coefficient group to the right of the current coefficient group and the coefficient group below the current coefficient group.

6. The method of claim 2, wherein determining comprises: for each coefficient group, if the coefficient group contains at least a threshold number of coefficients between a first non-zero coefficient and a last non-zero coefficient in the coefficient group in scan order within the coefficient group, and wherein the threshold number is determined based on whether adjacent coefficient groups contain at least one non-zero coefficient, then determining that the coefficient group is one of the at least two coefficient groups.

7. The method of claim 6, wherein the adjacent coefficient group includes any one of the coefficient group to the right of the current coefficient group and the coefficient group below the current coefficient group.

8. The method of claim 1, wherein each coefficient group is square.

9. The method of claim 1, wherein each coefficient group is 4×4.

10. A decoder for decoding a bitstream of encoded video by reconstructing the coefficients of transform units, wherein the transform units are divided into non-overlapping coefficient groups, each non-overlapping coefficient group containing a corresponding set of coefficients, and each non-zero coefficient has a sign bit indicating whether the coefficient is positive or negative, the decoder comprising:

processor;

Memory; and

The decoding application, stored in memory, contains instructions for causing the processor to perform the following actions during execution:

For each of at least two sets of coefficients in the transform unit:

Summing the absolute values of the coefficients in the coefficient set; and

11. The decoder of claim 10, wherein the instructions, when executed, further cause the processor to perform the following operations:

12. The decoder of claim 11, wherein the instructions, when executed, further cause the processor to perform the following operations:

For each coefficient group, it is determined whether the coefficient group is one of the at least two coefficient groups based on whether the number of coefficients between the first non-zero coefficient and the last non-zero coefficient in the coefficient group in the scanning order exceeds a threshold.

13. The decoder of claim 11, wherein the instructions, when executed, further cause the processor to perform the following operations:

For each coefficient group, it is determined whether the coefficient group is one of the at least two coefficient groups based on whether adjacent coefficient groups contain at least one non-zero coefficient.

14. The decoder of claim 13, wherein the adjacent coefficient group includes any one of the coefficient group to the right of the current coefficient group and the coefficient group below the current coefficient group.

15. The decoder of claim 11, wherein the instructions, when executed, further cause the processor to perform the following operations:

For each coefficient group, it is determined whether the coefficient group is one of the at least two coefficient groups based on the following:

The coefficient group is determined based on whether the first non-zero coefficient in the coefficient group and the last non-zero coefficient in the coefficient group, in the scanning order, contain at least a threshold number of coefficients, and the threshold number is determined based on whether adjacent coefficient groups contain at least one non-zero coefficient.

16. The decoder of claim 15, wherein the adjacent coefficient group includes any one of the coefficient group to the right of the current coefficient group and the coefficient group below the current coefficient group.

17. The decoder of claim 10, wherein each coefficient group is square.

18. The decoder of claim 10, wherein each coefficient group is 4×4.

19. A non-transitory processor-readable medium storing processor-executable instructions, which, when executed, configure one or more processors to perform the method according to claim 1.