HK1201110B

HK1201110B - Quantization parameter (qp) coding in video coding

Info

Publication number: HK1201110B
Application number: HK15101559.4A
Authority: HK
Inventors: 瓦迪姆．谢廖金; 翔林．王
Original assignee: 高通股份有限公司
Priority date: 2012-04-26
Filing date: 2013-04-26
Publication date: 2018-07-13

Description

Quantization Parameter (QP) coding in video coding

The present application claims rights to:

united states provisional application No. 61/639,015, filed on day 26, 4/2012;

united states provisional application No. 61/643,821, filed on 7/5/2012; and

united states provisional application No. 61/656,953, filed on 7/6/2012, the entire contents of each of which are incorporated herein by reference.

Technical Field

This disclosure relates to video coding, and more particularly, to techniques for quantization parameter coding.

Background

Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, Personal Digital Assistants (PDAs), laptop or desktop computers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, video teleconferencing devices, and the like. Digital video devices implement video compression techniques, such as those described by Advanced Video Coding (AVC) of MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4 part 10, the High Efficiency Video Coding (HEVC) standard currently under development, and standards defined by extensions of these standards, to transmit, receive, and store digital video information more efficiently.

Video compression techniques include spatial prediction and/or temporal prediction to reduce or remove redundancy inherent to video sequences. For block-based video coding, a video frame or slice (slice) may be partitioned into blocks. Alternatively, a video frame may be referred to as a picture. Each block may be further partitioned. Blocks in an intra-coded (I) frame or slice are encoded using spatial prediction with respect to reference samples in neighboring blocks in the same frame or slice. Blocks in an inter-coded (P or B) frame or slice may use spatial prediction with respect to reference samples of neighboring blocks in the same frame or slice or temporal prediction with respect to reference samples in other reference frames. Spatial prediction or temporal prediction results in a predictive block for the block to be coded. The residual data represents pixel differences between the original block to be coded (i.e., the coded block) and the predictive block.

An inter-coded block is encoded according to a motion vector that points to a block of reference samples that forms a predictive block and residual data that indicates the difference between the coded block and the predictive block. The intra-coded block is encoded according to an intra-coding mode and residual data. For further compression, the residual data may be transformed from the pixel domain to the transform domain, resulting in residual transform coefficients, which may then be quantized. The quantized transform coefficients, initially configured in a two-dimensional array, may be scanned in a particular order to generate a one-dimensional vector of transform coefficients for entropy coding.

Disclosure of Invention

This disclosure generally describes techniques for coding video data. In particular, this disclosure describes techniques for coding delta Quantization Parameter (dQP) values and identifying lossless coding modes in order to avoid potential encoder/decoder mismatch.

In one example, this disclosure describes a method of video decoding comprising: receiving delta quantization parameter (dQP) values for a current quantization block of video data, wherein the dQP values are received regardless of whether non-zero transform coefficients are present in the current quantization block; determining a Quantization Parameter (QP) value for the current quantized block based on the received dQP value and a QP predictor; and decoding the current quantized block using the determined QP value.

In another example, the video decoding method comprises: receiving a dQP value for the current quantization block of video data only if the QP predictor for the current quantization block has a value of 0; and inferring the dQP value to be 0 if the QP predictor for the current quantized block has a non-zero value and there are no non-zero transform coefficients in the current quantized block.

In another example, this disclosure describes a method of video encoding comprising: determining a QP value for a current quantized block of video data; determining a dQP value for the current quantized block based on the QP and QP predictors; generating the dQP value, wherein the dQP value is signaled regardless of whether non-zero transform coefficients are present in the current quantized block; and encoding the current quantized block using the determined QP value.

In another example, the video encoding method comprises: generating a dQP value for the current quantization block of video data only if the QP predictor for the current quantization block has a value of 0; and inferring the dQP value to be 0 if the QP predictor for the current quantized block has a non-zero value and there are no non-zero transform coefficients in the current quantized block.

This disclosure also describes a video encoder, a video decoder, apparatus, devices, and computer-readable media storing instructions that may be configured to perform the techniques for signaling transform coefficients described herein.

The details of one or more examples are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description and drawings, and from the claims.

Drawings

Fig. 1 is a block diagram illustrating an example video encoding and decoding system.

Fig. 2 is a conceptual diagram illustrating an example quantization group.

Fig. 3 is a conceptual diagram illustrating an example quantization parameter prediction technique.

Fig. 4 is a conceptual diagram illustrating an example of delta quantization parameter signaling when the quantization parameter predictor is zero.

Fig. 5 is a conceptual diagram illustrating an example of delta quantization parameter signaling when a quantization parameter predictor is coded in lossless mode.

Fig. 6 is a block diagram illustrating an example video encoder.

Fig. 7 is a block diagram illustrating an example video decoder.

FIG. 8 is a flow diagram showing an example video encoding method in accordance with the techniques of this disclosure.

FIG. 9 is a flow diagram showing an example video decoding method in accordance with the techniques of this disclosure.

Detailed Description

This disclosure generally describes techniques for coding video data. In particular, this disclosure describes techniques for Quantization Parameter (QP) coding in video encoding and/or decoding processes.

According to some proposals for the HEVC standard, a video block may include a Largest Coding Unit (LCU), which itself may be subdivided into smaller Coding Units (CUs) according to a quadtree partitioning scheme, and possibly further partitioned into Prediction Units (PUs) for the purposes of motion estimation and motion compensation. This disclosure describes techniques for encoding a change (i.e., delta) in a quantization parameter (i.e., delta QP or dQP) for an LCU, CU, or quantization group (or some other block or group of blocks of sufficient size such that quantization changes are supported). In this case, delta QP may define a change in QP for a quantization group relative to a predicted value of QP for an LCU. For example, the predicted QP value for the LCU may be the only QP of a previous quantization group (i.e., previously coded in the bitstream). Alternatively, the predicted QP value may be determined based on rules. For example, the rules may identify one or more other QP values for other quantization groups, or an average QP value that should be used.

Current techniques of dQP signaling in the HEVC standard present certain problems when used in conjunction with signaling and profiling lossless coding modes. As will be discussed in more detail below, in certain situations, encoder/decoder mismatches may occur, particularly when lossless coding is used. Thus, this disclosure presents techniques for dQP and lossless mode coding to avoid these encoder/decoder mismatches.

Fig. 1 is a block diagram illustrating an example video encoding and decoding system 10, which video encoding and decoding system 10 may be configured to utilize techniques for QP coding, according to an example of this disclosure. As shown in fig. 1, system 10 includes a source device 12, source device 12 transmitting encoded video to a destination device 14 via a communication channel 16. The encoded video data may also be stored on storage medium 34 or file server 36 and may be accessed by destination device 14 when needed. When stored to storage medium 34 or file server 36, video encoder 20 may provide the coded video data to another device, such as a network interface, a Compact Disc (CD), blu-ray, or Digital Video Disc (DVD) burner or press facility device, or other device, for storage of the coded video data to the storage medium. Likewise, a device separate from video decoder 30 (e.g., a network interface, a CD or DVD reader, or the like) may retrieve coded video data from a storage medium and provide the retrieved data to video decoder 30.

Source device 12 and destination device 14 may comprise any of a wide variety of devices, including desktop computers, notebook (i.e., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called smartphones, televisions, cameras, display devices, digital media players, video game consoles, or the like. In many cases, these devices may be equipped for wireless communication. Thus, communication channel 16 may comprise a wireless channel, a wired channel, or a combination of wireless and wired channels suitable for transmitting encoded video data. Similarly, the file server 36 may be accessed by the destination device 14 via any standard data connection, including an internet connection. Such a connection may include a wireless channel (e.g., a Wi-Fi connection), a wired connection (e.g., DSL, cable modem, etc.), or a combination of both, suitable for accessing encoded video data stored on a file server.

According to examples of this disclosure, techniques for QP coding may be applied to video coding to support any of a variety of multimedia applications, such as over-the-air television broadcasts, cable television transmissions, satellite television transmissions, streaming video transmissions (e.g., via the internet), encoding of digital video for storage on a data storage medium, decoding of digital video stored on a data storage medium, or other applications. In some examples, system 10 may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.

In the example of fig. 1, source device 12 includes a video source 18, a video encoder 20, a modulator/demodulator 22, and a transmitter 24. In source device 12, video source 18 may include sources such as: a video capture device, e.g., a video camera; a video archive containing previously captured video; a video feed interface to receive video from a video content provider; and/or a computer graphics system for generating computer graphics data as a source video; or a combination of these sources. As one example, if video source 18 is a video camera, source device 12 and destination device 14 may form so-called camera phones or video phones, which may be provided, for example, within a smartphone or tablet computer. However, the techniques described in this disclosure may be applicable to video coding in general, and may be applied to wireless and/or wired applications, or applications in which encoded video data is stored on a local disk.

The captured, pre-captured, or computer generated video may be encoded by video encoder 20. The encoded video information may be modulated by modem 22 according to a communication standard, such as a wired or wireless communication protocol, and transmitted to destination device 14 via transmitter 24. Modem 22 may include various mixers, filters, amplifiers, or other components designed for signal modulation. Transmitter 24 may include circuitry designed for transmitting data, including amplifiers, filters, and one or more antennas in the case of wireless communication.

The captured, pre-captured, or computer-generated video encoded by video encoder 20 may also be stored onto storage medium 34 or file server 36 for later consumption. Storage medium 34 may include a blu-ray disc, DVD, CD-ROM, flash memory, or any other suitable digital storage medium for storing encoded video. The encoded video stored on storage medium 34 may then be accessed by destination device 14 for decoding and playback. Although not shown in fig. 1, in some examples, storage medium 34 and/or file server 36 may store the output of transmitter 24.

File server 36 may be any type of server capable of storing encoded video and transmitting that encoded video to destination device 14. Example file servers include web servers (e.g., for a website), FTP servers, Network Attached Storage (NAS) devices, local disk drives, or any other type of device capable of storing encoded video data and transmitting it to a destination device. The transmission of the encoded video data from the file server 36 may be a streaming transmission, a download transmission, or a combination of both. The file server 36 may be accessed by the destination device 14 via any standard data connection, including an internet connection. Such a connection may include a wireless channel (e.g., a Wi-Fi connection), a wired connection (e.g., DSL, cable modem, ethernet, USB, etc.), or a combination of the two, suitable for accessing encoded video data stored on a file server.

In the example of fig. 1, destination device 14 includes a receiver 26, a modem 28, a video decoder 30, and a display device 32. Receiver 26 of destination device 14 receives the information over channel 16, and modem 28 demodulates the information to generate a demodulated bitstream for video decoder 30. The information communicated over channel 16 may include a variety of syntax information generated by video encoder 20 for use by video decoder 30 in decoding the video data. This syntax may also be included with the encoded video data stored on storage medium 34 or file server 36. Each of video encoder 20 and video decoder 30 may form part of a respective encoder-decoder (CODEC) capable of encoding or decoding video data.

The display device 32 may be integrated with the destination device 14 or external to the destination device 14. In some examples, destination device 14 may include an integrated display device and also be configured to interface with an external display device. In other examples, destination device 14 may be a display device. In general, display device 32 displays coded video data to a user, and may comprise any of a variety of display devices, such as a Liquid Crystal Display (LCD), a plasma display, an Organic Light Emitting Diode (OLED) display, or another type of display device.

In the example of fig. 1, communication channel 16 may include any wireless or wired communication medium, such as a Radio Frequency (RF) spectrum or one or more physical transmission lines, or any combination of wireless and wired media. The communication channel 16 may form part of a packet-based network, such as a local area network, a wide area network, or a global network such as the Internet. Communication channel 16 generally represents any suitable communication medium, or collection of different communication media, including any suitable combination of wired or wireless media, for transmitting video data from source device 12 to destination device 14. Communication channel 16 may include a router, switch, base station, or any other apparatus that may be useful for facilitating communication from source device 12 to destination device 14.

Video encoder 20 and video decoder 30 may operate according to a video compression standard, such as the High Efficiency Video Coding (HEVC) standard currently in the development of the video coding joint collaboration group (JCT-VC) of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Motion Picture Experts Group (MPEG). A draft of the HEVC standard (referred to as "HEVC working draft 7" or "WD 7") is described in Brouss (Bross) et al under the name "High Efficiency Video Coding (HEVC) Text Specification draft 7 (HEVC) Text Specification draftT7) ", ITU-T SG16 WP3 and the video coding Joint collaborative team (JCT-VC) of ISO/IEC JTC1/SC29/WG11, ninth conference: from day 27 of 2012 to day 7 of 2012, Rindertile, Switzerland, from day 25 of 2013, 4 of monthhttp://phenix.it-sudparis.eu/jct/doc_end_user/documents/9_Geneva/wg11/ JCTVC-I1003-v3.zipDownload, the entire contents of which are incorporated herein by reference.

A more recent draft of the HEVC standard (referred to as "HEVC working draft 10" or "WD 10") is described in file JCTVC-L v34 entitled "High Efficiency Video Coding (HEVC) text specification draft 10(High efficiency video coding (HEVC) text specification draft 10" by blous et al, ITU-T SG16 WP3 and the joint collaborative team for video coding (JCT-VC) of ISO/IEC JTC1/SC29/WG11, conference 12: 14 to 23 months of 2013, geneva, switzerland, starting at 25 months of 2013, the file may be fromhttp://phenix.int-evry.fr/jct/ doc_end_user/documents/12_Geneva/wg11/JCTVC-L1003-v34.zipAnd (6) downloading. HEVC WD10 is incorporated herein by reference in its entirety.

Although not shown in fig. 1, in some aspects, video encoder 20 and video decoder 30 may each be integrated with an audio encoder and decoder, and may include appropriate MUX-DEMUX units or other hardware and software to handle encoding of both audio and video in a common data stream or separate data streams. Where applicable, in some examples, the MUX-DEMUX unit may conform to the ITU h.223 multiplexer protocol, or other protocols such as the User Datagram Protocol (UDP).

Video encoder 20 and video decoder 30 may each be implemented as any of a variety of suitable encoder circuits, such as one or more microprocessors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), discrete logic, software, hardware, firmware or any combinations thereof. When the techniques are implemented in part in software, a device may store instructions for the software in a suitable non-transitory computer-readable medium and execute the instructions in hardware using one or more processors to perform the techniques of this disclosure. Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, either of which may be integrated as part of a combined encoder/decoder (CODEC) in the respective device.

Video encoder 20 may implement any or all of the techniques of this disclosure for QP coding in a video coding process. Likewise, video decoder 30 may implement any or all of these techniques for QP coding in a video coding process. As described in this disclosure, a video coder may refer to a video encoder or a video decoder. Similarly, a video coding unit may refer to a video encoder or a video decoder. Likewise, video coding may refer to video encoding or video decoding.

In one example of this disclosure, video encoder 20 may be configured to: determining a Quantization Parameter (QP) value for a current quantized block; determining delta quantization parameter (dQP) values for a current quantization block of video data based on the QP and the QP predictor; signaling a dQP value, wherein the dQP value is signaled regardless of whether a non-zero transform coefficient is present in the current quantization block; and the current quantized block is encoded using the determined QP value.

In another example, video encoder 20 may be configured to: signaling a dQP value for a current quantized block of video data only if a QP predictor for the current quantized block has a value of 0; and infers the dQP value as 0 if the QP predictor for the current quantized block has a non-zero value and there are no non-zero transform coefficients in the current quantized block.

Likewise, video decoder 30 may be configured to: receiving a dQP value for a current quantized block of video data, wherein the dQP value is received regardless of whether a non-zero transform coefficient is present in the current quantized block; determining a QP value for a current quantization block based on the received dQP value and a QP predictor; and decode the current quantized block using the determined QP value.

In another example, video decoder 30 may be configured to: receiving a dQP value for a current quantized block of video data only if a QP predictor for the current quantized block has a value of 0; and infers the dQP value as 0 if the QP predictor for the current quantized block has a non-zero value and there are no non-zero transform coefficients in the current quantized block.

Digital video devices implement video compression techniques to more efficiently encode and decode digital video information. Video compression may apply spatial (intra) prediction and/or temporal (inter) prediction techniques to reduce or remove redundancy inherent to video sequences.

JCT-VC is working on developing the HEVC standard, for example, as described by HEVC WD10 discussed above. HEVC standardization efforts are based on an evolution model of the video coding device, referred to as the HEVC test model (HM). The HM leverages a number of additional capabilities of the video coding device relative to existing devices in accordance with, for example, ITU-T h.264/AVC. For example, h.264 provides nine intra-prediction encoding modes, while HM may provide up to thirty-three intra-prediction encoding modes. The following sections will discuss certain aspects of the HM in more detail.

For video coding according to the HEVC standard currently under development, a video frame may be partitioned into coding units. A Coding Unit (CU) generally refers to an image area that serves as a basic unit to which various coding tools are applied for video compression. A CU typically has a luma component, denoted as Y, and two chroma components, denoted as U and V. Depending on the video sampling format, the size of the U and V components may be the same or different than the size of the Y component in terms of the number of samples.

CUs are typically square and may be considered similar to so-called macroblocks, e.g., for use in other video coding standards such as ITU-T h.264. Coding according to some of the presently suggested aspects of the HEVC standard in development will be described in this application for purposes of illustration. However, the techniques described in this disclosure may be useful for other video coding processes, such as those defined in accordance with h.264 or other standard or proprietary video coding processes.

According to the HM, a CU may include one or more Prediction Units (PUs) and/or one or more Transform Units (TUs). Syntax data within the bitstream may define a Largest Coding Unit (LCU), which is a largest CU in terms of a number of pixels. In general, a CU has a similar purpose to a macroblock of h.264, except that the CU does not have a size difference. Thus, a CU may be split into several sub-CUs. In general, a reference to a CU in this disclosure may refer to a largest coding unit of a picture or a sub-CU of an LCU. An LCU may be split into sub-CUs, and each sub-CU may be further split into sub-CUs. Syntax data for a bitstream may define the maximum number of times an LCU may be split, referred to as CU depth. Thus, the bitstream may also define a minimum coding unit (SCU). This disclosure also uses the term "block" or "portion" to refer to any of a CU, PU, or TU. In general, a "portion" may refer to any subset of a video frame.

The LCU may be associated with a quadtree data structure. In general, a quadtree data structure includes one node per CU, where the root node corresponds to an LCU. If a CU is split into four sub-CUs, the node corresponding to the CU includes four leaf nodes, each of the four leaf nodes corresponding to one of the sub-CUs. Each node of the quadtree data structure may provide syntax data for the corresponding CU. For example, a node in a quadtree may include a split flag that indicates whether to split a CU corresponding to the node into sub-CUs. Syntax elements for a CU may be defined recursively and may depend on whether the CU is split into sub-CUs. If a CU is not split further, it is called a leaf CU.

A CU may include one or more Prediction Units (PUs). In general, a PU represents all or part of a corresponding CU, and may include data for retrieving a reference sample for the PU. For example, when a PU is inter-mode encoded, the PU may include data defining a motion vector for the PU. For example, the data defining a motion vector may describe a horizontal component of the motion vector, a vertical component of the motion vector, a resolution for the motion vector (e.g., one-quarter pixel precision or one-eighth pixel precision), a reference frame to which the motion vector points, and/or a reference list for the motion vector (e.g., list 0 or list 1). For example, the data for a CU that defines a PU may also describe partitioning the CU into one or more PUs. The partition mode may be different depending on whether the CU is not coded, is intra-prediction mode encoded, or is inter-prediction mode encoded. For intra coding, the PU may be treated the same as the transform unit described below.

The emerging HEVC standard allows for transformation according to Transform Units (TUs), which may be different for different CUs. A TU is typically sized based on the size of a PU within a given CU defined for a partitioned LCU, although this may not always be the case. TUs are typically the same size as a PU, or smaller than a PU. In some examples, residual samples corresponding to a CU may be subdivided into smaller units using a quadtree structure referred to as a "residual quadtree" (RQT). The leaf nodes of the RQT may be referred to as Transform Units (TUs). The pixel difference values associated with TUs may be transformed to produce transform coefficients that may be quantized. A TU includes a luma transform block and two chroma transform blocks. Thus, in practice, any coding process discussed below that applies to a TU may be applied to luma transform blocks and chroma transform blocks.

In general, a PU refers to data related to a prediction process. For example, when a PU is intra-mode encoded, the PU may include data describing an intra-prediction mode for the PU. As another example, when a PU is inter-mode encoded, the PU may include data defining a motion vector for the PU.

In general, TUs are used for the transform process and the quantization process. A given CU with one or more PUs may also include one or more Transform Units (TUs). After prediction, video encoder 20 may calculate residual values from the video blocks identified by the coding node according to the PUs. The coding node is then updated to reference the residual values instead of the original video block. The residual values comprise pixel difference values that may be transformed into transform coefficients, quantized, and scanned using the transform and other transform information specified in the TU to generate serialized transform coefficients for entropy coding. The coding node may again be updated to reference these serialized transform coefficients. This disclosure generally uses the term "video block" to refer to a coding node of a CU. In some particular cases, this disclosure may also use the term "video block" to refer to treeblocks that include coding nodes as well as PUs and TUs, i.e., LCUs or CUs.

A video sequence typically comprises a series of video frames or pictures. A group of pictures (GOP) typically includes a series of one or more of the video pictures. The GOP may include syntax data describing the number of pictures included in the GOP in a header of the GOP, in a header of one or more of the pictures, or elsewhere. Each slice of a picture may include slice syntax data that describes an encoding mode for the respective slice. Video encoder 20 typically operates on video blocks within individual video slices in order to encode the video data. The video block may correspond to a coding node within a CU. Video blocks may have fixed or varying sizes, and may differ in size according to a specified coding standard.

To code a block (e.g., a prediction unit of video data), a predictor for the block is first derived. The predictor, also referred to as a predictive block, may be derived via intra (I) prediction (i.e., spatial prediction) or inter (P or B) prediction (i.e., temporal prediction). Thus, some prediction units may be intra-coded (I) using spatial prediction with respect to reference samples in neighboring reference blocks in the same frame (or slice), and other prediction units may be uni-directionally inter-coded (P) or bi-directionally inter-coded (B) with respect to reference sample blocks in other previously coded frames (or slices). In each case, the reference samples may be used to form a predictive block for the block to be coded.

Upon identifying the predictive block, differences between pixels in the original block of video data and pixels in its predictive block are determined. This difference may be referred to as prediction residual data and indicates a pixel difference between pixel values in the block to be coded and pixel values in a predictive block selected to represent the coded block. To achieve better compression, the prediction residual data may be transformed, for example, using a Discrete Cosine Transform (DCT), an integer transform, a Karhunen-Loeve (K-L) transform, or another transform to produce transform coefficients.

The residual data in a transform block (e.g., TU) may be configured as a two-dimensional (2D) array of pixel difference values residing in the spatial pixel domain. The transform converts the residual pixel values into a two-dimensional array of transform coefficients in a transform domain (e.g., the frequency domain).

Quantization may be applied to transform coefficients, and generally involves a process of limiting the number of bits associated with any given transform coefficient. More specifically, quantization may be applied according to a Quantization Parameter (QP). The LCU may be used to signal a change in QP (i.e., delta) rather than signaling the QP itself. Delta QP defines a change in quantization parameter for an LCU relative to a predicted value of a QP for the LCU (e.g., a QP previously communicated to the CU, or a QP defined by a previous QP and/or one or more rules). This disclosure relates to generating and signaling delta QPs within an encoded bitstream in a manner that can improve quality (and possibly improve compression) in the HEVC standard.

For further compression, the transform coefficients may be quantized prior to entropy coding. The entropy coder then applies entropy coding (e.g., Context Adaptive Variable Length Coding (CAVLC), Context Adaptive Binary Arithmetic Coding (CABAC), probability interval partition entropy coding (PIPE), or the like) to the quantized transform coefficients. In some examples, video encoder 20 may utilize a predefined scan order to scan the quantized transform coefficients to generate a serialized vector that may be entropy encoded. In other examples, video encoder 20 may perform adaptive scanning. After scanning the quantized transform coefficients to form a one-dimensional vector, video encoder 20 may entropy encode the one-dimensional vector. Video encoder 20 may also entropy encode syntax elements associated with the encoded video data for use by video decoder 30 in decoding the video data.

As discussed above, delta QP (also noted as dQP) may be signaled in an encoded video bitstream to code a QP for a block of video data. Delta QP is defined as the difference between the current QP (i.e., the actual QP for the current block) and the predictor of the current QP (QP predictor). Based on the signaled dQP, the corresponding current QP value can be reconstructed by adding dQP to the value of the QP predictor. That is, at video encoder 20, dQP is calculated by subtracting the QP predictor from the actual QP for the current block: dQP-QP predictor. At video decoder 30, the actual QP for the current block is reconstructed by adding the received dQP to the QP predictor: current QP ═ dQP + QP predictor. In some examples, the QP predictor is defined as the average of the actual QP value for a block above the current block and the actual QP value for a block to the left of the current block.

According to the proposals for HEVC and HM, Quantization Groups (QGs) are defined for signaling dQP. QG is the minimum block size for which dQP is signaled. The QG may consist of a single CU or multiple CUs. In many examples, the QG may be smaller than one or more possible CU sizes. For example, QG may be defined and/or signaled as a size of 16 x 16 pixels. In this example, it would be possible to have CUs of size 32 × 32 or 64 × 64.

In various examples, the size of the QG is signaled in a Picture Parameter Set (PPS). However, the QG size may be signaled at other levels of the bitstream, including slice headers, Adaptation Parameter Sets (APSs), Sequence Parameter Sets (SPS), and the like. If a CU has a size equal to or greater than the size of QG, dQP may be signaled for this CU. However, for CUs smaller than the size of QG, a common dQP is signaled and shared among all CUs inside the QG. In some examples proposed for HEVC, dQP signaling for a certain block (CU or QG) also depends on whether there are non-zero coefficients in the block. After video encoder 20 determines that there are non-zero coefficients coded in a block (and sometimes only after such determination), the dQP is signaled in the same block.

FIG. 2 shows an example of a 16 × 16QG including 7 CUs (CU 0-CU 6). CU 0-CU 2 are 8 × 8 CUs, while CU 3-CU 6 are 4 × 4 CUs. In the example of fig. 2, shaded CU5 includes at least a non-zero coefficient. Thus, video encoder 20 signals one dQP for the entire QG102, and video decoder 30 will apply the signaled dQP to decode each of CU 0-CU 6. If a CU in QG102 does not contain non-zero coefficients, dQP will not be signaled for that QG. In the current proposal for HEVC, whether dQP is signaled for a certain block (e.g., CU) depends on QG size. If the QG size in the example of fig. 2 is 8 × 8, dQP will not be signaled for CU 0-CU 2 because CU 0-CU 2 do not include non-zero coefficients. However, one dQP will be signaled for CU 3-CU 6 to share because each of these CUs is within a QG containing at least one non-zero coefficient. Likewise, if the size of the signaled QG is 4 × 4, then dQP will not be signaled for any block other than CU5 (i.e., a block with non-zero coefficients).

In other words, when there are multiple CUs in a QG, the dQP for that QG is not signaled unless a CU with non-zero coefficients is located within the QG. For example, if there are four CUs in the QG and none of the first three CUs except the fourth CU have non-zero coefficients (also referred to as non-zero CUs in the following description), then only dQP is signaled in the fourth CU. In general, the first non-zero CU in the QG may be located anywhere inside the QG. For example, the first non-zero CU may be the first CU, the last CU, or any other CU in the QG. In some cases, there may not be a non-zero CU in the QG.

In the following description, the general term "quantization block" is used to refer to a CU having a size greater than or equal to the signaled QG, or to a QG containing a number of CUs. That is, the quantization block is a QG containing one or more CUs, or a CU that is larger than the QG. The term "zero block" will be used to indicate a block that does not have non-zero coefficients (e.g., a TU, or a transform block associated with a CU). When a quantized block is a QG containing multiple CUs, the QG is considered a zero block when all CUs inside the QG are zero blocks. When a quantization block contains at least one non-zero coefficient, a dQP is signaled for each quantization block.

When dQP is not signaled for a quantized block, the value of dQP is assumed to be 0. Thus, at video decoder 30, the value of QP for the current block will be equal to the value of QP predictor (i.e., current QP ═ dQP + QP predictor, where dQP is assumed to be 0). According to the current proposals for HEVC and HM, a QP predictor is defined as the average of the QP value for the left quantized block and the QP value for the above quantized block, as shown in fig. 3 (i.e., showing the QP prediction for the current quantized block using the left quantized block and the above quantized block). In other examples, the QP predictor may be defined differently. For example, a QP predictor may also be defined as a QP value from the left quantization block, as a QP value for a previously coded quantization block in the stream, or otherwise defined.

In the current proposal for HEVC, a lossless coding mode is defined for coding a block (e.g., one or more CUs). According to this lossless coding mode, some coding operations, such as transform, quantization, and in-loop filtering, are skipped. In the current proposal for HEVC, a lossless coding mode may be indicated by video encoder 20 by using a QP value equal to 0 for blocks coded by the lossless coding mode. Thus, if a particular quantization block is coded in lossless mode, the QP value for this quantization block should be set to 0 and signaled to the decoder. In this case, signaling of the QP value is achieved via sending a dQP (i.e., dQP ═ actual QP (in this case 0) -QP predictor) with a value equal to the negative QP predictor value.

Certain problems may arise when using dQP signaling for quantized blocks in conjunction with a lossless coding mode indicated with a QP value of 0. These problems can lead to encoder/decoder mismatch and potentially to decoder corruption.

To explain the first problem, assume that the QP predictor derived for the current quantized block is 0. For example, both the left and upper quantization blocks are coded in lossless coding mode, and thus, each has a QP of 0. In the case where one of the left or upper quantization blocks is coded in lossless coding mode, the QP predictor may also be 0, and the other quantization block has a QP of 1. In this case, the QP predictor is the average of the two blocks, i.e., 0.5, and is truncated to 0 by a decrement. It is also assumed that the current quantized block is coded normally (i.e., non-lossless) using a non-zero QP value. Thus, the transform, quantization and all in-loop filtering processes are applied to the current quantized block.

In this case, if the quantized block is a zero block (i.e., does not have non-zero coefficients), video encoder 20 will not signal dQP to video decoder 30 for this quantized block. A problem arises in this case. At video encoder 20, this example quantized block is coded using a non-zero QP if intra or inter prediction is used. However, without dQP signaling, the inferred QP value at the decoder side has a value of 0. Because dQP is not signaled, and thus dQP is inferred to be 0. Thus, the current QP used to quantize the block will be the QP predictor, as explained above. However, because the QP predictor is also 0 in this example, the current QP value for the quantized block will also be reconstructed as 0, which will be interpreted by video decoder 30 as indicating that the current quantized block will be decoded in lossless coding mode. Thus, video decoder 30 may try to reconstruct the quantized block using a lossless mode even if it was coded using another prediction mode (e.g., inter or intra prediction). This leads to encoder/decoder mismatch, which can cause poor visual quality or even decoder corruption.

A second problem arises when the QP predictor for the current quantized block is non-zero, but the current quantized block is coded with lossless mode and has no residual (i.e., no non-zero values in the residual) and thus no dQP is transmitted. As a result, at video decoder 30, the value of dQP will be inferred to be 0, rather than receiving a dQP value equal to the-QP predictor. Thus, video decoder 30 would reconstruct the actual QP value for the quantized block as 0 (inferred dQP) + QP predictor ═QP predictor. As such, video decoder 30 will reconstruct the non-zero actual QP value, and thus is unable to identify the lossless mode for this quantization block.

To address the first issue, according to a first example of this disclosure, video encoder 20 is configured to always signal a dQP for each quantized block regardless of whether the quantized block is a zero block. For example, dQP may always be signaled for blocks greater than QG (e.g., CUs) or in the first CU of QG. In this way, dQP will not be incorrectly inferred to be 0, and automatic inference of lossless coding mode for the quantized block will be avoided when dQP is inferred to be 0 and QP predictor is also 0.

According to a second example of this disclosure, to limit the number of instances that dQP is signaled, video encoder 20 is configured to signal dQP for a quantized block if the QP predictor for the quantized block is 0. An example where the QP predictor for the current quantized block is 0 is shown in fig. 4. In the example of fig. 4, the QP predictor for block 104 is equal to 0 because the QP for both the upper block and the left block is 0. Thus, according to this example of this disclosure, dQP is signaled for quantization block 104 regardless of whether it is a zero block. Otherwise, if the QP predictor is non-zero for the quantized block 104, then dQP is not signaled for the quantized block 104 when it is a zero block.

In another example, if one of the QPs for QP prediction for the current quantized block (e.g., QP for the above quantized block or left quantized block) is equal to 0, or if the difference between QPs for QP prediction is less than or greater than some threshold, where such threshold may be any integer, then dQP is signaled for the current quantized block regardless of whether it is a zero block. Otherwise, if the quantized block is a zero block, then dQP is not signaled. Of course, if the quantized block contains non-zero coefficients, then dQP is still signaled. The threshold may be fixed or signaled at a header (e.g., at a PPS, slice header, or APS). In other examples, the threshold may be signaled at the LCU or CU level.

In another example of this disclosure, video encoder 20 may be configured to encode an isolation flag or syntax element to indicate that the block is coded with a lossless coding mode, rather than indicating a lossless coding mode for the block by using zero QP. In one example, this flag may be referred to as a "lossless flag". All of the examples of dQP signaling above still apply when using a lossless flag.

For example, in the case where the current quantization block is a zero block, if one (or in another example all) quantization blocks used for QP prediction (e.g., the left quantization block and the upper quantization block in the above example) are coded in a lossless coding mode (i.e., the lossless flag is on for the quantization block used for QP prediction), then dQP is signaled for that quantization block. However, if a lossless flag is used to identify lossless mode coding for a quantization block, setting QP to 0 for that lossless coded quantization block may not be the best option, as this QP value will later be used to predict the QP value of a subsequent quantization block. One example of a lossless coded left and upper quantization block with a non-zero assigned QP is shown on fig. 5 (i.e., an example of a non-zero QP for the quantization block assigned to lossless coding). As shown in fig. 5, the left (qplet) and upper (QPabove) quantized blocks relative to the current quantized block 106 have non-zero QPs, but each is coded in a lossless coding mode (indicated with a lossless flag ═ 1).

According to another example of this disclosure, if a quantized block is coded in a lossless coding mode as indicated by a lossless flag (e.g., QPleft and QPabove as shown in fig. 5), a QP value equal to the QP predictor of that quantized block may be set to the current QP for that block. For example, for QPabove in fig. 5, instead of using a QP value of 0 for the block, because it is coded losslessly, the QP predictor for QPabove (e.g., the average QP for the block to the left and above QPabove) may be used as the current QP. In this way, a non-zero QP value may be used as a QP predictor for a subsequent quantized block, thus avoiding potential encoder/decoder mismatch. In another example, the QP value for a lossless coded quantized block indicated using the lossless flag may also be set equal to the slice QP or any other predefined QP.

In the above description, the lossless flag is described as being signaled at the quantization block level. As another example of this disclosure, video encoder 20 may be configured to signal a lossless flag in accordance with each CU inside the quantization block. In this case, all the techniques described above are still applicable. For example, in the case that a CU inside the quantization block is coded losslessly, dQP signaling may be skipped and the value of dQP inferred to be 0 at video decoder 30. Thus, the QP predictor will be derived as the QP for this lossless CU and used to predict the QP value for the subsequent block. In this case, dQP signaling is skipped for a lossless CU regardless of whether a non-zero residual is present in the CU.

In another example of this disclosure, the quantization block lossless flag may be signaled for the quantization block. When set to, e.g., 1, the quantization block-level lossless flag indicates that at least one CU inside the quantization block is coded lossless. If the quantization block-level lossless flag has, for example, the value 1, a CU lossless flag is signaled for each CU inside the quantization block, indicating whether the particular CU is coded lossless. Otherwise, if the quantization block lossless flag is 0 (i.e., the CUs in the quantization block that are not losslessly coded), then no CU lossless flag is needed for each CU within the quantization block.

In another example, signaling efficiency may be further improved when two levels (quantized block-based and CU-based) lossless flags are used. If the quantization block-level lossless flag is enabled (e.g., it has a value of 1) and the last CU is reached, and there is no CU-level lossless flag coded before the last CU for the current quantization block, then signaling of the CU lossless flag for the last CU may be skipped. This is because the lossless flag of the quantization block indicates that there is at least one lossless coded CU, and thus, it can be inferred that the last CU in this example must be coded losslessly. However, if the lossless flag for the quantization group is disabled (e.g., it has a value of 0) meaning that there are no CUs coded losslessly inside the quantization block, then CU-based lossless flag signaling may be omitted for the current quantization block.

In the case where a lossless flag may be inferred as described above, the same techniques for coding a lossless flag and signaling redundancy removal may be applied at frames, slices, LCUs, and/or other defined units or levels of a video coding process. The term "flag" may refer to a unit syntax element. In addition, instead of flags, multi-bit syntax elements, variable length syntax elements, or other types of data structures capable of conveying the information described above for flags may also be used.

To reiterate the second drawback associated with the currently proposed dQP signaling techniques in HEVC, video decoder 30 may not be able to correctly identify a lossless coding mode for a quantized block when the QP predictor for the current quantized block is non-zero, but the current quantized block may be coded with the lossless mode and without a residual (i.e., there are no non-zero values in the residual). In this case, dQP is not transmitted and is inferred to be 0 by video decoder 30. Thus, video decoder 30 will reconstruct the non-zero actual QP value using zero dQP, and thus cannot identify the lossless mode for this quantized block.

To address this issue, according to another example of this disclosure, video encoder 20 may be configured to eliminate scenarios where it would otherwise be unable to identify a lossless mode, such that video encoder 20 is simply not allowed to select a lossless coding mode for a particular quantization block when there is no prediction residual for the current quantization block and the QP predictor is not 0 or the quantization block used for QP prediction is coded losslessly (e.g., lossless flag on). Thus, the situation where video decoder 30 determines a non-zero QP for a quantized block that is coded losslessly may be avoided.

Another drawback of some proposals for dQP signaling in HEVC relates to unary binarization coding (unary binarization coding) used. In these proposals for HEVC, dQP is coded using truncated unary binarization. The range of dQP values may be- (26+ QpBdOffset_Y(25+ QpBdOffset) + (2)_Y/2), wherein QpBdOffset_YDepending on the bit depth input, the bit depth input may be from 0 to 18. Due to the asymmetric distribution of delta QP values, the cMax truncating the one-element binarization depends on the sign of the dQP value. For example, assume QpBdOffset_YEqual to zero, then cMax equals 26 when the value is negative, and cMax equals 25 for positive values. As a result, this requires video encoder 20 and video decoder 30 to check the sign when dQP coding and parsing. Depending on the sign value, different cMax values are used when coding and parsing dQP values。

To reduce this complexity, this disclosure further suggests removing sign dependency in the dQP binary by changing the range of dQP values, so that the cMax value in the truncated unary binarization is the same regardless of sign. For example, for a positive dQP value, the range may be increased by 1 and the entire range would be- (26+ QpBdOffset)_Y(26+ QpBdOffset) + (2)_Y/2). In this case, the same binary will be used to code or parse both the positive and negative dQP values, and no checking of the sign value is required. This technique is not limited to truncated unary binarization, but may be applied to any other binary scheme, such as the Exponential Golomb (Exponential-Golomb) coding method.

Fig. 6 is a block diagram illustrating an example of video encoder 20, video encoder 20 may use techniques for dQP coding as described in this disclosure. Video encoder 20 will be described in the context of HEVC coding for purposes of illustration, but this disclosure is not limited to other coding standards or methods that may require scanning of transform coefficients. Video encoder 20 may perform intra-coding and inter-coding of CUs within video frames. Intra-coding relies on spatial prediction to reduce or remove spatial redundancy in video data within a given video frame. Inter-coding relies on temporal prediction to reduce or remove temporal redundancy between a current frame and a previously coded frame of a video sequence. Intra-mode (I-mode) may refer to any of a number of spatial-based video compression modes. An inter mode, such as uni-directional prediction (P-mode) or bi-directional prediction (B-mode), may refer to any of a number of time-based video compression modes.

As shown in fig. 6, video encoder 20 receives a current video block within a video frame to be encoded. In the example of fig. 6, video encoder 20 includes motion compensation unit 44, motion estimation unit 42, intra-prediction processing unit 46, reference frame buffer 64, summer 50, transform processing unit 52, quantization unit 54, and entropy encoding unit 56. The transform processing unit 52 illustrated in fig. 6 is a unit that applies the actual transform or transform combination to a block of residual data, and will not be confused with a block of transform coefficients, which may also be referred to as a Transform Unit (TU) of a CU. For video block reconstruction, video encoder 20 also includes an inverse quantization unit 58, an inverse transform processing unit 60, and a summer 62. A deblocking filter (not shown in fig. 6) may also be included to filter block boundaries to remove blocking artifacts from reconstructed video. The deblocking filter will typically filter the output of summer 62, if desired.

During the encoding process, video encoder 20 receives a video frame or slice to be coded. The frame or slice may be divided into a plurality of video blocks, e.g., Largest Coding Units (LCUs). Motion estimation unit 42 and motion compensation unit 44 perform inter-predictive coding of the received video block relative to one or more blocks in one or more reference frames to provide temporal compression. Intra-prediction processing unit 46 may perform intra-predictive coding of the received video block relative to one or more neighboring blocks in the same frame or slice as the block to be coded to provide spatial compression.

Mode select unit 40 may select one of the coding modes (intra or inter), e.g., based on the error (i.e., distortion) results for each mode, and provide the resulting intra or inter predicted block (e.g., Prediction Unit (PU)) to summer 50 to generate residual block data and to summer 62 to reconstruct the encoded block for use in the reference frame. Summer 62 combines the predicted block with the inverse-quantized inverse-transformed data from inverse transform processing unit 60 for the block to reconstruct the encoded block, as described in more detail below. Some video frames may be designated as I-frames, where all blocks in an I-frame are encoded in intra-prediction mode. In some cases, for example, when the motion search performed by motion estimation unit 42 does not result in sufficient prediction for a block in a P or B box, intra-prediction processing unit 46 may perform intra-prediction encoding of the block.

Motion estimation unit 42 and motion compensation unit 44 may be highly integrated, but are illustrated separately for conceptual purposes. Motion estimation (or motion search) is a process that generates motion vectors, which estimate the motion for a video block. For example, a motion vector may indicate a displacement of a prediction unit in a current frame relative to reference samples of a reference frame. Motion estimation unit 42 calculates a motion vector for a prediction unit of an inter-coded frame by comparing the prediction unit to reference samples of a reference frame stored in reference frame buffer 64. The reference samples may be blocks found to closely match in pixel difference to the portion of the CU that includes the PU being coded, which may be determined by Sum of Absolute Differences (SAD), Sum of Squared Differences (SSD), or other difference metrics. The reference samples may occur anywhere within a reference frame or reference slice, and not necessarily at block (e.g., coding unit) boundaries of the reference frame or slice. In some examples, the reference sample may occur at a fractional pixel location.

Motion estimation unit 42 sends the calculated motion vectors to entropy encoding unit 56 and motion compensation unit 44. The portion of the reference frame identified by the motion vector may be referred to as a reference sample. Motion compensation unit 44 may calculate a prediction value for the prediction unit of the current CU, e.g., by retrieving the reference samples identified by the motion vectors for the PUs.

As an alternative to inter-prediction performed by motion estimation unit 42 and motion compensation unit 44, intra-prediction processing unit 46 may intra-predict the received block. Intra-prediction processing unit 46 may predict the received block (assuming a left-to-right, top-to-bottom encoding order for the block) relative to neighboring previously coded blocks (e.g., blocks above, above-right, above-left, or left of the current block). Intra-prediction processing unit 46 may be configured with a variety of different intra-prediction modes. For example, intra-prediction processing unit 46 may be configured with a number of directional prediction modes (e.g., thirty-three directional prediction modes) based on the size of the CU being encoded.

Intra-prediction processing unit 46 may select the intra-prediction mode by, for example, calculating error values for various intra-prediction modes and selecting the mode that results in the lowest error value. The directional prediction mode may include a function for combining values of spatially neighboring pixels and applying the combined values to one or more pixel locations in the PU. Once values for all pixel locations in the PU have been calculated, intra-prediction processing unit 46 may calculate an error value for the prediction mode based on pixel differences between the PU and the received block to be encoded. Intra-prediction processing unit 46 may continue to test intra-prediction modes until an intra-prediction mode is explored that yields an acceptable error value. Intra-prediction processing unit 46 may then send the PU to summer 50.

Video encoder 20 forms a residual block by subtracting the prediction data calculated by motion compensation unit 44 or intra-prediction processing unit 46 from the original video block being coded. Summer 50 represents the component that performs this subtraction operation. The residual block may correspond to a two-dimensional matrix of pixel difference values, where the number of values in the residual block is the same as the number of pixels in the PU corresponding to the residual block. The values in the residual block may correspond to the difference, i.e., the error, between the values of the co-located pixels in the PU and the values of the co-located pixels in the original block to be coded. The difference may be a chroma difference or a luma difference depending on the type of block being coded.

In some examples, the residual may be sent directly to entropy encoding unit 56, for example, in a lossless coding mode. Thus, the transform and quantization processes are skipped. In addition, any loop filter process may also be skipped.

Transform processing unit 52 may form one or more Transform Units (TUs) from the residual blocks. The transform processing unit 52 selects a transform from among a plurality of transforms. The transform may be selected based on one or more coding characteristics, such as block size, coding mode, or the like. Transform processing unit 52 then applies the selected transform to the TU, producing a video block comprising a two-dimensional array of transform coefficients.

Transform processing unit 52 may send the resulting transform coefficients to quantization unit 54. Quantization unit 54 may then quantize the transform coefficients. Entropy encoding unit 56 may then perform a scan of the quantized transform coefficients in the matrix according to the scan mode. This disclosure describes entropy encoding unit 56 as performing the scan. However, it should be understood that in other examples, other processing units, such as quantization unit 54, may perform the scan. Quantization unit 56 may be configured to code dQP values in accordance with the techniques described above. Additional description of the function of quantization unit 56 in this regard will be discussed below with reference to fig. 8.

Once the transform coefficients are scanned into a one-dimensional array, entropy encoding unit 56 may apply entropy coding, such as CABAC, syntax-based context-adaptive binary arithmetic coding (SBAC), Probability Interval Partitioning Entropy (PIPE), or another entropy coding method, to the coefficients.

To perform CABAC, entropy encoding unit 56 may select a context model to apply to a certain context used to encode a symbol to be transmitted. For example, the context may relate to whether the neighbor values are non-zero. Entropy encoding unit 56 may also entropy encode syntax elements, e.g., signals representing the selected transform. In accordance with the techniques of this disclosure, entropy encoding unit 56 may select a context model to use to encode these syntax elements based on, for example, the intra-prediction direction used for the intra-prediction mode, the scan positions of the coefficients corresponding to these syntax elements, the block type, and/or the transform type, among other things.

Following entropy coding by entropy encoding unit 56, the resulting encoded video may be transmitted to another device, such as video decoder 30, or sealed for later transmission or retrieval.

In some cases, entropy encoding unit 56 or another unit of video encoder 20 may be configured to perform other coding functions in addition to entropy coding. For example, entropy encoding unit 56 may be configured to determine Coded Block Pattern (CBP) values for the CUs and PUs. Also, in some cases, entropy encoding unit 56 may perform run-length coding of the coefficients.

Inverse quantization unit 58 and inverse transform processing unit 60 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain, e.g., for later use as a reference block. Motion compensation unit 44 may calculate a reference block by adding the residual block to a predictive block of one of the frames of reference frame buffer 64. Motion compensation unit 44 may also apply one or more interpolation filters to the reconstructed residual block to calculate sub-integer pixel values for use in motion estimation. Summer 62 adds the reconstructed residual block to the motion compensated prediction block generated by motion compensation unit 44 to generate a reconstructed video block for storage in reference frame buffer 64. The reconstructed video block may be used by motion estimation unit 42 and motion compensation unit 44 as a reference block to inter-code a block in a subsequent video frame.

Fig. 7 is a block diagram illustrating an example of video decoder 30, video decoder 30 decoding an encoded video sequence. In the example of fig. 7, video decoder 30 includes an entropy decoding unit 70, a motion compensation unit 72, an intra-prediction processing unit 74, an inverse quantization unit 76, an inverse transform unit 78, a reference frame buffer 82, and a summer 80. In some examples, video decoder 30 may perform a decoding pass that is substantially reciprocal to the encoding pass described with respect to video encoder 20 (see fig. 6).

Entropy decoding unit 70 performs an entropy decoding process on the encoded bitstream to retrieve a one-dimensional array of transform coefficients. The entropy decoding process used depends on the entropy coding (e.g., CABAC) used by video encoder 20. The entropy coding process used by the encoder may be signaled in the encoded bitstream, or may be a predetermined process.

In some examples, for example, in lossless coding mode, the output (i.e., the residual in this example) may be sent directly from entropy decoding unit 70 to summer 80. Thus, the inverse transform and quantization process is skipped. In addition, any loop filter process may also be skipped.

In some examples, entropy decoding unit 70 (or inverse quantization unit 76) may scan the received values using a scan that is a mirror of the scan mode used by entropy encoding unit 56 (or quantization unit 54) of video encoder 20. While the scanning of coefficients may be performed in inverse quantization unit 76, the scanning will be described as being performed by entropy decoding unit 70 for purposes of illustration. In addition, although shown as separate functional units for ease of illustration, the structure and functionality of entropy decoding unit 70, inverse quantization unit 76, and other units of video decoder 30 may be highly integrated with one another.

Inverse quantization unit 76 inverse quantizes (i.e., de-quantizes) the quantized transform coefficients provided in the bitstream and decoded by entropy decoding unit 70. The inverse quantization process may include conventional processes, e.g., similar to those suggested for HEVC or defined by the h.264 decoding standard. The inverse quantization process may include using a quantization parameter QP calculated by video encoder 20 for the CU to determine a degree of quantization, and likewise a degree of inverse quantization that should be applied. Inverse quantization unit 76 may inverse quantize the transform coefficients before or after they are converted from a one-dimensional array to a two-dimensional array. Inverse quantization unit 76 may be configured to decode the dQP value in accordance with the techniques described above. Additional description of the function of the inverse quantization unit 76 in this regard will be discussed below with reference to fig. 9.

The inverse transform processing unit 78 applies an inverse transform to the inverse quantized transform coefficients. In some examples, inverse transform processing unit 78 may determine the transform based on signaling from video encoder 20 or by inferring the inverse transform from one or more coding characteristics, such as block size, coding mode, or the like. In some examples, inverse transform processing unit 78 may determine a transform to apply to the current block based on a signaled transform at a root node of a quadtree for an LCU that includes the current block. Alternatively, the transform may be signaled at the root of the TU quadtree for leaf node CUs in the LCU quadtree. In some examples, the inverse transform processing unit 78 may apply a cascaded inverse transform, where the inverse transform processing unit 78 applies two or more inverse transforms to the transform coefficients of the current block being decoded.

Intra-prediction processing unit 74 may generate prediction data for a current block of a current frame based on a signaled intra-prediction mode and data from previously decoded blocks of the current frame.

Motion compensation unit 72 may retrieve the motion vectors, motion prediction directions, and reference indices from the encoded bitstream. The reference prediction direction indicates whether the inter prediction mode is unidirectional (e.g., P frame) or bidirectional (B frame). The reference index indicates which reference frame the candidate motion vector is based on.

Based on the retrieved motion prediction direction, the reference frame index, and the motion vector, the motion compensation unit generates a motion compensated block for the current portion. These motion compensated blocks essentially re-establish the predictive blocks used to generate the residual data.

Motion compensation unit 72 may generate motion compensated blocks, possibly performing interpolation based on interpolation filters. An identifier for an interpolation filter to be used for motion estimation with sub-pixel precision may be included in the syntax element. Motion compensation unit 72 may use interpolation filters as used by video encoder 20 during encoding of the video block to calculate interpolated values for sub-integer pixels of the reference block. Motion compensation unit 72 may determine the interpolation filters used by video encoder 20 according to the received syntax information and use the interpolation filters to generate the predictive blocks.

Additionally, in the HEVC example, motion compensation unit 72 and intra-prediction processing unit 74 may use some of the syntax information (e.g., provided by a quadtree) to determine the size of the LCU used to encode the frames of the encoded video sequence. Motion compensation unit 72 and intra-prediction processing unit 74 may also use syntax information to determine splitting information that describes how each CU of a frame of the encoded video sequence is split (and likewise, how sub-CUs are split). The syntax information may also include modes indicating how each split is encoded (e.g., intra-prediction or inter-prediction, and for intra-prediction, intra-prediction encoding modes), one or more reference frames for each inter-encoded PU (and/or a reference list containing identifiers for the reference frames), and other information used to decode the encoded video sequence.

Summer 80 combines the residual block with the corresponding prediction block generated by motion compensation unit 72 or intra-prediction processing unit 74 to form a decoded block. Optionally, a deblocking filter may also be applied to filter the decoded blocks in order to remove blocking artifacts. The decoded video blocks are then stored in a reference frame buffer 82 (also referred to as a decoded picture buffer), the reference frame buffer 82 providing the reference blocks for subsequent motion compensation and also generating decoded video for presentation on a display device (e.g., display device 32 of fig. 1).

FIG. 8 is a flow diagram showing an example video encoding method in accordance with the techniques of this disclosure. The techniques of fig. 8 may be implemented by one or more hardware units of video encoder 20, including quantization unit 56.

In one example of this disclosure, quantization unit 56 may be configured to: determining a Quantization Parameter (QP) value for a current quantized block (810); determining delta quantization parameter (dQP) values for a current quantized block of video data based on the QP and the QP predictor (820); and a dQP value is generated, wherein the dQP value is signaled whether or not a non-zero transform coefficient is present in the current quantized block (830). In various examples of this disclosure, the QP predictor is an average of QP values for an above quantization block and a left quantization block, where the above quantization block is located above the current quantization block, and where the left quantization block is located to the left of the current quantization block. Video encoder 20 may be further configured to: the current quantized block is encoded using the determined QP value (840).

In one example of this disclosure, video encoder 20 may be further configured to: generating a Quantization Group (QG) size, wherein a current quantization block comprises one or more Coding Units (CUs) having a size equal to or less than the QG size or CUs having a size greater than the QG size.

In another example of this disclosure, quantization unit 56 may be configured to: generating a dQP value for a current quantized block of video data only if a QP predictor for the current quantized block has a value of 0; and infers the dQP value as 0 if the QP predictor for the current quantized block has a non-zero value and there are no non-zero transform coefficients in the current quantized block.

In another example of this disclosure, quantization unit 56 may be configured to: generating a dQP value for a current quantized block of video data only if one QP value used to determine a QP predictor for the current quantized block has a value of 0; and infers the dQP value as 0 if the QP predictor for the current quantized block has a non-zero value and there are no non-zero transform coefficients in the current quantized block.

In another example of this disclosure, quantization unit 56 may be configured to: generating a dQP value for a current quantized block of video data only if a difference between two QP values used to determine the QP predictor is greater than a threshold; and infers the dQP value as 0 if the QP predictor for the current quantized block has a non-zero value and there are no non-zero transform coefficients in the current quantized block.

In another example of this disclosure, quantization unit 56 may be configured to: generating a dQP value for a current quantized block of video data only if a difference between two QP values used to determine the QP predictor is less than a threshold; and infers the dQP value as 0 if the QP predictor for the current quantized block has a non-zero value and there are no non-zero transform coefficients in the current quantized block.

In another example of this disclosure, quantization unit 56 may be configured to: any of the thresholds described above are generated in one or more of a Picture Parameter Set (PPS), an Adaptation Parameter Set (APS), a slice header, a Largest Coding Unit (LCU) header, and a CU header.

In another example of this disclosure, video encoder 20 may be configured to: generating a lossless flag indicating that the current quantization block is encoded using a lossless coding mode; and the current quantized block is encoded according to a lossless coding mode.

In another example of this disclosure, quantization unit 56 may be configured to: assigning a QP value for the current quantization block to be equal to a QP predictor if a lossless flag is received for the current quantization block, wherein the QP value is used for subsequent QP predictions.

In another example of this disclosure, video encoder 20 may be configured to: in the case that the current quantization block includes two or more CUs, generating a lossless flag at each of the two or more CUs if a lossless flag received at the current quantization block indicates that one or more CUs in the quantization block were encoded using a lossless coding mode.

In another example of this disclosure, video encoder 20 may be configured to: the dQP value is encoded using a truncated unary binarization technique without checking the sign of the dQP value.

In another example of this disclosure, video encoder 20 may be configured to: a lossless coding mode for the current quantized block is not selected if the current quantized block has no prediction residual and the QP predictor for the current quantized block is non-zero.

In another example of this disclosure, video encoder 20 may be configured to: the lossless coding mode for the current quantized block is not selected if the current quantized block has no prediction residual and the one or more quantized blocks used to determine the QP predictor for the current quantized block are coded in the lossless coding mode.

FIG. 9 is a flow diagram showing an example video decoding method in accordance with the techniques of this disclosure. The techniques of fig. 9 may be implemented by one or more hardware units of video decoder 30, including inverse quantization unit 76.

In one example of this disclosure, inverse quantization unit 76 may be configured to: receiving delta quantization parameter (dQP) values for a current quantization block of video data, wherein the dQP values are received regardless of whether non-zero transform coefficients are present in the current quantization block (910); and determine a Quantization Parameter (QP) value for the current quantized block based on the received dQP value and QP predictor (920). In various examples of this disclosure, the QP predictor is an average of QP values for an above quantization block and a left quantization block, where the above quantization block is located above the current quantization block, and where the left quantization block is located to the left of the current quantization block. Video decoder 30 may be further configured to: the current quantized block is decoded using the determined QP value (930).

In another example of this disclosure, inverse quantization unit 76 may be further configured to: a Quantization Group (QG) size is received, wherein a current quantization block comprises one or more Coding Units (CUs) having a size equal to or less than the QG size or CUs having a size greater than the QG size.

In another example of this disclosure, inverse quantization unit 76 may be further configured to: receiving a dQP value for a current quantized block of video data only if a QP predictor for the current quantized block has a value of 0; and infers the dQP value as 0 if the QP predictor for the current quantized block has a non-zero value and there are no non-zero transform coefficients in the current quantized block.

In another example of this disclosure, inverse quantization unit 76 may be further configured to: receiving a dQP value for a current quantized block of video data only if one QP value used to determine a QP predictor for the current quantized block has a value of 0; and infers the dQP value as 0 if the QP predictor for the current quantized block has a non-zero value and there are no non-zero transform coefficients in the current quantized block.

In another example of this disclosure, inverse quantization unit 76 may be further configured to: receiving a dQP value for a current quantized block of video data only if a difference between two QP values used to determine a QP predictor is greater than a threshold; and infers the dQP value as 0 if the QP predictor for the current quantized block has a non-zero value and there are no non-zero transform coefficients in the current quantized block.

In another example of this disclosure, inverse quantization unit 76 may be further configured to: receiving a dQP value for a current quantized block of video data only if a difference between two QP values used to determine a QP predictor is less than a threshold; and infers the dQP value as 0 if the QP predictor for the current quantized block has a non-zero value and there are no non-zero transform coefficients in the current quantized block.

In another example of this disclosure, inverse quantization unit 76 may be further configured to: any of the above thresholds are received in one or more of a Picture Parameter Set (PPS), an Adaptation Parameter Set (APS), a slice header, a maximum coding unit (LCU) header, and a CU header.

In another example of this disclosure, video decoder 30 may be further configured to: receiving a lossless flag indicating that a current quantization block is encoded using a lossless coding mode; and decoding the current quantized block according to a lossless coding mode.

In another example of this disclosure, inverse quantization unit 76 may be further configured to: assigning a QP value for the current quantization block to be equal to a QP predictor if a lossless flag is received for the current quantization block, wherein the QP value is used for subsequent QP predictions.

In another example of this disclosure, video decoder 30 may be further configured to: the lossless flag received at the current quantization block indicates that the lossless flag is received at each of the two or more CUs if one or more CUs in the quantization block were encoded using a lossless coding mode.

In another example of this disclosure, video decoder 30 may be further configured to: the dQP value is decoded using a truncated unary binarization technique without checking the sign of the dQP value.

In one or more examples, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or program code on a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media corresponding to tangible media, such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, such as according to a communication protocol. In this manner, a computer-readable medium may generally correspond to (1) a tangible computer-readable storage medium that is not transitory, or (2) a communication medium such as a signal or carrier wave. A data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, program code, and/or data structures for implementation of the techniques described herein. The computer program product may include a computer-readable medium.

By way of example, and not limitation, such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if the instructions are transmitted from a website, server, or other remote source using a coaxial cable, optical cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, optical cable, twisted pair, DSL, or; wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but rather pertain to non-transitory tangible storage media. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

The instructions may be executed by one or more processors, such as one or more Digital Signal Processors (DSPs), general purpose microprocessors, Application Specific Integrated Circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Thus, as used herein, the term "processor" may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. In addition, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques may be fully implemented in one or more circuits or logic elements.

The techniques of this disclosure may be implemented in a wide variety of devices or apparatuses including a wireless handset, an Integrated Circuit (IC), or a collection of ICs (e.g., a chipset). Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. In particular, as described above, the various units may be combined in a codec hardware unit, or provided by an interoperability hardware unit set including one or more processors as described above in conjunction with suitable software and/or firmware.

Various examples have been described. These and other examples are within the scope of the following claims.

Claims

1. A method of decoding video data, the method comprising:

receiving a delta quantization parameter, dQP, value for a current quantization block of video data, wherein the dQP value is received regardless of whether non-zero transform coefficients are present in the current quantization block such that the dQP value is not inferred;

determining a Quantization Parameter (QP) value for the current quantized block based on the received dQP value and a QP predictor; and

decoding the current quantized block using the determined QP value.

2. The method of claim 1, further comprising:

receiving a Quantization Group (QG) size, wherein the current quantization block comprises one or more Coding Units (CUs) having a size equal to or less than the QG size or CUs having a size greater than the QG size.

3. The method of claim 1, wherein the QP predictor is an average of QP values for an above quantization block and a left quantization block, wherein the above quantization block is located above the current quantization block, and wherein the left quantization block is located to the left of the current quantization block.

4. The method of claim 1, further comprising:

receiving a lossless flag indicating that the current quantization block is encoded using a lossless coding mode; and

decoding the current quantized block according to the lossless coding mode.

5. The method of claim 4, further comprising:

assigning a QP value for the current quantization block to be equal to a QP predictor if the lossless flag is received for the current quantization block, wherein the QP value is used for subsequent QP prediction.

6. The method of claim 4, in the case that the current quantization block includes two or more CUs, the method further comprising:

a lossless flag is received at each of the two or more CUs if a lossless flag received at the current quantization block indicates that one or more CUs in the quantization block were encoded using the lossless coding mode.

7. The method of claim 1, further comprising:

decoding the dQP value using a truncated unary binarization technique without checking the sign of the dQP value.

8. A method of encoding video data, the method comprising:

determining a quantization parameter QP value for a current quantized block of video data;

determining a delta quantization parameter (dQP) value for the current quantization block based on the QP and QP predictors;

generating the dQP value, wherein the dQP value is signaled regardless of whether non-zero transform coefficients are present in the current quantized block; and

encoding the current quantization block using the determined QP value.

9. The method of claim 8, further comprising:

generating a Quantization Group (QG) size, wherein the current quantization block comprises one or more Coding Units (CUs) having a size equal to or less than the QG size or CUs having a size greater than the QG size.

10. The method of claim 8, wherein the QP predictor is an average of QP values for an above quantization block and a left quantization block, wherein the above quantization block is located above the current quantization block, and wherein the left quantization block is located to the left of the current quantization block.

11. The method of claim 8, further comprising:

generating a lossless flag indicating that the current quantization block is encoded using a lossless coding mode; and

encoding the current quantization block according to the lossless coding mode.

12. The method of claim 11, further comprising:

13. The method of claim 11, in the case that the current quantization block includes two or more CUs, the method further comprising:

generating a lossless flag at each of the two or more CUs if a lossless flag received at the current quantization block indicates that one or more CUs in the quantization block were encoded using the lossless coding mode.

14. The method of claim 8, further comprising:

the dQP value is encoded using a truncated unary binarization technique without checking the sign of the dQP value.

15. The method of claim 8, further comprising:

a lossless coding mode for the current quantized block is not selected if the current quantized block has no prediction residual and a QP predictor for the current quantized block is non-zero.

16. The method of claim 8, further comprising:

not select a lossless coding mode for the current quantization block if the current quantization block has no prediction residual and one or more quantization blocks used to determine a QP predictor for the current quantization block are coded in a lossless coding mode.

17. An apparatus configured to decode video data, the apparatus comprising:

a video decoder configured to:

determining a Quantization Parameter (QP) value for the current quantized block based on the received dQP value and a QP predictor; and is

Decoding the current quantized block using the determined QP value.

18. The apparatus of claim 17, wherein the video decoder is further configured to:

19. The apparatus of claim 17, wherein the QP predictor is an average of QP values for an above quantization block and a left quantization block, wherein the above quantization block is located above the current quantization block, and wherein the left quantization block is located to the left of the current quantization block.

20. The apparatus of claim 17, wherein the video decoder is further configured to:

receiving a lossless flag indicating that the current quantization block is encoded using a lossless coding mode; and is

Decoding the current quantized block according to the lossless coding mode.

21. The apparatus of claim 20, wherein the video decoder is further configured to:

22. The apparatus of claim 20, wherein, if the current quantization block includes two or more CUs, the video decoder is further configured to:

23. The apparatus of claim 17, wherein the video decoder is further configured to:

24. An apparatus configured to encode video data, the apparatus comprising:

a video encoder configured to:

generating the dQP value, wherein the dQP value is generated regardless of whether non-zero transform coefficients are present in the current quantized block; and is

Encoding the current quantization block using the determined QP value.

25. The apparatus of claim 24, wherein the video encoder is further configured to:

26. The apparatus of claim 24, wherein the QP predictor is an average of QP values for an above quantization block and a left quantization block, wherein the above quantization block is located above the current quantization block, and wherein the left quantization block is located to the left of the current quantization block.

27. The apparatus of claim 24, wherein the video encoder is further configured to:

generating a lossless flag indicating that the current quantization block is encoded using a lossless coding mode; and is

Encoding the current quantization block according to the lossless coding mode.

28. The apparatus of claim 27, wherein the video encoder is further configured to:

29. The apparatus of claim 27, wherein, if the current quantization block includes two or more CUs, the video encoder is further configured to:

30. The apparatus of claim 24, wherein the video encoder is further configured to:

31. The apparatus of claim 24, wherein the video encoder is further configured to:

32. The apparatus of claim 24, wherein the video encoder is further configured to:

33. An apparatus configured to decode video data, the apparatus comprising:

means for receiving a delta quantization parameter, dQP, value for a current quantization block of video data, wherein the dQP value is received regardless of whether non-zero transform coefficients are present in the current quantization block such that the dQP value is not inferred;

means for determining a Quantization Parameter (QP) value for the current quantized block based on the received dQP value and a QP predictor; and

means for decoding the current quantized block using the determined QP value.

34. An apparatus configured to encode video data, the apparatus comprising:

means for determining a quantization parameter QP value for a current quantized block of video data;

means for determining a delta quantization parameter (dQP) value for the current quantization block based on the QP and QP predictors;

means for generating the dQP value, wherein the dQP value is signaled regardless of whether non-zero transform coefficients are present in the current quantized block; and

means for encoding the current quantized block using the determined QP value.

35. A computer-readable storage medium storing instructions that, when executed, cause one or more processors of a device configured to decode video data to:

Decoding the current quantized block using the determined QP value.

36. A computer-readable storage medium storing instructions that, when executed, cause one or more processors of a device configured to encode video data to:

Encoding the current quantization block using the determined QP value.