HK1201003B

HK1201003B - Method and device for encoding and decoding video data

Info

Publication number: HK1201003B
Application number: HK15101460.2A
Authority: HK
Inventors: 钱威俊; 陈建乐; 穆罕默德．蔡德．科班; 马尔塔.卡切维奇
Original assignee: Qualcomm Incorporated
Priority date: 2012-04-11
Filing date: 2013-04-05
Publication date: 2019-08-09

Description

Method and apparatus for encoding and decoding video data

Related application

The present application claims rights to:

U.S. provisional application No. 61/623,004, filed on day 11, 4/2012; and

U.S. provisional application No. 61/639,836, filed on day 4, 27, 2012, the entire contents of each of which are hereby incorporated by reference in their entirety.

Technical Field

This disclosure relates to video coding, and more particularly, to techniques for coding syntax elements in video coding.

Background

Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, Personal Digital Assistants (PDAs), laptop or desktop computers, digital cameras, digital recording devices, digital media players, video gaming devices, video gaming consoles, cellular or satellite radio telephones, so-called "smart" telephones, tablet computers, video teleconferencing devices, and the like. Video coding standards include ITU-T H.261, ISO/IEC MPEG-1 Visual, ITU-T H.262, or ISO/IEC MPEG-2 Visual, ITU-T H.263, ISO/IECMPEG-4 Visual, and ITU-T H.264 (also known as ISO/IEC MPEG-4 AVC), including their Scalable Video Coding (SVC) and Multiview Video Coding (MVC) extensions. Additionally, High Efficiency Video Coding (HEVC) is a video coding standard developed by the joint collaborative team of video coding (JCT-VC) of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Motion Picture Experts Group (MPEG).

Video compression techniques perform spatial prediction and/or temporal prediction to reduce or remove redundancy inherent in video sequences. For block-based predictive video coding, a video frame or a slice of a video frame may be partitioned into video blocks, which may be referred to as macroblocks or coding units. Each video block may be further partitioned. Video blocks in an intra-coded (I) frame or slice are encoded using spatial prediction with respect to neighboring video blocks. Video blocks in an inter-coded (P or B) frame or slice may use spatial prediction with respect to neighboring video blocks in the same frame or slice, or temporal prediction with respect to other reference frames.

Inter-coded blocks may be coded using motion vectors that indicate groups of reference samples used to form predictive blocks. An intra-coded block may be coded using intra-modes that point to blocks of reference samples used to form a predictive block. The difference between the video block and the predictive block may be referred to as residual data. For further compression, the residual data may be transformed from the pixel domain to the transform domain, producing residual transform coefficients that may then be quantized. In addition, the quantized transform coefficients and predictive information may be entropy coded for further compression.

Disclosure of Invention

In general, various techniques are described for arranging syntax elements for predictive video coding. These techniques may reduce pipeline overhead that may affect entropy coding throughput. In one example, the order in which syntax elements are CABAC (context adaptive binary arithmetic coding) coded or bypass coded may be optimized. Bypass coding may refer to arithmetic coding of binary bits without using adaptive context. In one example, syntax elements that indicate intra-prediction modes of Predictive Units (PUs) within a Coding Unit (CU) may be grouped together based on the syntax elements being to be bypass coded or CABAC coded. This grouping may reduce the total number of cycles required to entropy code the syntax elements.

In one example, this disclosure describes a method for decoding video data, the method comprising: receiving an entropy encoded bitstream that includes a first group of syntax elements and a second group of syntax elements for Prediction Units (PUs) of a Coding Unit (CU), wherein syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements, and wherein one syntax element of the first group of syntax elements indicates whether an intra-prediction mode for a respective PU in the CU is based on an index to a list of most probable modes; context Adaptive Binary Arithmetic Coding (CABAC) decode the first group of syntax elements; after CABAC decoding the first group of syntax elements, bypass decoding the second group of syntax elements; and reconstructing the video data based on the decoded first group of syntax elements and the decoded second group of syntax elements.

In another example, this disclosure describes an apparatus for decoding video data, the apparatus comprising a video decoder configured to: receiving an entropy encoded bitstream that includes a first group of syntax elements and a second group of syntax elements for Prediction Units (PUs) of a Coding Unit (CU), wherein syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements, and wherein one syntax element of the first group of syntax elements indicates whether an intra-prediction mode for a respective PU in the CU is based on an index to a list of most probable modes; context Adaptive Binary Arithmetic Coding (CABAC) decode the first group of syntax elements; after CABAC decoding the first group of syntax elements, bypass decoding the second group of syntax elements; and reconstructing the video data based on the decoded first group of syntax elements and the decoded second group of syntax elements.

In another example, this disclosure describes a computer-readable storage medium having instructions stored thereon that, when executed, cause one or more processors of a device for decoding video data to: receiving an entropy encoded bitstream that includes a first group of syntax elements and a second group of syntax elements for Prediction Units (PUs) of a Coding Unit (CU), wherein syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements, and wherein one syntax element of the first group of syntax elements indicates whether an intra-prediction mode for a respective PU in the CU is based on an index to a list of most probable modes; context Adaptive Binary Arithmetic Coding (CABAC) decode the first group of syntax elements; after CABAC decoding the first group of syntax elements, bypass decoding the second group of syntax elements; and reconstructing the video data based on the decoded first group of syntax elements and the decoded second group of syntax elements.

In another example, this disclosure describes an apparatus for decoding video data, the apparatus comprising: means for receiving an entropy encoded bitstream that includes a first group of syntax elements and a second group of syntax elements for Prediction Units (PUs) of a Coding Unit (CU), wherein syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements, and wherein one syntax element of the first group of syntax elements indicates whether an intra-prediction mode for a respective PU in the CU is based on an index to a list of most probable modes; means for Context Adaptive Binary Arithmetic Coding (CABAC) decoding the first group of syntax elements; means for bypass decoding the second group of syntax elements after CABAC decoding the first group of syntax elements; and means for reconstructing the video data based on the decoded first group of syntax elements and the decoded second group of syntax elements.

In another example, this disclosure describes a method for encoding video data, the method comprising: generating a first group of syntax elements, wherein one syntax element of the first group of syntax elements indicates whether intra-prediction modes of respective Prediction Units (PUs) in a Coding Unit (CU) are based on an index to a list of most probable modes; generating a second group of syntax elements, wherein syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements; context Adaptive Binary Arithmetic Coding (CABAC) encode the first group of syntax elements; bypass encoding the second group of syntax elements after CABAC encoding the first group of syntax elements; and outputting the video data including the encoded first group of syntax elements and the encoded second group of syntax elements.

In another example, this disclosure describes an apparatus for encoding video data, the apparatus comprising a video encoder configured to: generating a first group of syntax elements, wherein one syntax element of the first group of syntax elements indicates whether intra-prediction modes of respective Prediction Units (PUs) in a Coding Unit (CU) are based on an index to a list of most probable modes; generating a second group of syntax elements, wherein syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements; context Adaptive Binary Arithmetic Coding (CABAC) encode the first group of syntax elements; bypass encoding the second group of syntax elements after CABAC encoding the first group of syntax elements; and outputting the video data including the encoded first group of syntax elements and the encoded second group of syntax elements.

In another example, this disclosure describes a computer-readable storage medium having instructions stored thereon that, when executed, cause one or more processors of a device for encoding video data to: generating a first group of syntax elements, wherein one syntax element of the first group of syntax elements indicates whether intra-prediction modes of respective Prediction Units (PUs) in a Coding Unit (CU) are based on an index to a list of most probable modes; generating a second group of syntax elements, wherein syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements; context Adaptive Binary Arithmetic Coding (CABAC) encode the first group of syntax elements; bypass encoding the second group of syntax elements after CABAC encoding the first group of syntax elements; and outputting the video data including the encoded first group of syntax elements and the encoded second group of syntax elements.

In another example, this disclosure describes an apparatus for encoding video data, the apparatus comprising: means for generating a first group of syntax elements, wherein one syntax element of the first group of syntax elements indicates whether intra-prediction modes of respective Prediction Units (PUs) in a Coding Unit (CU) are based on an index to a list of most probable modes; means for generating a second group of syntax elements, wherein syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements; means for Context Adaptive Binary Arithmetic Coding (CABAC) encoding the first group of syntax elements; means for bypass encoding the second group of syntax elements after CABAC encoding the first group of syntax elements; and means for outputting the video data including the encoded first group of syntax elements and the encoded second group of syntax elements.

The details of one or more examples are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description and drawings, and from the claims.

Drawings

Fig. 1 is a conceptual diagram illustrating an example of a prediction unit structure and associated intra-prediction coding modes.

FIG. 2 is a block diagram illustrating an example video encoding and decoding system that may be configured to utilize techniques in accordance with this disclosure.

Fig. 3 is a block diagram illustrating an example of a video encoder that may implement techniques for coding predictive syntax elements according to this disclosure.

Fig. 4 is a block diagram illustrating an example of an entropy encoder that may implement techniques for entropy encoding predictive syntax elements in accordance with this disclosure.

Fig. 5 is a flow diagram illustrating encoding of a predictive syntax element in accordance with the techniques of this disclosure.

Fig. 6 is a block diagram illustrating an example of a video decoder that may implement techniques for decoding an encoded video sequence that includes predictive syntax elements, according to this disclosure.

Fig. 7 is a block diagram illustrating an example of an entropy decoder that may implement techniques for entropy decoding predictive syntax elements in accordance with this disclosure.

Fig. 8 is a flow diagram illustrating decoding of a predictive syntax element in accordance with the techniques of this disclosure.

Detailed Description

Predictive syntax elements within a group of syntax elements (e.g., syntax elements that may indicate intra-prediction modes for one or more prediction units) may be either regular Context Adaptive Binary Arithmetic Coding (CABAC) coded or bypass coded. The order in which these predictive syntax elements are received by the CABAC coder may affect the total number of processing cycles required to entropy code the group of syntax elements. This may be due to setup overhead required to restart the CABAC coding engine when coding transitions between bypass coding and CABAC coding. In the techniques described in this disclosure, these syntax elements may be arranged in a manner such that the total number of processing cycles required to entropy code a group of syntax elements is reduced.

For example, in some other techniques, for intra prediction, a video encoder signals syntax elements that specify an intra prediction mode for luma samples and signals syntax elements for chroma samples. Some of the syntax elements for luma samples and some of the syntax elements for chroma samples are CABAC coded, and other syntax elements for luma samples and chroma samples are bypass coded. In some cases, CABAC-coded and bypass-coded syntax elements are interleaved, resulting in sub-optimal entropy coding throughput because of pipeline overhead.

For example, bypass mode typically has a much higher throughput than CABAC mode. As one example, the bypass mode may process 6 bins in one cycle without adding significant complexity compared to a bypass mode that processes 1 bin per cycle, while the CABAC mode may process only 2 bins per cycle, which is much more complex to design than a CABAC mode that processes 1 bin per cycle. This is due to the nature of context switching in CABAC mode.

To increase entropy coding throughput, it may be beneficial to group CABAC-coded syntax elements together and group bypass-coded syntax elements together. As one example, CABAC coded syntax elements and bypass coded syntax elements are not interleaved together. As described in more detail, the techniques described in this disclosure provide various example techniques to group CABAC-coded syntax elements together and to group bypass-coded syntax elements together, such grouping having the potential to increase entropy coding throughput.

As one example, a video encoder encodes and signals all CABAC-coded bins of luma intra mode and chroma intra mode together and all bypass-coded bins of luma intra mode and chroma intra mode together in a coded bitstream, and a video decoder receives and decodes all CABAC-coded bins of luma intra mode and chroma intra mode together and all bypass-coded bins of luma intra mode and chroma intra mode together from the coded bitstream. As another example, a video encoder encodes and signals all CABAC-coded bins of intra mode and all bypass-coded bins of intra mode for all blocks in a coding unit together in a coded bitstream, and a video decoder receives and decodes all CABAC-coded bins of intra mode and all bypass-coded bins of intra mode for all blocks in the coding unit together from the coded bitstream. As another example, a video encoder encodes and signals together all CABAC-coded bins of both luma intra mode and chroma intra mode and all bypass-coded bins of luma intra mode and chroma intra mode for all blocks in a coding unit in a coded bitstream, and a video decoder receives and decodes together all CABAC-coded bins of both luma intra mode and chroma intra mode and all bypass-coded bins of luma intra mode and chroma intra mode for all blocks in the coding unit from the coded bitstream. In some cases, some additional memory will be needed to store the partially decoded syntax for the video decoder.

Digital video devices implement video compression techniques to more efficiently encode and decode digital video information. Video compression techniques may be defined according to a video coding standard. HEVC standardization efforts are based on a model of the video coding device known as the HEVC test model (HM). The HM assumes that current video coding devices have capability improvements over video coding devices available during development of other previous video coding standards, such as ITU-T h.264/AVC. For example, h.264 provides nine intra-prediction coding modes, while HEVC provides up to thirty-five intra-prediction coding modes. A recent Working Draft (WD) of HEVC, referred to as "HEVC Working Draft 6" or "WD 6", is described in JCTVC-H1003 by the file Bross (Bross) et al ("WD 6: Working Draft 6 of High Efficiency Video Coding (HEVC) (WD 6: Working Draft 6 of High-efficiency video Coding (HEVC))", the video Coding joint collaboration team (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, the eighth conference: san Jose (Calif., USA), 2 months 2011.

In addition, another recent working Draft of HEVC (working Draft 8) is described in the documents HCTVC-J1003_ d7 of Brouss et al ("High Efficiency Video Coding (HEVC) Text Specification Draft 8(High Efficiency Video Coding (HEVC) Text Specification Draft 8)", ITU-T SG16 WP3 and the Video Coding joint cooperative group (JCT-VC) of ISO/IEC JTC1/SC29/WG11, conference 10: Swedish Stockholm, month 7 2012). The more recent working draft of HEVC (working draft 9) may be from http from 3.6.2013: int-evry.fr/jct/doc _ end _ user/documents/11_ Shanghai/wg11/JCTVC-K1003-v10.zip. The HEVC standard may also be referred to as ISO/IEC 23008-HEVC, which is intended to be a standard number for a delivered version of HEVC. Although the techniques of this disclosure are described with reference to the ITU-T h.264 standard and the HEVC standard, the techniques of this disclosure are generally applicable to any video coding standard.

A video sequence typically includes a series of video frames (also referred to as pictures). A group of pictures (GOP) generally includes a series of one or more video frames. The GOP may include syntax data in a header of the GOP, a header of one or more frames of the GOP, or elsewhere, the syntax data describing a number of frames included in the GOP. Each frame may include frame syntax data that describes an encoding mode for the respective frame. Each video frame may include multiple slices. Each slice may include a plurality of video blocks. Video encoders typically operate on video blocks within individual video frames in order to encode video data. Video blocks may have fixed or varying sizes, and may differ in size according to a specified coding standard.

As an example, the ITU-T h.264 standard supports intra prediction for various block sizes (e.g., 16 × 16, 8 × 8, or 4 × 4 for luma components, and 8 × 8 for chroma components), and inter prediction for various block sizes (e.g., 16 × 16, 16 × 8, 8 × 16, 8 × 8, 8 × 4, 4 × 8, and 4 × 4 for luma components, and corresponding resizing for chroma components). In this disclosure, "N × N" and "N by N" are used interchangeably to refer to the pixel size of a block in terms of the vertical and horizontal dimensions, e.g., 16 × 16 pixels or 16 by 16 pixels. In general, a 16 × 16 block will have 16 pixels in the vertical direction (y ═ 16) and 16 pixels in the horizontal direction (x ═ 16). The pixels may also be referred to as samples. Likewise, an nxn block typically has N pixels in the vertical direction and N pixels in the horizontal direction, where N represents a non-negative integer value. The pixels in a block may be arranged in rows and columns. Furthermore, the block need not have the same number of pixels in the horizontal direction as in the vertical direction. For example, a block may comprise N × M pixels, where M is not necessarily equal to N.

A video block may comprise a block of pixel data in the pixel domain, or a block of transform coefficients in the transform domain. The transform coefficients may be generated after applying a transform, such as a Discrete Cosine Transform (DCT), an integer transform, a wavelet transform, or a conceptually similar transform, to residual video block data representing pixel differences between coded video blocks and predictive video blocks. In some cases, the video block may comprise a block of quantized transform coefficients in the transform domain.

Smaller video blocks may provide better resolution and may be used for locations that include video frames of a high level of detail. In general, blocks and various partitions (sometimes referred to as sub-blocks) may be considered video blocks. In addition, a slice may be considered to be a plurality of video blocks, such as blocks and/or sub-blocks. Each slice may be an independently decodable unit of the video frame. Alternatively, the frame itself may be a decodable unit, or other portions of the frame may be defined as decodable units. The term "coded unit" may refer to any independently decodable unit of a video frame, such as an entire frame, a slice of a frame, a group of pictures (GOP), also referred to as a sequence, or another independently decodable unit defined in accordance with applicable coding techniques.

After intra-predictive or inter-predictive coding is performed to generate predictive data and residual data, and after any transform (e.g., the 4 x 4 or 8 x 8 integer transform used in h.264/AVC, or the discrete cosine transform DCT) is performed to generate the transform coefficients, quantization of the transform coefficients may be performed. Quantization generally refers to the process of quantizing transform coefficients to possibly reduce the amount of data used to represent the coefficients. The quantization process may reduce the bit depth associated with some or all of the coefficients. For example, an n-bit value may be reduced to an m-bit value during quantization, where n is greater than m.

After quantization, entropy coding of the quantized data may be performed, for example, according to Content Adaptive Variable Length Coding (CAVLC), Context Adaptive Binary Arithmetic Coding (CABAC), probability interval partitioning entropy coding (PIPE), or another entropy coding method. Entropy coding of syntax elements (e.g., syntax elements that define prediction modes) may also be performed. A processing unit configured for entropy coding, or another processing unit, may perform other processing functions, such as zero run length coding of quantized coefficients, and/or generation of syntax information, such as a Coded Block Pattern (CBP) value, a macroblock type, a coding mode, a maximum macroblock size of a coded unit (e.g., a frame, slice, macroblock, or sequence), or the like.

HEVC refers to a block of video data as a Coding Unit (CU). In general, the techniques of this disclosure and transform, quantization, scanning andthe data of the entropy coded CU is related. A CU may include one or more Prediction Units (PUs) and/or one or more Transform Units (TUs). A PU may also be referred to as a "prediction partition. Syntax data within the bitstream may define a Largest Coding Unit (LCU), which is the largest coding unit in terms of the number of pixels. In general, LCUs have a purpose similar to h.264 macroblocks. However, an LCU is not necessarily limited to a particular size and may include one or more CUs. An LCU typically contains 64 × 64 pixels. A CU within an LCU typically contains 32 × 32, 16 × 16, or 8 × 8 pixels. Thus, an LCU may be split into sub-CUs, and each sub-CU may be further split into sub-CUs. Each pixel in a CU may include a luma (Y) component, a chroma u (u) component, and a chroma v (v) component. In some examples, the chroma component may be referred to as C_rAnd C_bAnd (4) components. In general, a reference to a CU in this disclosure may refer to a Largest Coding Unit (LCU) of a picture or a sub-CU of an LCU. Syntax data for a bitstream may define a maximum number of partitionable LCUs, referred to as CU depths. Accordingly, the syntax data may also define a minimum coding unit (SCU). This disclosure also uses the term "block" to refer to any of a CU, PU, or TU.

The LCU may be associated with a quadtree data structure. In general, the quadtree data structure includes one node for each CU, where the root node corresponds to an LCU. If a CU is split into four sub-CUs, the node corresponding to the CU includes four leaf nodes, each of the four leaf nodes corresponding to one of the sub-CUs. Each node of the quadtree data structure may provide syntax data for a corresponding CU. For example, a node in the quadtree may include a split flag, indicating whether the CU corresponding to the node is split into sub-CUs. Syntax elements of a CU may be defined recursively, and may depend on whether the CU is split into sub-CUs. If a CU is not split further, it is called a leaf CU. In this disclosure, the four sub-CUs of a leaf CU will also be referred to as leaf CUs, although there is no explicit splitting of the original leaf CU. For example, if a CU of size 16 × 16 is not further split, then although the 16 × 16 CU is never split, the four 8 × 8 sub-CUs will also be referred to as leaf CUs.

Furthermore, the TUs of a leaf-CU may also be associated with respective quadtree data structures. That is, a leaf-CU may include a quadtree that indicates how the leaf-CU is partitioned into TUs. This disclosure refers to a quadtree indicating how an LCU is partitioned as a CU quadtree, and refers to a quadtree indicating how a leaf-CU is partitioned into TUs as a TU quadtree. The root node of the TU quadtree generally corresponds to a leaf CU, while the root node of the CU quadtree generally corresponds to an LCU. TUs of an undivided TU quadtree are referred to as leaf-TUs.

A leaf CU may include one or more Prediction Units (PUs). In general, a PU represents all or a portion of a corresponding CU, and may include data for retrieving a reference sample for the PU. For example, when a PU is inter-mode encoded, the PU may include data defining a motion vector for the PU. The data defining the motion vector may describe, for example, a horizontal component of the motion vector, a vertical component of the motion vector, a resolution of the motion vector (e.g., one-quarter pixel precision or one-eighth pixel precision), a prediction direction that identifies a reference picture list (e.g., list 0 or list 1) for the motion vector, and/or a reference picture index value that indicates a reference frame within the reference picture list to which the motion vector points. In a similar manner, when a CU is coded in intra-prediction mode, a PU may include data defining an intra-prediction mode (e.g., angular prediction, DC prediction, or planar prediction, etc.).

The data for defining a leaf CU of a PU may also describe, for example, partitioning of the CU into one or more PUs. The partitioning of a CU into one or more PUs may be referred to as a partition mode. The partition modes available to a CU may be different depending on the CU is not coded, is intra-prediction mode encoded, or is inter-prediction mode encoded. For intra coding, the PU may be treated the same as the transform unit described below.

A leaf-CU may include one or more Transform Units (TUs). In general, this disclosure uses the terms CU and TU to refer to leaf-CU and leaf-TU, respectively, unless otherwise indicated. As discussed above, the transform unit may be specified using a TU quadtree structure. That is, the split flag may indicate whether the leaf-CU is split into four transform units. Each transform unit may then be further split into four sub-TUs. When a TU is not further split, it may be referred to as a leaf-TU. In general, a split flag may indicate that a leaf-TU is split into square-TUs. To indicate that a TU is split into non-square TUs, other syntax data may be included, e.g., syntax data indicating that a TU is to be partitioned according to NSQT.

In general, for intra coding, all leaf-TUs belonging to a leaf-CU may share the same intra prediction mode. That is, the same intra prediction mode may be applied to calculate prediction values for all TUs of a leaf-CU. For intra coding, a video encoder may calculate the residual value for each leaf-TU as the difference between the portion corresponding to the prediction value for the TU and the original block using an intra prediction mode. The residual values may be transformed, quantized, and scanned as described above. For inter-coding, a video encoder may perform prediction at the PU level and may calculate a residual for each PU. The residual values corresponding to the leaf-CU may be transformed, quantized, and scanned. For inter coding, a leaf-TU may be larger or smaller than a PU. For intra coding, a PU may be co-located with a corresponding leaf-TU. In some examples, the maximum size of a leaf-TU may be the size of the corresponding leaf-CU.

In HEVC, the number of intra prediction modes used to generate the predictive blocks of a CU is determined by the PU structure. The PU structure of INTRA _2 nx 2N indicates that a CU contains only one PU. The PU structure of INTRA _ nxn indicates that a CU includes four PUs. In addition, each PU within a CU may have one luma intra prediction mode, and PUs within a CU may share one chroma prediction mode. For example, a CU with an INTRA _2 nx 2N PU structure may have one luma INTRA prediction mode and one chroma INTRA prediction mode. In addition, a CU with an INTRA _ nxn PU structure may have four luma INTRA prediction modes (i.e., one luma INTRA prediction mode per PU) and one chroma INTRA prediction mode for the entire CU (i.e., four PUs share one chroma prediction mode). Syntax elements may be generated to signal intra prediction modes.

Fig. 1 is a conceptual diagram illustrating an example of a prediction unit structure and associated intra-prediction coding modes. As illustrated in fig. 1, an INTRA _2N × 2N Prediction Unit (PU) structure includes only one PU, and an INTRA _ N × N PU structure includes four PUs. As illustrated in fig. 1, the INTRA _ nxn PU structure is symmetric and includes four square PUs of equal size. As described above, each of the four PUs illustrated in fig. 1 may have luma intra prediction and share chroma intra prediction. It should be noted that HEVC defines other PU structures for use with inter mode prediction. Some of these PUs are asymmetric and/or contain rectangular PUs. Although the techniques of this disclosure are described with respect to INTRA _2N × 2N PU and INTRA _ N × N PU structures, the techniques are generally applicable to additional PU structures available for INTRA prediction modes. For example, the techniques of this disclosure may be applicable to the case where a PU structure with four rectangular PUs is associated with an intra-prediction coding mode.

As described above, when a CU is coded using intra-prediction modes, the intra-prediction modes may be signaled using syntax elements. In some examples, the luma intra prediction mode for the PU is signaled using three syntax elements and the chroma intra prediction mode is signaled using one syntax element. The syntax elements are shown in bold in table 1.

prev_intra_luma_pred_flag[x0][y0]
	if(prev_intra_luma_pred_flag[x0][y0])
mpm_idx[x0][y0]
	Else
rem_intra_luma_pred_mode[x0][y0]
	intra_chroma_pred_mode[x0][y0]

TABLE 1

Syntax elements prev _ intra _ luma _ pred _ flag [ x0] [ y0], mpm _ idx [ x0] [ y0] and rem _ intra _ luma _ pred _ mode [ x0] [ y0] specify the intra prediction mode for luma samples. The array indices x0, y0 specify the position of the top left luma sample of the prediction block relative to the top left luma sample of the picture. The syntax element prev _ intra _ luma _ pred _ flag x0 y0 indicates a true or false condition and may have a value of 1 or 0. In one example, when prev _ intra _ luma _ pred _ flag x0 y0 is equal to 1, the intra prediction mode of the current PU is inferred from neighboring intra predicted PUs.

For example, as indicated in table 1, if prev _ intra _ luma _ pred _ flag [ x0] [ y0] is equal to 1, the video encoder signals an mpm _ idx [ x0] [ y0] syntax element in the encoded bitstream and the video decoder receives the mpm _ idx [ x0] [ y0] syntax element from the encoded bitstream. mpm _ idx [ x0] [ y0] is an index to the list of most probable modes (mpm). For example, a video encoder and a video decoder may each be configured to construct a list of most probable modes. The list of most probable modes identifies the intra prediction mode.

The video decoder may implicitly construct a list of most probable modes (i.e., receive no signaling from the video encoder indicating which intra-prediction modes are to be included in the list of most probable modes), or may construct the list of most probable modes based on the received signaling from the video encoder. In either example, the list of most probable modes at the video encoder side and the list of most probable modes at the video decoder side may identify the same intra-prediction mode, and in the same order. In some examples, the video encoder and video decoder may construct a list of most probable modes for each intra-predicted PU of the CU, and the lists of most probable modes for two or more PUs of the CU may be different (i.e., there may be respective lists of most probable modes for PUs of the CU). However, the techniques described in this disclosure are not so limited.

There may be various ways in which a video encoder and a video decoder may construct a list of most probable modes. As one example, a video encoder and a video decoder are configured to identify a fixed number of intra-prediction modes in a list of most probable modes (i.e., identify three intra-prediction modes). Video encoders and video decoders evaluate intra prediction modes of a first neighboring PU that neighbors a current PU (e.g., a PU that is being intra predicted) and a second neighboring PU that neighbors the current PU. Examples of a first neighboring PU and a second neighboring PU include a left, top-left, right, bottom, and substantially any neighboring PU.

The video encoder and video decoder include the intra prediction modes of the first neighboring PU and the second neighboring PU in a list of most probable modes. If one or both of the first neighboring PU and the second neighboring PU are not intra predicted, as one example, the video encoder and video decoder may identify a DC intra prediction mode in the list of most probable modes in corresponding locations of the first neighboring PU and the second neighboring PU within the list of most probable modes.

To determine a third intra-prediction mode in the list of intra-prediction modes, the video encoder and video decoder may determine whether the intra-prediction modes of the first neighboring PU and the second neighboring PU are the same (including whether the DC intra-prediction mode is selected when not intra-predicted). If the intra prediction modes of the first neighboring PU and the second neighboring PU are the same, the video encoder and video decoder may implement the first technique to identify a third intra prediction mode in the list of intra prediction modes. In some examples, if the intra prediction modes of the first neighboring PU and the second neighboring PU are the same, the video encoder and video decoder may identify one of the intra prediction modes in the list of most probable modes and use the first technique (or use another technique instead of the first technique) to identify the second intra prediction mode and the third intra prediction mode in the list of most probable modes. If the intra prediction modes of the first neighboring PU and the second neighboring PU are different, the video encoder and the video decoder may implement a second technique to identify a third intra prediction mode.

Various examples of first and second techniques for identifying a third intra-prediction mode may exist, and the techniques described in this disclosure are not limited to any one particular technique. In some examples, the third intra prediction mode based on the first or second techniques does not necessarily need to be an intra prediction mode of a neighboring PU. In some examples, the third intra prediction mode based on the first technique or the second technique may be based on intra prediction modes of neighboring PUs.

mpm _ idx [ x0] [ y0] is an index to the list of most likely modes that the video encoder signals and the video decoder receives. In this example, from the mpm _ idx [ x0] [ y0] value, the video decoder may be configured to determine the intra prediction mode for the current PU. In some examples, because the list of most probable modes is based on the intra-prediction modes of neighboring PUs (e.g., a first neighboring PU and a second neighboring PU), in some cases, the mpm _ idx [ x0] [ y0] syntax element may be used to identify a particular neighboring PU to be used to infer the intra-prediction mode of the current PU. For example, as one example, if mpm _ idx [ x0] [ y0] is an index to a list of most probable modes that identifies the intra prediction mode associated with the first neighboring PU, the video decoder may be considered to identify the first neighboring PU as the PU for which to infer the intra prediction mode of the current PU.

For example, referring to FIG. 1, for an INTRA _ NxN example, the PU₁May be equal to PU₀The intra prediction mode of (1). Syntax element mpm _ idx [ x0][y0]May indicate which neighboring PU within the set of neighboring intra-predicted PUs may infer the intra-prediction mode for the current PU. In this example, if prev _ intra _ luma _ pred _ flag [ x0]][y0]Equal to 0, then the intra prediction mode is not inferred from neighboring PUs, but from the syntax element rem _ intra _ luma _ pred _ mode x0][y0]And (4) indicating.

The rem _ intra _ luma _ pred _ mode x0] [ y0] syntax element may specify a particular intra prediction mode for the current PU. In some examples, the rem _ intra _ luma _ pred _ mode [ x0] [ y0] syntax element may specify that the intra prediction mode of the current PU is not included in the intra prediction modes identified by the list of most probable modes. For example, when the intra prediction modes identified in the list of most probable modes are excluded, the rem _ intra _ luma _ pred _ mode [ x0] [ y0] syntax element identifies the remaining intra prediction modes from all possible intra prediction modes. However, the technique in the present invention is not limited thereto.

The syntax element intra _ chroma _ pred _ mode [ x0] [ y0] specifies the intra prediction mode of the chroma samples of the CU. The chroma intra prediction may be based on a luma intra prediction mode. The syntax element chroma _ pred _ from _ luma _ enabled _ flag may indicate whether the intra prediction mode for chroma is based on the intra prediction mode for luma.

In HEVC WD6, possible intra prediction modes include a planar prediction mode (predMode ═ 0), a DC prediction (predMode ═ 1), 33 angular prediction modes (predMode ═ 2,....., 34), and a prediction mode (predMode ═ 35) that infers the chroma component of the intra prediction mode from the luma prediction mode. Thus, for the luma component, depending on the value of prev _ intra _ luma _ pred _ flag, the mpm _ idx syntax element may indicate an index to a list of most probable modes, where each entry in the list of most probable modes identifies one of 35 possible intra-prediction modes, or the rem _ intra _ luma _ pred _ mode may indicate one of 35 possible intra-prediction modes (excluding, in some examples, those identified in the list of most probable modes). In one example, the syntax element rem _ intra _ luma _ pred _ mode may have integer values 0 to 31 and may be coded using a fixed length binary coding method, and the syntax element mpm _ idx may have integer values 0 to 2 (e.g., to identify one of the three entries in the list of most probable modes) and may be coded using a truncated unary coding method. In addition, for the chroma component, intra _ chroma _ pred _ mode and one or more other syntax elements may indicate one of 36 possible intra prediction modes. In one example, the syntax element intra _ chroma _ pred _ mode may have integer values of 0 to 5. A more detailed description of how each of the syntax elements mpm _ idx, rem _ intra _ luma _ pred _ mode, and intra _ chroma _ pred _ mode identifies a particular intra prediction mode is provided in HEVC WD6 and is not repeated herein for the sake of brevity. However, it should be noted that the techniques described herein may be generally applied to the expected variations of syntax elements mpm _ idx, rem _ intra _ luma _ pred _ mode, and intra _ ehroma _ pred _ mode.

As described above, the video coding standard may entropy code syntax elements according to CABAC techniques. To apply CABAC coding to syntax elements, a video coder may perform binarization on the syntax elements. Binarization refers to the process of converting a syntax value into a series of one or more bits. These bits may be referred to as "binary bits". Binarization is a lossless process and may include one or a combination of the following coding techniques: fixed length coding, unary coding, truncated Rice (Rice) coding, Golomb (Golomb) coding, exponential Golomb coding, and Golomb-Rice coding. For example, binarization may include representing an integer value of 5 as 00000101 using an 8-bit fixed length technique or as 11110 using a unary coding technique.

After binarization, the video coder may identify the coding context. The coding context may identify a probability of coding a binary bit having a particular value. For example, the coding context may indicate a probability of 0.7 for coding a 0 value bin and a probability of 0.3 for coding a 1 bin. After identifying the coding context, the video coder may arithmetically code that bin based on the context. The binary bits coded using regular CABAC coding may be referred to as "CABAC binary bits".

In addition, instead of performing regular CABAC encoding on all syntax element bins, the video coder may code some syntax elements using bypass CABAC coding. Bypass coding refers to the process of arithmetically encoding a bin without using an adaptive context. That is, the bypass coding engine does not select a context and can assume a probability of 0.5 for two symbols (0 and 1). Although bypass coding may not be as bandwidth efficient as CABAC coding, performing bypass coding on binary bits may be computationally less costly than performing regular CABAC coding on binary bits. In addition, performing bypass coding may enable a higher degree of parallelism and throughput. The binary bits that are decoded using bypass decoding may be referred to as "bypass binary bits".

When a CABAC coder is implemented, the coder may include a regular coding engine that performs regular CABAC coding and a bypass coding engine that performs bypass coding. If a bin is CABAC coded, this bin is coded using a regular CABAC coding engine. A regular CABAC encoding engine may require more than two processing cycles to code a single binary bit. However, with proper pipeline design, a regular CABAC encoding engine may require only n + M cycles to encode n binary bits, where M is the overhead to start the pipeline. M is typically greater than 0.

At the beginning of the CABAC coding process (i.e., each switch from bypass mode to regular mode), pipeline overhead is introduced. If a bin is bypass coded, a regular bypass coding engine is used to code this bin. It is contemplated that the bypass coding engine requires only one cycle to code n-bit syntax elements, where n may be greater than one. Thus, if all bypass bits within a set of bypass bits and CABAC bits are coded together and all CABAC bits within the set are coded together, the total number of cycles used to code the set may be reduced. In particular, coding the bypass binary bits together before or after transitioning to CABAC coding may save the additional overhead required to restart the CABAC coding engine.

As described above, a CU with INTRA _ N × N PU structure may have four luma INTRA prediction modes and thus four sets of syntax elements prev _ INTRA _ luma _ pred _ flag, mpm _ idx, and rem _ INTRA _ luma _ pred _ mode. In one example, among these three syntax elements, only prev _ intra _ luma _ pred _ flag is CABAC coded and both mpm _ idx and rem _ intra _ luma _ pred _ mode are bypass coded. Furthermore, bypass coding may refer to coding of binary bits by non-CABAC processes (i.e., outside of the CABAC engine). Examples of non-CABAC coding that may be used in bypass coding include golomb coding, exponential golomb coding, and golomb-rice coding.

As described above, a CU with an INTRA _ nxn PU structure may have one chroma INTRA prediction mode for all four PUs. The chroma intra prediction mode may be based on the luma intra prediction mode. As one example, table 2 shows binarization of the syntax element intra _ chroma _ pred _ mode. Binarization results in one or more binary bits representing the corresponding value of intra _ chroma _ pred _ mode. In this example, the underlined bins are CABAC coded and the remaining bins are bypass coded. As illustrated in table 2, the binarization of the intra _ chroma _ pred _ mode and the particular bin being CABAC coded or bypass coded is based on the value of chroma _ pred _ from _ luma _ enable _ flag. Thus, binarization of a syntax element and CABAC-coded or bypass-coded particular bins may be based on a value of another syntax element (e.g., chroma _ pred _ from _ luma _ enable _ flag).

Table 2: binarization of intra _ chroma _ pred _ mode dependent on chroma _ pred _ from _ luma _ enabled _ flag

Tables 3 and 4 below represent coding structures of syntax elements used to identify luma prediction mode and chroma prediction mode of a CU in some examples. Table 3 illustrates the coding of syntax elements for an INTRA _2 nx 2N PU structure and table 4 illustrates the coding of an INTRA _ nx N PU structure. In tables 3 and 4, the syntax elements intra _ chroma _ pred _ mode are represented as syntax elements chroma _ mode _ callback _ bins and chroma _ mode _ bypass _ bins, where chroma _ mode _ callback _ bins represent CABAC-coded bins and chroma _ mode _ bypass _ bins represent bypass-coded bins, as provided in table 2. Tables 3 and 4 identify that a syntax element is being coded using CABAC mode or bypass mode. The syntax elements are output to the CABAC coding engine and are typically entropy coded in the order in which they are presented in tables 3 and 4. In addition, as illustrated in table 4, there are four sets of prev _ intra _ luma _ pred _ flag, mpm _ idx, and rem _ intra _ luma _ pred _ mode (each corresponding to one of the four PUs) and one chroma mode indication shared by all four PUs.

Table 3: intra-2 Nx 2N intra-mode decoding structure

Table 4: intra _ NxN intra mode decoding structure

As illustrated in tables 3 and 4, interleaving CABAC-coded syntax elements with bypass-coded syntax elements (as described above) may increase the total number of cycles required to entropy encode all syntax elements in tables 3 and 4, e.g., due to setup overhead of restarting the CABAC coding engine. As illustrated in table 4, if a CU contains multiple PUs, the problem is compounded. To reduce the overall pipeline overhead and the total number of cycles required to encode syntax elements that identify intra-prediction modes, certain adjustments to the coding order may be made in accordance with this disclosure.

Fig. 2 is a block diagram illustrating an example video encoding and decoding system 10 that may utilize techniques in accordance with this disclosure. As shown in fig. 2, system 10 includes a source device 12 that signals encoded video data. In some examples, source device 12 may signal the encoded video data to destination device 14 via communication channel 16. Source device 12 and destination device 14 may comprise any of a wide range of devices. In some cases, source device 12 and destination device 14 may comprise wireless communication devices, such as wireless handsets, so-called cellular or satellite radiotelephones, or any wireless devices that may communicate video information over communication channel 16, in which case communication channel 16 is wireless. However, the techniques of this disclosure are not necessarily limited to wireless applications or settings. For example, these techniques may be applied to over-the-air television broadcasts, cable television transmissions, satellite television transmissions, internet video transmissions, encoded digital video encoded onto a storage medium, or other scenarios. Accordingly, communication channel 16 may comprise any combination of wireless, wired, or storage media suitable for transmitting or storing encoded video data for later retrieval by destination device 14.

In the example of fig. 2, source device 12 includes a video source 18, a video encoder 20, a modulator/demodulator (modem) 22, and a transmitter 24. Destination device 14 includes a receiver 26, a modem 28, a video decoder 30, and a display device 32. In accordance with this disclosure, video encoder 20 of source device 12 may be configured to apply techniques for entropy encoding in a manner that reduces pipeline overhead. In other examples, the source device and destination device may include other components or arrangements. For example, source device 12 may receive video data from an external video source 18 (e.g., an external camera). Likewise, destination device 14 may interface with an external display device, rather than including an integrated display device.

The illustrated system 10 of fig. 2 is merely one example. The techniques for entropy encoding the set of flags and the respective syntax elements for each flag in a manner that reduces pipeline overhead may be performed by any digital video encoding and/or decoding device. Although the techniques of this disclosure are generally performed by a video coding device, the techniques may also be performed by a video encoder/decoder (commonly referred to as a "CODEC"). Furthermore, the techniques of this disclosure may also be performed by a video preprocessor. Source device 12 and destination device 14 are merely examples of these coding devices in which source device 12 generates coded video data for transmission to destination device 14. In some examples, devices 12, 14 may operate in a substantially symmetric manner such that each of devices 12, 14 includes video encoding and decoding components. Thus, system 10 may support one-way or two-way video transmission between video devices 12, 14, e.g., for video streaming, video playback, video broadcasting, or video telephony.

Video source 18 of source device 12 may include a video capture device (e.g., a video camera), a video archive containing previously captured video, and/or a video feed from a video content provider. As another alternative, video source 18 may generate computer graphics-based data as the source video, or a combination of live video, sealed video, and computer-generated video. In some cases, if video source 18 is a video camera, source device 12 and destination device 14 may form so-called camera phones or video phones. However, as mentioned above, the techniques described in this disclosure may be generally applicable to video coding, and may be applied to wireless and/or wired applications. In each case, the captured, pre-captured, or computer-generated video may be encoded by video encoder 20. The encoded video information may then be modulated by modem 22 according to a communication standard and transmitted to destination device 14 via transmitter 24. Modem 22 may include various mixers, filters, amplifiers, or other components designed for signal modulation. Transmitter 24 may include circuitry designed for transmitting data, including amplifiers, filters, and one or more antennas.

Receiver 26 of destination device 14 receives information over channel 16, and modem 28 demodulates the information. Again, the video encoding process may implement one or more of the techniques described herein to entropy encode syntax elements, reducing pipeline overhead. The information communicated over channel 16 may include syntax information defined by video encoder 20 (also used by video decoder 30) that includes syntax elements that describe characteristics and/or processing of macroblocks and other coded units, such as groups of pictures (GOPs). Display device 32 displays the decoded video data to a user, and may comprise any of a variety of display devices, such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), a plasma display, an Organic Light Emitting Diode (OLED) display, or another type of display device.

In the example of fig. 2, communication channel 16 may include any wireless or wired communication medium, such as a Radio Frequency (RF) spectrum or one or more physical transmission lines, or any combination of wireless and wired media. The communication channel 16 may form part of a packet-based network, such as a local area network, a wide area network, or a global network such as the internet. Communication channel 16 generally represents any suitable communication medium or collection of different communication media, including any suitable combination of wired or wireless media, for transmitting video data from source device 12 to destination device 14. Communication channel 16 may include a router, switch, base station, or any other equipment that may be useful for facilitating communication from source device 12 to destination device 14. In other examples, source device 12 may store the encoded data onto a storage medium instead of transmitting the data. Likewise, destination device 14 may be configured to retrieve encoded data from the storage medium. In this manner, source device 12 may be configured to generate a computer program product, where the computer program product includes a video file encoded according to the techniques described herein.

Video encoder 20 and video decoder 30 may operate in accordance with a video compression standard such as those described herein. However, the techniques of this disclosure are not limited to any particular coding standard. Although not shown in fig. 2, in some aspects, video encoder 20 and video decoder 30 may each be integrated with an audio encoder and decoder, and may include appropriate MUX-DEMUX units, or other hardware and software, to handle encoding of both audio and video in a common data stream or separate data streams. The MUX-DEMUX unit may conform to the ITU h.223 multiplexer protocol or other protocols such as the User Datagram Protocol (UDP), where applicable.

Video encoder 20 and video decoder 30 may each be implemented as any of a variety of suitable encoder or decoder circuits, such as one or more microprocessors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), wireless communication devices including video coding devices (e.g., encoders or decoders), discrete logic, software, hardware, firmware or any combinations thereof. Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, any of which may be integrated as part of a combined encoder/decoder (CODEC) in a respective camera, computer, mobile device, user device, broadcast device, set-top box, server, or the like. An apparatus including video encoder 20 and/or video decoder 30 may comprise an integrated circuit, a microprocessor, and/or a wireless communication device (e.g., a cellular telephone).

Video encoder 20 and video decoder 30 may be configured to implement techniques in accordance with this disclosure. These techniques may reduce pipeline overhead, thereby potentially increasing throughput. For example, one reason for pipeline overhead is the overhead associated with the cycles required to start Context Adaptive Binary Arithmetic Coding (CABAC). In the techniques described in this disclosure, video encoder 20 entropy encodes the first group of syntax elements using CABAC encoding (e.g., by a regular CABAC engine), and entropy encodes the second group of syntax elements using bypass encoding (e.g., by a bypass engine). Similarly, video decoder 30 entropy decodes the first group of syntax elements using CABAC decoding (e.g., by a regular CABAC engine), and entropy decodes the second group of syntax elements using bypass decoding (e.g., by a bypass decoding engine).

In this way, in the bitstream, the syntax elements of the CABAC-encoded first group are not interleaved with the syntax elements of the bypass-encoded second group. Video encoder 20 and video decoder 30 will have to repeatedly restart the CABAC engine if the CABAC-encoded syntax elements are interleaved with the bypass-encoded syntax elements.

For example, assume that a first syntax element is to be CABAC encoded and precedes a second syntax element to be bypass encoded, and a third syntax element to be CABAC encoded follows the second syntax element to be bypass encoded. In this example where the syntax element to be CABAC encoded is interleaved with the syntax element to be bypass encoded, video encoder 20 would need to start the CABAC engine for the first syntax element and then restart the CABAC engine for the third syntax element after encoding the second syntax element by the bypass engine, which wastes cycles.

In the techniques described in this disclosure, by CABAC encoding the first group of syntax elements and bypass encoding the second group of syntax elements, video encoder 20 may reduce the number of times the CABAC engine is to be started, thereby reducing the number of wasted cycles. Similarly, by CABAC decoding the first group of syntax elements and bypass decoding the second group of syntax elements, video decoder 30 may reduce the number of times the CABAC engine is to be started, thereby reducing the number of wasted cycles.

As described in more detail, each syntax element of the first group of syntax elements is associated with one PU of the CU, and each syntax element of the second group of syntax elements corresponds to a syntax element of the first group of syntax elements. Each of the first group of syntax elements and the second group of syntax elements includes a plurality of syntax elements. Each of the first group of syntax elements indicates whether an intra-prediction mode of a respective PU (e.g., the PU corresponding to the syntax element of the first group of syntax elements) was inferred from neighboring PUs. In other words, each of the first group of syntax elements indicates whether the intra-prediction mode of the respective PU is based on an index to a list of most probable modes.

For example, video encoder 20 and video decoder 30 may be configured to construct a list of most probable modes in the example manner described above. Each syntax element in the first group of syntax elements may correspond to one of the PUs in the CU. In this example, a syntax element in the first group of syntax elements (i.e., one of the syntax elements in the plurality of syntax elements of the first group) indicates whether an intra-prediction mode of a PU in the CU to which the syntax element corresponds is based on an index to a list of most probable modes.

Syntax elements of a second group of syntax elements associated with the syntax elements of the first group of syntax elements identify an index to a list of most probable modes if the syntax elements of the first group of syntax elements indicate that an intra-prediction mode of a PU with which they are associated is based on the index to the list of most probable modes. Syntax elements of a second group of syntax elements associated with syntax elements of a first group of syntax elements indicate intra-prediction modes if the syntax elements of the first group of syntax elements indicate that the intra-prediction modes of the PU with which they are associated are not based on an index to a list of most probable modes.

Fig. 3 is a block diagram illustrating an example of video encoder 20 that may implement the techniques for encoding video data described in this disclosure. In one example, video encoder 20 may be configured to generate a first group of syntax elements and a second group of syntax elements, wherein the syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements, and wherein each of the syntax elements of the first group of syntax elements indicates whether an intra-prediction mode of a respective PU in the CU is based on an index to a list of most probable modes. Video encoder 20 may Context Adaptive Binary Arithmetic Coding (CABAC) encode the first group of syntax elements (e.g., using a regular CABAC engine), bypass encode the second group of syntax elements (e.g., using a bypass encoding engine) after entropy encoding the first group of syntax elements, and output video data that includes the encoded first group of syntax elements and the encoded second group of syntax elements. Each of the first group and the second group of syntax elements includes a plurality of syntax elements.

Video encoder 20 may perform intra-coding (i.e., intra-prediction encoding) and inter-coding (i.e., inter-prediction encoding) of video blocks within a video slice. As shown in fig. 3, video encoder 20 receives video data, which may be a current video block within a video frame to be encoded, and outputs an encoded video bitstream. Video encoder 20 may further send syntax data, such as block-based syntax data, frame-based syntax data, and GOP-based syntax data, for example, in a frame header, a block header, a slice header, or a GOP header. The GOP syntax data may describe a number of frames in a respective GOP, and the frame syntax data may indicate an encoding/prediction mode used to encode the corresponding frame.

In the example of fig. 3, video encoder 20 includes mode select unit 40, reference frame memory 64, summer 50, transform processing unit 52, quantization processing unit 54, and entropy encoding unit 56. Mode select unit 40, in turn, includes motion estimation unit 42, motion compensation unit 44, intra-prediction unit 46, and partition unit 48. For video block reconstruction, video encoder 20 also includes an inverse quantization processing unit 58, an inverse transform processing unit 60, and a summer 62.

During the encoding process, video encoder 20 receives a video frame or slice to be coded. The frame or slice may be divided into a plurality of video blocks. Motion estimation unit 42 and motion compensation unit 44 perform inter-predictive coding (inter-prediction encoding) of the received video block relative to one or more blocks in one or more reference frames to provide temporal prediction. Intra-prediction unit 46 may alternatively perform intra-predictive coding (intra-prediction encoding) of the received video block relative to one or more neighboring blocks in the same frame or slice as the block to be coded to provide the spatial prediction. Video encoder 20 may perform multiple coding passes, for example, to select an appropriate coding mode for each block of video data.

Furthermore, partition unit 48 may partition a block of video data into a plurality of sub-blocks based on an evaluation of previous partition schemes in previous coding passes. For example, partition unit 48 may initially partition a frame or slice into a plurality of LCUs, and partition each of the LCUs into a plurality of sub-CUs based on a rate-distortion analysis (e.g., rate-distortion optimization). Mode select unit 40 may further generate a quadtree data structure that indicates partitioning of the LCU into sub-CUs. Leaf-node CUs of a quadtree may include one or more PUs and one or more TUs.

Mode select unit 40 may select one of the coding modes (intra or inter), e.g., based on the error results, and provide the resulting intra-coded or inter-coded block to summer 50 to generate residual block data and to summer 62 to reconstruct the encoded block for use as a reference frame. Mode select unit 40 also provides syntax elements such as motion vectors, intra-mode indicators, partition information, and other such syntax information to entropy encoding unit 56. For example, mode select unit 40 may provide syntax elements prev _ intra _ luma _ pred _ flag, mpm _ idx, rem _ intra _ luma _ pred _ mode, intra _ chroma _ pred _ mode, and chroma _ pred _ from _ luma _ enable _ flag as described above to entropy encoding unit 56. In addition, mode select unit 40 may be configured to arrange the syntax elements into prev _ intra _ luma _ pred _ flag, mpm _ idx, rem _ intra _ luma _ pred _ mode, and intra _ chroma _ pred _ mode such that the syntax elements are provided to entropy encoding unit 56 in a manner that reduces the number of cycles required to entropy encode the syntax elements.

For example, video encoder 20 entropy encodes prev _ intra _ luma _ pred _ flag for each PU within the CU using CABAC encoding via entropy encoding unit 56. In this example, the first group of syntax elements includes a plurality of prev _ intra _ luma _ pred _ flags (i.e., one prev _ intra _ luma _ pred _ flag per PU in the CU). Video encoder 20, via entropy encoding unit 56, entropy encodes the mpm _ idx syntax element using bypass encoding if the corresponding prev _ intra _ luma _ pred _ flag indicates that the intra-prediction mode of the PU is based on the index to the list of most probable modes or encodes the rem _ intra _ luma _ pred _ mode syntax element if the corresponding prev _ intra _ luma _ pred _ flag indicates that the intra-prediction mode of the PU is not based on the index to the list of most probable modes. In this example, the second group of syntax elements includes at least one of one or more mpm _ idx syntax elements and one or more rem _ intra _ luma _ pred _ mode syntax elements.

In other words, one instance of a syntax element in the first group of syntax elements is the prev _ intra _ luma _ pred _ flag, one instance of a syntax element in the second group of syntax elements is the mpm _ idx syntax element, and another instance of a syntax element in the second group of syntax elements is the rem _ intra _ luma _ pred _ mode syntax element.

Again, prev _ intra _ luma _ pred _ flag indicates whether the intra prediction mode of the luma samples of the PUs of the CU is based on an index to the list of most probable modes. If prev _ intra _ luma _ pred _ flag indicates that the intra prediction mode of the luma samples of the PUs of the CU is based on an index to the list of most probable modes, the mpm _ idx syntax element indicates an index to the list of most probable modes. If the prev _ intra _ luma _ pred _ flag indicates that the intra prediction mode of a PU of the CU is not based on an index to the list of most probable modes, then the rem _ intra _ luma _ pred _ mode syntax element indicates the intra prediction mode of the luma samples of the PU. The rem _ intra _ luma _ pred _ mode syntax element may indicate an intra prediction mode that is not identified in the list of most probable modes.

Motion estimation unit 42 and motion compensation unit 44 may be highly integrated, but are illustrated separately for conceptual purposes. Motion estimation, performed by motion estimation unit 42, is the process of generating motion vectors that estimate the motion of video blocks. For example, a motion vector may indicate the displacement of a PU of a video block within a current video frame or picture relative to a predictive block within a reference frame (or other coded unit) relative to a current block being coded within the current frame (or other coded unit). A predictive block is a block that is found to closely match the block to be coded in terms of pixel differences, which may be determined by Sum of Absolute Differences (SAD), Sum of Squared Differences (SSD), or other difference metrics. In some examples, video encoder 20 may calculate values for sub-integer pixel positions of reference pictures stored in reference frame memory 64. For example, video encoder 20 may interpolate values for a quarter-pixel position, an eighth-pixel position, or other fractional-pixel positions of a reference picture. Thus, motion estimation unit 42 may perform a motion search relative to full pixel positions and fractional pixel positions and output motion vectors with fractional pixel precision.

Motion estimation unit 42 calculates a motion vector for a PU of a video block in an inter-coded slice by comparing the location of the PU to the location of a predictive block of a reference picture. The reference picture may be selected from a first reference picture list (list 0) or a second reference picture list (list 1), each of list 0 or list 1 identifying one or more reference pictures stored in reference frame memory 64. Motion estimation unit 42 sends the calculated motion vectors to entropy encoding unit 56 and motion compensation unit 44.

The motion compensation performed by motion compensation unit 44 may involve obtaining or generating a predictive block based on the motion vectors determined by motion estimation unit 42. Again, in some examples, motion estimation unit 42 and motion compensation unit 44 may be functionally integrated. Upon receiving the motion vector for the PU of the current video block, motion compensation unit 44 may locate the predictive block to which the motion vector points in one of the reference picture lists. As discussed below, summer 50 forms a residual video block by subtracting pixel values of the predictive block from pixel values of the current video block being coded, forming pixel difference values. In general, motion estimation unit 42 performs motion estimation with respect to the luma component, and motion compensation unit 44 uses motion vectors calculated based on the luma component for both the chroma and luma components. Mode select unit 40 may also generate syntax elements associated with the video blocks and the video slice for use by video decoder 30 in decoding the video blocks of the video slice.

In lieu of inter-prediction performed by motion estimation unit 42 and motion compensation unit 44 (as described above), intra-prediction unit 46 may intra-predict the current block. In particular, intra-prediction unit 46 may determine the intra-prediction mode used to encode the current block. In some examples, intra-prediction unit 46 may encode the current block using various intra-prediction modes, e.g., during separate encoding passes, and intra-prediction unit 46 (or mode selection unit 40, in some examples) may select an appropriate intra-prediction mode to use from the tested modes, e.g., based on rate-distortion analysis. As described above, possible intra prediction modes may include a planar prediction mode, DC prediction, an angular prediction mode, and a prediction mode that infers a chroma component of the intra prediction mode from a luma prediction mode.

In addition, in one example, intra-prediction unit 46 may calculate rate-distortion values using rate-distortion analysis on various tested intra-prediction modes, and select the intra-prediction mode having the best rate-distortion characteristics among the tested modes. Rate-distortion analysis generally determines the amount of distortion (or error) between an encoded block and an original, unencoded block, which was encoded to generate the encoded block, and the bitrate (i.e., number of bits) used to generate the encoded block. Intra-prediction unit 46 may calculate the ratios of the various encoded blocks from the distortion and rate to determine which intra-prediction mode exhibits the best rate-distortion value for the block.

After selecting the intra-prediction mode for the block, intra-prediction unit 46 may provide information (e.g., syntax elements) to entropy encoding unit 56 that indicates the selected intra-prediction mode for the block. Entropy encoding unit 56 may encode information indicating the selected intra-prediction mode. Video encoder 20 may include in the transmitted bitstream configuration data, which may include a plurality of intra-prediction mode index tables and a plurality of modified intra-prediction mode index tables (also referred to as codeword mapping tables), definitions of the encoding contexts for the various blocks and indications of the most probable intra-prediction mode, intra-prediction mode index table, and modified intra-prediction mode index table to be used for each of the contexts.

As described above, the intra prediction syntax elements prev _ intra _ luma _ pred _ flag, mpm _ idx, rem _ intra _ luma _ pred _ mode, intra _ chroma _ pred _ mode, and chroma _ pred _ from _ luma _ enable _ flag may indicate one or more luma intra prediction modes of the PU structure and one chroma intra prediction mode of the PU structure. In addition, as described above with respect to tables 3 and 4, the arrangement of the intra-prediction syntax elements prev _ intra _ luma _ pred _ flag, mpm _ idx, rem _ intra _ luma _ pred _ mode, intra _ chroma _ pred _ mode, and chroma _ pred _ from _ luma _ enable _ flag within the coding structure may determine the total number of cycles needed to entropy encode all syntax elements of the PU structure. Thus, in addition to outputting intra-prediction syntax elements to entropy encoding unit 56 as described above with respect to tables 3 and 4, intra-prediction unit 46 may be configured to arrange the intra-prediction syntax such that the syntax elements are output to entropy encoding unit 56 in a manner that the total number of cycles required to entropy encode the syntax elements is reduced. Likewise, with this arrangement, the total number of cycles required to entropy decode a syntax element at the decoder is reduced.

In one example, intra-prediction unit 46 may be configured to arrange CABAC-coded bins of chroma intra-mode syntax elements and CABAC-coded bins of luma intra-mode syntax elements such that the bins are coded together. In addition, intra-prediction unit 46 may arrange the bypass coded bins of luma intra mode syntax elements and the bypass coded bins of chroma intra mode syntax elements such that the bins may be coded together. Table 5 provides an example coding structure for this arrangement of INTRA _2 nx 2N PU structures. Table 6 provides an example coding structure for this arrangement of INTRA _ N × N PU structures. In tables 5 and 6, the syntax elements may be defined as described above with respect to tables 3 and 4. It should be noted, however, that the syntax elements in tables 5 and 6 do not necessarily need to have the same value ranges as described above. For example, if video encoder 20 includes less than 33 or more intra-prediction modes, the values of the syntax elements may be adjusted accordingly. As illustrated in tables 5 and 6, the syntax element chroma _ mode _ cabac _ bins is arranged such that it is coded before mpm _ idx and rem _ intra _ luma _ pred _ mode.

Table 5: intra-2 Nx 2N intra-mode decoding structure

Table 6: intra _ NxN intra mode decoding structure

In another example, intra-prediction unit 46 may be configured to arrange CABAC-coded bins of intra-mode luma syntax elements for all PUs within a CU so that they may be coded together. In addition, intra-prediction unit 46 may arrange the bypass coded bins of luma intra mode syntax elements so that they may be coded together. Table 7 provides an example coding structure for this arrangement of INTRA _2 nx 2N PU structures. Table 8 provides an example coding structure for this arrangement of INTRA _ N × N PU structures. In tables 7 and 8, syntax elements may be defined as described above with respect to tables 5 and 6.

As illustrated in table 8, for an INTRA _ nxn PU structure, the prev _ INTRA _ luma _ pred _ flag syntax element is coded for each of the four PUs, followed by the respective mpm _ idx and rem _ INTRA _ luma _ pred _ mode syntax elements for each of the PUs. The four prev _ intra _ luma _ pred _ flag syntax elements may be referred to as a first group of syntax elements for the PU. Accordingly, intra-prediction unit 46 may be configured to output, prior to entropy encoding unit 56, the first group of syntax elements prior to the respective syntax elements mpm _ idx and rem _ intra _ luma _ pred _ mode associated with each of the flags.

Table 7: intra-2 Nx 2N intra-mode decoding structure

Table 8: intra _ NxN intra mode decoding structure

In yet another example, intra-prediction unit 46 may be configured to arrange all CABAC-coded bins of both luma intra-mode syntax elements and chroma intra-mode syntax elements for all PUs in the coding unit so that they may be coded together. In addition, intra-prediction unit 46 may be configured to arrange all bypass coded bins for both luma intra mode and chroma intra mode so that they may be coded together. Table 9 provides an example coding structure for this arrangement of INTRA _2 nx 2N PU structures. Table 10 provides an example coding structure for this arrangement of INTRA _ N × N PU structures. In tables 9 and 10, syntax elements may be defined as described above with respect to tables 5 and 6.

Table 9: intra-2 Nx 2N intra-mode decoding structure

Table 10: intra _ NxN intra mode decoding structure

It should be noted that in tables 6-10, the rows in which the syntax elements are presented in the tables may correspond to the order in which the syntax elements were entropy encoded by the entropy encoder. In some cases, no intermediate entropy coding may occur between adjacent rows in tables 6-10. The terms "group" and "grouping" herein may generally refer to placing elements closely together, e.g., within a syntax table within a video coding unit such as a picture, slice, CU, or the like. In the case of performing operations on syntax elements included in a group, the elements may in some cases be placed closely together such that no operations are performed on elements outside the group until the operations are completed for all elements within the group (i.e., no interleaving between different groups). In some cases, the binary bits in a group may be presented consecutively and coded in order or sequence (i.e., one after the other).

Referring again to fig. 3, video encoder 20 forms a residual video block by subtracting prediction data from mode select unit 40 from the original video block being coded. Summer 50 represents one or more components that perform this subtraction operation. Transform processing unit 52 applies a transform, such as a Discrete Cosine Transform (DCT) or a conceptually similar transform, to the residual block, producing a video block that includes residual transform coefficient values. Transform processing unit 52 may perform other transforms that are conceptually similar to DCT. Wavelet transforms, integer transforms, sub-band transforms, or other types of transforms may also be used. In any case, transform processing unit 52 applies a transform to the residual block, producing a block of residual transform coefficients. The transform may convert the residual information from a pixel value domain to a transform domain (e.g., frequency domain). Transform processing unit 52 may send the resulting transform coefficients to quantization unit 54. Quantization processing unit 54 quantizes the transform coefficients to further reduce the bit rate. The quantization process may reduce the bit depth associated with some or all of the coefficients. The quantization level may be modified by adjusting the quantization parameter. In some examples, quantization processing unit 54 may then perform a scan of a matrix that includes quantized transform coefficients. Alternatively, entropy encoding unit 56 may perform the scan.

Video encoder 20 may be configured to perform inverse quantization and inverse transformation to store decoded blocks for use as references for predicting subsequent blocks in the same frame or multiple frames to be temporally predicted, for example. Inverse quantization processing unit 58 and inverse transform processing unit 60 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain, e.g., for later use as a reference block. Motion compensation unit 44 may calculate a reference block by adding the residual block to a predictive block of one of the frames of reference frame memory 64. Motion compensation unit 44 may also apply one or more interpolation filters to the reconstructed residual block to calculate sub-integer pixel values for use in motion estimation. Summer 62 adds the reconstructed residual block to the motion compensated prediction block generated by motion compensation unit 44 to generate a reconstructed video block for storage in reference frame memory 64.

After quantization, entropy encoding unit 56 entropy encodes the quantized transform coefficients. In addition, entropy encoding unit 56 entropy encodes syntax elements, such as the predictive syntax elements described above. For example, entropy encoding unit 56 may perform Context Adaptive Variable Length Coding (CAVLC), Context Adaptive Binary Arithmetic Coding (CABAC), syntax-based context adaptive binary arithmetic coding (SBAC), Probability Interval Partition Entropy (PIPE) coding, or another entropy coding technique. In the case of context-based entropy coding, the context may be based on neighboring blocks. After entropy encoding by entropy encoding unit 56, the encoded bitstream may be transmitted to another device (e.g., video decoder 30) or sequestered for later transmission or retrieval.

FIG. 4 is a block diagram illustrating an example entropy encoding unit 56 that may implement the techniques described in this disclosure. In one example, entropy encoding unit 56 illustrated in fig. 4 may be a CABAC encoder. Example entropy encoding unit 56 may include binarization unit 502, arithmetic encoding unit 510 that includes bypass encoding engine 504 and rule encoding engine 508, and context modeling unit 506.

Entropy encoding unit 56 may receive one or more syntax elements, such as the intra-prediction syntax elements prev _ intra _ luma _ pred _ flag, mpm _ idx, rem _ intra _ luma _ pred _ mode, intra _ chroma _ pred _ mode, and chroma _ pred _ from _ luma _ enable _ flag described above. The order in which the syntax elements are received by entropy encoding unit 56 may be defined according to a coding structure (e.g., the example coding structures described above with respect to tables 3-10).

Binarization unit 502 receives the syntax elements and generates a string of bins (bin), i.e., a string of bins (bin). For example, binarization unit 502 may generate a string of binary bits using any one or a combination of the following techniques: fixed length coding, unary coding, truncated rice coding, golomb coding, exponential golomb coding, and golomb-rice coding. Additionally, in some cases, binarization unit 502 may receive the syntax elements as a bin string and simply pass the bin value. In one example, binarization unit 502 receives syntax element intra _ chroma _ pred _ mode and generates a binary bit string based on the value of chroma _ pred _ from _ luma _ enable _ flag according to the example described above with respect to table 2.

Arithmetic encoding unit 510 is configured to receive the string of binary bits from binarization unit 502 and perform arithmetic encoding on the string of binary bits. As shown in fig. 4, arithmetic encoding unit 510 may receive binary bit values from a bypass path or a regular coding path. Binary bit values that follow the bypass path may be identified as bypass coded binary bit values, and binary bit values that follow the regular encoding path may be identified as CABAC coded. Consistent with the CABAC process described above, in the case where the arithmetic coding unit 510 receives a binary bit value from the bypass path, the bypass coding engine 504 may perform arithmetic coding on the binary bit value without utilizing the adaptive context assigned to the binary bit value. In one example, the bypass coding engine 504 may assume that the probability of a binary bit having a possible value is equal.

In the case where the arithmetic encoding unit 510 receives binary bit values over the rule path, the context modeling unit 506 may provide context variables (e.g., context states) such that the rule encoding engine 508 may perform arithmetic encoding based on the context assignments provided by the context modeling unit 506. The context assignment may be defined according to a video coding standard such as HEVC. The context model may be stored in a memory. Context modeling unit 506 may include a series of indexed tables and/or utilize mapping functions to determine context and context variables for a particular bin. After encoding the binary bit value, rule encoding engine 508 may update the context based on the actual binary bit value.

Additionally, in one example, entropy encoding unit 56 may be configured to encode the bins based on the order in which they received the syntax elements from mode selection unit 40. As described above, the order may be defined according to a coding structure (e.g., the example coding structures described above with respect to tables 3-10). The order in which entropy encoding unit 56 receives syntax elements within the set of syntax elements may determine the total number of cycles needed for entropy encoding unit 56 to encode the set of syntax elements.

In one example, rule encoding engine 508 may require more than two cycles to code a single binary bit. Additionally, in one example, rule encoding engine 508 may require n + M cycles to encode n binary bits, where M is the overhead to start the pipeline. M is typically greater than 0. At the beginning of the CABAC encoding process (e.g., switching from bypass mode to regular mode), the introduction pipeline consumes M extra. In one example, the bypass coding engine 504 may require one cycle to code an n-bit syntax element, where n is greater than zero.

Thus, the total number of cycles required for the arithmetic coding unit 510 to encode the set of bypass and CABAC binary bits may be based on the number of times the pipeline is introduced to consume M in addition. For example, if the arithmetic coding unit 510 encodes the predictive syntax element as arranged in table 4, pipeline overhead may be introduced five times and the arithmetic coding unit 510 may require a minimum of 5 × M cycles to encode the binary bits of the syntax element. However, if the arithmetic coding unit 510 encodes the predictive syntax elements as arranged in table 8, pipeline overhead may be introduced only twice and the arithmetic coding unit 510 pipeline overhead may be reduced to 2 × M cycles.

It should be noted that video frames coded according to HEVC may include on the order of tens of thousands of PUs. Thus, the order in which the intra-prediction syntax elements are arranged in the coding structure may significantly affect the number of cycles required for the video coder to code the video data. Additionally, it should be noted that although example entropy encoding unit 56 illustrated in fig. 4 is described as switching from a bypass encoding operation to a regular encoding operation, in some cases these operations may be performed in parallel. In this case, however, the order in which entropy encoding unit 56 receives the bypass-coded and CABAC-coded syntax elements may still determine the total number of cycles needed to entropy encode the syntax elements. Overhead comes from context switching. Because a bin coded in bypass mode may generate several different syntax elements for the next CABAC coded bin, it is difficult to pre-fetch the necessary context (load context buffer) to reduce pipeline overhead.

Fig. 5 is a flow diagram illustrating an example of encoding video data in accordance with the techniques of this disclosure. Although the process in fig. 5 is described below as being performed generally by video encoder 20, the process may be performed by any combination of video encoder 20, mode select unit 40, and/or entropy encoding unit 56.

As illustrated in fig. 5, video encoder 20 generates a first group of syntax elements (602). In one example, a syntax element within the first group indicates whether a prediction mode of a respective Prediction Unit (PU) is based on an index to a list of most probable modes. In one example, the first group of syntax elements includes four syntax elements prev _ INTRA _ luma _ pred _ flag corresponding to PUs in the INTRA _ nxn PU structure. Video encoder 20 generates a second group of syntax elements (604). In one example, as described above, the respective syntax element may be any of the syntax elements mpm _ idx or rem _ intra _ luma _ pred _ mode. The syntax elements of the second group correspond to the respective syntax elements of the first group.

Video encoder 20 CABAC encodes a first group of syntax elements (e.g., using a regular CABAC engine) (606). After CABAC encodes the first group of syntax elements, video encoder 20 bypass encodes the second group of syntax elements (e.g., using a bypass coding engine) (608). Video encoder 20 may intra-encode the PUs of the CUs based on the first group and the second group of syntax elements (610). Video encoder 20 may output video data in the bitstream that includes the encoded first group of syntax elements and the encoded second group of syntax elements (612). Video encoder 20 also outputs the residual information for the intra-coded PU in the bitstream.

As described above, an example of the first group of syntax elements includes a plurality of flags (e.g., the prev _ intra _ luma _ pred _ flag syntax element). A flag of the plurality of flags may be a syntax element of a first group of syntax elements that indicates the following: whether the intra prediction mode of the luma samples of the respective PU is based on an index to a list of most probable modes.

In some examples, the second group of syntax elements includes a first syntax element that indicates an index to the list of most probable modes if a corresponding syntax element in the first group of syntax elements indicates that an intra-prediction mode of luma samples of the respective PU is based on an index to the list of most probable modes. For example, if the prev _ intra _ luma _ pred _ flag syntax element of the respective PU indicates that the intra prediction mode of the respective PU is based on an index to the list of most probable modes, the corresponding mpm _ idx syntax element of the respective PU indicates an index to the list of most probable modes. In this example, the mpm _ idx syntax element is considered to be the first syntax element in the second group of syntax elements.

In some examples, the second group of syntax elements includes a second syntax element that indicates an intra-prediction mode for the respective PU if a corresponding syntax element in the first group of syntax elements indicates that the intra-prediction mode for luma samples of the respective PU is not based on an index to a list of most probable modes. For example, if the prev _ intra _ luma _ pred _ flag syntax element of the respective PU indicates that the intra prediction mode of the respective PU is not based on an index to the list of most probable modes, the corresponding rem _ intra _ luma _ pred _ mode syntax element of the respective PU indicates the intra prediction mode. In this example, the rem _ intra _ luma _ pred _ mode syntax element is considered as the second syntax element in the second group of syntax elements.

As described above, the list of most probable modes constructed by video encoder 20 identifies one or more intra prediction modes for one or more neighboring PUs. In some examples, video encoder 20 may output all syntax elements of the first group of syntax elements of the PUs of the CU before outputting any syntax elements of the second group of syntax elements of the PUs of the CU. In some examples, video encoder 20 may CABAC encode all syntax elements of the first group of syntax elements before bypassing any syntax elements of the second group of syntax elements.

It should be noted that in some examples, the entropy encoder may be an entropy encoder included within video encoder 20, such as entropy encoding unit 56. In this case, the term "output" may refer to one component within video encoder 20 outputting data to another component within video encoder 20. Additionally, in other examples, the entropy encoder may be external to video encoder 20. In one example, video encoder 20 outputs the set of flags and the respective syntax elements such that the group of syntax elements is arranged before the respective syntax elements in the sequence. In one example, video encoder 20 may arrange the syntax elements according to the coding structure illustrated in table 8. A video decoder may receive an entropy encoded bitstream and reconstruct video data using the entropy encoded bitstream.

Fig. 6 is a block diagram illustrating an example of video decoder 30 decoding an encoded video sequence. In one example, video decoder 30 may be configured to perform the following operations: receiving an entropy encoded bitstream comprising a set of flags and a respective syntax element corresponding to each flag, wherein each flag indicates whether a prediction mode of a respective Prediction Unit (PU) is based on an index to a list of most probable modes; a set of CABAC decoding flags; bypass decoding the respective syntax element; and reconstructing video data based on each flag and the value of the respective syntax element.

In the example of fig. 6, video decoder 30 includes an entropy decoding unit 70, a motion compensation unit 72, an intra prediction unit 74, an inverse quantization processing unit 76, an inverse transform processing unit 78, a reference frame memory 82, and a summer 80. In some examples, video decoder 30 may perform a decoding pass that is substantially reciprocal to the encoding pass described with respect to video encoder 20 (fig. 4).

Entropy decoding unit 70 receives the entropy encoded bitstream and decodes syntax elements from the bitstream according to a process that is reciprocal to the entropy encoding process used to encode the syntax elements. In one example, the entropy encoding process used to encode the syntax elements may be any of the entropy encoding processes described above.

FIG. 7 is a block diagram illustrating an example entropy decoding unit 70 that may implement the techniques described in this disclosure. Entropy decoding unit 70 receives the entropy encoded bitstream and decodes syntax elements from the bitstream. The syntax elements may include the intra prediction syntax elements prev _ intra _ luma _ pred _ flag, mpm _ idx, rem _ intra _ luma _ pred _ mode, intra _ chroma _ pred _ mode, and chroma _ pred _ from _ luma _ enable _ flag described above. The order in which syntax elements are decoded by entropy decoding unit 70 may be defined according to a coding structure (e.g., the example coding structures described above with respect to tables 3-10). The example entropy decoding unit 70 in fig. 7 includes an arithmetic decoding unit 702, and the arithmetic decoding unit 702 may include a bypass decoding engine 704 and a rule decoding engine 706. Example entropy decoding unit 70 also includes a context modeling unit 708 and an inverse binarization unit 710. Example entropy decoding unit 70 may perform the reciprocal function of example entropy encoding unit 56 described with respect to fig. 4. In this way, entropy decoding unit 70 may perform entropy decoding based on the techniques described herein.

The arithmetic decoding unit 702 receives the encoded bitstream. As shown in fig. 7, the arithmetic decoding unit 702 may process the encoded binary bit values according to a bypass path or a regular coding path. An indication that the encoded binary bit values should be processed according to the bypass path or the regular path may be signaled in the bitstream in a higher level syntax. Consistent with the CABAC process described above, in the case where the arithmetic decoding unit 702 receives a binary bit value from the bypass path, the bypass decoding engine 704 may perform arithmetic encoding on the binary bit value without utilizing a context assigned to the binary bit value. In one example, the bypass decoding engine 704 may assume that the probability of a binary bit having a possible value is equal.

In the case where the arithmetic decoding unit 702 receives binary bit values through the rule path, the context modeling unit 708 may provide context variables such that the rule decoding engine 706 may perform arithmetic encoding based on the context assignments provided by the context modeling unit 708. The context assignment may be defined according to a video coding standard such as HEVC. The context model may be stored in a memory. The context modeling unit 708 may include a series of indexed tables and/or utilize a mapping function to determine the context and context variable portions of the encoded bitstream. After decoding the binary bit value, rule coding engine 706 may update the context based on the decoded binary bit value. Additionally, the inverse binarization unit 710 may perform inverse binarization on the binary bit values and use a binary bit matching function to determine whether the binary bit values are valid. The inverse binarization unit 710 may also update the context modeling unit based on the match determination. Thus, the inverse binarization unit 710 outputs syntax elements according to a context adaptive decoding technique.

The order in which entropy decoding unit 70 receives syntax elements within the set of syntax elements may determine the total number of cycles needed for entropy decoding unit 70 to decode the set of syntax elements. In one example, rule decoding engine 706 may require more than two cycles to decode a single binary bit. Additionally, in one example, regular CABAC decode engine 706 may require n + M cycles to decode n binary bits, where M is the overhead to start the pipeline. M is typically greater than 0. At the beginning of the CABAC decoding process (e.g., switching from bypass mode to regular mode), the introduction pipeline consumes M extra. In one example, the bypass decoding engine 704 may require one cycle to code an n-bit syntax element. Thus, the total number of cycles required for the arithmetic decoding unit 510 to decode the set of bypass and CABAC binary bits may be based on the number of times the pipeline is introduced to consume M in addition. As such, the order of arrangement of the intra-prediction syntax elements in the coding structure may determine the number of cycles needed for video decoder 30 to decode the video data. Additionally, it should be noted that although example entropy decoding unit 70 illustrated in fig. 7 is described as switching from a bypass decoding operation to a regular decoding operation, in some cases, these operations may be performed in parallel. In this case, however, the order in which entropy decoding unit 70 receives the bypass-coded and CABAC-coded syntax elements may still determine the total number of cycles needed to entropy decode the syntax elements.

Referring again to fig. 6, motion compensation unit 72 may generate prediction data based on the motion vectors received from entropy decoding unit 70. Motion compensation unit 72 may identify a prediction block in a reference frame in reference frame memory 82 using a motion vector received in the bitstream. Motion compensation unit 72 generates motion compensated blocks making it possible to perform interpolation based on the interpolation filters. An identifier of an interpolation filter to be used for motion estimation with sub-pixel precision may be included in the syntax element. Motion compensation unit 72 may calculate interpolated values for sub-integer pixels of the reference block using interpolation filters as used by video encoder 20 during encoding of the video block. Motion compensation unit 72 may determine the interpolation filters used by video encoder 20 from the received syntax information and use the interpolation filters to generate predictive blocks.

Motion compensation unit 72 uses some of the syntax information to determine the size of the macroblocks used to encode the frames of the encoded video sequence, partition information that describes how to partition each macroblock of the frames of the encoded video sequence, a mode that indicates how to encode each partition, one or more reference frames (and a list of reference frames) for each inter-coded macroblock or partition, and other information to decode the encoded video sequence.

Intra-prediction unit 74 may form a prediction block from spatially neighboring blocks using an intra-prediction mode received in the bitstream. The intra-prediction modes may include the intra-prediction modes described above. Intra prediction unit 74 may determine the intra prediction mode to be used according to the syntax elements prev _ intra _ luma _ pred _ flag, mpm _ idx, rem _ intra _ luma _ pred _ mode, intra _ chroma _ pred _ mode, and chroma _ pred _ from _ luma _ enable _ flag as described above.

Inverse quantization unit 76 inverse quantizes (i.e., de-quantizes) the quantized block coefficients provided in the bitstream and decoded by entropy decoding unit 70. The inverse quantization process may comprise, for example, a conventional process as defined by the h.264 decoding standard. The inverse quantization process may also include for each macroBlock usage quantization parameter QP calculated by video encoder_YTo determine the degree of quantization and likewise the degree of inverse quantization that should be applied.

The inverse transform unit 78 applies an inverse transform (e.g., an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process) to the transform coefficients in order to generate a residual block in the pixel domain. Summer 80 sums the residual block with the corresponding prediction block produced by motion compensation unit 72 or intra-prediction unit 74 to form a decoded block. In this way, video decoder 30 reconstructs video blocks from the encoded bitstream.

Fig. 8 is a flow diagram illustrating an example of decoding video data in accordance with the techniques of this disclosure. Although the process in fig. 8 is described below as being performed generally by video decoder 30, the process may be performed by any combination of video decoder 30, intra-prediction unit 74, and/or entropy decoding unit 70.

As illustrated in fig. 8, video decoder 30 receives a first group of syntax elements (802). Video decoder 30 receives a second group of syntax elements (804), wherein the syntax elements of the second group of syntax elements correspond to the respective syntax elements of the first group of syntax elements. In other words, video decoder 30 receives an entropy encoded bitstream that includes a first group of syntax elements and a second group of syntax elements for a Prediction Unit (PU) of a Coding Unit (CU).

In one example, a syntax element within the first group indicates whether a prediction mode of a respective Prediction Unit (PU) is based on an index to a list of most probable modes. In one example, the first group includes four syntax elements prev _ INTRA _ luma _ pred _ flag corresponding to four respective PUs in the INTRA _ nxn PU structure. In one example, the syntax elements in the second group of syntax elements may be any of the syntax elements mpm _ idx or rem _ intra _ luma _ pred _ mode, as described above. In one example, the first group of syntax elements and the second group of syntax elements may be arranged according to the coding structure described with respect to table 8. In this way, the order in which video decoder 30 receives and performs operations on the set of flags and the respective syntax elements may be defined according to the coding structure.

Video decoder 30 CABAC decodes the first group of syntax elements (806). In one example, video decoder 30 decodes the first group of syntax elements using a rule Context Adaptive Binary Arithmetic Coding (CABAC) engine (e.g., rule decoding engine 706 described with respect to fig. 7). Video decoder 30 bypasses decoding the second group of syntax elements (808). In one example, video decoder 30 may entropy decode the second group of syntax elements after entropy decoding the first group of syntax elements. Additionally, in one example, video decoder 30 may entropy decode the second group of syntax elements using a bypass decoding engine (e.g., bypass decoding engine 704 described with respect to fig. 7). Video decoder 30 decodes and reconstructs the video data based on the decoded first group of syntax elements and the decoded second group of syntax elements (810). In one example, video decoder 30 reconstructs the video data by summing residual blocks of the video data with corresponding prediction blocks of the video data associated with the respective syntax elements. For example, video decoder 30 intra-predictively decodes a PU of a CU based on the first group of syntax elements and the decoded second group of syntax elements.

Reiterating, similar to the description above with respect to video encoder 20 and fig. 5, the example of the first group of syntax elements includes a plurality of flags (e.g., a prev _ intra _ luma _ pred _ flag syntax element). A flag of the plurality of flags may be a syntax element of a first group of syntax elements that indicates the following: whether the intra prediction mode of the luma samples of the respective PU is based on an index to a list of most probable modes.

As described above, the list of most probable modes constructed by video decoder 30 identifies one or more intra prediction modes of one or more neighboring PUs. In some examples, video decoder 30 may receive all syntax elements of the first group of syntax elements of the PUs of the CU before receiving any syntax elements of the second group of syntax elements of the PUs of the CU. In some examples, video decoder 30 may CABAC decode all syntax elements of the first group of syntax elements before bypass decoding any syntax elements of the second group of syntax elements.

In the above example, when the prediction mode of a CU is INTRA _ nxn, the CU includes four PUs, and each PU has one luma INTRA prediction mode, and the entire CU has only one chroma INTRA prediction mode, however, this design may have some potential drawbacks. As one example, there may be inefficient prediction performance because four PUs with different texture patterns are forced to use one same chroma prediction mode. As another example, there may be an inconsistency mode signaling between the luma component and the chroma component.

Some examples are described below to address the drawbacks associated with the requirement that there is only one chroma intra prediction mode for the entire CU. The techniques described above may be combined when there is only one chroma intra prediction mode for an entire CU, or applied separately from the techniques described above when there is only one chroma intra prediction mode for an entire CU.

To address the potential problem of inefficient prediction performance introduced above, the techniques of this disclosure include signaling one chroma INTRA prediction mode for each PU (as is done for luma INTRA prediction modes), i.e., in the context of the current HEVC standard, four chroma _ INTRA _ prediction _ modes are signaled for CUs with prediction modes such as INTRA _ N × N. With this design, one "luma _ intra _ prediction _ mode" syntax element and one "chroma _ intra _ prediction _ mode" syntax element are therefore signaled to indicate the intra prediction mode of the PU.

In some examples, this intra-mode signaling scheme may be applied unless the current CU size is equal to 8 × 8 and the minimum transform unit size is 4 × 4. More generally, this exception exists when the current CU size is twice the minimum transform unit size in each dimension, which means that the chroma components of the current CU are equal to the minimum transform unit size and cannot be further split into four PUs (assuming the original format is YUV 420). In this case, only one chroma _ intra _ prediction _ mode is signaled for the current CU.

According to these examples, the coding structure for intra modes may be implemented as shown in the examples of table 11 and table 12.

prev_intra_luma_pred_flag[x0][y0]	CABAC
		if(prev_intra_luma_pred_flag[x0][y0])
mpm_idx[x0][y0]	Bypass path
		Else
rem_intra_luma_pred_mode[x0][y0]	Bypass path
		chroma_mode_cabac_bins[x0][y0]	CABAC
chroma_mode_bypass_bins[x0][y0]	Bypass path

Table 11: intra-2 Nx 2N intra-mode decoding structure

Table 12: intra _ NxN intra mode decoding structure

The following describes an example in which intra prediction modes for luma samples of each PU and chroma samples of each PU of a CU are defined rather than only one intra prediction mode for chroma samples of the entire CU. Tables 13 and 14 define the manner for arranging the intra prediction syntax for intra _2 nx 2N and intra _ nxn intra mode coding structures, respectively, where all CABAC coded bins of luma intra prediction mode and chroma intra prediction mode are coded together, and then all bypass coded bins of luma intra prediction mode and chroma intra prediction mode are coded together. Tables 15 and 16 define the manner for arranging the intra-prediction syntax for intra 2 nx 2N and intra N x N intra mode coding structures, respectively, in which all CABAC-coded bins of the intra-prediction modes of all PUs in the CU are coded together, and then all bypass-coded bins of the intra-prediction modes are coded together. Tables 17 and 18 define the manner for arranging the intra-prediction syntax for intra 2 nx 2N and intra N x N intra mode coding structures, respectively, in which all CABAC-coded bins for both luma and chroma intra prediction modes of all PUs in a CU are coded, and then all bypass-coded bins for luma and chroma intra prediction modes are coded together.

Table 13: intra-2 Nx 2N intra-mode decoding structure

Table 14: intra-2 Nx 2N intra-mode decoding structure

Table 15: intra _ NxN intra mode decoding structure

Table 16: intra-2 Nx 2N intra-mode decoding structure

Table 17: intra _ NxN intra mode decoding structure

Table 18: intra-2 Nx 2N intra-mode decoding structure

In one or more examples, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media (which corresponds to tangible media such as data storage media) or communication media, including any medium that facilitates transfer of a computer program from one place to another, such as in accordance with a communication protocol. In this manner, the computer-readable medium may generally correspond to (1) a tangible computer-readable storage medium that is not transitory, or (2) a communication medium such as a signal or carrier wave. A data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure. The computer program product may include a computer-readable medium.

By way of example, and not limitation, such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if the instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory tangible storage media. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

The instructions may be executed by one or more processors, such as one or more Digital Signal Processors (DSPs), general purpose microprocessors, Application Specific Integrated Circuits (ASICs), field programmable gate logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Thus, the term "processor," as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. Additionally, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques may be fully implemented in one or more circuits or logic elements.

The techniques of this disclosure may be implemented in a wide variety of devices or apparatuses including a wireless handset, an Integrated Circuit (IC), or a set of ICs (e.g., a chipset). Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. Rather, as described above, the various units may be combined in a codec hardware unit, or provided by a set of interoperability hardware units (including one or more processors as described above) in conjunction with suitable software and/or firmware. Various examples have been described. These and other examples are within the scope of the following claims.

Claims

1. A method of decoding video data, the method comprising:

receiving an entropy encoded bitstream that includes a first group of syntax elements and a second group of syntax elements that include a plurality of flags of a Prediction Unit (PU) of a Coding Unit (CU), wherein syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements, and wherein one flag of the plurality of flags of the first group of syntax elements indicates whether an intra-prediction mode of luma samples of a respective PU in the CU is based on an index to a list of most probable modes;

context Adaptive Binary Arithmetic Coding (CABAC) decodes the first group of syntax elements;

after CABAC decoding the first group of syntax elements, bypass decoding the second group of syntax elements, wherein the second group of syntax elements comprises one of:

a first syntax element that indicates the index to the list of most probable modes if the flag in the first group of syntax elements indicates that the intra-prediction mode of the luma samples of the respective PU is based on the index to the list of most probable modes; and

a second syntax element that indicates the intra prediction mode for the luma samples of the respective PU if the flag in the first group of syntax elements indicates that the intra prediction mode for the luma samples of the respective PU is not based on the index to the list of most probable modes; and

reconstructing the video data based on the decoded first group of syntax elements and the decoded second group of syntax elements.

2. The method of decoding video data according to claim 1, further comprising:

CABAC decoding a first set of bins to indicate a chroma intra prediction mode separate from the CABAC decoding of the first group of syntax elements; and

bypass decoding a second set of bins to indicate the chroma intra prediction mode separately from the bypass decoding of the second group of syntax elements.

3. The method of decoding video data according to claim 1, wherein the list of most probable modes identifies one or more intra-prediction modes of one or more neighboring PUs.

4. The method of decoding video data according to claim 1, wherein receiving comprises receiving all syntax elements of the first group of syntax elements for the PUs of the CU prior to receiving any syntax elements of the second group of syntax elements for the PUs of the CU.

5. The method of decoding video data according to claim 1, wherein CABAC decoding comprises CABAC decoding all syntax elements of the first group of syntax elements before bypass decoding any syntax elements of the second group of syntax elements.

6. The method of decoding video data according to claim 1, wherein the first group of syntax elements includes four flags, and wherein each of the four flags is associated with a respective PU of an INTRA nxn PU structure of the CU.

7. The method of decoding video data according to claim 1, wherein reconstructing comprises:

intra-prediction decode the PU of the CU based on the first group of syntax elements and the second group of syntax elements.

8. An apparatus for decoding video data, the apparatus comprising a video decoder configured to:

9. The apparatus for decoding video data according to claim 8, wherein said video decoder is configured to:

10. The apparatus for decoding video data according to claim 8, wherein the list of most probable modes identifies one or more intra prediction modes of one or more neighboring PUs.

11. The apparatus for decoding video data according to claim 8, wherein to receive, the video decoder is configured to receive all syntax elements of the first group of syntax elements for the PUs of the CU prior to receiving any syntax elements of the second group of syntax elements for the PUs of the CU.

12. The apparatus for decoding video data according to claim 8, wherein, for CABAC decoding, the video decoder is configured to CABAC decode all syntax elements of the first group of syntax elements before bypassing decoding any syntax elements of the second group of syntax elements.

13. The apparatus for decoding video data according to claim 8, wherein the first group of syntax elements includes four flags, and wherein each of the four flags is associated with a respective PU of an INTRA nxn PU structure of the CU.

14. The apparatus for decoding video data according to claim 8, wherein, to reconstruct, said video decoder is configured to:

15. The apparatus for decoding video data according to claim 8, wherein said apparatus comprises at least one of:

an integrated circuit;

a microprocessor; and

a wireless communication device including the video decoder.

16. A computer-readable storage medium having instructions stored thereon that, when executed, cause one or more processors of a device for decoding video data to:

17. The computer-readable storage medium of claim 16, further comprising instructions that cause the one or more processors to:

18. The computer-readable storage medium of claim 16, wherein the list of most probable modes identifies one or more intra prediction modes of one or more neighboring PUs.

19. The computer-readable storage medium of claim 16, wherein the instructions that cause the one or more processors to receive comprise instructions that cause the one or more processors to: receiving all syntax elements of the first group of syntax elements for the PUs of the CU prior to receiving any syntax elements of the second group of syntax elements for the PUs of the CU.

20. The computer-readable storage medium of claim 16, wherein the instructions that cause the one or more processors to CABAC decode comprise instructions that cause the one or more processors to: CABAC decode all syntax elements of the first group of syntax elements before bypassing any syntax elements of the second group of syntax elements.

21. The computer-readable storage medium of claim 16, wherein the first group of syntax elements includes four flags, and wherein each of the four flags is associated with a respective PU of an INTRA nxn PU structure of the CU.

22. The computer-readable storage medium of claim 16, wherein the instructions that cause the one or more processors to reconstruct comprise instructions that cause the one or more processors to:

23. An apparatus for decoding video data, the apparatus comprising:

means for receiving an entropy encoded bitstream that includes a first group of syntax elements comprising a plurality of flags and a second group of syntax elements for a Prediction Unit (PU) of a Coding Unit (CU), wherein syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements, and wherein one flag of the plurality of flags of the first group of syntax elements indicates whether an intra-prediction mode for luma samples of a respective PU in the CU is based on an index to a list of most probable modes;

means for Context Adaptive Binary Arithmetic Coding (CABAC) decoding the first group of syntax elements;

means for bypass decoding the second group of syntax elements after CABAC decoding the first group of syntax elements, wherein the second group of syntax elements comprises one of:

means for reconstructing the video data based on the decoded first group of syntax elements and the decoded second group of syntax elements.

24. The apparatus for decoding video data according to claim 23, further comprising:

means for CABAC decoding a first set of bins to indicate a chroma intra prediction mode separate from the CABAC decoding of the first group of syntax elements; and

means for bypass decoding a second set of bins to indicate the chroma intra prediction mode separately from the bypass decoding of the second group of syntax elements.

25. A method of encoding video data, the method comprising:

generating a first group of syntax elements comprising a plurality of flags, wherein one flag of the plurality of flags of the first group of syntax elements indicates whether an intra-prediction mode of luma samples of a respective Prediction Unit (PU) in a Coding Unit (CU) is based on an index to a list of most probable modes;

generating a second group of syntax elements, wherein syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements, and wherein the second group of syntax elements comprises one of:

a second syntax element that indicates the intra prediction mode for the luma samples of the respective PU if the flag in the first group of syntax elements indicates that the intra prediction mode for the luma samples of the respective PU is not based on the index to the list of most probable modes;

context Adaptive Binary Arithmetic Coding (CABAC) encodes the first group of syntax elements;

bypass encoding the second group of syntax elements after CABAC encoding the first group of syntax elements; and

outputting the video data including the encoded first group of syntax elements and the encoded second group of syntax elements.

26. The method of encoding video data of claim 25, further comprising:

means for CABAC encoding a first set of bins to indicate a chroma intra prediction mode separate from the CABAC encoding of the first group of syntax elements; and

bypass encoding a second set of bins to indicate the chroma intra prediction mode separately from the bypass encoding of the second group of syntax elements.

27. The method of encoding video data of claim 25, wherein the list of most probable modes identifies one or more intra-prediction modes of one or more neighboring PUs.

28. The method of encoding video data of claim 25, wherein outputting comprises outputting the video data that includes all syntax elements of the encoded first group of syntax elements before outputting any syntax elements of the second group of syntax elements.

29. The method of encoding video data of claim 25, wherein CABAC encoding comprises CABAC encoding all syntax elements of the first group of syntax elements before bypass encoding any syntax elements of the second group of syntax elements.

30. The method of encoding video data of claim 25, wherein the first group of syntax elements includes four flags, and wherein each of the four flags is associated with a respective PU of an INTRA nxn PU structure of the CU.

31. The method of encoding video data of claim 25, further comprising:

intra-prediction encode the PU of the CU based on the first group of syntax elements and the second group of syntax elements.

32. An apparatus for encoding video data, the apparatus comprising a video encoder configured to:

33. The apparatus for encoding video data according to claim 32, wherein said video encoder is configured to:

34. The apparatus for encoding video data of claim 32, wherein the list of most probable modes identifies one or more intra prediction modes of one or more neighboring PUs.

35. The apparatus for encoding video data according to claim 32, wherein to output, said video encoder is configured to output the video data that includes all syntax elements of the encoded first group of syntax elements before outputting any syntax elements of the second group of syntax elements.

36. The apparatus for encoding video data according to claim 32, wherein for CABAC encoding, said video encoder is configured to CABAC encode all syntax elements of the first group of syntax elements before bypassing any syntax elements of the second group of syntax elements.

37. The apparatus for encoding video data of claim 32, wherein the first group of syntax elements includes four flags, and wherein each of the four flags is associated with a respective PU of an INTRA nxn PU structure of the CU.

38. The apparatus for encoding video data according to claim 32, wherein said video encoder is configured to:

39. A computer-readable storage medium having instructions stored thereon that, when executed, cause one or more processors of a device for encoding video data to:

generating a second group of syntax elements, wherein syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements, wherein the second group of syntax elements comprises one of:

40. The computer-readable storage medium of claim 39, further comprising instructions that cause the one or more processors to:

41. The computer-readable storage medium of claim 39, wherein the list of most probable modes identifies one or more intra prediction modes of one or more neighboring PUs.

42. The computer-readable storage medium of claim 39, wherein the instructions that cause the one or more processors to output comprise instructions that cause the one or more processors to: outputting the video data that includes all syntax elements of the encoded first group of syntax elements before outputting any syntax elements of the second group of syntax elements.

43. The computer-readable storage medium of claim 39, wherein the instructions that cause the one or more processors to CABAC encode comprise instructions that cause the one or more processors to: all syntax elements of the first group of syntax elements are CABAC encoded before bypassing any syntax elements of the second group of syntax elements.

44. The computer-readable storage medium of claim 39, wherein the first group of syntax elements includes four flags, and wherein each of the four flags is associated with a respective PU of an INTRA _ NxN PU structure of the CU.

45. The computer-readable storage medium of claim 39, further comprising instructions that cause the one or more processors to:

46. An apparatus for encoding video data, the apparatus comprising:

means for generating a first group of syntax elements comprising a plurality of flags, wherein one flag of the plurality of flags of the first group of syntax elements indicates whether an intra-prediction mode of luma samples of a respective Prediction Unit (PU) in a Coding Unit (CU) is based on an index to a list of most probable modes;

means for generating a second group of syntax elements, wherein syntax elements of the second group of syntax elements correspond to respective syntax elements of the first group of syntax elements, wherein the second group of syntax elements comprises one of:

means for Context Adaptive Binary Arithmetic Coding (CABAC) encoding the first group of syntax elements;

means for bypass encoding the second group of syntax elements after CABAC encoding the first group of syntax elements; and

means for outputting the video data including the encoded first group of syntax elements and the encoded second group of syntax elements.

47. The apparatus for encoding video data as recited in claim 46, further comprising:

means for bypass encoding a second set of bins to indicate the chroma intra prediction mode separately from the bypass encoding of the second group of syntax elements.