HK1216571B

HK1216571B - Methods, devices, and computer-readable storage mediums for video encoding and decoding

Info

Publication number: HK1216571B
Application number: HK16104546.3A
Authority: HK
Inventors: 张莉; 陈颖; 维贾伊拉加哈万．提鲁马莱; 刘鸿彬
Original assignee: 高通股份有限公司
Priority date: 2013-07-24
Filing date: 2013-12-24
Publication date: 2019-10-04

Description

Method, device and computer-readable storage medium for video encoding and decoding

本申请案主张2013年8月30日申请的美国临时申请案第61/872,540号及2013年12月6日申请的美国临时申请案第61/913,031号的权益，所述申请案中的每一者的全部内容以引用的方式并入本文中。This application claims the benefit of U.S. Provisional Application No. 61/872,540, filed August 30, 2013, and U.S. Provisional Application No. 61/913,031, filed December 6, 2013, each of which is incorporated herein by reference in its entirety.

技术领域Technical Field

本发明涉及视频编码及解码。The present invention relates to video encoding and decoding.

背景技术Background Art

数字视频能力可并入到广泛多种装置中，包含数字电视、数字直播系统、无线广播系统、个人数字助理(PDA)、膝上型或桌上型计算机、平板计算机、电子书阅读器、数码相机、数字记录装置、数字媒体播放器、视频游戏装置、视频游戏控制台、蜂窝式或卫星无线电电话(所谓的“智能电话”)、视频电话会议装置、视频串流装置及其类似者。数字视频装置实施视频压缩技术，例如，在由MPEG-2、MPEG-4、ITU-T H.263、ITU-T H.264/MPEG-4第10部分高级视频译码(AVC)定义的标准、目前正在开发的高效率视频译码(HEVC)标准及此类标准的扩展中所描述的那些视频压缩技术。视频装置可通过实施此类视频压缩技术来更有效地发射、接收、编码、解码及/或存储数字视频信息。Digital video capabilities can be incorporated into a wide variety of devices, including digital televisions, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio telephones (so-called "smartphones"), video teleconferencing devices, video streaming devices, and the like. Digital video devices implement video compression techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4 Part 10 Advanced Video Coding (AVC), the High Efficiency Video Coding (HEVC) standard currently under development, and extensions of such standards. By implementing such video compression techniques, video devices can more efficiently transmit, receive, encode, decode, and/or store digital video information.

视频压缩技术执行空间(图片内)预测及/或时间(图片间)预测来减少或移除视频序列中固有的冗余。对于基于块的视频译码，可将视频切片(即，视频帧或视频帧的一部分)分割成视频块。使用关于同一图片中的相邻块中的参考样本的空间预测来编码图片的经帧内译码(I)切片中的视频块。图片的经帧间译码(P或B)切片中的视频块可使用关于同一图片中的相邻块中的参考样本的空间预测或关于其它参考图片中的参考样本的时间预测。图片可被称作帧，且参考图片可被称作参考帧。Video compression techniques perform spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice (i.e., a video frame or portion of a video frame) may be partitioned into video blocks. Video blocks in an intra-coded (I) slice of a picture are encoded using spatial prediction with respect to reference samples in neighboring blocks in the same picture. Video blocks in an inter-coded (P or B) slice of a picture may use spatial prediction with respect to reference samples in neighboring blocks in the same picture or temporal prediction with respect to reference samples in other reference pictures. Pictures may be referred to as frames, and reference pictures may be referred to as reference frames.

空间或时间预测导致待译码块的预测性块。残余数据表示待译码原始块与预测性块之间的像素差。经帧间译码块是根据指向形成预测性块的参考样本块的运动向量及指示经译码块与预测性块之间的差的残余数据而编码。根据帧内译码模式及残余数据来编码经帧内译码块。为了进一步压缩，可将残余数据从像素域变换到变换域，从而产生可接着量化的残余系数。可扫描最初布置为二维阵列的经量化系数，以便产生系数的一维向量，且可应用熵译码以实现甚至更多压缩。Spatial or temporal prediction results in a predictive block for the block to be coded. Residual data represents the pixel differences between the original block to be coded and the predictive block. An inter-coded block is encoded based on a motion vector pointing to a block of reference samples forming the predictive block and residual data indicating the difference between the coded block and the predictive block. An intra-coded block is encoded based on an intra-coding mode and the residual data. For further compression, the residual data can be transformed from the pixel domain to the transform domain, generating residual coefficients that can then be quantized. The quantized coefficients, initially arranged in a two-dimensional array, can be scanned to generate a one-dimensional vector of coefficients, and entropy coding can be applied to achieve even more compression.

可通过例如从多个观点编码视图来产生多视图译码位流。已开发利用多视图译码方面的一些三维(3D)视频标准。举例来说，不同视图可发射左眼及右眼视图以支持3D视频。替代性地，一些3D视频译码过程可应用所谓的多视图加深度译码。在多视图加深度译码中，3D视频位流可不仅含有纹理视图分量而且含有深度视图分量。举例来说，每一视图可包括一个纹理视图分量及一个深度视图分量。A multi-view coding bitstream can be generated by, for example, encoding views from multiple viewpoints. Some three-dimensional (3D) video standards have been developed that utilize aspects of multi-view coding. For example, different views may transmit left-eye and right-eye views to support 3D video. Alternatively, some 3D video coding processes may apply so-called multi-view plus depth coding. In multi-view plus depth coding, a 3D video bitstream may contain not only texture view components but also depth view components. For example, each view may include one texture view component and one depth view component.

发明内容Summary of the Invention

一般来说，本发明涉及基于高级编解码器的三维(3D)视频译码，包含深度译码技术。举例来说，本发明的技术中的一些涉及用于3维高效率视频译码(3D-HEVC)的高级运动预测。在一些实例中，视频译码器确定用于包含在用于当前预测单元(PU)的候选者列表中的候选者。所述候选者基于所述当前PU的多个子PU的运动参数。当产生所述候选者时，所述视频译码器可以特定次序(例如，光栅扫描次序)处理所述子PU。如果不使用运动补偿预测译码对应于子PU的参考块，则所述视频译码器将所述子PU的运动参数设定为默认运动参数。对于来自所述多个子PU的每一相应子PU，如果不使用运动补偿预测译码所述相应子PU的参考块，则响应于后续确定使用运动补偿预测译码所述次序中的任何稍后子PU的参考块，不设定所述相应子PU的运动参数。In general, the present disclosure relates to three-dimensional (3D) video coding based on advanced codecs, including depth coding techniques. For example, some of the techniques of this disclosure relate to advanced motion prediction for 3-Dimensional High Efficiency Video Coding (3D-HEVC). In some examples, a video coder determines a candidate for inclusion in a candidate list for a current prediction unit (PU). The candidate is based on motion parameters of a plurality of sub-PUs of the current PU. When generating the candidates, the video coder may process the sub-PUs in a particular order (e.g., raster scan order). If a reference block corresponding to the sub-PU is not coded using motion compensated prediction, the video coder sets the motion parameters of the sub-PU to default motion parameters. For each respective sub-PU from the plurality of sub-PUs, if the reference block of the respective sub-PU is not coded using motion compensated prediction, then in response to a subsequent determination to code a reference block of any later sub-PU in the order using motion compensated prediction, the motion parameters of the respective sub-PU are not set.

在一个实例中，本发明描述一种用于解码多视图视频数据的方法，所述方法包括：将当前预测单元(PU)划分成多个子PU，其中所述当前PU在当前图片中；确定默认运动参数；以特定次序处理来自所述多个子PU的子PU，其中对于来自所述多个子PU的每一相应子PU，如果不使用运动补偿预测译码所述相应子PU的参考块，则响应于后续确定使用运动补偿预测译码所述次序中的任何稍后子PU的参考块，不设定所述相应子PU的运动参数，其中不使用运动补偿预测译码所述子PU中的至少一者的参考块，且其中处理所述子PU包括，对于来自所述多个子PU的每一相应子PU：确定所述相应子PU的参考块，其中参考图片包含所述相应子PU的所述参考块；如果使用运动补偿预测译码所述相应子PU的所述参考块，则基于所述相应子PU的所述参考块的运动参数设定所述相应子PU的运动参数；及如果不使用运动补偿预测译码所述相应子PU的所述参考块，则将所述相应子PU的所述运动参数设定为所述默认运动参数；及将候选者包含于所述当前PU的候选者列表中，其中所述候选者基于所述多个子PU的所述运动参数；从位流获得指示所述候选者列表中的所选择候选者的语法元素；及使用所述所选择候选者的运动参数以重建构所述当前PU的预测性块。In one example, the present disclosure describes a method for decoding multi-view video data, the method comprising: partitioning a current prediction unit (PU) into a plurality of sub-PUs, wherein the current PU is in a current picture; determining default motion parameters; processing the sub-PUs from the plurality of sub-PUs in a particular order, wherein, for each respective sub-PU from the plurality of sub-PUs, if a reference block of the respective sub-PU is not coded using motion compensated prediction, then, in response to a subsequent determination that a reference block of any later sub-PU in the order is coded using motion compensated prediction, the motion parameters of the respective sub-PU are not set, wherein the reference block of at least one of the sub-PUs is not coded using motion compensated prediction, and wherein processing the sub-PUs comprises, for each respective sub-PU from the plurality of sub-PUs, : determining a reference block for the corresponding sub-PU, wherein a reference picture includes the reference block for the corresponding sub-PU; if the reference block of the corresponding sub-PU is decoded using motion compensated prediction, setting the motion parameters of the corresponding sub-PU based on the motion parameters of the reference block of the corresponding sub-PU; and if the reference block of the corresponding sub-PU is not decoded using motion compensated prediction, setting the motion parameters of the corresponding sub-PU to the default motion parameters; and including a candidate in a candidate list for the current PU, wherein the candidate is based on the motion parameters of the multiple sub-PUs; obtaining a syntax element indicating a selected candidate in the candidate list from a bitstream; and using the motion parameters of the selected candidate to reconstruct a predictive block of the current PU.

在另一实例中，本发明描述一种编码视频数据的方法，所述方法包括：将当前预测单元(PU)划分成多个子PU，其中所述当前PU在当前图片中；确定默认运动参数；以特定次序处理来自所述多个子PU的子PU，其中对于来自所述多个子PU的每一相应子PU，如果不使用运动补偿预测译码所述相应子PU的参考块，则响应于后续确定使用运动补偿预测译码所述次序中的任何稍后子PU的参考块，不设定所述相应子PU的运动参数，其中不使用运动补偿预测译码所述子PU中的至少一者的参考块，且其中处理所述子PU包括，对于来自所述多个子PU的每一相应子PU：确定所述相应子PU的参考块，其中参考图片包含所述相应子PU的所述参考块；如果使用运动补偿预测译码所述相应子PU的所述参考块，则基于所述相应子PU的所述参考块的运动参数设定所述相应子PU的运动参数；及如果不使用运动补偿预测译码所述相应子PU的所述参考块，则将所述相应子PU的所述运动参数设定为所述默认运动参数；及将候选者包含于所述当前PU的候选者列表中，其中所述候选者基于所述多个子PU的所述运动参数；及在位流中用信号通知指示所述候选者列表中的所选择候选者的语法元素。In another example, the present disclosure describes a method of encoding video data, the method comprising: dividing a current prediction unit (PU) into a plurality of sub-PUs, wherein the current PU is in a current picture; determining default motion parameters; processing the sub-PUs from the plurality of sub-PUs in a particular order, wherein, for each respective sub-PU from the plurality of sub-PUs, if a reference block of the respective sub-PU is not coded using motion compensated prediction, then, in response to a subsequent determination that a reference block of any later sub-PU in the order is coded using motion compensated prediction, the motion parameters of the respective sub-PU are not set, wherein the reference block of at least one of the sub-PUs is not coded using motion compensated prediction, and wherein processing the sub-PUs comprises, for each respective sub-PU from the plurality of sub-PUs, For each corresponding sub-PU of a sub-PU: determining a reference block for the corresponding sub-PU, wherein a reference picture includes the reference block of the corresponding sub-PU; if the reference block of the corresponding sub-PU is decoded using motion compensated prediction, setting the motion parameters of the corresponding sub-PU based on the motion parameters of the reference block of the corresponding sub-PU; and if the reference block of the corresponding sub-PU is not decoded using motion compensated prediction, setting the motion parameters of the corresponding sub-PU to the default motion parameters; and including a candidate in a candidate list for the current PU, wherein the candidate is based on the motion parameters of the multiple sub-PUs; and signaling a syntax element indicating a selected candidate in the candidate list in a bitstream.

在另一实例中，本发明描述一种用于译码视频数据的装置，所述装置包括：用于存储经解码图片的存储器；及经配置以进行以下操作的一或多个处理器：将当前预测单元(PU)划分成多个子PU，其中所述当前PU在当前图片中；确定默认运动参数；以特定次序处理来自所述多个子PU的子PU，其中对于来自所述多个子PU的每一相应子PU，如果不使用运动补偿预测译码所述相应子PU的参考块，则响应于后续确定使用运动补偿预测译码所述次序中的任何稍后子PU的参考块，不设定所述相应子PU的运动参数，其中不使用运动补偿预测译码所述子PU中的至少一者的参考块，且其中处理所述子PU包括，对于来自所述多个子PU的每一相应子PU：确定所述相应子PU的参考块，其中参考图片包含所述相应子PU的所述参考块；如果使用运动补偿预测译码所述相应子PU的所述参考块，则基于所述相应子PU的所述参考块的运动参数设定所述相应子PU的运动参数；及如果不使用运动补偿预测译码所述相应子PU的所述参考块，则将所述相应子PU的所述运动参数设定为所述默认运动参数；及将候选者包含于所述当前PU的候选者列表中，其中所述候选者基于所述多个子PU的所述运动参数。In another example, the disclosure describes a device for decoding video data, the device comprising: a memory for storing a decoded picture; and one or more processors configured to: divide a current prediction unit (PU) into a plurality of sub-PUs, wherein the current PU is in a current picture; determine default motion parameters; process the sub-PUs from the plurality of sub-PUs in a particular order, wherein, for each respective sub-PU from the plurality of sub-PUs, if a reference block of the respective sub-PU is not coded using motion compensated prediction, then, in response to a subsequent determination to use motion compensated prediction to decode a reference block of any later sub-PU in the order, the motion parameters of the respective sub-PU are not set, wherein the motion compensated prediction is not used to decode the reference block of the respective sub-PU. The method comprises: determining a reference block of at least one of the plurality of sub-PUs, wherein processing the sub-PU comprises, for each respective sub-PU from the plurality of sub-PUs: determining a reference block for the respective sub-PU, wherein a reference picture includes the reference block of the respective sub-PU; setting motion parameters of the respective sub-PU based on motion parameters of the reference block of the respective sub-PU if motion-compensated prediction is used to decode the reference block of the respective sub-PU; and setting the motion parameters of the respective sub-PU to the default motion parameters if motion-compensated prediction is not used to decode the reference block of the respective sub-PU; and including a candidate in a candidate list for the current PU, wherein the candidate is based on the motion parameters of the plurality of sub-PUs.

在另一实例中，本发明描述一种用于译码视频数据的装置，所述装置包括：用于将当前预测单元(PU)划分成多个子PU的装置，其中所述当前PU在当前图片中；用于确定默认运动参数的装置；用于以特定次序处理来自所述多个子PU的子PU的装置，其中对于来自所述多个子PU的每一相应子PU，如果不使用运动补偿预测译码所述相应子PU的参考块，则响应于后续确定使用运动补偿预测译码所述次序中的任何稍后子PU的参考块，不设定所述相应子PU的运动参数，其中不使用运动补偿预测译码所述子PU中的至少一者的参考块，且其中所述用于处理所述子PU的装置包括，对于来自所述多个子PU的每一相应子PU：用于确定所述相应子PU的参考块的装置，其中参考图片包含所述相应子PU的所述参考块；用于如果使用运动补偿预测译码所述相应子PU的所述参考块，则基于所述相应子PU的所述参考块的运动参数设定所述相应子PU的运动参数的装置；及用于如果不使用运动补偿预测译码所述相应子PU的所述参考块，则将所述相应子PU的所述运动参数设定为所述默认运动参数的装置；及用于将候选者包含于所述当前PU的候选者列表中的装置，其中所述候选者基于所述多个子PU的所述运动参数。In another example, the disclosure describes a device for coding video data, the device comprising: means for partitioning a current prediction unit (PU) into a plurality of sub-PUs, wherein the current PU is in a current picture; means for determining default motion parameters; means for processing sub-PUs from the plurality of sub-PUs in a particular order, wherein, for each respective sub-PU from the plurality of sub-PUs, if a reference block of the respective sub-PU is not coded using motion compensated prediction, then, in response to a subsequent determination that a reference block of any later sub-PU in the order is coded using motion compensated prediction, the motion parameter of the respective sub-PU is not set, wherein the reference block of at least one of the sub-PUs is not coded using motion compensated prediction, and wherein the means for processing The device of the sub-PU includes, for each corresponding sub-PU from the multiple sub-PUs: a device for determining a reference block of the corresponding sub-PU, wherein a reference picture includes the reference block of the corresponding sub-PU; a device for setting the motion parameters of the corresponding sub-PU based on the motion parameters of the reference block of the corresponding sub-PU if the reference block of the corresponding sub-PU is decoded using motion compensated prediction; and a device for setting the motion parameters of the corresponding sub-PU to the default motion parameters if the reference block of the corresponding sub-PU is not decoded using motion compensated prediction; and a device for including a candidate in a candidate list of the current PU, wherein the candidate is based on the motion parameters of the multiple sub-PUs.

在另一实例中，本发明描述一种非暂时性计算机可读数据存储媒体，其具有存储在其上的指令，所述指令在被执行时致使装置：将当前预测单元(PU)划分成多个子PU，其中所述当前PU处于当前图片中；确定默认运动参数；以特定次序处理来自所述多个子PU的子PU，其中对于来自所述多个子PU的每一相应子PU，如果不使用运动补偿预测译码所述相应子PU的参考块，则响应于后续确定使用运动补偿预测译码所述次序中的任何稍后子PU的参考块，不设定所述相应子PU的运动参数，其中不使用运动补偿预测译码所述子PU中的至少一者的参考块，且其中处理所述子PU包括，对于来自所述多个子PU的每一相应子PU：确定所述相应子PU的参考块，其中参考图片包含所述相应子PU的所述参考块；如果使用运动补偿预测译码所述相应子PU的所述参考块，则基于所述相应子PU的所述参考块的运动参数而设定所述相应子PU的运动参数；及如果不使用运动补偿预测译码所述相应子PU的所述参考块，则将所述相应子PU的所述运动参数设定到所述默认运动参数；及将候选者包含在所述当前PU的候选者列表中，其中所述候选者基于所述多个子PU的所述运动参数。In another example, the present disclosure describes a non-transitory computer-readable data storage medium having instructions stored thereon that, when executed, cause a device to: divide a current prediction unit (PU) into a plurality of sub-PUs, wherein the current PU is in a current picture; determine default motion parameters; and process the sub-PUs from the plurality of sub-PUs in a particular order, wherein, for each respective sub-PU from the plurality of sub-PUs, if a reference block of the respective sub-PU is not coded using motion compensated prediction, then, in response to a subsequent determination that a reference block of any later sub-PU in the order is coded using motion compensated prediction, the motion parameters of the respective sub-PU are not set. the reference block of at least one of the plurality of sub-PUs, and wherein processing the sub-PU comprises, for each respective sub-PU from the plurality of sub-PUs: determining a reference block for the respective sub-PU, wherein a reference picture includes the reference block of the respective sub-PU; setting motion parameters of the respective sub-PU based on motion parameters of the reference block of the respective sub-PU if motion compensated prediction is used to decode the reference block of the respective sub-PU; and setting the motion parameters of the respective sub-PU to the default motion parameters if motion compensated prediction is not used to decode the reference block of the respective sub-PU; and including a candidate in a candidate list of the current PU, wherein the candidate is based on the motion parameters of the plurality of sub-PUs.

在附图及下文描述中阐述本发明的一或多个实例的细节。其它特征、目标及优势将从所述描述、图式及权利要求书而显而易见。The details of one or more embodiments of the present invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description, drawings, and claims.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为说明可利用本发明中描述的技术的实例视频译码系统的框图。1 is a block diagram illustrating an example video coding system that may utilize the techniques described in this disclosure.

图2为说明高效率视频译码(HEVC)中的实例帧内预测模式的概念图。2 is a conceptual diagram illustrating example intra-prediction modes in High Efficiency Video Coding (HEVC).

图3为说明相对于当前块的实例空间相邻块的概念图。3 is a conceptual diagram illustrating example spatially neighboring blocks relative to a current block.

图4为说明实例多视图解码次序的概念图。4 is a conceptual diagram illustrating an example multi-view decoding order.

图5为说明用于多视图译码的实例预测结构的概念图。5 is a conceptual diagram illustrating an example prediction structure for multi-view coding.

图6为说明基于相邻块的视差向量(NBDV)导出中的实例时间相邻块的概念图。6 is a conceptual diagram illustrating example temporal neighboring blocks in neighboring block-based disparity vector (NBDV) derivation.

图7为说明从参考视图的深度块导出以执行反向视图合成预测(BVSP)的概念图。7 is a conceptual diagram illustrating depth block derivation from a reference view to perform backward view synthesis prediction (BVSP).

图8为说明用于合并/跳过模式的经视图间预测的运动向量候选者的实例导出的概念图。8 is a conceptual diagram illustrating an example derivation of inter-view predicted motion vector candidates for merge/skip mode.

图9为指示3D-HEVC中的l0CandIdx及l1CandIdx的实例指定的表格。FIG. 9 is a table indicating example designations of l0CandIdx and l1CandIdx in 3D-HEVC.

图10为说明用于深度译码的运动向量继承候选者的实例导出的概念图。10 is a conceptual diagram illustrating an example derivation of motion vector inheritance candidates for depth coding.

图11说明多视图视频译码中的高级残余预测(ARP)的实例预测结构。11 illustrates an example prediction structure for advanced residual prediction (ARP) in multi-view video coding.

图12为说明当前块、参考块及运动补偿块当中的实例关系的概念图。12 is a conceptual diagram illustrating an example relationship among a current block, a reference block, and a motion compensated block.

图13为说明子预测单元(PU)视图间运动预测的概念图。13 is a conceptual diagram illustrating inter-view motion prediction for sub-prediction units (PUs).

图14为说明可实施本发明中描述的技术的实例视频编码器的框图。14 is a block diagram illustrating an example video encoder that may implement the techniques described in this disclosure.

图15为说明可实施本发明中描述的技术的实例视频解码器的框图。15 is a block diagram illustrating an example video decoder that may implement the techniques described in this disclosure.

图16A为根据本发明的实例说明使用帧间预测来编码译码单元(CU)的视频编码器的实例操作的流程图。16A is a flowchart illustrating example operation of a video encoder that encodes a coding unit (CU) using inter prediction, according to an example of this disclosure.

图16B为根据本发明的实例说明使用帧间预测来解码CU的视频解码器的实例操作的流程图。16B is a flowchart illustrating example operation of a video decoder that uses inter prediction to decode a CU, according to an example of this disclosure.

图17为根据本发明的实例说明建构用于当前视图分量中的当前PU的合并候选者列表的视频译码器的实例操作的流程图。17 is a flowchart illustrating example operation of a video coder constructing a merge candidate list for a current PU in a current view component, according to examples of this disclosure.

图18为根据本发明的实例说明图17的参考图片列表建构操作的延续部分的流程图。FIG. 18 is a flow chart illustrating a continuation of the reference picture list construction operation of FIG. 17 according to an example of the present disclosure.

图19为根据本发明的实例说明确定视图间预测运动向量候选者或纹理合并候选者的视频译码器的操作的流程图。19 is a flowchart illustrating the operation of a video coder that determines an inter-view prediction motion vector candidate or a texture merge candidate, according to an example of this disclosure.

具体实施方式DETAILED DESCRIPTION

高效率视频译码(HEVC)是新开发的视频译码标准。3D-HEVC为HEVC针对3D视频数据的扩展。3D-HEVC提供同一场景来自不同视点的多个视图。3D-HEVC的标准化努力的部分包含基于HEVC的多视图视频编解码器的标准化。在3D-HEVC中，实现基于来自不同视图的经重建构视图分量的视图间预测。High-Efficiency Video Coding (HEVC) is a newly developed video coding standard. 3D-HEVC is an extension of HEVC for 3D video data. 3D-HEVC provides multiple views of the same scene from different viewpoints. Part of the 3D-HEVC standardization effort includes the standardization of a multi-view video codec based on HEVC. In 3D-HEVC, inter-view prediction is implemented based on reconstructed view components from different views.

在3D-HEVC中，视图间运动预测类似于用于标准HEVC的运动补偿且可利用相同或类似语法元素。合并模式、跳过模式及高级运动向量预测(AMVP)模式为运动预测的实例类型。当视频译码器对预测单元(PU)执行视图间运动预测时，视频译码器可使用在相同于PU的接入单元中但在不同视图中的图片作为运动信息的来源。相反地，其它运动补偿方法可仅使用不同接入单元中的图片作为参考图片。因此，在3D-HEVC中，相依视图中的块的运动参数可基于相同接入单元的其它视图中的已经译码运动参数来预测或推断。In 3D-HEVC, inter-view motion prediction is similar to motion compensation used in standard HEVC and may utilize the same or similar syntax elements. Merge mode, skip mode, and advanced motion vector prediction (AMVP) mode are example types of motion prediction. When a video coder performs inter-view motion prediction on a prediction unit (PU), the video coder may use pictures in the same access unit as the PU but in different views as sources of motion information. In contrast, other motion compensation methods may only use pictures in different access units as reference pictures. Therefore, in 3D-HEVC, motion parameters for blocks in dependent views may be predicted or inferred based on already coded motion parameters in other views of the same access unit.

当视频译码器执行运动预测时，视频译码器可在使用合并模式、跳过模式或AMVP模式用信号通知当前PU的运动信息时产生候选者列表(例如，合并候选者列表或AMVP候选者列表)。为在3D-HEVC中实施视图间运动预测，视频译码器可在合并候选者列表及AMVP候选者列表中包含经视图间预测的运动向量候选者(IPMVC)。视频译码器可以相同于候选者列表中的其它候选者的方式使用IPMVC。IPMVC可指定视图间参考图片中的PU(即，参考PU)的运动信息。视图间参考图片可在相同于当前PU的接入单元中但在不同于当前PU的视图中。When a video coder performs motion prediction, the video coder may generate a candidate list (e.g., a merge candidate list or an AMVP candidate list) when signaling the motion information of the current PU using merge mode, skip mode, or AMVP mode. To implement inter-view motion prediction in 3D-HEVC, the video coder may include inter-view predicted motion vector candidates (IPMVC) in the merge candidate list and the AMVP candidate list. The video coder may use IPMVC in the same manner as other candidates in the candidate list. IPMVC may specify motion information of a PU (i.e., a reference PU) in an inter-view reference picture. The inter-view reference picture may be in the same access unit as the current PU but in a different view from the current PU.

在一些实例中，IPMVC可指定当前PU的多个子PU的运动参数(例如，运动向量、参考索引等)。一般来说，PU的每一子PU可与PU的预测块的不同大小相等子块相关联。举例来说，如果PU的明度预测块为32×32且子PU大小为4×4，则视频译码器可将PU分割成与PU的明度预测块的不同4×4子块相关联的64个子PU。在此实例中，子PU还可与PU的色度预测块的对应子块相关联。因此，IPMVC可指定多个运动参数集合。在此类实例中，如果IPMVC为候选者列表中的所选择候选者，则视频译码器可基于由IPMVC指定的多个运动参数集合确定当前PU的预测性块。In some examples, IPMVC may specify motion parameters (e.g., motion vectors, reference indices, etc.) for multiple sub-PUs of the current PU. In general, each sub-PU of a PU may be associated with a different, equally-sized sub-block of the PU's prediction block. For example, if the PU's luma prediction block is 32x32 and the sub-PU size is 4x4, the video coder may partition the PU into 64 sub-PUs associated with different 4x4 sub-blocks of the PU's luma prediction block. In this example, the sub-PUs may also be associated with corresponding sub-blocks of the PU's chroma prediction block. Thus, IPMVC may specify multiple motion parameter sets. In such examples, if IPMVC is the selected candidate in the candidate list, the video coder may determine the predictive block for the current PU based on the multiple motion parameter sets specified by the IPMVC.

为确定指定当前PU的子PU的运动参数的IPMVC，视频译码器可根据光栅扫描次序处理子PU中的每一者。当视频译码器处理子PU(即，当前子PU)时，视频译码器可基于当前PU的视差向量确定对应于子PU的参考块。参考块可在相同于当前图片的时间实例中但在不同于当前图片的视图中。如果使用运动补偿预测译码对应于当前子PU的参考块(例如，参考块具有一或多个运动向量、参考索引等)，则视频译码器可将当前子PU的运动参数设定为对应于子PU的参考块的运动参数。否则，如果不使用运动补偿预测译码对应于当前子PU的参考块(例如，参考块经使用帧内预测译码)，则视频译码器可在光栅扫描次序中识别其对应参考块经使用运动补偿预测译码的最近子PU。视频译码器可接着将当前子PU的运动参数设定为对应于所识别子PU的参考块的运动参数。To determine the IPMVC that specifies the motion parameters of a sub-PU of the current PU, the video coder may process each of the sub-PUs according to raster scan order. When the video coder processes a sub-PU (i.e., the current sub-PU), the video coder may determine a reference block corresponding to the sub-PU based on the disparity vector of the current PU. The reference block may be in the same time instance as the current picture but in a different view from the current picture. If the reference block corresponding to the current sub-PU is coded using motion compensated prediction (e.g., the reference block has one or more motion vectors, reference indices, etc.), the video coder may set the motion parameters of the current sub-PU to the motion parameters of the reference block corresponding to the sub-PU. Otherwise, if the reference block corresponding to the current sub-PU is not coded using motion compensated prediction (e.g., the reference block is coded using intra prediction), the video coder may identify the nearest sub-PU in raster scan order whose corresponding reference block is coded using motion compensated prediction. The video coder may then set the motion parameters of the current sub-PU to the motion parameters of the reference block corresponding to the identified sub-PU.

在一些情况下，所识别子PU相比当前子PU在子PU的光栅扫描次序中稍后出现。因此，当确定当前子PU的运动参数时，视频译码器可前向扫描以找到其对应参考块经使用运动补偿预测译码的子PU。替代性地，视频译码器可延迟确定当前子PU的运动参数，直至视频译码器在子PU的处理期间遇到其对应参考块经使用运动补偿预测译码的PU为止。在这些状况中的任一者中，添加额外复杂性及译码延迟。In some cases, the identified sub-PU appears later in the raster scan order of sub-PUs than the current sub-PU. Therefore, when determining the motion parameters of the current sub-PU, the video coder may scan forward to find a sub-PU whose corresponding reference block is coded using motion compensated prediction. Alternatively, the video coder may delay determining the motion parameters of the current sub-PU until the video coder encounters a PU whose corresponding reference block is coded using motion compensated prediction during the processing of the sub-PU. In either of these cases, additional complexity and coding delay are added.

根据本发明的一或多种技术，视频译码器可将当前PU分割成多个子PU。此外，视频译码器可确定默认运动参数。另外，视频译码器可以特定次序处理来自多个子PU的子PU。在一些情况下，视频译码器可在处理子PU中的任一者之前确定默认运动参数。对于当前PU的每一相应PU，视频译码器可确定相应子PU的参考块。如果使用运动补偿预测译码相应子PU的参考块，则视频译码器可基于相应子PU的参考块的运动参数设定相应子PU的运动参数。然而，如果不使用运动补偿预测译码相应子PU的参考块，则视频译码器可将相应子PU的运动参数设定为默认运动参数。According to one or more techniques of the present invention, a video decoder may partition a current PU into multiple sub-PUs. In addition, the video decoder may determine default motion parameters. In addition, the video decoder may process the sub-PUs from the multiple sub-PUs in a specific order. In some cases, the video decoder may determine the default motion parameters before processing any of the sub-PUs. For each corresponding PU of the current PU, the video decoder may determine a reference block for the corresponding sub-PU. If the reference block of the corresponding sub-PU is decoded using motion compensated prediction, the video decoder may set the motion parameters of the corresponding sub-PU based on the motion parameters of the reference block of the corresponding sub-PU. However, if the reference block of the corresponding sub-PU is not decoded using motion compensated prediction, the video decoder may set the motion parameters of the corresponding sub-PU to the default motion parameters.

根据本发明的一或多种技术，如果不使用运动补偿预测译码相应子PU的参考块，则响应于后续确定使用运动补偿预测译码特定次序中的任何稍后子PU的参考块，不设定相应子PU的运动参数。因此，当视频译码器处理子PU时，视频译码器可不需要前向扫描以找到其对应参考块经使用运动补偿预测译码的子PU或延迟确定相应子PU的运动参数，直至视频译码器在子PU的处理期间遇到其对应参考块经使用运动补偿预测译码的PU为止。有利地，此情况可降低复杂性及译码延迟。According to one or more techniques of this disclosure, if the reference block of a corresponding sub-PU is not coded using motion compensated prediction, then in response to a subsequent determination that the reference block of any later sub-PU in a particular order is coded using motion compensated prediction, the motion parameters of the corresponding sub-PU are not set. Therefore, when the video coder processes a sub-PU, the video coder may not need to scan forward to find a sub-PU whose corresponding reference block is coded using motion compensated prediction or delay determining the motion parameters of the corresponding sub-PU until the video coder encounters a PU whose corresponding reference block is coded using motion compensated prediction during the processing of the sub-PU. Advantageously, this can reduce complexity and decoding latency.

图1为说明可利用本发明的技术的实例视频译码系统10的框图。如本文中所使用，术语“视频译码器”一般是指视频编码器及视频解码器两者。在本发明中，术语“视频译码”或“译码”可一般指视频编码或视频解码。FIG1 is a block diagram illustrating an example video coding system 10 that may utilize the techniques of this disclosure. As used herein, the term "video coder" generally refers to both video encoders and video decoders. In this disclosure, the terms "video coding" or "coding" may generally refer to video encoding or video decoding.

如图1中所展示，视频译码系统10包含源装置12及目的地装置14。源装置12产生经编码视频数据。因此，源装置12可被称为视频编码装置或视频编码设备。目的地装置14可解码由源装置12所产生的经编码视频数据。因此，目的地装置14可被称为视频解码装置或视频解码设备。源装置12及目的地装置14可为视频译码装置或视频译码设备的实例。1 , video coding system 10 includes a source device 12 and a destination device 14. Source device 12 generates encoded video data. Thus, source device 12 may be referred to as a video encoding device or a video encoding apparatus. Destination device 14 may decode the encoded video data generated by source device 12. Thus, destination device 14 may be referred to as a video decoding device or a video decoding apparatus. Source device 12 and destination device 14 may be examples of video coding devices or video coding apparatuses.

源装置12及目的地装置14可包括广泛范围的装置，包含桌上型计算机、移动计算装置、笔记型(例如，膝上型)计算机、平板计算机、机顶盒、例如所谓的“智能”电话的电话手持机、电视机、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机(in-carcomputer)或其类似者。Source device 12 and destination device 14 may comprise a wide range of devices, including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called “smart” phones, televisions, cameras, display devices, digital media players, video game consoles, in-car computers, or the like.

目的地装置14可经由信道16从源装置12接收经编码视频数据。信道16可包括能够将经编码视频数据从源装置12移动到目的地装置14的一或多个媒体或装置。在一个实例中，信道16可包括使得源装置12能够实时地将经编码视频数据直接发射到目的地装置14的一或多个通信媒体。在此实例中，源装置12可根据通信标准(例如，无线通信协议)来调制经编码视频数据，且可将经调制视频数据发射到目的地装置14。一或多个通信媒体可包含无线及/或有线通信媒体，例如射频(RF)频谱或一或多个物理发射线。一或多个通信媒体可形成分组网络的部分，例如局域网、广域网或全球网络(例如，因特网)。一或多个通信媒体可包含路由器、交换器、基站或促进从源装置12到目的地装置14的通信的其它设备。Destination device 14 may receive encoded video data from source device 12 via channel 16. Channel 16 may include one or more media or devices capable of moving the encoded video data from source device 12 to destination device 14. In one example, channel 16 may include one or more communication media that enable source device 12 to transmit the encoded video data directly to destination device 14 in real time. In this example, source device 12 may modulate the encoded video data according to a communication standard (e.g., a wireless communication protocol) and may transmit the modulated video data to destination device 14. The one or more communication media may include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet network, such as a local area network, a wide area network, or a global network (e.g., the Internet). The one or more communication media may include routers, switches, base stations, or other equipment that facilitates communication from source device 12 to destination device 14.

在另一实例中，信道16可包含存储由源装置12产生的经编码视频数据的存储媒体。在此实例中，目的地装置14可例如经由磁盘接入或卡接入来接入存储媒体。存储媒体可包含多种本地接入的数据存储媒体，例如蓝光光盘、DVD、CD-ROM、快闪存储器或用于存储经编码视频数据的其它合适数字存储媒体。In another example, channel 16 may include a storage medium that stores the encoded video data generated by source device 12. In this example, destination device 14 may access the storage medium, for example, via disk access or card access. The storage medium may include a variety of locally accessible data storage media such as Blu-ray discs, DVDs, CD-ROMs, flash memory, or other suitable digital storage media for storing encoded video data.

在另一实例中，信道16可包含存储由源装置12产生的经编码视频数据的文件服务器或另一中间存储装置。在此实例中，目的地装置14可经由串流或下载来接入存储于文件服务器或其它中间存储装置处的经编码视频数据。文件服务器可为能够存储经编码视频数据并将经编码视频数据发射到目的地装置14的服务器类型。实例文件服务器包含网络服务器(例如，用于网站)、文件传输协议(FTP)服务器、网络附接存储(NAS)装置及本地磁盘驱动器。In another example, channel 16 may include a file server or another intermediate storage device that stores the encoded video data generated by source device 12. In this example, destination device 14 may access the encoded video data stored at the file server or other intermediate storage device via streaming or downloading. The file server may be a type of server capable of storing and transmitting the encoded video data to destination device 14. Example file servers include a network server (e.g., for a website), a File Transfer Protocol (FTP) server, a Network Attached Storage (NAS) device, and a local disk drive.

目的地装置14可通过标准数据连接(例如，因特网连接)来接入经编码视频数据。数据连接的实例类型可包含无线信道(例如，Wi-Fi连接)、有线连接(例如，数字订户线(DSL)、电缆调制解调器等)或适于接入存储在文件服务器上的经编码视频数据的两者的组合。经编码视频数据从文件服务器的发射可为串流发射、下载发射或两者的组合。Destination device 14 may access the encoded video data through a standard data connection, such as an Internet connection. Example types of data connections may include a wireless channel (e.g., a Wi-Fi connection), a wired connection (e.g., a digital subscriber line (DSL), a cable modem, etc.), or a combination of both suitable for accessing encoded video data stored on a file server. Transmission of the encoded video data from the file server may be a streaming transmission, a download transmission, or a combination of both.

本发明的技术不限于无线应用或设定。所述技术可应用于视频译码以支持多种多媒体应用，例如空中电视广播、有线电视发射、卫星电视发射、串流视频发射(例如，经由因特网)、编码视频数据以存储于数据存储媒体上、解码存储于数据存储媒体上的视频数据，或其它应用。在一些实例中，视频译码系统10可经配置以支持单向或双向视频发射以支持例如视频串流、视频回放、视频广播及/或视频电话的应用。The techniques of this disclosure are not limited to wireless applications or settings. The techniques may be applied to video coding to support a variety of multimedia applications, such as over-the-air television broadcasting, cable television transmission, satellite television transmission, streaming video transmission (e.g., via the Internet), encoding video data for storage on a data storage medium, decoding video data stored on a data storage medium, or other applications. In some examples, video coding system 10 may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.

图1仅仅为实例，且本发明的技术可适用于未必包含编码装置与解码装置之间的任何数据通信的视频译码设定(例如，视频编码或视频解码)。在其它实例中，数据可从本地存储器检索、在网络上串流或其类似者。视频编码装置可编码数据并将数据存储到存储器，及/或视频解码装置可从存储器检索数据并解码数据。在许多实例中，由并不彼此通信而是仅编码数据到存储器及/或从存储器检索数据且解码数据的装置执行编码及解码。FIG1 is merely an example, and the techniques of this disclosure may be applicable to video coding settings (e.g., video encoding or video decoding) that do not necessarily include any data communication between the encoding and decoding devices. In other examples, data may be retrieved from local memory, streamed over a network, or the like. A video encoding device may encode and store data to memory, and/or a video decoding device may retrieve and decode data from memory. In many examples, encoding and decoding are performed by devices that do not communicate with each other but merely encode and/or retrieve data from memory and decode data.

在图1的实例中，源装置12包含视频源18、视频编码器20及输出接口22。在一些实例中，输出接口22可包含调制器/解调器(调制解调器)及/或发射器。视频源18可包含视频俘获装置(例如，摄像机)、含有先前俘获的视频数据的视频存档、用以从视频内容提供者接收视频数据的视频馈入接口，及/或用于产生视频数据的计算机图形系统，或视频数据的此类来源的组合。1 , source device 12 includes a video source 18, a video encoder 20, and an output interface 22. In some examples, output interface 22 may include a modulator/demodulator (modem) and/or a transmitter. Video source 18 may include a video capture device (e.g., a camera), a video archive containing previously captured video data, a video feed interface for receiving video data from a video content provider, and/or a computer graphics system for generating video data, or a combination of such sources of video data.

视频编码器20可编码来自视频源18的视频数据。在一些实例中，源装置12经由输出接口22将经编码视频数据直接发射到目的地装置14。在其它实例中，经编码视频数据也可存储到存储媒体或文件服务器上以供稍后由目的地装置14接入以用于解码及/或回放。Video encoder 20 may encode video data from video source 18. In some examples, source device 12 transmits the encoded video data directly to destination device 14 via output interface 22. In other examples, the encoded video data may also be stored on a storage medium or a file server for later access by destination device 14 for decoding and/or playback.

在图1的实例中，目的地装置14包含输入接口28、视频解码器30及显示装置32。在一些实例中，输入接口28包含接收器及/或调制解调器。输入接口28可经由信道16接收经编码视频数据。视频解码器30可解码经编码视频数据。显示装置32可显示经解码视频数据。显示装置32可与目的地装置14集成或可在目的地装置外部。显示装置32可包括多种显示装置，例如液晶显示器(LCD)、等离子显示器、有机发光二极管(OLED)显示器或另一类型的显示装置。In the example of FIG1 , destination device 14 includes an input interface 28, a video decoder 30, and a display device 32. In some examples, input interface 28 includes a receiver and/or a modem. Input interface 28 may receive encoded video data via channel 16. Video decoder 30 may decode the encoded video data. Display device 32 may display the decoded video data. Display device 32 may be integrated with destination device 14 or may be external to the destination device. Display device 32 may include a variety of display devices, such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or another type of display device.

视频编码器20及视频解码器30各自可实施为例如以下各者的多种合适电路中的任一者：一或多个微处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)、离散逻辑、硬件或其任何组合。当部分地以软件实施技术时，装置可将软件的指令存储于合适的非暂时性计算机可读存储媒体中且可使用一或多个处理器以硬件执行指令以执行本发明的技术。可将前述内容中的任一者(包含硬件、软件、硬件与软件的组合等)视为一或多个处理器。视频编码器20及视频解码器30中的每一者可包含在一或多个编码器或解码器中，所述编码器或解码器中的任一者可集成为相应装置中的组合编码器/解码器(编解码器)的部分。Video encoder 20 and video decoder 30 may each be implemented as any of a variety of suitable circuits, such as one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), discrete logic, hardware, or any combination thereof. When the techniques are implemented partially in software, the device may store the instructions for the software in a suitable non-transitory computer-readable storage medium and may execute the instructions in hardware using one or more processors to perform the techniques of this disclosure. Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) may be considered to be one or more processors. Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, either of which may be integrated as part of a combined encoder/decoder (CODEC) in the respective device.

本发明通常可指视频编码器20将某些信息“用信号通知”到另一装置(例如，视频解码器30)。术语“用信号通知”可通常指用以解码经压缩视频数据的语法元素及/或其它数据的传达。此传达可实时或几乎实时发生。替代性地，此传达可历时时间跨度发生，例如当在编码时以经编码位流将语法元素存储到计算机可读存储媒体时，可发生此传达，接着，在存储到此媒体之后可由解码装置在任何时间处检索所述语法元素。This disclosure may generally refer to video encoder 20 "signaling" certain information to another device (e.g., video decoder 30). The term "signaling" may generally refer to the communication of syntax elements and/or other data used to decode compressed video data. This communication may occur in real time or near real time. Alternatively, this communication may occur over a time span, such as when syntax elements are stored in an encoded bitstream to a computer-readable storage medium at encoding time, which may then be retrieved by a decoding device at any time after storage to such medium.

在一些实例中，视频编码器20及视频解码器30根据视频压缩标准而操作，例如ISO/IEC MPEG-4视觉及ITU-T H.264(也称为ISO/IEC MPEG-4AVC)，包含其可缩放视频译码(SVC)扩展、多视图视频译码(MVC)扩展及基于MVC的3DV扩展。在一些情况下，符合H.264/AVC的基于MVC的3DV扩展的任何位流始终含有顺应H.264/AVC的MVC扩展的子位流。此外，正致力于产生H.264/AVC的三维视频(3DV)译码扩展，即基于AVC的3DV。在其它实例中，视频编码器20及视频解码器30可根据ITU-T H.261、ISO/IEC MPEG-1视觉、ITU-T H.262或ISO/IECMPEG-2视觉及ITU-T H.264、ISO/IEC视觉来操作。In some examples, video encoder 20 and video decoder 30 operate according to a video compression standard, such as ISO/IEC MPEG-4 Visual and ITU-T H.264 (also known as ISO/IEC MPEG-4 AVC), including its scalable video coding (SVC) extension, multi-view video coding (MVC) extension, and MVC-based 3DV extension. In some cases, any bitstream conforming to the MVC-based 3DV extension of H.264/AVC always contains a sub-bitstream compliant with the MVC extension of H.264/AVC. In addition, work is underway to create a three-dimensional video (3DV) coding extension to H.264/AVC, i.e., 3DV-based AVC. In other examples, video encoder 20 and video decoder 30 may operate according to ITU-T H.261, ISO/IEC MPEG-1 Visual, ITU-T H.262, or ISO/IEC MPEG-2 Visual and ITU-T H.264, ISO/IEC Visual.

在其它实例中，视频编码器20及视频解码器30可根据由ITU-T视频译码专家组(VCEG)及ISO/IEC动画专家组(MPEG)的视频译码联合合作小组(JCT-VC)开发的高效率视频译码(HEVC)标准操作。HEVC标准的草案(被称作“HEVC工作草案10”)描述于布洛斯等人的“高效率视频译码(HEVC)文本规范草案10(用于FDIS及许可(Consent))”中(ITU-T SG16 WP3及ISO/IEC JTC1/SC29/WG11的视频译码联合合作小组(JCT-VC)，第12次会议，瑞士日内瓦，2013年1月)(在下文中称为“HEVC工作草案10”或“HEVC基本规范”)。此外，正致力于产生对HEVC的可缩放视频译码扩展。HEVC的可缩放视频译码扩展可被称为SHEVC或SHVC。In other examples, video encoder 20 and video decoder 30 may operate according to the High Efficiency Video Coding (HEVC) standard developed by the Joint Collaboration Team on Video Coding (JCT-VC) of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Motion Picture Experts Group (MPEG). A draft of the HEVC standard, referred to as "HEVC Working Draft 10," is described in "High Efficiency Video Coding (HEVC) Textual Specification Draft 10 (for FDIS and Consent)" by Bloss et al. (Joint Collaboration Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 12th Meeting, Geneva, Switzerland, January 2013) (hereinafter referred to as "HEVC Working Draft 10" or the "HEVC Baseline Specification"). In addition, work is underway to create a scalable video coding extension to HEVC. The scalable video coding extension of HEVC may be referred to as SHEVC or SHVC.

此外，VCEG及MPEG的3D视频译码联合合作小组(JCT-3V)当前正开发HEVC的多视图译码扩展(即，MV-HEVC)。泰克等人的“MV-HEVC草案文本4”(ITU-T SG 16 WP 3及ISO/IECJTC 1/SC 29/WG 11的3D视频译码扩展开发联合合作小组，第4次会议：韩国仁川，2013年4月)(在下文中称为“MV-HEVC测试模型4”)为MV-HEVC的草案。在MV-HEVC中，可仅存在高层级语法(HLS)改变，使得HEVC中在CU或PU层级处的模块不需要再设计。这情况可允许经配置以用于HEVC的模块再用于MV-HEVC。换句话说，MV-HEVC仅提供高层级语法改变而不提供低层级语法改变，例如CU/PU层级处的改变。Furthermore, the Joint Collaborative Team on 3D Video Coding (JCT-3V) of VCEG and MPEG is currently developing a multi-view coding extension for HEVC (i.e., MV-HEVC). "MV-HEVC Draft Text 4" by Teck et al. (Joint Collaborative Team on 3D Video Coding Extension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 4th Meeting: Incheon, South Korea, April 2013) (hereinafter referred to as "MV-HEVC Test Model 4") is a draft for MV-HEVC. In MV-HEVC, only high-level syntax (HLS) changes may be present, so that modules at the CU or PU level in HEVC do not need to be redesigned. This allows modules configured for HEVC to be reused in MV-HEVC. In other words, MV-HEVC only provides high-level syntax changes and not low-level syntax changes, such as those at the CU/PU level.

另外，VCEG及MPEG的JCT-3V正基于HEVC开发3DV标准，其中标准化努力的部分包含基于HEVC的多视图视频编解码器的标准化(MV-HEVC)及基于HEVC的另一3D视频译码部分(3D-HEVC)。对于3D-HEVC，可包含且支持用于纹理及深度视图两者的新译码工具(包含CU及/或PU层级处的那些译码工具)。到2013年12月17日为止，用于3D-HEVC的软件(例如，3D-HTM)可从以下链接下载：[3D-HTM版本7.0]：https://hevc.hhi.fraunhofer.de/svn/svn_3DVCSoftware/tags/HTM-7.0/。In addition, VCEG and MPEG's JCT-3V are developing a 3DV standard based on HEVC, with parts of the standardization effort including the standardization of a multi-view video codec based on HEVC (MV-HEVC) and another 3D video coding part based on HEVC (3D-HEVC). For 3D-HEVC, new coding tools for both texture and depth views (including those at the CU and/or PU level) may be included and supported. As of December 17, 2013, software for 3D-HEVC (e.g., 3D-HTM) is available for download at the following link: [3D-HTM Version 7.0]: https://hevc.hhi.fraunhofer.de/svn/svn_3DVCSoftware/tags/HTM-7.0/.

参考软件描述以及3D-HEVC的工作草案可如下获得：格哈德泰克等人的“3D-HEVC测试模型4”(JCT3V-D1005_spec_v1，ITU-T SG 16 WP 3及ISO/IEC JTC 1/SC 29/WG 11的3D视频译码扩展开发联合合作小组，第4次会议：韩国仁川，2013年4月)(在下文中称为“3D-HEVC测试模型4”)，到2013年12月17日为止所述测试模型可从以下链接下载：http://phenix.it-sudparis.eu/jct2/doc_end_user/documents/2_Shanghai/wg11/JCT3V-B1005-v1.zip。泰克等人的“3D-HEVC草案文本3”(ITU-T SG 16 WP 3及ISO/IEC JTC 1/SC29/WG 11的3D视频译码扩展开发联合合作小组，第3次会议：瑞士日内瓦，2013年1月，文件第JCT3V-C1005号)(在下文中称为“3D-HEVC测试模型3”)(到2013年12月17日为止可从http://phenix.it-sudparis.eu/jct2/doc_end_user/current_document.php？id＝706获得)为3D-HEVC的参考软件描述的另一版本。3D-HEVC也描述于泰克等人的“3D-HEVC草案文本2”中(ITU-T SG 16 WP 3及ISO/IEC JTC 1/SC 29/WG 11的3D视频译码扩展开发联合合作小组，第6次会议：瑞士日内瓦，2013年10月25日到11月1日，文件第JCT3V-F1001-v2号)(在下文中称为“3D-HEVC草案文本2”)。视频编码器20及视频解码器30可根据SHEVC、MV-HEVC及/或3D-HEVC而操作。A reference software description and a working draft of 3D-HEVC are available as follows: “3D-HEVC Test Model 4” by Gerhard Teck et al. (JCT3V-D1005_spec_v1, Joint Collaboration Group on Development of 3D Video Coding Extensions of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 4th meeting: Incheon, South Korea, April 2013) (hereinafter referred to as “3D-HEVC Test Model 4”), which as of December 17, 2013, was available for download from the following link: http://phenix.it-sudparis.eu/jct2/doc_end_user/documents/2_Shanghai/wg11/JCT3V-B1005-v1.zip. "3D-HEVC Draft Text 3" by Tektronix et al. (Joint Collaborative Group on Development of 3D Video Coding Extensions of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC29/WG 11, 3rd Meeting: Geneva, Switzerland, January 2013, Document No. JCT3V-C1005) (hereinafter referred to as "3D-HEVC Test Model 3") (available as of December 17, 2013 at http://phenix.it-sudparis.eu/jct2/doc_end_user/current_document.php?id=706) is another version of the reference software description for 3D-HEVC. 3D-HEVC is also described in "3D-HEVC Draft Text 2" by Tektronix et al. (Joint Collaborative Group on 3D Video Coding Extension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 6th Meeting: Geneva, Switzerland, October 25-November 1, 2013, document No. JCT3V-F1001-v2) (hereinafter referred to as "3D-HEVC Draft Text 2"). Video encoder 20 and video decoder 30 may operate according to SHEVC, MV-HEVC, and/or 3D-HEVC.

在HEVC及其它视频译码规范中，视频序列通常包含一系列图片。图片也可被称作“帧”。图片可包含三个样本阵列，表示为S_L、S_Cb及S_Cr。S_L是明度样本的二维阵列(即，块)。S_Cb是Cb色度样本的二维阵列。S_Cr是Cr色度样本的二维阵列。色度样本在本文中还可被称为“色度”样本。在其它情况下，图片可为单色的且可仅包含明度样本阵列。In HEVC and other video coding specifications, a video sequence typically comprises a series of pictures. A picture may also be referred to as a "frame." A picture may comprise three sample arrays, denoted as _SL , _SCb , and _SCr . _SL is a two-dimensional array (i.e., a block) of luma samples. _SCb is a two-dimensional array of Cb chroma samples. _SCr is a two-dimensional array of Cr chroma samples. Chroma samples may also be referred to herein as "chroma" samples. In other cases, a picture may be monochrome and may comprise only a luma sample array.

为产生图片的经编码表示，视频编码器20可产生译码树单元(CTU)的集合。CTU中的每一者可包括明度样本的译码树块、色度样本的两个对应译码树块，及用以译码所述译码树块的样本的语法结构。在单色图片或具有三个单独色彩平面的图片中，CTU可包括单个译码树块及用于译码所述译码树块的样本的语法结构。译码树块可为样本的N×N块。CTU也可被称为“树块”或“最大译码单元(LCU)”。HEVC的CTU可广泛地类似于例如H.264/AVC的其它标准的宏块。然而，CTU未必限于特定大小，且可包含一或多个译码单元(CU)。切片可包含按光栅扫描次序连续排序的整数数目个CTU。To generate an encoded representation of a picture, video encoder 20 may generate a set of coding tree units (CTUs). Each of the CTUs may include a coding tree block of luma samples, two corresponding coding tree blocks of chroma samples, and syntax structures used to code the samples of the coding tree blocks. In monochrome pictures or pictures with three separate color planes, a CTU may include a single coding tree block and syntax structures used to code the samples of the coding tree block. A coding tree block may be an N×N block of samples. A CTU may also be referred to as a “tree block” or a “largest coding unit (LCU)”. A CTU of HEVC may be broadly similar to macroblocks of other standards, such as H.264/AVC. However, a CTU is not necessarily limited to a specific size and may include one or more coding units (CUs). A slice may include an integer number of CTUs ordered contiguously in raster scan order.

本发明可使用术语“视频单元”或“视频块”或“块”来指代一或多个样本块及用于译码所述一或多个样本块的样本的语法结构。实例类型的视频单元可包含CTU、CU、PU、变换单元(TU)、宏块、宏块分区等等。在一些情形中，PU的论述可与宏块或宏块分区的论述互换。This disclosure may use the term "video unit," or "video block," or "block" to refer to one or more blocks of samples and syntax structures used to code samples of the one or more blocks of samples. Example types of video units may include CTUs, CUs, PUs, transform units (TUs), macroblocks, macroblock partitions, and the like. In some cases, discussion of PUs may be interchangeable with discussion of macroblocks or macroblock partitions.

为产生经译码CTU，视频编码器20可对CTU的译码树块以递归方式执行四叉树分割，以将译码树块划分成译码块，因此命名为“译码树单元”。译码块是样本的N×N块。CU可包括具有明度样本阵列、Cb样本阵列及Cr样本阵列的图片的明度样本的译码块及色度样本的两个对应译码块，及用以译码所述译码块的样本的语法结构。在单色图片或具有三个单独色彩平面的图片中，CU可包括单个译码块及用以译码所述译码块的样本的语法结构。To generate a coded CTU, video encoder 20 may recursively perform quadtree partitioning on the coding tree block of the CTU to divide the coding tree block into coding blocks, hence the name "coding tree unit". A coding block is an N×N block of samples. A CU may include a coding block of luma samples and two corresponding coding blocks of chroma samples of a picture having a luma sample array, a Cb sample array, and a Cr sample array, and syntax structures used to code the samples of the coding blocks. In a monochrome picture or a picture with three separate color planes, a CU may include a single coding block and syntax structures used to code the samples of the coding block.

视频编码器20可将CU的译码块分割成一或多个预测块。预测块为应用相同预测的样本的矩形(即，正方形或非正方形)块。CU的预测单元(PU)可包括明度样本的预测块、色度样本的两个对应预测块及用以预测预测块的语法结构。在单色图片或具有三个单独色彩平面的图片中，PU可包括单个预测块及用以预测预测块的语法结构。视频编码器20可产生用于CU的每一PU的明度预测块、Cb预测块及Cr预测块的预测性明度块、Cb块及Cr块。Video encoder 20 may partition the coding blocks of a CU into one or more prediction blocks. A prediction block is a rectangular (i.e., square or non-square) block of samples to which the same prediction applies. A prediction unit (PU) of a CU may include a prediction block of luma samples, two corresponding prediction blocks of chroma samples, and syntax structures used to predict the prediction blocks. In monochrome pictures or pictures with three separate color planes, a PU may include a single prediction block and syntax structures used to predict the prediction blocks. Video encoder 20 may generate predictive luma, Cb, and Cr blocks for the luma, Cb, and Cr prediction blocks of each PU of a CU.

视频编码器20可使用帧内预测或帧间预测来产生PU的预测性块。如果视频编码器20使用帧内预测产生PU的预测性块，则视频编码器20可基于与PU相关联的图片的经解码样本来产生PU的预测性块。在HEVC的一些版本中，对于每一PU的明度分量，使用33个角度预测模式(从2到34编制索引)、DC模式(以1编制索引)及平面模式(以0编制索引)利用帧内预测方法，如图2中所展示。图2为说明HEVC中的实例帧内预测模式的概念图。Video encoder 20 may use intra prediction or inter prediction to generate the predictive blocks for the PU. If video encoder 20 generates the predictive blocks for the PU using intra prediction, video encoder 20 may generate the predictive blocks for the PU based on decoded samples of the picture associated with the PU. In some versions of HEVC, for the luma component of each PU, an intra prediction method is utilized using 33 angular prediction modes (indexed from 2 to 34), a DC mode (indexed with 1), and a planar mode (indexed with 0), as shown in FIG2 . FIG2 is a conceptual diagram illustrating example intra prediction modes in HEVC.

如果视频编码器20使用帧间预测产生PU的预测性块，则视频编码器20可基于除与PU相关联的图片以外的一或多个图片的经解码样本产生PU的预测性块。帧间预测可为单向帧间预测(即，单向预测)或双向帧间预测(即，双向预测)。为执行帧间预测，视频编码器20可产生当前切片的第一参考图片列表(RefPicList0)且在一些情况下还可产生当前切片的第二参考图片列表(RefPicList1)。参考图片列表中的每一者可包含一或多个参考图片。当使用单向预测时，视频编码器20可搜索RefPicList0及RefPicList1中的任一者或两者中的参考图片以确定参考图片内的参考位置。此外，当使用单向预测时，视频编码器20可至少部分基于对应于参考位置的样本产生PU的预测性样本块。此外，当使用单向预测时，视频编码器20可产生指示PU的预测块与参考位置之间的空间移位的单一运动向量。为指示PU的预测块与参考位置之间的空间移位，运动向量可包含指定PU的预测块与参考位置之间的水平移位的水平分量且可包含指定PU的预测块与参考位置之间的垂直移位的垂直分量。If video encoder 20 uses inter-prediction to generate the predictive blocks of a PU, video encoder 20 may generate the predictive blocks of the PU based on decoded samples of one or more pictures other than the picture associated with the PU. Inter-prediction may be unidirectional inter-prediction (i.e., unidirectional prediction) or bidirectional inter-prediction (i.e., bidirectional prediction). To perform inter-prediction, video encoder 20 may generate a first reference picture list (RefPicList0) for the current slice and, in some cases, may also generate a second reference picture list (RefPicList1) for the current slice. Each of the reference picture lists may include one or more reference pictures. When using uni-directional prediction, video encoder 20 may search the reference pictures in either or both of RefPicList0 and RefPicList1 to determine a reference position within the reference pictures. Furthermore, when using uni-directional prediction, video encoder 20 may generate the predictive sample blocks of the PU based at least in part on samples corresponding to the reference position. Furthermore, when using uni-directional prediction, video encoder 20 may generate a single motion vector that indicates the spatial displacement between the prediction block of the PU and the reference position. To indicate the spatial displacement between the PU's prediction block and the reference location, the motion vector may include a horizontal component specifying the horizontal displacement between the PU's prediction block and the reference location and may include a vertical component specifying the vertical displacement between the PU's prediction block and the reference location.

当使用双向预测来编码PU时，视频编码器20可确定RefPicList0中的参考图片中的第一参考位置及RefPicList1中的参考图片中的第二参考位置。视频编码器20接着可至少部分基于对应于第一及第二参考位置的样本来产生PU的预测性块。此外，当使用双向预测来编码PU时，视频编码器20可产生指示PU的样本块与第一参考位置之间的空间移位的第一运动向量，及指示PU的预测块与第二参考位置之间的空间移位的第二运动向量。When bi-prediction is used to encode a PU, video encoder 20 may determine a first reference location in a reference picture in RefPicList0 and a second reference location in a reference picture in RefPicList1. Video encoder 20 may then generate a predictive block for the PU based at least in part on samples corresponding to the first and second reference locations. Furthermore, when bi-prediction is used to encode a PU, video encoder 20 may generate a first motion vector indicating a spatial displacement between the sample block of the PU and the first reference location, and a second motion vector indicating a spatial displacement between the prediction block of the PU and the second reference location.

通常，B图片的第一或第二参考图片列表(例如，RefPicList0或RefPicList1)的参考图片列表建构包含两个步骤：参考图片列表初始化及参考图片列表重排(修改)。参考图片列表初始化为基于POC(图片次序计数，其与图片的显示次序对齐)值的次序将参考图片存储器(也被称作经解码图片缓冲器(DPB))中的参考图片放置成列表的明确机制。参考图片列表重排机制可将在参考图片列表初始化期间放置于列表中的图片的位置修改为任何新位置，或将参考图片存储器中的任何参考图片放置于任何位置，即使图片并不属于初始化列表也如此。可将参考图片列表重排(修改)之后的一些图片放置在列表中的极远位置。然而，如果图片的位置超过列表的有效参考图片的数目，则不将所述图片视为最终参考图片列表的项目。可在切片标头中用信号通知每一列表的有效参考图片的数目。在建构参考图片列表(即RefPicList0及RefPicList1，如果可用)之后，可使用到参考图片列表的参考索引来识别参考图片列表中包含的任何参考图片。Typically, reference picture list construction for the first or second reference picture list (e.g., RefPicList0 or RefPicList1) of a B picture involves two steps: reference picture list initialization and reference picture list reordering (modification). Reference picture list initialization is a definitive mechanism that places reference pictures in the reference picture memory (also known as the decoded picture buffer (DPB)) into lists based on the order of their POC (Picture Order Count) values, which align with the display order of the pictures. The reference picture list reordering mechanism can modify the position of pictures placed in the lists during reference picture list initialization to any new position or place any reference picture in the reference picture memory at any position, even if the picture does not belong to the initialized list. Some pictures after reference picture list reordering (modification) may be placed at very far positions in the lists. However, if a picture's position exceeds the number of valid reference pictures for a list, it is not considered an entry in the final reference picture list. The number of valid reference pictures for each list can be signaled in the slice header. After the reference picture lists (ie, RefPicList0 and RefPicList1, if available) are constructed, reference indexes to the reference picture lists may be used to identify any reference pictures included in the reference picture lists.

在视频编码器20产生CU的一或多个PU的预测性块(例如，明度块、Cb块及Cr块)之后，视频编码器20可产生CU的一或多个残余块。举例来说，视频编码器20可产生CU的明度残余块。CU的明度残余块中的每一样本指示CU的预测性明度块中的一者中的明度样本与CU的原始明度译码块中的对应样本之间的差。另外，视频编码器20可产生CU的Cb残余块。CU的Cb残余块中的每一样本可指示CU的预测性Cb块中的一者中的Cb样本与CU的原始Cb译码块中的对应样本之间的差。视频编码器20还可产生CU的Cr残余块。CU的Cr残余块中的每一样本可指示CU的预测性Cr块中的一者中的Cr样本与CU的原始Cr译码块中的对应样本之间的差。After video encoder 20 generates predictive blocks (e.g., luma blocks, Cb blocks, and Cr blocks) for one or more PUs of a CU, video encoder 20 may generate one or more residual blocks for the CU. For example, video encoder 20 may generate a luma residual block for the CU. Each sample in the luma residual block of the CU indicates the difference between a luma sample in one of the CU's predictive luma blocks and a corresponding sample in the CU's original luma coding block. Additionally, video encoder 20 may generate a Cb residual block for the CU. Each sample in the Cb residual block of the CU may indicate the difference between a Cb sample in one of the CU's predictive Cb blocks and a corresponding sample in the CU's original Cb coding block. Video encoder 20 may also generate a Cr residual block for the CU. Each sample in the Cr residual block of the CU may indicate the difference between a Cr sample in one of the CU's predictive Cr blocks and a corresponding sample in the CU's original Cr coding block.

此外，视频编码器20可使用四叉树分割来将CU的残余块(例如，明度、Cb及Cr残余块)分解为一或多个变换块(例如，明度、Cb及Cr变换块)。变换块为应用相同变换的样本的矩形(例如，正方形或非正方形)块。CU的变换单元(TU)可包括明度样本的变换块、色度样本的两个对应变换块及用以变换所述变换块样本的语法结构。因此，CU的每一TU可与明度变换块、Cb变换块及Cr变换块相关联。与TU相关联的明度变换块可为CU的明残余块的子块。Cb变换块可为CU的Cb残余块的子块。Cr变换块可为CU的Cr残余块的子块。在单色图片或具有三个单独色彩平面的图片中，TU可包括单个变换块及用以变换所述变换块的样本的语法结构。Furthermore, video encoder 20 may use quadtree partitioning to decompose a residual block (e.g., luma, Cb, and Cr residual blocks) of a CU into one or more transform blocks (e.g., luma, Cb, and Cr transform blocks). A transform block is a rectangular (e.g., square or non-square) block of samples to which the same transform is applied. A transform unit (TU) of a CU may include a transform block of luma samples, two corresponding transform blocks of chroma samples, and syntax structures used to transform the transform block samples. Thus, each TU of a CU may be associated with a luma transform block, a Cb transform block, and a Cr transform block. The luma transform block associated with a TU may be a sub-block of the CU's luma residual block. The Cb transform block may be a sub-block of the CU's Cb residual block. The Cr transform block may be a sub-block of the CU's Cr residual block. In monochrome pictures or pictures with three separate color planes, a TU may include a single transform block and syntax structures used to transform the samples of the transform block.

视频编码器20可将一或多个变换应用到TU的变换块以产生TU的系数块。举例来说，视频编码器20可将一或多个变换应用到TU的明度变换块以产生TU的明度系数块。系数块可为变换系数的二维阵列。变换系数可为标量。视频编码器20可将一或多个变换应用到TU的Cb变换块以产生TU的Cb系数块。视频编码器20可将一或多个变换应用到TU的Cr变换块以产生TU的Cr系数块。Video encoder 20 may apply one or more transforms to a transform block of a TU to generate a coefficient block for the TU. For example, video encoder 20 may apply one or more transforms to a luma transform block of a TU to generate a luma coefficient block for the TU. A coefficient block may be a two-dimensional array of transform coefficients. A transform coefficient may be a scalar. Video encoder 20 may apply one or more transforms to a Cb transform block of a TU to generate a Cb coefficient block for the TU. Video encoder 20 may apply one or more transforms to a Cr transform block of a TU to generate a Cr coefficient block for the TU.

在产生系数块(例如，明度系数块、Cb系数块或Cr系数块)之后，视频编码器20可量化系数块。量化一般是指量化变换系数以可能减少用以表示变换系数的数据量从而提供进一步压缩的过程。在视频编码器20量化系数块之后，视频编码器20可对指示经量化变换系数的语法元素进行熵编码。举例来说，视频编码器20可对指示经量化变换系数的语法元素执行上下文自适应二进制算术译码(CABAC)。After generating a coefficient block (e.g., a luma coefficient block, a Cb coefficient block, or a Cr coefficient block), video encoder 20 may quantize the coefficient block. Quantization generally refers to the process of quantizing transform coefficients to potentially reduce the amount of data used to represent the transform coefficients, thereby providing further compression. After video encoder 20 quantizes the coefficient block, video encoder 20 may entropy encode syntax elements indicating the quantized transform coefficients. For example, video encoder 20 may perform context-adaptive binary arithmetic coding (CABAC) on the syntax elements indicating the quantized transform coefficients.

视频编码器20可输出包含形成经译码图片及相关联数据的表示的位序列的位流。位流可包括网络抽象层(NAL)单元的序列。NAL单元为含有NAL单元中的数据类型的指示及含有所述数据的呈按需要穿插有模拟阻止位的原始字节序列有效负载(RBSP)形式的字节的语法结构。NAL单元中的每一者包含NAL单元标头且囊封RBSP。NAL单元标头可包含指示NAL单元类型码的语法元素。由NAL单元的NAL单元标头指定的NAL单元类型码指示NAL单元的类型。RBSP可为含有囊封在NAL单元内的整数数目个字节的语法结构。在一些情况下，RBSP包含零个位。Video encoder 20 may output a bitstream that includes a sequence of bits that form a representation of a coded picture and associated data. The bitstream may include a sequence of network abstraction layer (NAL) units. A NAL unit is a syntax structure that contains an indication of the type of data in the NAL unit and bytes containing the data in the form of a raw byte sequence payload (RBSP) interspersed with emulation prevention bits as needed. Each NAL unit includes a NAL unit header and encapsulates the RBSP. The NAL unit header may include a syntax element that indicates a NAL unit type code. The NAL unit type code, specified by the NAL unit header of a NAL unit, indicates the type of the NAL unit. The RBSP may be a syntax structure that contains an integer number of bytes encapsulated within the NAL unit. In some cases, the RBSP includes zero bits.

不同类型的NAL单元可包封不同类型的RBSP。举例来说，不同类型的NAL单元可囊封用于视频参数集(VPS)、序列参数集(SPS)、图片参数集(PPS)、经译码切片、补充增强信息(SEI)等等的不同RBSP。囊封视频译码数据的RBSP(与用于参数集及SEI消息的RBSP相反)的NAL单元可被称为视频译码层(VCL)NAL单元。Different types of NAL units may encapsulate different types of RBSPs. For example, different types of NAL units may encapsulate different RBSPs for video parameter sets (VPSs), sequence parameter sets (SPSs), picture parameter sets (PPSs), coded slices, supplemental enhancement information (SEI), and so on. NAL units that encapsulate RBSPs for video coding data (as opposed to RBSPs for parameter sets and SEI messages) may be referred to as video coding layer (VCL) NAL units.

在HEVC中，SPS可含有应用于经译码视频序列(CVS)的所有切片的信息。CVS可包括图片序列。在HEVC中，CVS可开始于瞬时解码刷新(IDR)图片，或断链接入(BLA)图片，或为位流中的第一图片的清洁随机接入(CRA)图片，包含并非IDR或BLA图片的所有后续图片。也就是说，在HEVC中，CVS可包括按解码次序可由以下各项组成的接入单元序列：作为位流中的第一接入单元的CRA接入单元，IDR接入单元或BLA接入单元，接着是零或多个非IDR及非BLA接入单元，包含直到但不包含任何后续IDR或BLA接入单元的所有后续接入单元。在HEVC中，接入单元可为按解码次序连续且含有确切一个经译码图片的NAL单元的集合。除经译码图片的经译码切片NAL单元之外，接入单元还可含有不含经译码图片的切片的其它NAL单元。在一些实例中，对接入单元的解码始终产生经解码图片。In HEVC, the SPS may contain information applicable to all slices of a coded video sequence (CVS). A CVS may include a sequence of pictures. In HEVC, a CVS may start with an instantaneous decoding refresh (IDR) picture, a broken link access (BLA) picture, or a clean random access (CRA) picture that is the first picture in the bitstream, and include all subsequent pictures that are not IDR or BLA pictures. That is, in HEVC, a CVS may include a sequence of access units that, in decoding order, may consist of: a CRA access unit, an IDR access unit, or a BLA access unit that is the first access unit in the bitstream, followed by zero or more non-IDR and non-BLA access units, including all subsequent access units up to but not including any subsequent IDR or BLA access unit. In HEVC, an access unit may be a set of NAL units that are consecutive in decoding order and contain exactly one coded picture. In addition to the coded slice NAL units of a coded picture, an access unit may also contain other NAL units that do not contain slices of a coded picture. In some examples, decoding an access unit always produces a decoded picture.

VPS是包括应用于零或多个整个CVS的语法元素的语法结构。SPS也为包括应用于零或多个整个CVS的语法元素的语法结构。SPS可包含识别在SPS在作用中时在作用中的VPS的语法元素。因此，VPS的语法元素可比SPS的语法元素更一般地适用。PPS为包括应用于零或多个经译码图片的语法元素的语法结构。PPS可包含识别在PPS在作用中时在作用中的SPS的语法元素。切片的切片标头可包含指示当切片正经译码时在作用中的PPS的语法元素。A VPS is a syntax structure that includes syntax elements that apply to zero or more entire CVSs. An SPS is also a syntax structure that includes syntax elements that apply to zero or more entire CVSs. An SPS may include syntax elements that identify the VPS that is in effect when the SPS is in effect. Thus, syntax elements of a VPS may be more generally applicable than syntax elements of an SPS. A PPS is a syntax structure that includes syntax elements that apply to zero or more coded pictures. A PPS may include syntax elements that identify the SPS that is in effect when the PPS is in effect. A slice header for a slice may include syntax elements that indicate the PPS that is in effect when the slice is being coded.

视频解码器30可接收由视频编码器20产生的位流。另外，视频解码器30可剖析位流以获得来自位流的语法元素。视频解码器30可至少部分基于从位流获得的语法元素重建构视频数据的图片。重建构视频数据的过程可通常与由视频编码器20执行的过程互逆。举例来说，视频解码器30可使用PU的运动向量确定当前CU的PU的预测性块。另外，视频解码器30可反量化与当前CU的TU相关联的系数块。视频解码器30可对系数块执行反变换以重建构与当前CU的TU相关联的变换块。通过将当前CU的PU的预测性块的样本添加到当前CU的TU的变换块的对应样本上，视频解码器30可重建构当前CU的译码块。通过重建构图片的每一CU的译码块，视频解码器30可重建构图片。Video decoder 30 may receive a bitstream generated by video encoder 20. Additionally, video decoder 30 may parse the bitstream to obtain syntax elements from the bitstream. Video decoder 30 may reconstruct a picture of the video data based, at least in part, on the syntax elements obtained from the bitstream. The process of reconstructing the video data may generally be the inverse of the process performed by video encoder 20. For example, video decoder 30 may determine a predictive block for a PU of a current CU using the motion vector of the PU. Additionally, video decoder 30 may inverse quantize a coefficient block associated with a TU of the current CU. Video decoder 30 may perform an inverse transform on the coefficient block to reconstruct a transform block associated with the TU of the current CU. Video decoder 30 may reconstruct the coding blocks of the current CU by adding samples of the predictive blocks of the PUs of the current CU to corresponding samples of the transform blocks of the TUs of the current CU. By reconstructing the coding blocks of each CU of the picture, video decoder 30 may reconstruct the picture.

在一些实例中，视频编码器20可使用合并模式或高级运动向量预测(AMVP)模式用信号通知PU的运动信息。举例来说，在HEVC中，存在用于运动参数预测的两种模式，一个为合并模式且另一个为AMVP。运动预测可包括基于一或多个其它块的运动信息确定块(例如，PU)的运动信息。PU的运动信息(在本文中也被称作运动参数)可包含PU的运动向量及PU的参考索引。In some examples, video encoder 20 may signal the motion information of a PU using merge mode or advanced motion vector prediction (AMVP) mode. For example, in HEVC, there are two modes for motion parameter prediction, one is merge mode and the other is AMVP. Motion prediction may include determining the motion information of a block (e.g., a PU) based on the motion information of one or more other blocks. The motion information of a PU (also referred to herein as motion parameters) may include a motion vector of the PU and a reference index of the PU.

当视频编码器20使用合并模式用信号通知当前PU的运动信息时，视频编码器20产生合并候选者列表。换句话说，视频编码器20可执行运动向量预测符列表建构过程。合并候选者列表包含指示在空间上或时间上相邻当前PU的PU的运动信息的合并候选者的集合。也就是说，在合并模式中，建构运动参数(例如，参考索引、运动向量等)的候选者列表，其中候选者可来自空间及时间相邻块。When video encoder 20 uses merge mode to signal the motion information of a current PU, video encoder 20 generates a merge candidate list. In other words, video encoder 20 may perform a motion vector predictor list construction process. The merge candidate list includes a set of merge candidates that indicate the motion information of PUs that are spatially or temporally neighboring the current PU. That is, in merge mode, a candidate list of motion parameters (e.g., reference index, motion vector, etc.) is constructed, where the candidates may come from spatially and temporally neighboring blocks.

此外，在合并模式中，视频编码器20可从合并候选者列表选择合并候选者且可使用由所选择合并候选者指示的运动信息作为当前PU的运动信息。视频编码器20可用信号通知所选择合并候选者的合并候选者列表中的位置。举例来说，视频编码器20可通过将索引发射到候选者列表而用信号通知所选择运动向量参数。视频解码器30可从位流获得到候选者列表的索引(即，候选者列表索引)。另外，视频解码器30可产生相同合并候选者列表，且可基于所选择合并候选者的位置的指示确定所选择合并候选者。接着，视频解码器30可使用所选择合并候选者的运动信息来产生当前PU的预测性块。也就是说，视频解码器30可至少部分基于候选者列表索引确定候选者列表中的所选择候选者，其中所选择候选者指定当前PU的运动向量。以此方式，在解码器侧处，一旦索引被解码，索引所指向的对应块的所有运动参数便可由当前PU继承。Furthermore, in merge mode, video encoder 20 may select a merge candidate from a merge candidate list and may use the motion information indicated by the selected merge candidate as the motion information for the current PU. Video encoder 20 may signal the position of the selected merge candidate in the merge candidate list. For example, video encoder 20 may signal the selected motion vector parameters by transmitting an index to the candidate list. Video decoder 30 may obtain an index to the candidate list (i.e., a candidate list index) from the bitstream. Alternatively, video decoder 30 may generate the same merge candidate list and may determine the selected merge candidate based on the indication of the position of the selected merge candidate. Video decoder 30 may then use the motion information of the selected merge candidate to generate a predictive block for the current PU. That is, video decoder 30 may determine the selected candidate in the candidate list based at least in part on the candidate list index, where the selected candidate specifies the motion vector for the current PU. In this manner, at the decoder side, once the index is decoded, all motion parameters for the corresponding block pointed to by the index may be inherited by the current PU.

跳过模式类似于合并模式。在跳过模式中，视频编码器20及视频解码器30以相同于视频编码器20及视频解码器30在合并模式中使用合并候选者列表的方式来产生及使用合并候选者列表。然而，当视频编码器20使用跳过模式用信号通知当前PU的运动信息时，视频编码器20不用信号通知当前PU的任何残余数据。因此，视频解码器30可在不使用残余数据的情况下基于由合并候选者列表中的所选择候选者的运动信息指示的参考块而确定PU的预测性块。Skip mode is similar to merge mode. In skip mode, video encoder 20 and video decoder 30 generate and use a merge candidate list in the same manner as video encoder 20 and video decoder 30 use a merge candidate list in merge mode. However, when video encoder 20 signals the motion information of the current PU using skip mode, video encoder 20 does not signal any residual data for the current PU. Therefore, video decoder 30 can determine the predictive block of the PU based on the reference block indicated by the motion information of the selected candidate in the merge candidate list without using residual data.

AMVP模式类似于合并模式，类似之处在于视频编码器20可产生候选者列表且可从候选者列表选择候选者。然而，当视频编码器20使用AMVP模式用信号通知当前PU的RefPicListX运动信息时，视频编码器20可除了用信号通知当前PU的RefPicListX MVP旗标之外还用信号通知当前PU的RefPicListX运动向量差(MVD)及当前PU的RefPicListX参考索引。当前PU的RefPicListX MVP旗标可指示AMVP候选者列表中的所选择AMVP候选者的位置。当前PU的RefPicListX MVD可指示当前PU的RefPicListX运动向量与所选择AMVP候选者的运动向量之间的差。以此方式，视频编码器20可通过用信号通知RefPicListX运动向量预测符(MVP)旗标、RefPicListX参考索引值及RefPicListX MVD而用信号通知当前PU的RefPicListX运动信息。换句话说，位流中表示当前PU的运动向量的数据可包含表示参考索引的数据、到候选者列表的索引及MVD。AMVP mode is similar to merge mode in that video encoder 20 may generate a candidate list and may select a candidate from the candidate list. However, when video encoder 20 signals the RefPicListX motion information of the current PU using AMVP mode, video encoder 20 may signal the RefPicListX motion vector difference (MVD) of the current PU and the RefPicListX reference index of the current PU in addition to signaling the RefPicListX MVP flag of the current PU. The RefPicListX MVP flag of the current PU may indicate the position of the selected AMVP candidate in the AMVP candidate list. The RefPicListX MVD of the current PU may indicate the difference between the RefPicListX motion vector of the current PU and the motion vector of the selected AMVP candidate. In this way, video encoder 20 can signal the RefPicListX motion information of the current PU by signaling the RefPicListX motion vector predictor (MVP) flag, the RefPicListX reference index value, and the RefPicListX MVD. In other words, the data representing the motion vector of the current PU in the bitstream may include data representing the reference index, an index into the candidate list, and the MVD.

此外，当使用AMVP模式用信号通知当前PU的运动信息时，视频解码器30可从位流获得当前PU的MVD及MVP旗标。视频解码器30可产生相同的AMVP候选者列表且可基于MVP旗标确定所选择AMVP候选者。视频解码器30可通过将MVD添加到由所选择AMVP候选者指示的运动向量来恢复当前PU的运动向量。也就是说，视频解码器30可基于由所选择AMVP候选者指示的运动向量及MVD确定当前PU的运动向量。视频解码器30接着可使用当前PU的所恢复(多个)运动向量来产生当前PU的预测性块。In addition, when the motion information of the current PU is signaled using the AMVP mode, the video decoder 30 may obtain the MVD and MVP flag of the current PU from the bitstream. The video decoder 30 may generate the same AMVP candidate list and may determine the selected AMVP candidate based on the MVP flag. The video decoder 30 may recover the motion vector of the current PU by adding the MVD to the motion vector indicated by the selected AMVP candidate. That is, the video decoder 30 may determine the motion vector of the current PU based on the motion vector indicated by the selected AMVP candidate and the MVD. The video decoder 30 may then use the recovered motion vector(s) of the current PU to generate the predictive block of the current PU.

当视频解码器30产生当前PU的AMVP候选者列表时，视频解码器30可基于覆盖在空间上相邻当前PU的位置的PU(即，空间相邻PU)的运动信息导出一或多个AMVP候选者。图3为说明相对于当前块40的实例空间相邻PU的概念图。在图3的实例中，空间相邻PU可为覆盖指示为A₀、A₁、B₀、B₁及B₂的位置的PU。当PU的预测块包含一位置时，PU可覆盖所述位置。When video decoder 30 generates an AMVP candidate list for a current PU, video decoder 30 may derive one or more AMVP candidates based on motion information of PUs that cover positions that are spatially adjacent to the current PU (i.e., spatially adjacent PUs). FIG3 is a conceptual diagram illustrating example spatially adjacent PUs relative to current block 40. In the example of FIG3, spatially adjacent PUs may be PUs that cover positions indicated as _A0 , _A1 , _B0 , _B1 , and _B2 . When the prediction block of a PU includes a position, the PU may cover that position.

合并候选者列表或AMVP候选者列表中基于在时间上相邻当前PU的PU(即，在与当前PU不同的时间实例中的PU)的运动信息的候选者可被称为时间运动向量预测符。时间运动向量预测的使用可被称为时间运动向量预测(TMVP)。TMVP可用以改善HEVC的译码效率，且不同于其它译码工具，TMVP可需要接入经解码图片缓冲器中(更具体来说参考图片列表中)的帧的运动向量。Candidates in a merge candidate list or an AMVP candidate list that are based on motion information of a PU that is temporally adjacent to the current PU (i.e., a PU at a different time instance than the current PU) may be referred to as temporal motion vector predictors. The use of temporal motion vector prediction may be referred to as temporal motion vector prediction (TMVP). TMVP may be used to improve HEVC coding efficiency, and unlike other coding tools, TMVP may require access to motion vectors for frames in a decoded picture buffer (more specifically, in a reference picture list).

可基于逐CVS、基于逐切片或基于另一者启用或停用TMVP。SPS中的语法元素(例如，sps_temporal_mvp_enable_flag)可指示是否针对CVS启用TMVP的使用。此外，当针对CVS启用TMVP时，可针对CVS内的特定切片启用或停用TMVP。举例来说，切片标头中的语法元素(例如，slice_temporal_mvp_enable_flag)可指示是否针对切片启用TMVP。因此，在经帧间预测切片中，当针对整个CVS启用TMVP(例如，SPS中的sps_temporal_mvp_enable_flag设定成1)时，在切片标头中用信号通知slice_temporal_mvp_enable_flag以指示是否针对当前切片启用TMVP。TMVP may be enabled or disabled on a per-CVS basis, a per-slice basis, or another basis. A syntax element in the SPS (e.g., sps_temporal_mvp_enable_flag) may indicate whether use of TMVP is enabled for a CVS. Furthermore, when TMVP is enabled for a CVS, TMVP may be enabled or disabled for a specific slice within the CVS. For example, a syntax element in a slice header (e.g., slice_temporal_mvp_enable_flag) may indicate whether TMVP is enabled for a slice. Thus, in an inter-predicted slice, when TMVP is enabled for the entire CVS (e.g., sps_temporal_mvp_enable_flag in the SPS is set to 1), a slice_temporal_mvp_enable_flag is signaled in the slice header to indicate whether TMVP is enabled for the current slice.

为确定时间运动向量预测符，视频译码器可首先识别包含与当前PU共置的PU的参考图片。换句话说，视频译码器可识别所谓的“共置图片”。如果当前图片的当前切片为B切片(即，允许包含经双向帧间预测的PU的切片)，则视频编码器20可在切片标头中用信号通知指示共置图片是否来自RefPicList0或RefPicList1的语法元素(例如，collocated_from_l0_flag)。换句话说，当针对当前切片启用TMVP且当前切片为B切片(例如，允许包含经双向帧间预测的PU的切片)时，视频编码器20可在切片标头中用信号通知指示共置图片是否处于RefPicList0或RefPicList1中的语法元素(例如，collocated_from_l0_flag)。To determine the temporal motion vector predictor, the video coder may first identify a reference picture that includes a PU that is co-located with the current PU. In other words, the video coder may identify a so-called "co-located picture." If the current slice of the current picture is a B slice (i.e., a slice that allows inclusion of bi-directionally inter-predicted PUs), video encoder 20 may signal in the slice header a syntax element (e.g., collocated_from_l0_flag) that indicates whether the co-located picture is from RefPicList0 or RefPicList1. In other words, when TMVP is enabled for the current slice and the current slice is a B slice (e.g., a slice that allows inclusion of bi-directionally inter-predicted PUs), video encoder 20 may signal in the slice header a syntax element (e.g., collocated_from_l0_flag) that indicates whether the co-located picture is in RefPicList0 or RefPicList1.

切片标头中的语法元素(例如，collocated_ref_idx)可指示所识别参考图片列表中的共置图片。因此，在视频解码器30识别包含共置图片的参考图片列表之后，视频解码器30可使用可在切片标头中用信号通知的collocated_ref_idx来识别所识别参考图片列表中的共置图片。视频译码器可通过检查共置图片来识别共置PU。时间运动向量预测符可指示共置PU的右下方PU的运动信息，或共置PU的中心PU的运动信息。A syntax element in the slice header (e.g., collocated_ref_idx) may indicate a collocated picture in the identified reference picture list. Thus, after video decoder 30 identifies a reference picture list that includes a collocated picture, video decoder 30 may use the collocated_ref_idx signaled in the slice header to identify the collocated picture in the identified reference picture list. The video coder may identify a collocated PU by examining the collocated picture. The temporal motion vector predictor may indicate motion information for the PU to the bottom right of the collocated PU, or motion information for the center PU of the collocated PU.

当由上文过程所识别的运动向量(即，时间运动向量预测符的运动向量)用于产生合并模式或AMVP模式的运动候选者时，视频译码器可基于时间位置(由POC值所反映)缩放运动向量。举例来说，当当前图片与参考图片的POC值之间的差大于当当前图片与参考图片的POC值之间的差较小时的值时，视频译码器可将运动向量的量值增加多出的量。When the motion vector identified by the above process (i.e., the motion vector of the temporal motion vector predictor) is used to generate a motion candidate for merge mode or AMVP mode, the video coder may scale the motion vector based on the temporal position (reflected by the POC value). For example, when the difference between the POC values of the current picture and the reference picture is greater than when the difference between the POC values of the current picture and the reference picture is smaller, the video coder may increase the magnitude of the motion vector by the excess amount.

可将从时间运动向量预测符导出的时间合并候选者的所有可能参考图片列表的目标参考索引始终设定成0。然而，对于AMVP，所有可能参考图片的目标参考索引可设定成等于经解码参考索引。换句话说，从TMVP导出的时间合并候选者的所有可能参考图片列表的目标参考索引始终设定成0，而对于AMVP，时间合并候选者可设定成等于经解码参考索引。在HEVC中，SPS可包含旗标(例如，sps_temporal_mvp_enable_flag)且当sps_temporal_mvp_enable_flag等于1时，切片标头可包含旗标(例如，pic_temporal_mvp_enable_flag)。当pic_temporal_mvp_enable_flag及temporal_id两者对于特定图片都等于0时，并不将来自在解码次序上在所述特定图片之前的图片的运动向量用作时间运动向量预测符以解码特定图片或在解码次序上在特定图片之后的图片。The target reference index of all possible reference picture lists for temporal merging candidates derived from the temporal motion vector predictor may be always set to 0. However, for AMVP, the target reference index of all possible reference pictures may be set equal to the decoded reference index. In other words, the target reference index of all possible reference picture lists for temporal merging candidates derived from TMVP is always set to 0, while for AMVP, the temporal merging candidates may be set equal to the decoded reference index. In HEVC, the SPS may include a flag (e.g., sps_temporal_mvp_enable_flag) and when sps_temporal_mvp_enable_flag is equal to 1, the slice header may include a flag (e.g., pic_temporal_mvp_enable_flag). When both pic_temporal_mvp_enable_flag and temporal_id are equal to 0 for a particular picture, motion vectors from pictures that precede the particular picture in decoding order are not used as temporal motion vector predictors to decode the particular picture or pictures that follow the particular picture in decoding order.

本发明的技术潜在地适用于多视图译码及/或3DV标准及规范，包含MV-HEVC及3D-HEVC。在例如MV-HEVC及3D-HEVC中定义的多视图译码中，可存在同一场景来自不同视点的多个视图。在多视图译码的上下文中，术语“接入单元”可用以指对应于同一时间实例的图片的集合。在一些情况下，在多视图译码的上下文中，接入单元可为NAL单元的集合，其根据指定分类规则而彼此相关联、在解码次序上连续且含有与相同输出时间相关联的所有经译码图片的VCL NAL单元及其相关联非VCL NAL单元。因此，视频数据可概念化为随时间出现的一系列接入单元。The techniques of this disclosure are potentially applicable to multi-view coding and/or 3DV standards and specifications, including MV-HEVC and 3D-HEVC. In multi-view coding, such as that defined in MV-HEVC and 3D-HEVC, there may be multiple views of the same scene from different viewpoints. In the context of multi-view coding, the term "access unit" may be used to refer to a set of pictures corresponding to the same time instance. In some cases, in the context of multi-view coding, an access unit may be a set of NAL units that are associated with each other according to a specified classification rule, are consecutive in decoding order, and contain the VCL NAL units and their associated non-VCL NAL units of all coded pictures associated with the same output time. Thus, video data can be conceptualized as a series of access units occurring over time.

在例如3D-HEVC中定义的3DV译码中，“视图分量”可为单一接入单元中的视图的经译码表示。视图分量可含有深度视图分量及纹理视图分量。深度视图分量可为单一接入单元中的视图的深度的经译码表示。纹理视图分量可为单一接入单元中的视图的纹理的经译码表示。在本发明中，“视图”可指与相同视图识别符相关联的视图分量序列。In 3DV coding, such as that defined in 3D-HEVC, a "view component" may be a coded representation of a view in a single access unit. A view component may contain a depth view component and a texture view component. A depth view component may be a coded representation of the depth of a view in a single access unit. A texture view component may be a coded representation of the texture of a view in a single access unit. In this disclosure, a "view" may refer to a sequence of view components associated with the same view identifier.

视图的图片集合内的纹理视图分量及深度视图分量可被视为对应于彼此。举例来说，视图的图片集合内的纹理视图分量被视为对应于视图的图片集合内的深度视图分量，且反过来也一样(即，深度视图分量对应于其在所述集合中的纹理视图分量，且反过来也一样)。如本发明中所使用，对应于深度视图分量的纹理视图分量可被视为纹理视图分量且深度视图分量为单一接入单元的同一视图的部分。Texture view components and depth view components within a view's picture set may be considered to correspond to each other. For example, a texture view component within a view's picture set may be considered to correspond to a depth view component within the view's picture set, and vice versa (i.e., a depth view component corresponds to its texture view component in the set, and vice versa). As used in this disclosure, a texture view component corresponding to a depth view component may be considered to be part of the same view of a single access unit, as may the texture view component and the depth view component.

纹理视图分量包含所显示的实际图像内容。举例来说，纹理视图分量可包含明度(Y)及色度(Cb及Cr)分量。深度视图分量可指示其对应纹理视图分量中的像素的相对深度。作为一个实例，深度视图分量为仅包含明度值的灰阶图像。换句话说，深度视图分量可不传达任何图像内容，而是提供纹理视图分量中的像素的相对深度的测量。The texture view component contains the actual image content being displayed. For example, a texture view component may include luma (Y) and chroma (Cb and Cr) components. A depth view component may indicate the relative depth of pixels in its corresponding texture view component. As an example, a depth view component is a grayscale image containing only luma values. In other words, a depth view component may not convey any image content, but rather provides a measurement of the relative depth of pixels in the texture view component.

举例来说，深度视图分量中的纯白色像素指示对应纹理视图分量中的其对应像素较接近于检视者的视角，且深度视图分量中的纯黑色像素指示对应纹理视图分量中的其对应像素距检视者的视角较远。黑色与白色之间的各种灰度渐变指示不同深度水平。举例来说，深度视图分量中的深灰色像素指示纹理视图分量中的其对应像素比深度视图分量中的浅灰色像素更远。因为仅需要灰阶来识别像素的深度，因此深度视图分量不需要包含色度分量，这是由于深度视图分量的色彩值可不服务于任何目的。For example, a pure white pixel in a depth view component indicates that its corresponding pixel in the corresponding texture view component is closer to the viewer's perspective, and a pure black pixel in a depth view component indicates that its corresponding pixel in the corresponding texture view component is farther from the viewer's perspective. Various shades of gray between black and white indicate different depth levels. For example, a dark gray pixel in a depth view component indicates that its corresponding pixel in the texture view component is farther away than a light gray pixel in the depth view component. Because only grayscale is needed to identify the depth of a pixel, the depth view component does not need to include chroma components, as the color values of the depth view component may not serve any purpose.

仅使用明度值(例如，强度值)来识别深度的深度视图分量是出于说明目的而提供，且不应被视为限制性的。在其它实例中，可利用任何技术来指示纹理视图分量中的像素的相对深度。The use of only luma values (eg, intensity values) to identify depth view components is provided for illustration purposes and should not be considered limiting. In other examples, any technique may be utilized to indicate the relative depth of pixels in a texture view component.

在多视图译码中，如果视频解码器(例如，视频解码器30)可解码视图中的图片而无需参考任何其它视图中的图片，则所述视图可被称为“基础视图”。当译码非基础视图中的一者中的图片时，如果图片在不同视图中但在相同于视频译码器当前译码的图片的时间实例(即，接入单元)内，则视频译码器(例如，视频编码器20或视频解码器30)可将所述图片添加到参考图片列表(例如，RefPicList0或RefPicList1)。如同其它帧间预测参考图片，视频译码器可在参考图片列表的任何位置处插入视图间预测参考图片。In multi-view coding, a view may be referred to as a "base view" if a video decoder (e.g., video decoder 30) can decode pictures in the view without referencing pictures in any other view. When coding a picture in one of the non-base views, the video coder (e.g., video encoder 20 or video decoder 30) may add the picture to a reference picture list (e.g., RefPicList0 or RefPicList1) if the picture is in a different view but within the same time instance (i.e., access unit) as the picture currently being coded by the video coder. Like other inter-prediction reference pictures, the video coder may insert an inter-view prediction reference picture at any position in the reference picture list.

多视图译码支持视图间预测。视图间预测类似于H.264/AVC、HEVC或其它视频译码规范中所使用的帧间预测，且可使用相同语法元素。然而，当视频译码器对当前块(例如，宏块、CU或PU)执行视图间预测时，视频编码器20可使用在相同于当前块的接入单元中但在不同视图中的图片作为参考图片。换句话说，在多视图译码中，在同一接入单元(即，同一时间实例内)的不同视图中俘获的图片当中执行视图间预测以移除视图之间的相关性。相反地，常规帧间预测仅使用不同接入单元中的图片作为参考图片。Multi-view coding supports inter-view prediction. Inter-view prediction is similar to inter-frame prediction used in H.264/AVC, HEVC, or other video coding specifications, and can use the same syntax elements. However, when the video coder performs inter-view prediction on the current block (e.g., macroblock, CU, or PU), the video encoder 20 can use pictures in the same access unit as the current block but in different views as reference pictures. In other words, in multi-view coding, inter-view prediction is performed among pictures captured in different views of the same access unit (i.e., within the same time instance) to remove the correlation between views. In contrast, conventional inter-frame prediction only uses pictures in different access units as reference pictures.

图4为说明实例多视图解码次序的概念图。多视图解码次序可为位流次序。在图4的实例中，每一正方形对应于视图分量。正方形的列对应于接入单元。每一接入单元可经定义为含有时间实例的所有视图的经译码图片。正方形的行对应于视图。在图4的实例中，接入单元标记为T0……T8且视图标记为S0……S7。因为在下一接入单元的任何视图分量之前解码接入单元的每一视图分量，所以图4的解码次序可被称作时间优先译码。接入单元的解码次序与视图的输出或显示次序可不相同。FIG4 is a conceptual diagram illustrating an example multi-view decoding order. The multi-view decoding order may be a bitstream order. In the example of FIG4, each square corresponds to a view component. The columns of squares correspond to access units. Each access unit may be defined as a coded picture containing all views of a time instance. The rows of squares correspond to views. In the example of FIG4, the access units are labeled T0 ... T8 and the views are labeled S0 ... S7. Because each view component of an access unit is decoded before any view component of the next access unit, the decoding order of FIG4 may be referred to as time-first decoding. The decoding order of the access units may not be the same as the output or display order of the views.

图5为说明用于多视图译码的实例预测结构的概念图。图5的多视图预测结构包含时间及视图间预测。在图5的实例中，每一正方形对应于视图分量。在图5的实例中，接入单元标记为T0……T11且视图标记为S0……S7。标记为“I”的正方形为经帧内预测视图分量。标记为“P”的正方形为经单向帧间预测视图分量。标记为“B”及“b”的正方形为经双向帧间预测视图分量。标记为“b”的正方形可将标记为“B”的正方形用作参考图片。从第一正方形指向第二正方形的箭头指示第一正方形在帧间预测中可用作第二正方形的参考图片。如由图5中的垂直箭头所指示，可将相同接入单元的不同视图中的视图分量用作参考图片。将接入单元的一个视图分量用作同一接入单元的另一视图分量的参考图片可被称为视图间预测。FIG5 is a conceptual diagram illustrating an example prediction structure for multi-view coding. The multi-view prediction structure of FIG5 includes temporal and inter-view prediction. In the example of FIG5, each square corresponds to a view component. In the example of FIG5, the access units are labeled T0...T11 and the views are labeled S0...S7. The squares labeled "I" are intra-predicted view components. The squares labeled "P" are unidirectional inter-predicted view components. The squares labeled "B" and "b" are bidirectional inter-predicted view components. The square labeled "b" can use the square labeled "B" as a reference picture. An arrow pointing from a first square to a second square indicates that the first square can be used as a reference picture for the second square in inter-prediction. As indicated by the vertical arrows in FIG5, view components in different views of the same access unit can be used as reference pictures. Using one view component of an access unit as a reference picture for another view component of the same access unit is referred to as inter-view prediction.

在例如H.264/AVC的MVC扩展的多视图译码中，视差运动补偿支持视图间预测，所述视差运动补偿使用H.264/AVC运动补偿的语法，但允许将不同视图中的图片用作参考图片。对两个视图的译码还可由H.264/AVC的MVC扩展支持。H.264/AVC的MVC扩展的优势中的一者是，MVC编码器可将两个以上视图视为3D视频输入且MVC解码器可解码此多视图表示。因此，具有MVC解码器的任何显现器可预期具有两个以上视图的3D视频内容。In multi-view coding, such as the MVC extension of H.264/AVC, inter-view prediction is supported by disparity motion compensation, which uses the syntax of H.264/AVC motion compensation but allows pictures in different views to be used as reference pictures. Coding of two views can also be supported by the MVC extension of H.264/AVC. One of the advantages of the MVC extension of H.264/AVC is that an MVC encoder can treat more than two views as 3D video input, and an MVC decoder can decode this multi-view representation. Therefore, any display with an MVC decoder can expect 3D video content with more than two views.

在例如MV-HEVC及3D-HEVC中定义的多视图视频译码的上下文中，存在两个种类的运动向量。一个种类的运动向量是指向时间参考图片的正常运动向量。对应于正常时间运动向量的帧间预测类型可被称为“运动补偿预测”或“MCP”。当视图间预测参考图片用于运动补偿时，对应运动向量被称为“视差运动向量”。换句话说，视差运动向量指向不同视图中的图片(即，视图间参考图片)。对应于视差运动向量的帧间预测类型可被称为“视差补偿预测”或“DCP”。In the context of multi-view video coding, such as that defined in MV-HEVC and 3D-HEVC, there are two types of motion vectors. One type of motion vector is a normal motion vector that points to a temporal reference picture. The inter-frame prediction type corresponding to the normal temporal motion vector may be referred to as "motion compensated prediction" or "MCP". When an inter-view prediction reference picture is used for motion compensation, the corresponding motion vector is referred to as a "disparity motion vector". In other words, the disparity motion vector points to a picture in a different view (i.e., an inter-view reference picture). The inter-frame prediction type corresponding to the disparity motion vector may be referred to as "disparity compensated prediction" or "DCP".

3D-HEVC可使用视图间运动预测及视图间残余预测而改善译码效率。换句话说，为进一步改善译码效率，已在参考软件中采用两个新技术，即“视图间运动预测”及“视图间残余预测”。在视图间运动预测中，视频译码器可基于在不同于当前PU的视图中的PU的运动信息而确定(即，预测)当前PU的运动信息。在视图间残余预测中，视频译码器可基于在不同于当前CU的视图中的残余数据而确定当前CU的残余块。3D-HEVC can improve coding efficiency using inter-view motion prediction and inter-view residual prediction. Specifically, to further improve coding efficiency, two new techniques, inter-view motion prediction and inter-view residual prediction, have been adopted in the reference software. In inter-view motion prediction, the video coder can determine (i.e., predict) the motion information of the current PU based on the motion information of PUs in views other than the current PU. In inter-view residual prediction, the video coder can determine the residual block of the current CU based on residual data in views other than the current CU.

为启用视图间运动预测及视图间残余预测，视频译码器可确定块(例如，PU、CU等)的视差向量。换句话说，为启用这两个译码工具，第一步骤是导出视差向量。一般来说，视差向量用作两个视图之间的移位的估计器。视频译码器可使用块的视差向量来定位另一视图中的参考块以用于视图间运动或残余预测，或视频译码器可将视差向量转换为用于视图间运动预测的视差运动向量。也就是说，视差向量可用于定位另一视图中的对应块以用于视图间运动/残余预测，或可转换成用于视图间运动预测的视差运动向量。To enable inter-view motion prediction and inter-view residual prediction, a video coder may determine a disparity vector for a block (e.g., a PU, CU, etc.). In other words, to enable these two coding tools, the first step is to derive a disparity vector. Generally speaking, a disparity vector is used as an estimator of the displacement between two views. The video coder may use the disparity vector of a block to locate a reference block in another view for inter-view motion or residual prediction, or the video coder may convert the disparity vector into a disparity motion vector for inter-view motion prediction. That is, the disparity vector may be used to locate a corresponding block in another view for inter-view motion/residual prediction, or may be converted into a disparity motion vector for inter-view motion prediction.

在一些实例中，视频译码器可使用基于相邻块的视差向量(NBDV)导出方法以导出PU(即，当前PU)的视差向量。举例来说，为导出当前PU的视差向量，称为NBDV导出的过程可用于3D-HEVC的测试模型(即，3D-HTM)中。In some examples, the video coder may use a neighboring block-based disparity vector (NBDV) derivation method to derive the disparity vector of a PU (i.e., the current PU). For example, to derive the disparity vector of the current PU, a process called NBDV derivation may be used in the 3D-HEVC test model (i.e., 3D-HTM).

NBDV导出过程使用来自空间及时间相邻块的视差运动向量以导出当前块的视差向量。因为相邻块(例如，在空间上或时间上相邻当前块的块)很可能在视频译码中共用几乎相同的运动及视差信息，所以当前块可使用相邻块中的运动向量信息作为当前块的视差向量的预测符。因此，NBDV导出过程使用用于估计不同视图中的视差向量的相邻视差信息。The NBDV derivation process uses disparity motion vectors from spatially and temporally neighboring blocks to derive the disparity vector for the current block. Because neighboring blocks (e.g., blocks that are spatially or temporally neighboring the current block) are likely to share nearly identical motion and disparity information in video coding, the current block can use the motion vector information in the neighboring blocks as a predictor for the disparity vector for the current block. Therefore, the NBDV derivation process uses neighboring disparity information for estimating disparity vectors in different views.

在NBDV导出过程中，视频译码器可以固定检查次序检查空间相邻及时间相邻PU的运动向量。当视频译码器检查空间相邻或时间相邻PU的运动向量时，视频译码器可确定运动向量是否为视差运动向量。图片的PU的视差运动向量为指向图片的视图间参考图片内的位置的运动向量。图片的视图间参考图片可为在相同于图片的接入单元中但在不同视图中的图片。当视频译码器识别出视差运动向量或隐式视差向量(IDV)时，视频译码器可终止检查过程。IDV可为使用视图间预测译码的空间或时间相邻PU的视差向量。当PU利用视图间运动向量预测(即，借助于视差向量，从另一视图中的参考块导出AMVP或合并模式的候选者)时，可产生IDV。出于视差向量导出的目的，IDV可存储到PU。此外，当视频译码器识别出视差运动向量或IDV时，视频译码器可传回所识别视差运动向量或IDV。During the NBDV derivation process, the video coder may check the motion vectors of spatially and temporally neighboring PUs in a fixed order. When the video coder checks the motion vectors of spatially or temporally neighboring PUs, it may determine whether the motion vector is a disparity motion vector. The disparity motion vector of a PU of a picture is a motion vector that points to a location within an inter-view reference picture of the picture. An inter-view reference picture of a picture may be a picture in the same access unit as the picture but in a different view. When the video coder identifies a disparity motion vector or an implicit disparity vector (IDV), the video coder may terminate the checking process. The IDV may be the disparity vector of a spatially or temporally neighboring PU coded using inter-view prediction. The IDV may be generated when the PU utilizes inter-view motion vector prediction (i.e., a candidate for AMVP or merge mode is derived from a reference block in another view using the disparity vector). The IDV may be stored in the PU for disparity vector derivation purposes. Furthermore, when the video coder identifies a disparity motion vector or IDV, it may return the identified disparity motion vector or IDV.

IDV与NBDV导出过程的简化版本一起包含于宋(Sung)等人的“3D-CE5.h:用于基于HEVC的3D视频译码的简化视差向量导出，文件JCTV3-A0126”中。通过移除存储于经解码图片缓冲器中的IDV且还通过随机接入点(RAP)图片选择提供经改善译码增益，IDV在NBDV导出过程中的使用进一步简化于康等人的“3D-CE5.h相关：视差向量导出改善”，文件JCT3V-B0047中。视频译码器可将所传回视差运动向量或IDV转换为视差向量且可使用所述视差向量以用于视图间运动预测及视图间残余预测。A simplified version of the IDV and NBDV derivation processes is included in Sung et al., "3D-CE5.h: Simplified Disparity Vector Derivation for HEVC-Based 3D Video Coding," document JCTV3-A0126. The use of IDV in the NBDV derivation process is further simplified in Kang et al., "3D-CE5.h Related: Disparity Vector Derivation Improvements," document JCT3V-B0047, by removing the IDV stored in the decoded picture buffer and also providing improved coding gain through random access point (RAP) picture selection. The video coder can convert the returned disparity motion vector or IDV into a disparity vector and use the disparity vector for inter-view motion prediction and inter-view residual prediction.

在3D-HEVC的一些设计中，当视频译码器执行NBDV导出过程时，视频译码器按次序检查时间相邻块中的视差运动向量、空间相邻块中的视差运动向量且接着检查IDV。一旦视频译码器找到当前块的视差运动向量，视频译码器可终止NBDV导出过程。因此，一旦识别出视差运动向量或IDV，终止检查过程且传回所识别视差运动向量并将其转换成将用于视图间运动预测及视图间残余预测的视差向量。当视频译码器不能够通过执行NBDV导出过程确定当前块的视差向量(即，当在NBDV导出过程期间未找到视差运动向量或IDV)时，视频译码器可将NBDV标记为不可用。In some designs of 3D-HEVC, when the video coder performs the NBDV derivation process, the video coder checks the disparity motion vectors in the temporally neighboring blocks, the disparity motion vectors in the spatially neighboring blocks, and then the IDV in order. Once the video coder finds the disparity motion vector for the current block, the video coder may terminate the NBDV derivation process. Thus, once the disparity motion vector or IDV is identified, the checking process is terminated and the identified disparity motion vector is returned and converted into a disparity vector to be used for inter-view motion prediction and inter-view residual prediction. When the video coder is unable to determine the disparity vector for the current block by performing the NBDV derivation process (i.e., when no disparity motion vector or IDV is found during the NBDV derivation process), the video coder may mark the NBDV as unavailable.

在一些实例中，如果视频译码器不能够通过执行NBDV导出过程导出当前PU的视差向量(即，如果未找到视差向量)，则视频译码器可使用零视差向量作为当前PU的视差向量。零视差向量是水平分量及垂直分量均等于0的视差向量。因此，甚至当NBDV导出过程传回不可用结果时，视频译码器的要求视差向量的其它译码过程也可使用零视差向量用于当前块。In some examples, if the video coder is unable to derive the disparity vector for the current PU by performing the NBDV derivation process (i.e., if no disparity vector is found), the video coder may use a zero disparity vector as the disparity vector for the current PU. A zero disparity vector is a disparity vector in which both the horizontal and vertical components are equal to 0. Therefore, even when the NBDV derivation process returns an unusable result, other coding processes of the video coder that require a disparity vector may use a zero disparity vector for the current block.

在一些实例中，如果视频译码器不能够通过执行NBDV导出过程而导出当前PU的视差向量，则视频译码器可停用对当前PU的视图间残余预测。然而，不管视频译码器是否能够通过执行NBDV导出过程而导出当前PU的视差向量，视频译码器都可使用视图间运动预测用于当前PU。也就是说，如果在检查所有预定义相邻块之后未找到视差向量，则零视差向量可用于视图间运动预测，同时可停用针对对应PU的视图间残余预测。In some examples, if the video coder is unable to derive the disparity vector of the current PU by performing the NBDV derivation process, the video coder may disable inter-view residual prediction for the current PU. However, regardless of whether the video coder is able to derive the disparity vector of the current PU by performing the NBDV derivation process, the video coder may use inter-view motion prediction for the current PU. That is, if no disparity vector is found after checking all predefined neighboring blocks, a zero disparity vector may be used for inter-view motion prediction, while inter-view residual prediction for the corresponding PU may be disabled.

如上文所提及，作为确定当前PU的视差向量的过程的部分，视频译码器可检查空间相邻PU。在一些实例中，视频译码器检查以下空间相邻块：左下方空间相邻块，左边空间相邻块，右上方空间相邻块，上方空间相邻块，及左上方空间相邻块。举例来说，在NBDV导出过程的一些版本中，使用五个空间相邻块以用于视差向量导出。所述五个空间相邻块可分别覆盖位置A₀、A₁、B₀、B₁及B₂，如图3中所指示。视频译码器可按A₁、B₁、B₀、A₀及B₂的次序检查五个空间相邻块。相同五个空间相邻块可用于HEVC的合并模式。因此，在一些实例中，不需要额外的存储器接入。如果空间相邻块中的一者具有视差运动向量，则视频译码器可终止检查过程，且视频译码器可使用视差运动向量作为当前PU的最终视差向量。换句话说，如果其中的一者使用视差运动向量，则终止检查过程且对应视差运动向量将被用作最终视差向量。As mentioned above, as part of the process of determining the disparity vector for the current PU, the video coder may check spatially neighboring PUs. In some examples, the video coder checks the following spatially neighboring blocks: a lower left spatially neighboring block, a left spatially neighboring block, a top right spatially neighboring block, an upper spatially neighboring block, and a top left spatially neighboring block. For example, in some versions of the NBDV derivation process, five spatially neighboring blocks are used for disparity vector derivation. The five spatially neighboring blocks may cover positions A ₀ , A ₁ , B ₀ , B ₁ , and B ₂ , respectively, as indicated in FIG3 . The video coder may check the five spatially neighboring blocks in the order of A ₁ , B ₁ , B ₀ , A ₀ , and B ₂ . The same five spatially neighboring blocks may be used for the merge mode of HEVC. Therefore, in some examples, no additional memory access is required. If one of the spatially neighboring blocks has a disparity motion vector, the video coder may terminate the checking process, and the video coder may use the disparity motion vector as the final disparity vector for the current PU. In other words, if one of them uses a disparity motion vector, the checking process is terminated and the corresponding disparity motion vector will be used as the final disparity vector.

此外，如上文所提及，作为确定当前PU的视差向量的过程的部分，视频译码器可检查时间相邻PU。为检查时间相邻块(例如，PU)，可首先执行候选者图片列表的建构过程。在一些实例中，视频译码器可检查来自当前视图的至多两个参考图片以找到视差运动向量。第一参考图片可为共置图片。因此，共置图片(即，共置参考图片)可首先插入候选者图片列表中。第二参考图片可为随机接入图片，或具有最小POC值差及最小时间识别符的参考图片。换句话说，来自当前视图的至多两个参考图片、共置图片及随机接入图片或具有最小POC差及最小时间ID的参考图片被视为用于时间块检查。视频译码器可首先检查随机接入图片接着检查共置图片。Furthermore, as mentioned above, as part of the process of determining the disparity vector for the current PU, the video coder may check temporally neighboring PUs. To check temporally neighboring blocks (e.g., PUs), the candidate picture list construction process may be performed first. In some examples, the video coder may check up to two reference pictures from the current view to find the disparity motion vector. The first reference picture may be a co-located picture. Therefore, the co-located picture (i.e., the co-located reference picture) may be inserted first into the candidate picture list. The second reference picture may be a random access picture, or the reference picture with the smallest POC value difference and the smallest temporal identifier. In other words, up to two reference pictures from the current view, the co-located picture and the random access picture, or the reference picture with the smallest POC difference and the smallest temporal identifier, are considered for temporal block checking. The video coder may first check the random access picture and then the co-located picture.

对于每一候选者图片(即，随机接入图片及共置图片)，视频译码器可检查两个块。具体来说，视频译码器可检查中心块(CR)及右下块(BR)。图6为说明NBDV导出过程中的实例时间相邻块的概念图。中心块可为当前PU的共置区的中心4×4块。右下块可为当前PU的共置区的右下4×4块。因此，对于每一候选者图片，按次序检查两个块，对于第一非基础视图为CR及BR，或对于第二非基础视图为BR、CR。如果覆盖CR或BR的PU中的一者具有视差运动向量，则视频译码器可终止检查过程，且可使用视差运动向量作为当前PU的最终视差向量。在此实例中，与第一非基础视图相关联的图片的解码可取决于与基础视图相关联的图片而非与其它视图相关联的图片的解码。此外，在此实例中，与第二非基础视图相关联的图片的解码可取决于与基础视图相关联且在一些情况下与第一非基础视图相关联的图片而非与其它视图(如果存在)相关联的图片的解码。For each candidate picture (i.e., random access pictures and co-located pictures), the video coder may check two blocks. Specifically, the video coder may check a center block (CR) and a bottom right block (BR). FIG6 is a conceptual diagram illustrating an example temporal neighboring block in the NBDV derivation process. The center block may be the center 4×4 block of the co-located region of the current PU. The bottom right block may be the bottom right 4×4 block of the co-located region of the current PU. Thus, for each candidate picture, two blocks are checked in order, CR and BR for the first non-base view, or BR, CR for the second non-base view. If one of the PUs covering the CR or BR has a disparity motion vector, the video coder may terminate the checking process and may use the disparity motion vector as the final disparity vector for the current PU. In this example, the decoding of the picture associated with the first non-base view may depend on the decoding of the picture associated with the base view, but not on the decoding of the pictures associated with the other views. Furthermore, in this example, decoding of pictures associated with the second non-base view may depend on decoding of pictures associated with the base view and, in some cases, the first non-base view, rather than pictures associated with other views, if any.

在图6的实例中，块42指示当前PU的共置区。此外，在图6的实例中，标记为“Pos.A”的块对应于中心块。标记为“Pos.B”的块对应于右下块。如图6的实例中所指示，中心块可直接位于共置区的中心点的中心的右下方。In the example of FIG6 , block 42 indicates the co-located region of the current PU. Furthermore, in the example of FIG6 , the block labeled "Pos. A" corresponds to the center block. The block labeled "Pos. B" corresponds to the lower-right block. As indicated in the example of FIG6 , the center block may be located directly below and to the right of the center point of the co-located region.

当视频译码器检查相邻PU(即，空间或时间相邻PU)时，视频译码器可首先检查相邻PU是否具有视差运动向量。如果相邻PU均不具有视差运动向量，则视频译码器可确定空间相邻PU中的任一者是否具有IDV。换句话说，首先针对所有空间/时间相邻块检查是否使用视差运动向量接着检查IDV。首先检查空间相邻块接着检查时间相邻块。当检查相邻块以找到IDV时，视频译码器可按A₀、A₁、B₀、B₁及B₂的次序检查空间相邻PU。如果空间相邻PU中的一者具有IDV且所述IDV是作为合并/跳过模式而经译码，则视频译码器可终止检查过程且可使用所述IDV作为当前PU的最终视差向量。换句话说，按A₀、A₁、B₀、B₁及B₂的次序检查五个空间相邻块。如果相邻块中的一者使用IDV且其可作为跳过/合并模式经译码，则终止检查过程且可将对应IDV用作最终视差向量。When the video coder checks a neighboring PU (i.e., a spatial or temporal neighboring PU), the video coder may first check whether the neighboring PU has a disparity motion vector. If none of the neighboring PUs has a disparity motion vector, the video coder may determine whether any of the spatially neighboring PUs has an IDV. In other words, first check whether a disparity motion vector is used for all spatial/temporal neighboring blocks and then check the IDV. First check the spatial neighboring blocks and then check the temporal neighboring blocks. When checking neighboring blocks to find the IDV, the video coder may check the spatially neighboring PUs in the order of A ₀ , A ₁ , B ₀ , B ₁ , and B _2. If one of the spatially neighboring PUs has an IDV and the IDV is coded as a merge/skip mode, the video coder may terminate the checking process and may use the IDV as the final disparity vector for the current PU. In other words, check the five spatially neighboring blocks in the order of A ₀ , A ₁ , B ₀ , B ₁ , and B ₂ . If one of the neighboring blocks uses IDV and it can be coded as skip/merge mode, the checking process is terminated and the corresponding IDV can be used as the final disparity vector.

如上文所指示，当前块的视差向量可指示参考视图中的参考图片(即，参考视图分量)中的位置。在一些3D-HEVC设计中,允许视频译码器接入参考视图的深度信息。在一些此类3D-HEVC设计中，当视频译码器使用NBDV导出过程导出当前块的视差向量时，视频译码器可应用优化过程以进一步优化当前块的视差向量。视频译码器可基于参考图片的深度图优化当前块的视差向量。换句话说，可使用经译码深度图中的信息进一步优化从NBDV方案产生的视差向量。也就是说，可通过利用基础视图深度图中所译码的信息增强视差向量的准确性。此优化过程可在本文中被称作NBDV优化(“NBDV-R”)、NBDV优化过程或深度定向的NBDV(Do-NBDV)。As indicated above, the disparity vector of the current block may indicate a position in a reference picture (i.e., a reference view component) in a reference view. In some 3D-HEVC designs, the video coder is allowed to access depth information of the reference view. In some such 3D-HEVC designs, when the video coder derives the disparity vector of the current block using the NBDV derivation process, the video coder may apply an optimization process to further optimize the disparity vector of the current block. The video coder may optimize the disparity vector of the current block based on the depth map of the reference picture. In other words, the disparity vector generated from the NBDV scheme may be further optimized using information in the coded depth map. That is, the accuracy of the disparity vector may be enhanced by utilizing the information coded in the base view depth map. This optimization process may be referred to herein as NBDV optimization ("NBDV-R"), an NBDV optimization process, or depth-oriented NBDV (Do-NBDV).

当NBDV导出过程传回可用视差向量时(例如，当NBDV导出过程传回指示NBDV导出过程能够基于相邻块的视差运动向量或IDV导出当前块的视差向量的变量时)，视频译码器可通过从参考视图的深度图检索深度数据进一步优化视差向量。在一些实例中，优化过程包含以下步骤：When the NBDV derivation process returns a usable disparity vector (e.g., when the NBDV derivation process returns a variable indicating that the NBDV derivation process is able to derive the disparity vector of the current block based on the disparity motion vector or IDV of the neighboring block), the video coder can further optimize the disparity vector by retrieving depth data from the depth map of the reference view. In some examples, the optimization process includes the following steps:

1.使用当前块的视差向量以定位参考视图的深度图中的块。换句话说，通过所导出视差向量定位先前经译码参考深度视图(例如，基础视图)中的对应深度块。在此实例中，深度中的对应块的大小可与当前PU的大小(即，当前PU的预测块的大小)相同。1. Use the disparity vector of the current block to locate the block in the depth map of the reference view. In other words, locate the corresponding depth block in the previously coded reference depth view (e.g., the base view) through the derived disparity vector. In this example, the size of the corresponding block in depth can be the same as the size of the current PU (i.e., the size of the prediction block of the current PU).

2.从共置深度块的四个拐角深度值的最大值计算视差向量。此值设定成等于视差向量的水平分量，而视差向量的垂直分量设定为0。2. Calculate the disparity vector from the maximum of the four corner depth values of the co-located depth block. This value is set equal to the horizontal component of the disparity vector, while the vertical component of the disparity vector is set to 0.

在一些实例中，当NBDV导出过程并不传回可用视差向量时(例如，当NBDV导出过程传回指示NBDV导出过程不能够基于相邻块的视差运动向量或IDV导出当前块的视差向量的变量时)，视频译码器并不执行NBDV优化过程且视频译码器可使用零视差向量作为当前块的视差向量。换句话说，当NBDV导出过程并不提供可用视差向量且因此NBDV导出过程的结果不可用时，跳过上文的NBDV-R过程且直接传回零视差向量。In some examples, when the NBDV derivation process does not return a usable disparity vector (e.g., when the NBDV derivation process returns a variable indicating that the NBDV derivation process is unable to derive the disparity vector of the current block based on the disparity motion vector or IDV of the neighboring block), the video coder does not perform the NBDV optimization process and the video coder may use a zero disparity vector as the disparity vector of the current block. In other words, when the NBDV derivation process does not provide a usable disparity vector and thus the result of the NBDV derivation process is unusable, the NBDV-R process described above is skipped and a zero disparity vector is directly returned.

在3D-HEVC的一些提议中，视频译码器使用当前块的经优化视差向量以用于视图间运动预测同时视频译码器使用当前块的未经优化视差向量以用于视图间残余预测。举例来说，视频译码器可使用NBDV导出过程以导出当前块的未经优化视差向量。视频译码器可接着应用NBDV优化过程以导出当前块的经优化视差向量。视频译码器可使用当前块的经优化视差向量以用于确定当前块的运动信息。此外，视频译码器可使用当前块的未经优化视差向量以用于确定当前块的残余块。In some proposals for 3D-HEVC, the video coder uses the optimized disparity vector of the current block for inter-view motion prediction while the video coder uses the unoptimized disparity vector of the current block for inter-view residual prediction. For example, the video coder may use an NBDV derivation process to derive the unoptimized disparity vector of the current block. The video coder may then apply the NBDV optimization process to derive the optimized disparity vector of the current block. The video coder may use the optimized disparity vector of the current block to determine motion information for the current block. Furthermore, the video coder may use the unoptimized disparity vector of the current block to determine the residual block of the current block.

以此方式，此新视差向量被称为“深度定向的基于相邻块的视差向量(DoNBDV)”。接着用来自DoNBDV方案的此新近导出视差向量替换来自NBDV方案的视差向量以用于AMVP及合并模式的视图间候选者导出。视频译码器可使用未经优化视差向量以用于视图间残余预测。In this way, this new disparity vector is called "depth-oriented neighbor-based disparity vector (DoNBDV)". This newly derived disparity vector from the DoNBDV scheme is then used to replace the disparity vector from the NBDV scheme for inter-view candidate derivation for AMVP and merge mode. The video decoder can use the unoptimized disparity vector for inter-view residual prediction.

视频译码器可使用类似优化过程以优化视差运动向量以用于反向视图合成预测(BVSP)。以此方式，深度可用于优化视差向量或待用于BVSP的视差运动向量。如果经优化视差向量以BVSP模式译码，则经优化视差向量可存储为一个PU的运动向量。The video coder may use a similar optimization process to optimize the disparity motion vector for use in backward view synthesis prediction (BVSP). In this way, depth may be used to optimize the disparity vector or the disparity motion vector to be used for BVSP. If the optimized disparity vector is coded in BVSP mode, the optimized disparity vector may be stored as the motion vector of one PU.

视频译码器可执行BVSP以合成视图分量。BVSP方法提出于田等人的“CE1.h:使用相邻块的反向视图合成预测”，文件JCT3V-C0152(在下文中称为“JCT3V-C0152”)中且在第三次JCT-3V会议中采用。BVSP概念上类似于3D-AVC中的基于块的VSP。换句话说，反向变形VSP的基本想法与3D-AVC中的基于块的VSP相同。BVSP与3D-AVC中的基于块的VSP两者都使用反向变形及基于块的VSP以避免发射运动向量差及使用较精确运动向量。然而，归因于不同平台实施细节可不同。The video coder can perform Backward Warping (BVSP) to synthesize view components. The BVSP method was proposed in Tian et al., "CE1.h: Backward View Synthesis Prediction Using Neighboring Blocks," document JCT3V-C0152 (hereinafter referred to as "JCT3V-C0152"), and was adopted at the third JCT-3V meeting. BVSP is conceptually similar to block-based VSP in 3D-AVC. In other words, the basic idea behind Backward Warping (BVSP) is the same as block-based VSP in 3D-AVC. Both BVSP and block-based VSP in 3D-AVC use backward warping and block-based VSP to avoid transmitting motion vector differences and use more accurate motion vectors. However, implementation details may differ due to different platforms.

在3D-HEVC的一些版本中,应用纹理优先译码。在纹理优先译码中，视频译码器在译码对应深度视图分量(即，具有相同于纹理视图分量的POC值及视图识别符的深度视图分量)之前译码(例如，编码或解码)纹理视图分量。因此，非基础视图深度视图分量不可用于译码对应的非基础视图纹理视图分量。换句话说，当视频译码器译码非基础纹理视图分量时,对应的非基础深度视图分量不可用。因此，可估计深度信息且将其用以执行BVSP。In some versions of 3D-HEVC, texture-first coding is applied. In texture-first coding, the video coder codes (e.g., encodes or decodes) a texture view component before coding the corresponding depth view component (i.e., a depth view component having the same POC value and view identifier as the texture view component). As a result, the non-base view depth view component is not available for coding the corresponding non-base view texture view component. In other words, when the video coder codes the non-base texture view component, the corresponding non-base depth view component is not available. Therefore, depth information can be estimated and used to perform BVSP.

为了估计块的深度信息，提出首先从相邻块导出视差向量，且接着使用所导出视差向量以从参考视图获得深度块。在3D-HEVC测试模型5.1(即，HTM 5.1测试模型)中，存在称为NBDV导出过程的导出视差向量预测符的过程。使(dv_x,dv_y)表示从NBDV导出过程识别的视差向量，且当前块位置为(block_x,block_y)。视频译码器可在参考视图的深度图像中的(block_x+dv_x,block_y+dv_y)处提取深度块。所提取深度块可具有相同于当前PU的大小。视频译码器可接着使用所提取深度块以进行当前PU的反向变形。图7为说明从参考视图的深度块导出以执行BVSP的概念图。图7说明来自参考视图的深度块如何定位且接着用于BVSP预测的三个步骤。To estimate the depth information of a block, it is proposed to first derive a disparity vector from neighboring blocks and then use the derived disparity vector to obtain a depth block from a reference view. In 3D-HEVC Test Model 5.1 (i.e., HTM 5.1 Test Model), there is a process for deriving a disparity vector predictor called the NBDV derivation process. Let (dv _x , dv _y ) denote the disparity vector identified from the NBDV derivation process, and the current block position be (block _x , block _y ). The video coder may extract a depth block at (block _x + dv _x , block _y + dv _y ) in the depth image of the reference view. The extracted depth block may have the same size as the current PU. The video coder may then use the extracted depth block to perform inverse warping of the current PU. FIG7 is a conceptual diagram illustrating the derivation of a depth block from a reference view to perform BVSP. FIG7 illustrates the three steps of how a depth block from a reference view is located and then used for BVSP prediction.

如果在序列中启用BVSP，则可改变用于视图间运动预测的NBDV导出过程且在以下段落中以粗体展示差异：If BVSP is enabled in a sequence, the NBDV derivation process for inter-view motion prediction may be changed and the differences are shown in bold in the following paragraphs:

●对于时间相邻块中的每一者，如果时间相邻块使用视差运动向量，则传回视差运动向量作为视差向量且用本发明中其它地方所描述的方法进一步优化所述视差向量。• For each of the temporal neighboring blocks, if the temporal neighboring block uses a disparity motion vector, the disparity motion vector is returned as the disparity vector and further optimized with the methods described elsewhere in this disclosure.

●对于空间相邻块中的每一者，以下适用：• For each of the spatially neighboring blocks, the following applies:

○对于每一参考图片列表0或参考图片列表1，以下适用：o For each reference picture list 0 or reference picture list 1, the following applies:

■如果空间相邻块使用视差运动向量，则传回视差运动向量作为视差向量且用本发明中其它地方所描述的方法进一步优化所述视差向量。■ If the spatial neighboring block uses a disparity motion vector, the disparity motion vector is returned as the disparity vector and further optimized using the methods described elsewhere in this disclosure.

■否则，如果空间相邻块使用BVSP模式，则可传回相关联运动向量作为视差向量。可进一步以类似于如本发明中其它地方所描述的方式优化视差向量。■ Otherwise, if the spatial neighboring block uses BVSP mode, the associated motion vector may be returned as the disparity vector.The disparity vector may be further optimized in a manner similar to that described elsewhere in this disclosure.

然而，最大深度值可选自对应深度块的所有像素而非四个拐角像素。However, the maximum depth value may be selected from all pixels of the corresponding depth block instead of the four corner pixels.

对于空间相邻块中的每一者，如果空间相邻块使用IDV，则传回IDV作为视差向量。视频译码器可进一步使用本发明中其它地方所描述的方法中的一或多者优化视差向量。For each of the spatially neighboring blocks, if the spatially neighboring block uses IDV, then the IDV is returned as the disparity vector.The video coder may further optimize the disparity vector using one or more of the methods described elsewhere in this disclosure.

视频译码器可将上文所描述的BVSP模式处理为特殊经帧间译码模式且视频译码器可维持指示BVSP模式针对每一PU的使用的旗标。视频译码器可将新合并候选者(BVSP合并候选者)添加到合并候选者列表而非在位流中用信号通知旗标且所述旗标取决于经解码合并候选者索引是否对应于BVSP合并候选者。在一些实例中，BVSP合并候选者定义为如下：The video coder may process the BVSP mode described above as a special inter-coded mode and the video coder may maintain a flag indicating the use of BVSP mode for each PU. The video coder may add a new merge candidate (BVSP merge candidate) to the merge candidate list instead of signaling the flag in the bitstream and the flag depends on whether the decoded merge candidate index corresponds to a BVSP merge candidate. In some examples, the BVSP merge candidate is defined as follows:

●每一参考图片列表的参考图片索引：-1● Reference picture index for each reference picture list: -1

●每一参考图片列表的运动向量：经优化视差向量● Motion vector for each reference picture list: optimized disparity vector

在一些实例中，BVSP合并候选者的插入位置取决于空间相邻块。举例来说，如果以BVSP模式译码五个空间相邻块(A₀、A₁、B₀、B₁或B₂)，即，相邻块的所维持旗标等于1，则视频译码器可将BVSP合并候选者处理为对应空间合并候选者且可将BVSP合并候选者插入到合并候选者列表。视频译码器可仅将BVSP合并候选者插入到合并候选者列表中一次。否则，在此实例中(例如，当五个空间相邻块均未以BVSP模式译码时)，视频译码器可在紧接着任何时间合并候选者之前将BVSP合并候选者插入到合并候选者列表。在组合双向预测性合并候选者导出过程期间，视频译码器可检查额外条件以避免包含BVSP合并候选者。In some examples, the insertion position of a BVSP merge candidate depends on spatial neighboring blocks. For example, if five spatial neighboring blocks ( _A0 , _A1 , _B0 , _B1 , or _B2 ) are coded in BVSP mode, i.e., the maintained flag of the neighboring blocks is equal to 1, then the video coder may treat the BVSP merge candidate as the corresponding spatial merge candidate and may insert the BVSP merge candidate into the merge candidate list. The video coder may insert the BVSP merge candidate into the merge candidate list only once. Otherwise, in this example (e.g., when none of the five spatial neighboring blocks are coded in BVSP mode), the video coder may insert the BVSP merge candidate into the merge candidate list immediately before any temporal merge candidate. During the combined bi-predictive merge candidate derivation process, the video coder may check additional conditions to avoid including BVSP merge candidates.

对于每一经BVSP译码PU，视频译码器可进一步将BVSP分割成具有等于K×K的大小的若干子区(其中K可为4或2)。经BVSP译码PU的大小可由N×M表示。对于每一子区，视频译码器可导出单独视差运动向量。此外，视频译码器可从由视图间参考图片中的所导出视差运动向量定位的一个块预测每一子区。换句话说，用于经BVSP译码PU的运动补偿单元的大小设定成K×K。在一些测试条件中，K设定成4。For each BVSP-coded PU, the video coder may further partition the BVSP into several sub-regions of size equal to K×K (where K may be 4 or 2). The size of the BVSP-coded PU may be represented by N×M. For each sub-region, the video coder may derive a separate disparity motion vector. Furthermore, the video coder may predict each sub-region from one block located by the derived disparity motion vector in the inter-view reference picture. In other words, the size of the motion compensation unit for the BVSP-coded PU is set to K×K. In some test conditions, K is set to 4.

关于BVSP，视频译码器可执行以下视差运动向量导出过程。对于以BVSP模式译码的一个PU内的每一子区(4×4块)，视频译码器可首先用上文所提及的经优化视差向量定位参考深度视图中的对应4×4深度块。其次，视频译码器可选择对应深度块中的十六个深度像素的最大值。再次，视频译码器可将最大值转换为视差运动向量的水平分量。视频译码器可将视差运动向量的垂直分量设定为0。Regarding BVSP, the video coder may perform the following disparity motion vector derivation process. For each sub-region (4×4 block) within a PU coded in BVSP mode, the video coder may first locate the corresponding 4×4 depth block in the reference depth view using the optimized disparity vector mentioned above. Second, the video coder may select the maximum value of the sixteen depth pixels in the corresponding depth block. Third, the video coder may convert the maximum value into the horizontal component of the disparity motion vector. The video coder may set the vertical component of the disparity motion vector to 0.

基于从DoNBDV技术导出的视差向量，视频译码器可将新运动向量候选者(即，经视图间预测的运动向量候选者(IPMVC))(如果可用)添加到AMVP及跳过/合并模式。IPMVC(如果可用)为时间运动向量。由于跳过模式具有相同于合并模式的运动向量导出过程，因此本文件中所描述的技术可适用于合并及跳过模式两者。Based on the disparity vector derived from the DoNBDV technique, the video coder can add new motion vector candidates, namely, inter-view predicted motion vector candidates (IPMVC), if available, to AMVP and skip/merge modes. IPMVC, if available, is a temporal motion vector. Since skip mode has the same motion vector derivation process as merge mode, the techniques described in this document can be applied to both merge and skip modes.

对于合并/跳过模式，可通过以下步骤导出IPMVC。首先，视频译码器可使用视差向量定位相同接入单元的参考视图中的当前块(例如，PU、CU等)的对应块。其次，如果对应块未经帧内译码及视图间预测且对应块的参考图片具有等于当前块的相同参考图片列表中的一个项目的POC值，则视频译码器可基于POC值转换对应块的参考索引。此外，视频译码器可导出IPMVC以指定对应块的预测方向、对应块的运动向量及经转换参考索引。For merge/skip mode, the IPMVC can be derived through the following steps. First, the video coder can use the disparity vector to locate the corresponding block of the current block in the reference view of the same access unit (e.g., PU, CU, etc.). Second, if the corresponding block is not intra-coded and inter-view predicted and the reference picture of the corresponding block has a POC value equal to an entry in the same reference picture list as the current block, the video coder can convert the reference index of the corresponding block based on the POC value. In addition, the video coder can derive the IPMVC to specify the prediction direction of the corresponding block, the motion vector of the corresponding block, and the converted reference index.

3D-HEVC测试模型4的章节H.8.5.2.1.10描述时间视图间运动向量候选者的导出过程。IPMVC可被称为时间视图间运动向量候选者，这是因为其指示时间参考图片中的位置。如描述于3D-HEVC测试模型4的章节H.8.5.2.1.10中，如下导出参考层明度位置(xRef,yRef)：Section H.8.5.2.1.10 of 3D-HEVC Test Model 4 describes the derivation process of temporal inter-view motion vector candidates. IPMVC may be referred to as a temporal inter-view motion vector candidate because it indicates a position in a temporal reference picture. As described in section H.8.5.2.1.10 of 3D-HEVC Test Model 4, the reference layer luma position (xRef, yRef) is derived as follows:

xRef＝Clip3(0,PicWidthInSamples_L-1,xP+((nPSW-1)>>1)+xRef＝Clip3(0,PicWidthInSamples _L -1,xP+((nPSW-1)>>1)+

((mvDisp[0]+2)>>2))(H-124)((mvDisp[0]+2)>>2))(H-124)

yRef＝Clip3(0,PicHeightInSamples_L-1,yP+((nPSH-1)>>1)+yRef＝Clip3(0,PicHeightInSamples _L -1,yP+((nPSH-1)>>1)+

((mvDisp[1]+2)>>2))(H-125)((mvDisp[1]+2)>>2))(H-125)

在上文的方程式H-124及H-125中，(xP,yP)表示当前PU的左上方明度样本相对于当前图片的左上方明度样本的坐标，nPSW及nPSH分别表示当前预测单元的宽度及高度，refViewIdx表示参考视图次序索引且mvDisp表示视差向量。对应块被设定成覆盖具有等于refViewIdx的ViewIdx的视图分量中的明度位置(xRef,yRef)的PU。在上文的方程式H-124及H-125及本发明中的其它方程式中，Clip3函数可定义为：In equations H-124 and H-125 above, (xP, yP) represents the coordinates of the top-left luma sample of the current PU relative to the top-left luma sample of the current picture, nPSW and nPSH represent the width and height of the current prediction unit, respectively, refViewIdx represents the reference view order index, and mvDisp represents the disparity vector. The corresponding block is set to cover the PU with luma position (xRef, yRef) in the view component with ViewIdx equal to refViewIdx. In equations H-124 and H-125 above and other equations in this disclosure, the Clip3 function may be defined as:

图8为说明用于合并/跳过模式的IPMVC的实例导出的概念图。换句话说，图8展示经视图间预测的运动向量候选者的导出过程的实例。在图8的实例中，当前PU 50出现于时间实例T1处的视图V1中。当前PU 50的参考PU 52出现于不同于当前PU 50的视图(即，视图V0)中，但在相同于当前PU的时间实例(即，时间实例T1)处。在图8的实例中，参考PU 52经双向帧间预测。之后，参考PU 52具有第一运动向量54及第二运动向量56。运动向量54指示参考图片58中的位置。参考图片58出现于视图V0及时间实例T0中。运动向量56指示参考图片60中的位置。参考图片60出现于视图V0及时间实例T3中。FIG8 is a conceptual diagram illustrating an example derivation of an IPMVC for merge/skip mode. In other words, FIG8 shows an example of the derivation process for inter-view predicted motion vector candidates. In the example of FIG8 , current PU 50 occurs in view V1 at time instance T1. Reference PU 52 for current PU 50 occurs in a different view than current PU 50 (i.e., view V0), but at the same time instance as current PU (i.e., time instance T1). In the example of FIG8 , reference PU 52 is bidirectionally inter-predicted. Reference PU 52 then has a first motion vector 54 and a second motion vector 56. Motion vector 54 indicates a location in reference picture 58. Reference picture 58 occurs in view V0 and time instance T0. Motion vector 56 indicates a location in reference picture 60. Reference picture 60 occurs in view V0 and time instance T3.

视频译码器可基于参考PU 52的运动信息产生用于包含在当前PU 50的合并候选者列表中的IPMVC。IPMVC可具有第一运动向量62及第二运动向量64。运动向量62匹配运动向量54且运动向量64匹配运动向量56。视频译码器产生IPMVC，使得IPMVC的第一参考索引指示出现于相同于参考图片58的时间实例(即，时间实例T0)中的参考图片(即，参考图片66)的当前PU 50的RefPicList0中的位置。在图8的实例中，参考图片66出现于当前PU 50的RefPicList0中的第一位置(即，Ref0)中。此外，视频译码器产生IPMVC，使得IPMVC的第二参考索引指示出现于相同于参考图片60的时间实例中的参考图片(即，参考图片68)的当前PU50的RefPicList1中的位置。因此，在图8的实例中，IPMVC的RefPicList0参考索引可等于0。在图8的实例中，参考图片70出现于当前PU 50的RefPicList1中的第一位置(即，Ref0)中，且参考图片68出现于当前PU 50的RefPicList1中的第二位置(即，Ref1)中。因此，IPMVC的RefPicList1参考索引可等于1。The video coder may generate an IPMVC for inclusion in the merge candidate list of current PU 50 based on the motion information of reference PU 52. The IPMVC may have first motion vector 62 and second motion vector 64. Motion vector 62 matches motion vector 54, and motion vector 64 matches motion vector 56. The video coder generates the IPMVC such that a first reference index of the IPMVC indicates a position in RefPicList0 of current PU 50 of a reference picture (i.e., reference picture 66) that appears in the same time instance as reference picture 58 (i.e., time instance T0). In the example of FIG. 8 , reference picture 66 appears in the first position (i.e., Ref0) in RefPicList0 of current PU 50. Furthermore, the video coder generates the IPMVC such that a second reference index of the IPMVC indicates a position in RefPicList1 of current PU 50 of a reference picture (i.e., reference picture 68) that appears in the same time instance as reference picture 60. Therefore, in the example of FIG. 8 , the RefPicList0 reference index of the IPMVC may be equal to 0. 8 , reference picture 70 appears in the first position (i.e., Ref0) in RefPicList1 of current PU 50, and reference picture 68 appears in the second position (i.e., Ref1) in RefPicList1 of current PU 50. Therefore, the RefPicList1 reference index of the IPMVC may be equal to 1.

除产生IPMVC并将IPMVC包含于合并候选者列表中之外，视频译码器可将当前PU的视差向量转换成视图间视差运动向量(IDMVC)且可将IDMVC包含于用于当前PU的合并候选者列表中。换句话说，视差向量可转换成IDMVC，其被添加到合并候选者列表中不同于IPMVC的位置或在IDMVC可用时被添加到AMVP候选者列表中相同于IPMVC的位置。IPMVC或IDMVC在此上下文中被称为‘视图间候选者’。换句话说，术语“视图间候选者”可用于指IPMVC或IDMVC。在一些实例中，在合并/跳过模式中，视频译码器始终在所有空间及时间合并候选者之前将IPMVC(如果可用)插入到合并候选者列表。此外，视频译码器可在从A₀导出的空间合并候选者之前插入IDMVC。In addition to generating an IPMVC and including the IPMVC in the merge candidate list, the video coder may convert the disparity vector of the current PU into an inter-view disparity motion vector (IDMVC) and may include the IDMVC in the merge candidate list for the current PU. In other words, the disparity vector may be converted into an IDMVC, which is added to the merge candidate list at a position different from the IPMVC or to the AMVP candidate list at the same position as the IPMVC when the IDMVC is available. The IPMVC or IDMVC is referred to as an 'inter-view candidate' in this context. In other words, the term "inter-view candidate" may be used to refer to the IPMVC or IDMVC. In some examples, in merge/skip mode, the video coder always inserts the IPMVC (if available) into the merge candidate list before all spatial and temporal merge candidates. In addition, the video coder may insert the IDMVC before the spatial merge candidate derived from _A0 .

如上文所指示，视频译码器可用DoNBDV的方法导出视差向量。以所述视差向量，3D-HEVC中的合并候选者列表建构过程可如下定义：As indicated above, the video decoder can derive the disparity vector using the DoNBDV method. With the disparity vector, the merge candidate list construction process in 3D-HEVC can be defined as follows:

1.IPMVC插入1.IPMVC Insertion

通过上文所描述的程序导出IPMVC。如果IPMVC可用，则将IPMVC插入到合并列表。The IPMVC is derived by the procedure described above. If the IPMVC is available, it is inserted into the merge list.

2.3D-HEVC中用于空间合并候选者的导出过程及IDMVC插入2.3D-HEVC Derivation Process for Spatial Merging Candidates and IDMVC Insertion

按以下次序检查空间相邻PU的运动信息：A₁、B₁、B₀、A₀或B₂。通过以下程序执行受约束的精简：The motion information of spatially neighboring PUs is checked in the following order: A ₁ , B ₁ , B ₀ , A ₀ or B ₂ . Constrained reduction is performed by the following procedure:

-如果A₁及IPMVC具有相同运动向量及相同参考索引，则不将A₁插入到候选者列表中；否则将A₁插入到列表中。If _A1 and IPMVC have the same motion vector and the same reference index, do not insert _A1 into the candidate list; otherwise insert _A1 into the list.

-如果B₁及A₁/IPMVC具有相同运动向量及相同参考索引，则不将B₁插入到候选者列表中；否则将B₁插入到列表中。If _B1 and _A1 /IPMVC have the same motion vector and the same reference index, do not insert _B1 into the candidate list; otherwise insert _B1 into the list.

-如果B₀可用，则将B₀添加到候选者列表。通过上文所描述的程序导出IDMVC。如果IDMVC可用且IDMVC不同于从A₁及B₁导出的候选者，则将IDMVC插入到候选者列表中。If _B0 is available, add _B0 to the candidate list. Derive IDMVC by the procedure described above. If IDMVC is available and IDMVC is different from the candidates derived from _A1 and _B1 , insert IDMVC into the candidate list.

-如果针对整个图片或当前切片启用BVSP，则将BVSP合并候选者插入到合并候选者列表。If BVSP is enabled for the entire picture or the current slice, insert the BVSP merge candidate into the merge candidate list.

-如果A₀可用，则将A₀添加到候选者列表。- If A ₀ is available, add A ₀ to the candidate list.

-如果B₂可用，则将B₂添加到候选者列表。- If B ₂ is available, add B ₂ to the candidate list.

3.用于时间合并候选者的导出过程3. Derivation process for temporal merging candidates

类似于其中利用共置PU的运动信息的HEVC中的时间合并候选者导出过程，然而，时间合并候选者的目标参考图片索引可实际上改变为固定为0。当等于0的目标参考索引对应于时间参考图片(相同视图中)同时共置PU的运动向量指向视图间参考图片时，目标参考索引可改变为对应于参考图片列表中的视图间参考图片的第一项目的另一索引。相反地，当等于0的目标参考索引对应于视图间参考图片同时共置PU的运动向量指向时间参考图片时，目标参考索引可改变为对应于参考图片列表中的时间参考图片的第一项目的另一索引。Similar to the temporal merge candidate derivation process in HEVC where the motion information of the co-located PU is utilized, however, the target reference picture index of the temporal merge candidate may actually be changed to be fixed to 0. When the target reference index equal to 0 corresponds to a temporal reference picture (in the same view) and the motion vector of the co-located PU points to an inter-view reference picture, the target reference index may be changed to another index corresponding to the first entry of the inter-view reference picture in the reference picture list. Conversely, when the target reference index equal to 0 corresponds to an inter-view reference picture and the motion vector of the co-located PU points to a temporal reference picture, the target reference index may be changed to another index corresponding to the first entry of the temporal reference picture in the reference picture list.

4.3D-HEVC中用于组合双向预测性合并候选者的导出过程4. Derivation Process for Combining Bidirectional Predictive Merging Candidates in 3D-HEVC

如果从上文两个步骤导出的候选者的总数目小于候选者的最大数目，则除l0CandIdx及l1CandIdx的指定外执行如HEVC中定义的相同过程。图9为指示3D-HEVC中的l0CandIdx及l1CandIdx的实例指定的表格。图9的表格中定义combIdx、l0CandIdx及l1CandIdx当中的关系。HEVC工作草案10的章节8.5.3.2.3定义l0CandIdx及l1CandIdx在导出组合双向预测性合并候选者时的实例使用。If the total number of candidates derived from the two steps above is less than the maximum number of candidates, the same process as defined in HEVC is performed, except for the designation of l0CandIdx and l1CandIdx. FIG9 is a table indicating an example designation of l0CandIdx and l1CandIdx in 3D-HEVC. The table in FIG9 defines the relationship between combIdx, l0CandIdx, and l1CandIdx. Section 8.5.3.2.3 of HEVC Working Draft 10 defines the example use of l0CandIdx and l1CandIdx when deriving combined bi-predictive merging candidates.

5.用于零运动向量合并候选者的导出过程5. Derivation process for zero motion vector merge candidates

-执行如HEVC中定义的相同程序。- Perform the same procedures as defined in HEVC.

在3D-HEVC的参考软件的一些版本中，合并(例如，MRG)列表中的候选者的总数目至多为六个且在切片标头中用信号通知five_minus_max_num_merge_cand以指定从6减去的合并候选者的最大数目。five_minus_max_num_merge_cand在0到5的范围中(包含0与5)。five_minus_max_num_merge_cand语法元素可指定从5减去的切片中所支持的合并MVP候选者的最大数目。合并运动向量预测(MVP)候选者的最大数目MaxNumMergeCand可计算为MaxNumMergeCand＝5-five_minus_max_num_merge_cand+iv_mv_pred_flag[nuh_layer_id]。five_minus_max_num_merge_cand的值可受限，使得MaxNumMergeCand处于0到(5+iv_mv_pred_flag[nuh_layer_id])的范围中(包含0与(5+iv_mv_pred_flag[nuh_layer_id]))。In some versions of the reference software for 3D-HEVC, the total number of candidates in a merge (e.g., MRG) list is at most six, and five_minus_max_num_merge_cand is signaled in the slice header to specify the maximum number of merge candidates subtracted from 6. five_minus_max_num_merge_cand is in the range of 0 to 5, inclusive. The five_minus_max_num_merge_cand syntax element may specify the maximum number of merge MVP candidates supported in a slice subtracted from 5. The maximum number of merge motion vector prediction (MVP) candidates, MaxNumMergeCand, may be calculated as MaxNumMergeCand=5−five_minus_max_num_merge_cand+iv_mv_pred_flag[nuh_layer_id]. The value of five_minus_max_num_merge_cand may be limited such that MaxNumMergeCand is in the range of 0 to (5+iv_mv_pred_flag[nuh_layer_id]), inclusive.

如上文所指示，HEVC工作草案10的章节8.5.3.2.3定义l0CandIdx及l1CandIdx在导出组合双向预测性合并候选者时的实例使用。下文再现HEVC工作草案10的章节8.5.3.2.3。As indicated above, section 8.5.3.2.3 of HEVC Working Draft 10 defines example uses of l0CandIdx and l1CandIdx in deriving combined bi-predictive merge candidates. Section 8.5.3.2.3 of HEVC Working Draft 10 is reproduced below.

用于组合双向预测性合并候选者的导出过程Derivation process for combining bidirectional predictive merge candidates

此过程的输入为：The inputs to this process are:

-合并候选者列表mergeCandList，-Merge candidate list mergeCandList,

-mergeCandList中的每一候选者N的参考索引refIdxL0N及refIdxL1N，- the reference indices refldxLON and refldxL1N for each candidate N in mergeCandList,

-mergeCandList中的每一候选者N的预测列表利用旗标predFlagL0N及predFlagL1N，-mergeCandList each candidate N prediction list using flags predFlagL0N and predFlagL1N,

-mergeCandList中的每一候选者N的运动向量mvL0N及mvL1N，-mergeCandList for each candidate N's motion vectors mvL0N and mvL1N,

-mergeCandList内的元素numCurrMergeCand的数目，-The number of elements numCurrMergeCand in mergeCandList,

-在空间及时间合并候选者导出过程之后mergeCandList内的元素numOrigMergeCand的数目。- The number of elements numOrigMergeCand in mergeCandList after the spatial and temporal merge candidate derivation process.

此过程的输出为：The output of this process is:

-合并候选者列表mergeCandList，-Merge candidate list mergeCandList,

-在此过程的调用期间添加到mergeCandList中的每一新候选者combCand_k的参考索引refIdxL0combCand_k及refIdxL1combCand_k，- the reference index refIdxL0combCand _k and refIdxL1combCand _k of each new candidate combCand _k added to mergeCandList during this call of this process,

-在此过程的调用期间添加到mergeCandList中的每一新候选者combCand_k的预测列表利用旗标predFlagL0combCand_k及predFlagL1combCand_k，- The prediction list of each new candidate combCand _k added to mergeCandList during the call of this process uses the flags predFlagL0combCand _k and predFlagL1combCand _k ,

-在此过程的调用期间添加到mergeCandList中的每一新候选者combCand_k的运动向量mvL0combCand_k及mvL1combCand_k。- The motion vectors mvL0combCand _k and mvL1combCand _k for each new candidate combCand _k added to mergeCandList during the call of this process.

当numOrigMergeCand大于1且小于MaxNumMergeCand时，变量numInputMergeCand被设定成等于numCurrMergeCand，变量combIdx被设定成等于0，变量combStop被设定成等于FALSE，且重复以下步骤直到combStop等于TRUE为止：When numOrigMergeCand is greater than 1 and less than MaxNumMergeCand, the variable numInputMergeCand is set equal to numCurrMergeCand, the variable combIdx is set equal to 0, the variable combStop is set equal to FALSE, and the following steps are repeated until combStop is equal to TRUE:

1.使用如表8-6中所指定的combIdx导出变量l0CandIdx及l1CandIdx。1. Export variables l0CandIdx and l1CandIdx using combIdx as specified in Table 8-6.

2.进行以下指派，其中在合并候选者列表mergeCandList中，l0Cand是在位置l0CandIdx处的候选者，且l1Cand是在位置l1CandIdx处的候选者：2. The following assignment is made, where in the merge candidate list mergeCandList, l0Cand is the candidate at position l0CandIdx, and l1Cand is the candidate at position l1CandIdx:

-l0Cand＝mergeCandList[l0CandIdx]-l0Cand=mergeCandList[l0CandIdx]

-l1Cand＝mergeCandList[l1CandIdx]-l1Cand=mergeCandList[l1CandIdx]

3.当所有以下条件都为真时：3. When all of the following conditions are true:

-predFlagL0l0Cand＝＝1-predFlagL0l0Cand==1

-predFlagL1l1Cand＝＝1-predFlagL1l1Cand==1

-(DiffPicOrderCnt(RefPicList0[refIdxL0l0Cand],-(DiffPicOrderCnt(RefPicList0[refIdxL0l0Cand],

RefPicList1[refIdxL1l1Cand])！＝0)||RefPicList1[refIdxL1l1Cand])! =0)||

(mvL0l0Cand！＝mvL1l1Cand)(mvL0l0Cand!=mvL1l1Cand)

将候选者combCand_k(其中k等于(numCurrMergeCand-numInputMergeCand))添加到mergeCandList的末端，即将mergeCandList[numCurrMergeCand]设定成等于combCand_k并如下导出combCand_k的参考索引、预测列表利用旗标及运动向量且将numCurrMergeCand递增1：Add candidate combCand _k (where k is equal to (numCurrMergeCand - numInputMergeCand)) to the end of mergeCandList, i.e., set mergeCandList[numCurrMergeCand] equal to combCand _k and derive the reference index, prediction list utilization flag and motion vector of combCand _k as follows and increment numCurrMergeCand by 1:

refIdxL0combCand_k＝refIdxL0l0Cand (8-113)refIdxL0combCand _k =refIdxL0l0Cand (8-113)

refIdxL1combCand_k＝refIdxL1l1Cand (8-114)refIdxL1combCand _k =refIdxL1l1Cand (8-114)

predFlagL0combCand_k＝1 (8-115)predFlagL0combCand _k = 1 (8-115)

predFlagL1combCand_k＝1 (8-116)predFlagL1combCand _k = 1 (8-116)

mvL0combCand_k[0]＝mvL0l0Cand[0] (8-117)mvL0combCand _k [0]＝mvL0l0Cand[0] (8-117)

mvL0combCand_k[1]＝mvL0l0Cand[1] (8-118)mvL0combCand _k [1]＝mvL0l0Cand[1] (8-118)

mvL1combCand_k[0]＝mvL1l1Cand[0] (8-119)mvL1combCand _k [0]＝mvL1l1Cand[0] (8-119)

mvL1combCand_k[1]＝mvL1l1Cand[1] (8-120)mvL1combCand _k [1]＝mvL1l1Cand[1] (8-120)

numCurrMergeCand＝numCurrMergeCand+1 (8-121)numCurrMergeCand＝numCurrMergeCand+1 (8-121)

4.变量combIdx递增1。4. The variable combIdx is incremented by 1.

5.当combIdx等于(numOrigMergeCand*(numOrigMergeCand-1))或numCurrMergeCand等于MaxNumMergeCand时，将combStop设定成等于TRUE。5. When combIdx is equal to (numOrigMergeCand*(numOrigMergeCand-1)) or numCurrMergeCand is equal to MaxNumMergeCand, set combStop equal to TRUE.

运动向量继承(MVI)利用纹理图像与其相关联深度图像之间的运动特性相似性。具体来说，视频译码器可将MVI候选者包含于合并候选者列表中。对于深度图像中的给定PU，MVI候选者重新使用已经译码的对应纹理块(如果其可用)的运动向量及参考索引。图10为说明用于深度译码的运动向量继承候选者的实例导出的概念图。图10展示MVI候选者的导出过程的实例，其中对应纹理块选择为位于当前PU的中心的右下方的4×4块。Motion vector inheritance (MVI) exploits the similarity in motion characteristics between a texture image and its associated depth image. Specifically, a video coder may include MVI candidates in a merge candidate list. For a given PU in a depth image, an MVI candidate reuses the motion vector and reference index of the already coded corresponding texture block (if available). FIG10 is a conceptual diagram illustrating an example derivation of motion vector inheritance candidates for depth coding. FIG10 shows an example of the derivation process for MVI candidates, where the corresponding texture block is selected as a 4×4 block located to the lower right of the center of the current PU.

在一些实例中，具有整数精度的运动向量用于深度译码而运动向量的四分之一精度用于纹理译码。因此，可在用作MVI候选者之前缩放对应纹理块的运动向量。In some examples, motion vectors with integer precision are used for depth coding while motion vectors with quarter precision are used for texture coding.Thus, the motion vectors of the corresponding texture blocks may be scaled before being used as MVI candidates.

在MVI候选者产生的情况下，可如下建构用于深度视图的合并候选者列表：In case of MVI candidate generation, the merge candidate list for the depth view can be constructed as follows:

1.MVI插入1.MVI Insertion

通过上文所描述的程序导出MVI。如果MVI可用，则视频译码器可将MVI插入到合并列表中。The MVI is derived through the procedure described above. If the MVI is available, the video coder may insert the MVI into the merge list.

按以下次序检查空间相邻PU的运动信息：A₁、B₁、B₀、A₀或B₂。视频译码器可通过以下程序执行受约束的精简：The motion information of spatially neighboring PUs is checked in the following order: A ₁ , B ₁ , B ₀ , A ₀ or B ₂ . The video coder may perform constrained downscaling by the following procedure:

-如果A₁及MVI具有相同运动向量及相同参考索引，则视频译码器不将A₁插入到候选者列表中。If _A1 and MVI have the same motion vector and the same reference index, the video coder does not insert _A1 into the candidate list.

-如果B₁及A₁/MVI具有相同运动向量及相同参考索引，则视频译码器不将B₁插入到候选者列表中。If B ₁ and A ₁ /MVI have the same motion vector and the same reference index, the video coder does not insert B ₁ into the candidate list.

-如果B₀可用，则视频译码器将B₀添加到候选者列表。If _B0 is available, the video coder adds _B0 to the candidate list.

-如果A₀可用，则视频译码器将A₀添加到候选者列表。- If A ₀ is available, the video coder adds A ₀ to the candidate list.

-如果B₂可用，则视频译码器将B₂添加到候选者列表。If B ₂ is available, the video coder adds B ₂ to the candidate list.

类似于其中利用共置PU的运动信息的HEVC中的时间合并候选者导出过程，然而，可如本发明中在其它地方关于用于3D-HEVC中的纹理译码的合并候选者列表建构所解释地改变时间合并候选者的目标参考图片索引，而不是将目标参考图片索引固定为0。Similar to the temporal merge candidate derivation process in HEVC where the motion information of a co-located PU is utilized, however, instead of fixing the target reference picture index to 0, the target reference picture index of the temporal merge candidate may be changed as explained elsewhere in this disclosure with respect to the merge candidate list construction for texture coding in 3D-HEVC.

如果从上文两个步骤导出的候选者的总数目小于候选者的最大数目，则除l0CandIdx及l1CandIdx的指定之外，视频译码器可执行如HEVC中定义的相同过程。图9的表格中定义combIdx、l0CandIdx及l1CandIdx当中的关系。If the total number of candidates derived from the above two steps is less than the maximum number of candidates, the video coder may perform the same process as defined in HEVC except for the specification of l0CandIdx and l1CandIdx. The relationship among combIdx, l0CandIdx, and l1CandIdx is defined in the table of FIG.

6.用于零运动向量合并候选者的导出过程6. Derivation process for zero motion vector merge candidates

如上文所指示，3D-HEVC提供视图间残余预测。高级残余预测(ARP)为视图间残余预测的一个形式。第4次JCT3V会议中采用将ARP应用于具有等于Part_2Nx2N的分割模式的CU，如张等人的“CE4：用于多视图译码的高级残余预测”(ITU-T SG 16 WP 3及ISO/IEC JTC1/SC 29/WG 11的3D视频译码扩展联合合作小组，第4次会议：韩国仁川，2013年4月20日到26日，文件JCT3V-D0177，到2013年12月17日为止可从http://phenix.it-sudparis.eu/jct3v/doc_end_user/documents/4_Incheon/wg11/JCT3V-D0177-v2.zip获得)中所提出(在下文中称为JCT3V-D0177)。As indicated above, 3D-HEVC provides inter-view residual prediction. Advanced residual prediction (ARP) is a form of inter-view residual prediction. The application of ARP to CUs with a partitioning mode equal to Part_2Nx2N was adopted in the 4th JCT3V meeting as proposed in Zhang et al., "CE4: Advanced residual prediction for multi-view coding" (Joint Collaboration Group on 3D Video Coding Extensions of ITU-T SG 16 WP 3 and ISO/IEC JTC1/SC 29/WG 11, 4th meeting: Incheon, South Korea, April 20-26, 2013, document JCT3V-D0177, available from http://phenix.it-sudparis.eu/jct3v/doc_end_user/documents/4_Incheon/wg11/JCT3V-D0177-v2.zip as of December 17, 2013) (hereinafter referred to as JCT3V-D0177).

图11说明多视图视频译码中的ARP的实例预测结构。如图11中所展示，视频译码器可在当前块预测中调用以下块。FIG11 illustrates an example prediction structure of ARP in multi-view video coding.As shown in FIG11 , a video coder may call the following blocks in current block prediction.

1.当前块：Curr1. Current Block: Curr

2.由视差向量(DV)导出的参考/基础视图中的参考块：Base。2. Reference block in the reference/base view derived from the disparity vector (DV): Base.

3.与由当前块的(时间)运动向量(表示为TMV)导出的块Curr在相同视图中的块：CurrTRef。3. A block in the same view as the block Curr derived from the (temporal) motion vector (denoted as TMV) of the current block: CurrTRef.

4.与由当前块的时间运动向量(TMV)导出的块Base在相同视图中的块：BaseTRef。相比于当前块通过TMV+DV的向量识别此块。4. A block in the same view as the block Base derived from the temporal motion vector (TMV) of the current block: BaseTRef. This block is identified by the vector of TMV+DV compared to the current block.

残余预测符表示为BaseTRef-Base，其中将减法运算应用于所表示像素阵列的每一像素。视频译码器可将加权因子w乘以残余预测符。因此，当前块的最终预测符可表示为：CurrTRef+w*(BaseTRef-Base)。The residual predictor is expressed as BaseTRef-Base, where a subtraction operation is applied to each pixel of the represented pixel array. The video decoder can multiply the weighting factor w by the residual predictor. Therefore, the final predictor for the current block can be expressed as: CurrTRef+w*(BaseTRef-Base).

上文描述及图11两者都基于应用单向预测的假定。当扩展到双向预测的状况时，将上文步骤应用于每一参考图片列表。当针对一个参考图片列表当前块使用视图间参考图片(不同视图中)时，停用残余预测过程。Both the above description and FIG11 are based on the assumption that unidirectional prediction is used. When extended to the case of bidirectional prediction, the above steps are applied to each reference picture list. When an inter-view reference picture (in a different view) is used for the current block in a reference picture list, the residual prediction process is disabled.

所提出ARP在解码器侧处的主要程序可如下描述。首先，视频译码器可获得如3D-HEVC工作草案4中所指定的指向目标参考视图的视差向量。接着，在相同于当前图片的接入单元内的参考视图的图片中，视频译码器可使用视差向量以定位对应块。接下来，视频译码器可重新使用当前块的运动信息以导出参考块的运动信息。视频译码器可接着基于当前块的相同运动向量及参考块的参考视图中的所导出参考图片将运动补偿应用于对应块以导出残余块。图12展示当前块、对应块及运动补偿块当中的关系。换句话说，图12为说明当前块、参考块及运动补偿块当中的实例关系的概念图。将参考视图(V₀)中具有与当前视图(V_m)的参考图片相同的POC(图片次序计数)值的参考图片选择为对应块的参考图片。接下来，视频译码器可将加权因子应用于残余块以确定加权残余块。视频译码器可将加权残余块的值添加到预测样本。The main procedures of the proposed ARP at the decoder side can be described as follows. First, the video decoder can obtain the disparity vector pointing to the target reference view as specified in 3D-HEVC Working Draft 4. Then, in the picture of the reference view within the same access unit as the current picture, the video decoder can use the disparity vector to locate the corresponding block. Next, the video decoder can reuse the motion information of the current block to derive the motion information of the reference block. The video decoder can then apply motion compensation to the corresponding block based on the same motion vector of the current block and the derived reference picture in the reference view of the reference block to derive the residual block. Figure 12 shows the relationship among the current block, the corresponding block, and the motion compensated block. In other words, Figure 12 is a conceptual diagram illustrating an example relationship among the current block, the reference block, and the motion compensated block. A reference picture in the reference view ( _V0 ) with the same POC (Picture Order Count) value as the reference picture of the current view ( _Vm ) is selected as the reference picture of the corresponding block. Next, the video decoder can apply a weighting factor to the residual block to determine a weighted residual block. The video coder may add the values of the weighted residual block to the prediction samples.

三个加权因子用于ARP中，即，0、0.5及1。视频编码器20可将带来当前CU的最小速率失真成本的加权因子选择为最终加权因子。视频编码器20可在CU层级处在位流中用信号通知对应加权因子索引(分别对应于加权因子0、1及0.5的0、1及2)。CU中的所有PU预测可共用相同加权因子。当加权因子等于0时，视频译码器并不使用ARP用于当前CU。Three weighting factors are used in ARP: 0, 0.5, and 1. Video encoder 20 may select the weighting factor that results in the minimum rate-distortion cost for the current CU as the final weighting factor. Video encoder 20 may signal the corresponding weighting factor index (0, 1, and 2 corresponding to weighting factors of 0, 1, and 0.5, respectively) in the bitstream at the CU level. All PU predictions in a CU may share the same weighting factor. When the weighting factor is equal to 0, the video coder does not use ARP for the current CU.

在张等人的“3D-CE4：用于多视图译码的高级残余预测”(ITU-T SG 16 WP 3及ISO/IEC JTC 1/SC 29/WG 11的3D视频译码扩展开发联合合作小组，第3次会议：瑞士日内瓦，2013年1月17日到23日，文件JCT3V-C0049，到2013年8月30日为止可从http://phenix.int-evry.fr/jct3v/doc_end_user/documents/3_Geneva/wg11/JCT3V-C0049-v2.zip获得)(在下文中称为JCT3V-C0049)中，以非零加权因子译码的PU的参考图片可在块间不同。因此，视频译码器可需要从参考视图接入不同图片来产生对应块的运动补偿块(即，在图11的实例中为BaseTRef)。当加权因子不等于0时，视频译码器可在执行用于残余产生过程的运动补偿之前朝向固定图片缩放当前PU的经解码运动向量。在JCT3V-D0177中，如果每一参考图片列表来自相同视图，则固定图片定义为每一参考图片列表的第一参考图片。当经解码运动向量并不指向固定图片时，视频译码器可首先缩放经解码运动向量且接着使用经缩放运动向量以识别CurrTRef及BaseTRef。用于ARP的此参考图片可被称为目标ARP参考图片。In Zhang et al., "3D-CE4: Advanced residual prediction for multiview coding" (Joint Collaborative Group on 3D Video Coding Extension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 3rd Meeting: Geneva, Switzerland, January 17-23, 2013, document JCT3V-C0049, available from http://phenix.int-evry.fr/jct3v/doc_end_user/documents/3_Geneva/wg11/JCT3V-C0049-v2.zip as of August 30, 2013) (hereinafter referred to as JCT3V-C0049), the reference pictures of a PU coded with a non-zero weighting factor may differ between blocks. Therefore, the video coder may need to access different pictures from the reference view to generate the motion compensation block (i.e., BaseTRef in the example of FIG. 11 ) for the corresponding block. When the weighting factor is not equal to 0, the video coder may scale the decoded motion vector of the current PU toward a fixed picture before performing motion compensation for the residual generation process. In JCT3V-D0177, if each reference picture list is from the same view, a fixed picture is defined as the first reference picture in each reference picture list. When the decoded motion vector does not point to a fixed picture, the video coder may first scale the decoded motion vector and then use the scaled motion vector to identify CurrTRef and BaseTRef. This reference picture used for ARP may be referred to as a target ARP reference picture.

在JCT3V-C0049中，视频译码器可在对应块及对应块的预测块的内插过程期间应用双线性滤波器。然而对于非基础视图中的当前PU的预测块，视频译码器可应用常规8/4抽头滤波器。当应用ARP时，JCT3V-D0177始终提出利用双线性而不论块是在基础视图还是非基础视图中。In JCT3V-C0049, the video coder may apply a bilinear filter during the interpolation process of the corresponding block and the prediction block of the corresponding block. However, for the prediction block of the current PU in a non-base view, the video coder may apply a conventional 8/4 tap filter. When applying ARP, JCT3V-D0177 always proposes to utilize bilinearity regardless of whether the block is in a base view or a non-base view.

在ARP中，视频译码器可通过从NBDV导出过程传回的视图次序索引识别参考视图。在ARP的一些设计中，当一个PU在一个参考图片列表中的参考图片来自当前视图的不同视图时，对于此参考图片列表停用ARP。In ARP, the video coder can identify the reference view by the view order index returned from the NBDV derivation process. In some designs of ARP, when a PU's reference pictures in a reference picture list are from a different view than the current view, ARP is disabled for this reference picture list.

在2013年6月28日申请的美国临时专利申请案61/840,400及2013年7月18日申请的美国临时专利申请案61/847,942(其中的每一者的全部内容以引用方式并入)中，当译码深度图片时，由来自当前块的相邻样本的所估计深度值转换视差向量。此外，可(例如)通过接入由视差向量所识别的基础视图的参考块导出较多合并候选者。In U.S. Provisional Patent Application No. 61/840,400, filed on June 28, 2013, and U.S. Provisional Patent Application No. 61/847,942, filed on July 18, 2013 (each of which is incorporated by reference in its entirety), when coding a depth picture, a disparity vector is converted from an estimated depth value of neighboring samples of the current block. Furthermore, more merge candidates can be derived, for example, by accessing a reference block of a base view identified by the disparity vector.

在3D-HEVC中，视频译码器可通过两个步骤识别参考4×4块。第一步骤为通过视差运动向量识别像素。第二步骤为获得4×4块(通过分别对应于RefPicList0或RefPicList1的独特运动信息集合)并利用运动信息建立合并候选者。In 3D-HEVC, the video coder can identify the reference 4×4 block in two steps. The first step is to identify the pixel using the disparity motion vector. The second step is to obtain the 4×4 block (using the unique motion information set corresponding to RefPicList0 or RefPicList1, respectively) and use the motion information to create a merge candidate.

可如下识别参考视图中的像素(xRef,yRef)：A pixel (xRef, yRef) in the reference view can be identified as follows:

((mvDisp[0]+2)>>2))(H-124)((mvDisp[0]+2)>>2))(H-124)

((mvDisp[1]+2)>>2))(H-125)((mvDisp[1]+2)>>2))(H-125)

其中(xP,yP)为当前PU的左上方样本的坐标，mvDisp为视差向量且nPSWxnPSH为当前PU的大小，且PicWidthInSamples_L及PicHeightInSamples_L定义参考视图中的图片的分辨率(相同于当前视图)。Where (xP, yP) are the coordinates of the top left sample of the current PU, mvDisp is the disparity vector and nPSWxnPSH is the size of the current PU, and _{PicWidthInSamplesL} and _{PicHeightInSamplesL} define the resolution of the picture in the reference view (same as the current view).

安(An)等人的“3D-CE3.h相关：子PU层级的视图间运动预测”(ITU-T SG 16 WP 3及ISO/IEC JTC 1/SC 29/WG 11的3D视频译码扩展联合合作小组，第5次会议：奥地利维也纳，2013年7月27日到8月2日，文件JCT3V-E0184(在下文中称为“JCT3V-E0184”)，到2013年12月17日为止可从http://phenix.it-sudparis.eu/jct2/doc_end_user/documents/5_Vienna/wg11/JCT3V-E0184-v2.zip获得)提出用于时间视图间合并候选者(即，从参考视图中的参考块导出的候选者)的子PU层级视图间运动预测方法。本发明中在其它地方描述视图间运动预测的基本概念。在视图间运动预测的基本概念中，仅参考块的运动信息用于相依视图中的当前PU。然而，当前PU可对应于参考视图中的参考区域(具有相同于由当前PU的视差向量所识别的当前PU的大小)且参考区域可具有充裕运动信息。因此，如图13中所展示地提出子PU层级视图间运动预测(SPIVMP)方法。换句话说，图13为说明子PU视图间运动预测的实例的概念图。An et al., “3D-CE3.h related: Inter-view motion prediction at the sub-PU level” (Joint Collaboration Group on 3D Video Coding Extensions of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 5th Meeting: Vienna, Austria, July 27–August 2, 2013, document JCT3V-E0184 (hereinafter “JCT3V-E0184”), available as of December 17, 2013 at http://phenix.it-sudparis.eu/jct2/doc_end_user/documents/5_Vienna/wg11/JCT3V-E0184-v2.zip) proposes a sub-PU level inter-view motion prediction method for temporal inter-view merging candidates (i.e., candidates derived from reference blocks in a reference view). The basic concepts of inter-view motion prediction are described elsewhere in this disclosure. In the basic concept of inter-view motion prediction, only the motion information of the reference block is used for the current PU in the dependent view. However, the current PU may correspond to a reference area in the reference view (having the same size as the current PU identified by the disparity vector of the current PU), and the reference area may have sufficient motion information. Therefore, as shown in FIG13 , a sub-PU-level inter-view motion prediction (SPIVMP) method is proposed. In other words, FIG13 is a conceptual diagram illustrating an example of sub-PU inter-view motion prediction.

时间视图间合并候选者可如下导出。在时间视图间合并候选者的导出过程中，所指派子PU大小可表示为N×N。可应用不同子PU块大小，例如4×4、8×8及16×16。Temporal inter-view merge candidates may be derived as follows: In the derivation process of temporal inter-view merge candidates, the assigned sub-PU size may be expressed as N×N. Different sub-PU block sizes may be applied, such as 4×4, 8×8, and 16×16.

在时间视图间合并候选者的导出过程中，视频译码器可首先将当前PU划分成多个子PU，其中的每一者具有比当前PU小的大小。当前PU的大小可由nPSW×nPSH表示。子PU的大小可由nPSWsub×nPSHSub表示。nPSWsub及nPSHsub可如以下方程式中所展示地相关于nPSW及nSPH。In the derivation process of temporal inter-view merge candidates, the video coder may first divide the current PU into multiple sub-PUs, each of which has a smaller size than the current PU. The size of the current PU may be represented by nPSW×nPSH. The size of the sub-PU may be represented by nPSWsub×nPSHSub. nPSWsub and nPSHsub may be related to nPSW and nSPH as shown in the following equations.

nPSWsub＝min(N,nPSW)nPSWsub＝min(N,nPSW)

nPSHSub＝min(N,nPSH)nPSHSub＝min(N,nPSH)

另外，对于每一参考图片列表，视频译码器可将默认运动向量tmvLX设定为(0,0)且可将参考索引refLX设定为-1(其中X为0或1)。Additionally, for each reference picture list, the video coder may set the default motion vector tmvLX to (0, 0) and may set the reference index refLX to -1 (where X is 0 or 1).

此外，当确定时间视图间合并候选者时，视频译码器可按光栅扫描次序针对每一子PU应用以下动作。首先，视频译码器可将视差向量添加到当前子PU(其左上方样本位置为(xPSub,yPSub))的中部位置以获得参考样本位置(xRefSub,yRefSub)。视频译码器可使用以下方程式确定(xRefSub,yRefSub)：Furthermore, when determining a temporal inter-view merge candidate, the video coder may apply the following actions for each sub-PU in raster scan order. First, the video coder may add the disparity vector to the middle position of the current sub-PU (whose top-left sample position is (xPSub, yPSub)) to obtain a reference sample position (xRefSub, yRefSub). The video coder may determine (xRefSub, yRefSub) using the following equation:

xRefSub＝Clip3(0,PicWidthInSamplesL-1,xPSub+nPSWsub/2+xRefSub＝Clip3(0,PicWidthInSamplesL-1,xPSub+nPSWsub/2+

((mvDisp[0]+2)>>2))((mvDisp[0]+2)>>2))

yRefSub＝Clip3(0,PicHeightInSamplesL-1,yPSub+nPSHSub/2+yRefSub＝Clip3(0,PicHeightInSamplesL-1,yPSub+nPSHSub/2+

((mvDisp[1]+2)>>2))((mvDisp[1]+2)>>2))

视频译码器可使用覆盖(xRefSub,yRefSub)的参考视图中的块作为当前子PU的参考块。The video coder may use the block in the reference view that overlays (xRefSub, yRefSub) as a reference block for the current sub-PU.

如果使用时间运动向量译码所识别参考块且如果refL0及refL1两者都等于-1且当前子PU并非光栅扫描次序中的第一者，则参考块的运动信息由所有先前子PU继承。此外，如果使用时间运动向量译码所识别参考块，则相关联运动参数可用作当前子PU的运动参数。另外，如果使用时间运动向量译码所识别参考块，则视频译码器可将tmvLX及refLX更新为当前子PU的运动信息。否则，如果参考块经帧内译码，则视频译码器可将当前子PU的运动信息设定为tmvLX及refLX。If the identified reference block is coded using temporal motion vectors and if both refL0 and refL1 are equal to -1 and the current sub-PU is not the first in raster scan order, the motion information of the reference block is inherited by all previous sub-PUs. Furthermore, if the identified reference block is coded using temporal motion vectors, the associated motion parameters may be used as the motion parameters of the current sub-PU. Additionally, if the identified reference block is coded using temporal motion vectors, the video coder may update tmvLX and refLX to the motion information of the current sub-PU. Otherwise, if the reference block is intra-coded, the video coder may set the motion information of the current sub-PU to tmvLX and refLX.

如JCT3V-E0184中所提出的子PU运动预测方法具有一或多个问题。举例来说，当参考视图中的一个子PU的对应块经帧内译码(即，其运动信息不可用)时，将光栅扫描次序中的最近子PU的运动信息复制到当前子PU。因此，如果光栅扫描次序中的前N个子PU的对应块经帧内译码且第(N+1)子PU的对应块经帧间译码，则将设定到第(N+1)子PU的相关运动信息复制到前N个子PU，此情况导致额外复杂性及译码延迟。The sub-PU motion prediction method proposed in JCT3V-E0184 has one or more problems. For example, when the corresponding block of a sub-PU in the reference view is intra-coded (i.e., its motion information is not available), the motion information of the most recent sub-PU in raster scan order is copied to the current sub-PU. Therefore, if the corresponding blocks of the first N sub-PUs in raster scan order are intra-coded and the corresponding block of the (N+1)th sub-PU is inter-coded, the relevant motion information set to the (N+1)th sub-PU is copied to the first N sub-PUs, which causes additional complexity and decoding delay.

本发明的一或多个实例涉及视图间运动预测。举例来说，本发明的一或多个实例可适用于当合并索引指示视图间运动预测的上下文中。One or more embodiments of the present disclosure relate to inter-view motion prediction. For example, one or more embodiments of the present disclosure may be applicable in the context of when a merge index indicates inter-view motion prediction.

举例来说，在一个实例中，当视频译码器以子PU方式使用视图间运动预测时，如果当前子PU的运动信息不可用，则视频译码器可从默认运动向量及参考索引复制运动信息。举例来说，如果当前子PU的运动信息不可用，则视频译码器可从默认运动向量及默认参考索引复制当前子PU的运动信息。在此实例中，默认运动参数对于多个子PU中的每一子PU相同，不论随后是否存在具有使用运动补偿预测译码的参考块的子PU都是如此。For example, in one example, when the video coder uses inter-view motion prediction in a sub-PU manner, if the motion information of the current sub-PU is not available, the video coder may copy the motion information from a default motion vector and a reference index. For example, if the motion information of the current sub-PU is not available, the video coder may copy the motion information of the current sub-PU from a default motion vector and a default reference index. In this example, the default motion parameters are the same for each sub-PU in the plurality of sub-PUs, regardless of whether there is a subsequent sub-PU with a reference block coded using motion compensated prediction.

在一些实例中，视频译码器(例如，视频编码器20或视频解码器30)可将当前PU划分成多个子PU。当前PU在当前图片中。另外，视频译码器可确定默认运动参数。默认运动参数可包含一或多个默认运动向量及一或多个默认参考索引。另外，视频译码器可以特定次序处理来自多个子PU的子PU。对于来自所述多个子PU的每一相应子PU，视频译码器可确定相应子PU的参考块。In some examples, a video coder (e.g., video encoder 20 or video decoder 30) may divide a current PU into a plurality of sub-PUs. The current PU is in a current picture. In addition, the video coder may determine default motion parameters. The default motion parameters may include one or more default motion vectors and one or more default reference indexes. In addition, the video coder may process the sub-PUs from the plurality of sub-PUs in a particular order. For each respective sub-PU from the plurality of sub-PUs, the video coder may determine a reference block for the respective sub-PU.

在一些实例中，参考图片可在不同于当前图片的视图中，且视频译码器可基于当前PU的视差向量确定参考图片中的参考样本位置。在此类实例中，相应子PU的参考块可覆盖参考样本位置。在其它实例中，当前图片为深度视图分量且参考图片为在相同于当前图片的视图及接入单元中的纹理视图分量。在此类实例中，视频译码器可确定相应子PU的参考块为与相应子PU共置的参考图片的PU。In some examples, the reference picture may be in a different view than the current picture, and the video coder may determine the reference sample location in the reference picture based on the disparity vector of the current PU. In such examples, the reference block of the corresponding sub-PU may cover the reference sample location. In other examples, the current picture is a depth view component and the reference picture is a texture view component in the same view and access unit as the current picture. In such examples, the video coder may determine that the reference block of the corresponding sub-PU is a PU of the reference picture that is co-located with the corresponding sub-PU.

此外，对于来自多个子PU的每一相应子PU(或多个子PU的子集)，如果使用运动补偿预测译码相应子PU的参考块，则视频译码器可基于相应子PU的参考块的运动参数设定相应子PU的运动参数。另一方面，如果不使用运动补偿预测译码相应子PU的参考块，则视频译码器可将相应子PU的运动参数设定为默认运动参数。Furthermore, for each corresponding sub-PU from the plurality of sub-PUs (or a subset of the plurality of sub-PUs), if the reference block of the corresponding sub-PU is coded using motion compensated prediction, the video coder may set the motion parameters of the corresponding sub-PU based on the motion parameters of the reference block of the corresponding sub-PU. On the other hand, if the reference block of the corresponding sub-PU is not coded using motion compensated prediction, the video coder may set the motion parameters of the corresponding sub-PU to default motion parameters.

根据本发明的一或多个实例，如果不使用运动补偿预测译码相应子PU的参考块，则响应于后续确定使用运动补偿预测译码次序中的任何稍后子PU的参考块，不设定相应子PU的运动参数。因此，在不使用运动补偿预测译码子PU中的至少一者的参考块的情况中，视频译码器可不需要前向扫描以找到其对应参考块经使用运动补偿预测译码的子PU。同样地，视频译码器可不需要延迟确定相应子PU的运动参数，直到视频译码器在子PU的处理期间遇到其对应参考块经使用运动补偿预测译码的PU为止。有利地，此情况可降低复杂性及译码延迟。According to one or more examples of the present disclosure, if the reference block of a corresponding sub-PU is not coded using motion compensated prediction, then in response to a subsequent determination that the reference block of any later sub-PU in the coding order is coded using motion compensated prediction, the motion parameters of the corresponding sub-PU are not set. Therefore, in the event that the reference block of at least one of the sub-PUs is not coded using motion compensated prediction, the video coder may not need to scan forward to find a sub-PU whose corresponding reference block is coded using motion compensated prediction. Similarly, the video coder may not need to delay determining the motion parameters of the corresponding sub-PU until the video coder encounters a PU whose corresponding reference block is coded using motion compensated prediction during processing of the sub-PU. Advantageously, this can reduce complexity and coding latency.

视频译码器可将候选者包含于当前PU的候选者列表中，其中候选者基于多个子PU的运动参数。在一些实例中，候选者列表为合并候选者列表。此外，如果视频译码器为视频编码器(例如，视频编码器20)，则视频编码器可在位流中用信号通知指示候选者列表中的所选择候选者的语法元素(例如，merge_idx)。如果视频译码器为视频解码器(例如，视频解码器30)，则视频解码器可从位流获得指示候选者列表中的所选择候选者的语法元素(例如，merge_idx)。视频解码器可使用所选择候选者的运动参数以重建构当前PU的预测性块。The video coder may include a candidate in a candidate list for the current PU, where the candidate is based on motion parameters of multiple sub-PUs. In some examples, the candidate list is a merge candidate list. Furthermore, if the video coder is a video encoder (e.g., video encoder 20), the video encoder may signal a syntax element (e.g., merge_idx) in the bitstream indicating the selected candidate in the candidate list. If the video coder is a video decoder (e.g., video decoder 30), the video decoder may obtain the syntax element (e.g., merge_idx) indicating the selected candidate in the candidate list from the bitstream. The video decoder may use the motion parameters of the selected candidate to reconstruct the predictive block of the current PU.

本发明中所描述的技术中的至少一些可单独地或彼此结合实施。At least some of the techniques described in this disclosure may be implemented alone or in combination with each other.

图14为说明可实施本发明的技术的实例视频编码器20的框图。图14是出于解释目的而提供，且不应被视为将技术限制为本发明中所大致例示及描述者。出于解释目的，本发明描述在HEVC译码的上下文中的视频编码器20。然而，本发明的技术可适用于其它译码标准或方法。FIG14 is a block diagram illustrating an example video encoder 20 that may implement the techniques of this disclosure. FIG14 is provided for explanation purposes and should not be construed as limiting the techniques to those generally illustrated and described in this disclosure. For explanation purposes, this disclosure describes video encoder 20 in the context of HEVC coding. However, the techniques of this disclosure may be applicable to other coding standards or methods.

在图14的实例中，视频编码器20包含预测处理单元100、残余产生单元102、变换处理单元104、量化单元106、反量化单元108、反变换处理单元110、重建构单元112、滤波器单元114、经解码图片缓冲器116及熵编码单元118。预测处理单元100包含帧间预测处理单元120及帧内预测处理单元126。帧间预测处理单元120包含运动估计单元122及运动补偿单元124。在其它实例中，视频编码器20可包含较多、较少或不同功能组件。14 , video encoder 20 includes a prediction processing unit 100, a residual generation unit 102, a transform processing unit 104, a quantization unit 106, an inverse quantization unit 108, an inverse transform processing unit 110, a reconstruction unit 112, a filter unit 114, a decoded picture buffer 116, and an entropy encoding unit 118. Prediction processing unit 100 includes an inter-prediction processing unit 120 and an intra-prediction processing unit 126. Inter-prediction processing unit 120 includes a motion estimation unit 122 and a motion compensation unit 124. In other examples, video encoder 20 may include more, fewer, or different functional components.

视频编码器20可接收视频数据。视频编码器20可编码视频数据的图片的切片中每一CTU。CTU中的每一者可与图片的大小相等明度译码树块(CTB)及对应CTB相关联。作为编码CTU的部分，预测处理单元100可执行四叉树分割以将CTU的CTB划分成逐渐更小的块。较小块可为CU的译码块。举例来说，预测处理单元100可将与CTU相关联的CTB分割成四个大小相等的子块，将子块中的一或多者分割成四个大小相等的子子块等等。Video encoder 20 may receive video data. Video encoder 20 may encode each CTU in a slice of a picture of the video data. Each of the CTUs may be associated with an equally sized luma coding tree block (CTB) and a corresponding CTB of the picture. As part of encoding the CTU, prediction processing unit 100 may perform quadtree partitioning to divide the CTBs of the CTU into progressively smaller blocks. The smaller blocks may be coding blocks of a CU. For example, prediction processing unit 100 may partition the CTB associated with the CTU into four equally sized sub-blocks, partition one or more of the sub-blocks into four equally sized sub-sub-blocks, and so on.

视频编码器20可编码CTU的CU以产生CU的经编码表示(即，经译码CU)。作为编码CU的部分，预测处理单元100可在CU的一或多个PU当中分割与CU相关联的译码块。因此，每一PU可与明度预测块及对应色度预测块相关联。视频编码器20及视频解码器30可支持具有各种大小的PU。如上文所指示，CU的大小可指CU的明度译码块的大小且PU的大小可指PU的明度预测块的大小。假定特定CU的大小是2N×2N，则视频编码器20及视频解码器30可支持用于帧内预测的2N×2N或N×N的PU大小，及用于帧间预测的2N×2N、2N×N、N×2N、N×N或类似大小的对称PU大小。视频编码器20及视频解码器30还可支持用于帧间预测的2N×nU、2N×nD、nL×2N及nR×2N的PU大小的不对称分割。Video encoder 20 may encode a CU of a CTU to generate an encoded representation of the CU (i.e., a coded CU). As part of encoding a CU, prediction processing unit 100 may partition the coding blocks associated with the CU among one or more PUs of the CU. Thus, each PU may be associated with a luma prediction block and a corresponding chroma prediction block. Video encoder 20 and video decoder 30 may support PUs of various sizes. As indicated above, the size of a CU may refer to the size of the luma coding block of the CU and the size of a PU may refer to the size of the luma prediction block of the PU. Assuming the size of a particular CU is 2N×2N, video encoder 20 and video decoder 30 may support PU sizes of 2N×2N or N×N for intra-prediction, and symmetric PU sizes of 2N×2N, 2N×N, N×2N, N×N, or similar sizes for inter-prediction. Video encoder 20 and video decoder 30 may also support asymmetric partitioning of PU sizes of 2N×nU, 2N×nD, nL×2N, and nR×2N for inter-prediction.

帧间预测处理单元120可通过对CU的每一PU执行帧间预测来产生用于PU的预测性数据。PU的预测性数据可包含PU的预测性块及PU的运动信息。帧间预测处理单元120可取决于PU是在I切片、P切片或B切片中而对CU的PU执行不同操作。在I切片中，所有PU都经帧内预测。因此，如果PU在I切片中，则帧间预测处理单元120并不对PU执行帧间预测。Inter-prediction processing unit 120 may generate predictive data for each PU of a CU by performing inter prediction on the PU. The predictive data for the PU may include the predictive block of the PU and the motion information of the PU. Inter-prediction processing unit 120 may perform different operations on a PU of a CU depending on whether the PU is in an I slice, a P slice, or a B slice. In an I slice, all PUs are intra predicted. Therefore, if the PU is in an I slice, inter-prediction processing unit 120 does not perform inter prediction on the PU.

如果PU在P切片中，则运动估计单元122可针对PU的参考区搜索参考图片列表(例如，“RefPicList0”)中的参考图片。PU的参考区可为参考图片内含有最接近地对应于PU的预测块的样本的区。运动估计单元122可产生指示含有PU的参考区的参考图片在RefPicList0中的位置的参考索引。另外，运动估计单元122可产生指示PU的译码块与相关联于参考区的参考位置之间的空间移位的运动向量。举例来说，运动向量可为提供从当前图片中的坐标到参考图片中的坐标的偏移的二维向量。运动估计单元122可将参考索引及运动向量输出为PU的运动信息。运动补偿单元124可基于由PU的运动向量所指示的参考位置处的实际或内插样本产生PU的预测性块。If the PU is in a P slice, motion estimation unit 122 may search a reference picture list (e.g., "RefPicList0") for the PU's reference region. The PU's reference region may be a region within a reference picture that contains samples that most closely correspond to the PU's prediction block. Motion estimation unit 122 may generate a reference index that indicates the position of the reference picture in RefPicList0 that contains the PU's reference region. In addition, motion estimation unit 122 may generate a motion vector that indicates the spatial displacement between the PU's coding block and a reference position associated with the reference region. For example, a motion vector may be a two-dimensional vector that provides an offset from coordinates in the current picture to coordinates in a reference picture. Motion estimation unit 122 may output the reference index and motion vector as the motion information for the PU. Motion compensation unit 124 may generate the PU's predictive block based on actual or interpolated samples at the reference position indicated by the PU's motion vector.

如果PU在B切片中，则运动估计单元122可对PU执行单向预测或双向预测。为对PU执行单向预测，运动估计单元122可搜索RefPicList0的参考图片，或用于PU的参考区的第二参考图片列表(“RefPicList1”)。运动估计单元122可将指示含有参考区的参考图片的RefPicList0或RefPicList1中的位置的参考索引、指示PU的预测块与相关联于参考区的参考位置之间的空间移位的运动向量及指示参考图片是在RefPicList0中还是在RefPicList1中的一或多个预测方向指示符输出为PU的运动信息。运动补偿单元124可至少部分基于由PU的运动向量所指示的参考位置处的实际或内插样本来产生PU的预测性块。If the PU is in a B slice, motion estimation unit 122 may perform uni-prediction or bi-prediction on the PU. To perform uni-prediction on the PU, motion estimation unit 122 may search the reference picture of RefPicList0 or a second reference picture list ("RefPicList1") for the reference region of the PU. Motion estimation unit 122 may output, as the motion information of the PU, a reference index indicating a position in RefPicList0 or RefPicList1 of the reference picture containing the reference region, a motion vector indicating a spatial displacement between the prediction block of the PU and the reference position associated with the reference region, and one or more prediction direction indicators indicating whether the reference picture is in RefPicList0 or RefPicList1. Motion compensation unit 124 may generate a predictive block for the PU based at least in part on actual or interpolated samples at the reference position indicated by the motion vector of the PU.

为对PU执行双向帧间预测，运动估计单元122可针对PU的参考区搜索RefPicList0中的参考图片，且还可针对PU的另一参考区搜索RefPicList1中的参考图片。运动估计单元122可产生指示含有参考区的参考图片在RefPicList0及RefPicList1中的位置的参考索引。另外，运动估计单元122可产生指示与参考区相关联的参考位置与PU的预测块之间的空间移位的运动向量。PU的运动信息可包含PU的参考索引及运动向量。运动补偿单元124可至少部分基于由PU的运动向量所指示的参考位置处的实际或内插样本产生PU的预测性块。To perform bidirectional inter prediction for a PU, motion estimation unit 122 may search a reference picture in RefPicList0 for the PU's reference region and may also search a reference picture in RefPicList1 for another reference region of the PU. Motion estimation unit 122 may generate a reference index that indicates the position of the reference picture containing the reference region in RefPicList0 and RefPicList1. In addition, motion estimation unit 122 may generate a motion vector that indicates the spatial displacement between the reference location associated with the reference region and the prediction block of the PU. The motion information of the PU may include the reference index and motion vector of the PU. Motion compensation unit 124 may generate a predictive block for the PU based at least in part on actual or interpolated samples at the reference location indicated by the motion vector of the PU.

在一些实例中，运动估计单元122可产生PU的合并候选者列表。作为产生合并候选者列表的部分，运动估计单元122可确定IPMVC及/或纹理合并候选者。当确定IPMVC及/或纹理合并候选者时，运动估计单元122可将PU分割成子PU并根据特定次序处理子PU以确定子PU的运动参数。根据本发明的一或多种技术，如果不使用运动补偿预测译码相应子PU的参考块，则运动估计单元122响应于后续确定使用运动补偿预测译码特定次序中的任何稍后子PU的参考块，并不设定相应子PU的运动参数。实际上，如果不使用运动补偿预测译码相应子PU的参考块，则运动估计单元122可将相应子PU的运动参数设定成默认运动参数。如果IPMVC或纹理合并候选者为合并候选者列表中的所选择合并候选者，则运动补偿单元124可基于由IPMVC或纹理合并候选者所指定的运动参数确定相应PU的预测性块。In some examples, motion estimation unit 122 may generate a merge candidate list for a PU. As part of generating the merge candidate list, motion estimation unit 122 may determine IPMVC and/or texture merge candidates. When determining the IPMVC and/or texture merge candidates, motion estimation unit 122 may partition the PU into sub-PUs and process the sub-PUs according to a specific order to determine motion parameters for the sub-PUs. According to one or more techniques of this disclosure, if a reference block of a corresponding sub-PU is not coded using motion compensated prediction, motion estimation unit 122 does not set the motion parameters for the corresponding sub-PU in response to a subsequent determination to use motion compensated prediction to code the reference blocks of any later sub-PU in the specific order. In fact, if the reference block of the corresponding sub-PU is not coded using motion compensated prediction, motion estimation unit 122 may set the motion parameters of the corresponding sub-PU to default motion parameters. If an IPMVC or texture merge candidate is the selected merge candidate in the merge candidate list, motion compensation unit 124 may determine the predictive blocks for the corresponding PU based on the motion parameters specified by the IPMVC or texture merge candidate.

帧内预测处理单元126可通过对PU执行帧内预测产生用于PU的预测性数据。用于PU的预测性数据可包含PU的预测性块及各种语法元素。帧内预测处理单元126可对I切片、P切片及B切片中的PU执行帧内预测。Intra-prediction processing unit 126 may generate predictive data for a PU by performing intra prediction on the PU. The predictive data for the PU may include the predictive blocks and various syntax elements for the PU. Intra-prediction processing unit 126 may perform intra prediction on PUs in I slices, P slices, and B slices.

为对PU执行帧内预测，帧内预测处理单元126可使用多个帧内预测模式来产生PU的预测性块的多个集合。当使用特定帧内预测模式执行帧内预测时，帧内预测处理单元126可使用来自相邻块的特定样本集合产生PU的预测性块。假定对于PU、CU及CTU采用从左到右、从上到下的编码次序，相邻块可在PU的预测块的上方、右上方、左上方或左方。帧内预测处理单元126可使用各种数目个帧内预测模式，例如33个定向帧内预测模式。在一些实例中，帧内预测模式的数目可取决于PU的预测块的大小。To perform intra prediction on a PU, intra-prediction processing unit 126 may use multiple intra-prediction modes to generate multiple sets of predictive blocks for the PU. When performing intra prediction using a particular intra-prediction mode, intra-prediction processing unit 126 may use a particular set of samples from neighboring blocks to generate the predictive blocks for the PU. Assuming a left-to-right, top-to-bottom coding order for the PU, CU, and CTU, the neighboring blocks may be above, to the right, to the left, or to the left of the prediction block of the PU. Intra-prediction processing unit 126 may use various numbers of intra-prediction modes, such as 33 directional intra-prediction modes. In some examples, the number of intra-prediction modes may depend on the size of the prediction block of the PU.

预测处理单元100可从PU的由帧间预测处理单元120产生的预测性数据或PU的由帧内预测处理单元126产生的预测性数据当中选择用于CU的PU的预测性数据。在一些实例中，预测处理单元100基于预测性数据集合的速率/失真量度选择用于CU的PU的预测性数据。所选择预测性数据的预测性块在本文中可被称作所选择预测性块。Prediction processing unit 100 may select predictive data for PUs of a CU from among the predictive data for the PUs generated by inter-prediction processing unit 120 or the predictive data for the PUs generated by intra-prediction processing unit 126. In some examples, prediction processing unit 100 selects predictive data for the PUs of the CU based on rate/distortion metrics of the predictive data sets. The predictive block of the selected predictive data may be referred to herein as a selected predictive block.

残余产生单元102可基于CU的明度、Cb及Cr译码块以及CU的PU的所选择预测性明度、Cb及Cr块产生CU的明度、Cb及Cr残余块。举例来说，残余产生单元102可产生CU的残余块，使得残余块中的每一样本具有等于CU的译码块中的样本与CU的PU的对应所选择预测性样本块中的对应样本之间的差的值。Residual generation unit 102 may generate a luma, Cb, and Cr residual block for a CU based on the luma, Cb, and Cr coding blocks of the CU and the selected predictive luma, Cb, and Cr blocks of the PUs of the CU. For example, residual generation unit 102 may generate the residual block for the CU such that each sample in the residual block has a value equal to the difference between a sample in the coding block of the CU and a corresponding sample in the corresponding selected predictive sample block of the PUs of the CU.

变换处理单元104可执行四叉树分割以将CU的残余块分割成与CU的TU相关联的变换块。因此，TU可与明度变换块及两个对应色度变换块相关联。CU的TU的明度变换块及色度变换块的大小及位置可或可不基于CU的PU的预测块的大小及位置。Transform processing unit 104 may perform quadtree partitioning to partition the residual blocks of a CU into transform blocks associated with the TUs of the CU. Thus, a TU may be associated with a luma transform block and two corresponding chroma transform blocks. The sizes and positions of the luma transform blocks and chroma transform blocks of a TU of a CU may or may not be based on the sizes and positions of the prediction blocks of the PUs of the CU.

变换处理单元104可通过将一或多个变换应用到TU的变换块而产生用于CU的每一TU的变换系数块。变换处理单元104可将各种变换应用到与TU相关联的变换块。举例来说，变换处理单元104可应用离散余弦变换(DCT)、定向变换或概念上类似变换到变换块。在一些实例中，变换处理单元104并不将变换应用于变换块。在此类实例中，变换块可处理为变换系数块。Transform processing unit 104 may generate a transform coefficient block for each TU of a CU by applying one or more transforms to the transform blocks of the TU. Transform processing unit 104 may apply various transforms to the transform blocks associated with the TU. For example, transform processing unit 104 may apply a discrete cosine transform (DCT), a directional transform, or a conceptually similar transform to the transform blocks. In some examples, transform processing unit 104 does not apply a transform to the transform blocks. In such examples, the transform blocks may be processed as transform coefficient blocks.

量化单元106可量化系数块中的变换系数。量化过程可减少与变换系数中的一些或全部相关联的位深度。举例来说，n位变换系数可在量化期间舍入到m位变换系数，其中n大于m。量化单元106可基于与CU相关联的量化参数(QP)值量化与CU的TU相关联的系数块。视频编码器20可通过调整与CU相关联的QP值来调整应用于与CU相关联的系数块的量化程度。量化可引入信息丢失，因此经量化变换系数可具有比原始变换系数更低的精度。Quantization unit 106 may quantize the transform coefficients in a coefficient block. The quantization process may reduce the bit depth associated with some or all of the transform coefficients. For example, an n-bit transform coefficient may be rounded to an m-bit transform coefficient during quantization, where n is greater than m. Quantization unit 106 may quantize a coefficient block associated with a TU of a CU based on a quantization parameter (QP) value associated with the CU. Video encoder 20 may adjust the degree of quantization applied to the coefficient block associated with the CU by adjusting the QP value associated with the CU. Quantization may introduce information loss, and thus, the quantized transform coefficients may have lower precision than the original transform coefficients.

反量化单元108及反变换处理单元110可分别将反量化及反变换应用到系数块以从系数块重建构残余块。重建构单元112可将经重建构残余块添加到来自由预测处理单元100所产生的一或多个预测性块的对应样本，以产生与TU相关联的经重建构变换块。通过以此方式重建构CU的每一TU的变换块，视频编码器20可重建构CU的译码块。Inverse quantization unit 108 and inverse transform processing unit 110 may apply inverse quantization and inverse transform, respectively, to the coefficient block to reconstruct a residual block from the coefficient block. Reconstruction unit 112 may add the reconstructed residual block to corresponding samples from one or more predictive blocks generated by prediction processing unit 100 to produce a reconstructed transform block associated with the TU. By reconstructing the transform blocks for each TU of a CU in this manner, video encoder 20 can reconstruct the coding blocks of the CU.

滤波器单元114可执行一或多个解块操作以减少与CU相关联的译码块中的成块假象。经解码图片缓冲器116可在滤波器单元114对经重建构译码块执行一或多个解块操作之后存储经重建构译码块。帧间预测处理单元120可使用含有经重建构译码块的参考图片来对其它图片的PU执行帧间预测。另外，帧内预测处理单元126可使用经解码图片缓冲器116中的经重建构译码块以对在相同于CU的图片中的其它PU执行帧内预测。Filter unit 114 may perform one or more deblocking operations to reduce blocking artifacts in the coding blocks associated with the CU. Decoded picture buffer 116 may store the reconstructed coding blocks after filter unit 114 performs the one or more deblocking operations on the reconstructed coding blocks. Inter-prediction processing unit 120 may use a reference picture containing the reconstructed coding blocks to perform inter prediction on PUs of other pictures. Additionally, intra-prediction processing unit 126 may use the reconstructed coding blocks in decoded picture buffer 116 to perform intra prediction on other PUs in the same picture as the CU.

熵编码单元118可从视频编码器20的其它功能组件接收数据。举例来说，熵编码单元118可从量化单元106接收系数块，且可从预测处理单元100接收语法元素。熵编码单元118可对数据执行一或多个熵编码操作以产生经熵编码数据。举例来说，熵编码单元118可对数据执行上下文自适应可变长度译码(CAVLC)操作、CABAC操作、可变到可变(V2V)长度译码操作、基于语法的上下文自适应二进制算术译码(SBAC)操作、概率区间分割熵(PIPE)译码操作、指数哥伦布编码操作或另一类型的熵编码操作。视频编码器20可输出包含由熵编码单元118所产生的经熵编码数据的位流。Entropy encoding unit 118 may receive data from other functional components of video encoder 20. For example, entropy encoding unit 118 may receive coefficient blocks from quantization unit 106 and may receive syntax elements from prediction processing unit 100. Entropy encoding unit 118 may perform one or more entropy encoding operations on the data to generate entropy-encoded data. For example, entropy encoding unit 118 may perform a context-adaptive variable length coding (CAVLC) operation, a CABAC operation, a variable-to-variable (V2V) length coding operation, a syntax-based context-adaptive binary arithmetic coding (SBAC) operation, a probability interval partitioning entropy (PIPE) coding operation, an exponential Golomb coding operation, or another type of entropy encoding operation on the data. Video encoder 20 may output a bitstream including the entropy-encoded data generated by entropy encoding unit 118.

图15为说明可实施本发明的技术的实例视频解码器30的框图。图15是出于解释目的而提供，且并不将技术限制为本发明中所大致例示及描述者。出于解释目的，本发明描述在HEVC译码的上下文中的视频解码器30。然而，本发明的技术可适用于其它译码标准或方法。FIG15 is a block diagram illustrating an example video decoder 30 that may implement the techniques of this disclosure. FIG15 is provided for explanation purposes and does not limit the techniques to those generally illustrated and described in this disclosure. For explanation purposes, this disclosure describes video decoder 30 in the context of HEVC coding. However, the techniques of this disclosure may be applicable to other coding standards or methods.

在图15的实例中，视频解码器30包含熵解码单元150、预测处理单元152、反量化单元154、反变换处理单元156、重建构单元158、滤波器单元160及经解码图片缓冲器162。预测处理单元152包含运动补偿单元164及帧内预测处理单元166。在其它实例中，视频解码器30可包含较多、较少或不同功能组件。15 , video decoder 30 includes entropy decoding unit 150, prediction processing unit 152, inverse quantization unit 154, inverse transform processing unit 156, reconstruction unit 158, filter unit 160, and decoded picture buffer 162. Prediction processing unit 152 includes motion compensation unit 164 and intra-prediction processing unit 166. In other examples, video decoder 30 may include more, fewer, or different functional components.

经译码图片缓冲器(CPB)151可接收且存储位流的经编码视频数据(例如，NAL单元)。熵解码单元150可从CPB 151接收NAL单元，并剖析NAL单元以从位流获得语法元素。熵解码单元150可对NAL单元中的经熵编码语法元素进行熵解码。预测处理单元152、反量化单元154、反变换处理单元156、重建构单元158及滤波器单元160可基于从位流提取的语法元素产生经解码视频数据。Coded picture buffer (CPB) 151 may receive and store encoded video data (e.g., NAL units) of a bitstream. Entropy decoding unit 150 may receive NAL units from CPB 151 and parse the NAL units to obtain syntax elements from the bitstream. Entropy decoding unit 150 may entropy decode the entropy-encoded syntax elements in the NAL units. Prediction processing unit 152, inverse quantization unit 154, inverse transform processing unit 156, reconstruction unit 158, and filter unit 160 may generate decoded video data based on the syntax elements extracted from the bitstream.

位流的NAL单元可包含经译码切片NAL单元。作为解码位流的部分，熵解码单元150可从经译码切片NAL单元提取语法元素并对语法元素进行熵解码。经译码切片中的每一者可包含切片标头及切片数据。切片标头可含有关于切片的语法元素。The NAL units of the bitstream may include coded slice NAL units. As part of decoding the bitstream, entropy decoding unit 150 may extract syntax elements from the coded slice NAL units and entropy decode the syntax elements. Each of the coded slices may include a slice header and slice data. The slice header may contain syntax elements related to the slice.

除了从位流获得语法元素之外，视频解码器30可对CU执行解码操作。通过对CU执行解码操作，视频解码器30可重建构CU的译码块。In addition to obtaining syntax elements from the bitstream, video decoder 30 may perform a decoding operation on a CU. By performing a decoding operation on a CU, video decoder 30 may reconstruct the coding blocks of the CU.

作为对CU执行解码操作的部分，反量化单元154可反量化(即，解量化)与CU的TU相关联的系数块。反量化单元154可使用与TU的CU相关联的QP值来确定量化程度及(同样地)反量化单元154将应用的反量化程度。也就是说，可通过调整当量化变换系数时所使用的QP的值来控制压缩比，即用以表示原始序列及经压缩序列的位数目的比率。压缩比还可取决于所利用的熵译码方法。As part of performing a decoding operation on a CU, inverse quantization unit 154 may inverse quantize (i.e., dequantize) coefficient blocks associated with the TUs of the CU. Inverse quantization unit 154 may use the QP value associated with the CU of the TU to determine the degree of quantization and, similarly, the degree of inverse quantization that inverse quantization unit 154 will apply. That is, the compression ratio, i.e., the ratio of the number of bits used to represent the original sequence and the compressed sequence, may be controlled by adjusting the value of the QP used when quantizing transform coefficients. The compression ratio may also depend on the entropy coding method utilized.

在反量化单元154反量化系数块之后，反变换处理单元156可将一或多个反变换应用于系数块以便产生与TU相关联的残余块。举例来说，反变换处理单元156可将反DCT、反整数变换、反卡忽南-拉维(Karhunen-Loeve)变换(KLT)、反旋转变换、反定向变换或另一反变换应用于系数块。After inverse quantization unit 154 inverse quantizes the coefficient block, inverse transform processing unit 156 may apply one or more inverse transforms to the coefficient block to generate a residual block associated with the TU. For example, inverse transform processing unit 156 may apply an inverse DCT, an inverse integer transform, an inverse Karhunen-Loeve transform (KLT), an inverse rotational transform, an inverse directional transform, or another inverse transform to the coefficient block.

如果使用帧内预测编码PU，则帧内预测处理单元166可执行帧内预测以产生PU的预测性块。帧内预测处理单元166可使用帧内预测模式以基于空间相邻PU的预测块产生PU的预测性明度块、Cb块及Cr块。帧内预测处理单元166可基于从位流解码的一或多个语法元素确定用于PU的帧内预测模式。If the PU is encoded using intra prediction, intra-prediction processing unit 166 may perform intra prediction to generate predictive blocks for the PU. Intra-prediction processing unit 166 may use an intra prediction mode to generate predictive luma blocks, Cb blocks, and Cr blocks for the PU based on prediction blocks of spatially-neighboring PUs. Intra-prediction processing unit 166 may determine the intra prediction mode for the PU based on one or more syntax elements decoded from the bitstream.

预测处理单元152可基于从位流提取的语法元素来建构第一参考图片列表(RefPicList0)及第二参考图片列表(RefPicList1)。此外，如果PU是使用帧间预测编码，则熵解码单元150可获得PU的运动信息。运动补偿单元164可基于PU的运动信息来确定PU的一或多个参考区。运动补偿单元164可基于在PU的一或多个参考块处的样本产生PU的预测性明度块、Cb块及Cr块。Prediction processing unit 152 may construct a first reference picture list (RefPicList0) and a second reference picture list (RefPicList1) based on syntax elements extracted from the bitstream. Furthermore, if the PU is encoded using inter-frame prediction, entropy decoding unit 150 may obtain the motion information of the PU. Motion compensation unit 164 may determine one or more reference regions for the PU based on the motion information of the PU. Motion compensation unit 164 may generate predictive luma blocks, Cb blocks, and Cr blocks for the PU based on samples at one or more reference blocks of the PU.

在一些实例中，运动补偿单元164可产生PU的合并候选者列表。作为产生合并候选者列表的部分，运动补偿单元164可确定IPMVC及/或纹理合并候选者。当确定IPMVC及/或纹理合并候选者时，运动补偿单元164可将PU分割成子PU并根据特定次序处理子PU以确定子PU中的每一者的运动参数。根据本发明的一或多种技术，如果不使用运动补偿预测译码相应子PU的参考块，则运动补偿单元164响应于后续确定使用运动补偿预测译码特定次序中的任何稍后子PU的参考块，并不设定相应子PU的运动参数。实际上，如果不使用运动补偿预测译码相应子PU的参考块，则运动补偿单元164可将相应子PU的运动参数设定成默认运动参数。如果IPMVC或纹理合并候选者为合并候选者列表中的所选择合并候选者，则运动补偿单元164可基于由IPMVC或纹理合并候选者所指定的运动参数确定相应PU的预测性块。In some examples, motion compensation unit 164 may generate a merge candidate list for a PU. As part of generating the merge candidate list, motion compensation unit 164 may determine IPMVC and/or texture merge candidates. When determining the IPMVC and/or texture merge candidates, motion compensation unit 164 may partition the PU into sub-PUs and process the sub-PUs according to a specific order to determine motion parameters for each of the sub-PUs. According to one or more techniques of this disclosure, if a reference block of a corresponding sub-PU is not coded using motion compensated prediction, motion compensation unit 164 does not set the motion parameters of the corresponding sub-PU in response to a subsequent determination to use motion compensated prediction to code the reference blocks of any later sub-PU in the specific order. In fact, if the reference block of the corresponding sub-PU is not coded using motion compensated prediction, motion compensation unit 164 may set the motion parameters of the corresponding sub-PU to default motion parameters. If an IPMVC or texture merge candidate is the selected merge candidate in the merge candidate list, motion compensation unit 164 may determine the predictive block of the corresponding PU based on the motion parameters specified by the IPMVC or texture merge candidate.

重建构单元158可使用来自与CU的TU相关联的明度、Cb及Cr变换块以及CU的PU的预测性明度、Cb及Cr块的残余值(即，帧内预测数据或帧间预测数据(如可适用))来重建构CU的明度、Cb及Cr译码块。举例来说，重建构单元158可将明度变换块、Cb变换块及Cr变换块的样本添加到预测性明度块、Cb块及Cr块的对应样本以重建构CU的明度译码块、Cb译码块及Cr译码块。Reconstruction unit 158 may use residual values (i.e., intra prediction data or inter prediction data, as applicable) from the luma, Cb, and Cr transform blocks associated with the TUs of the CU and the predictive luma, Cb, and Cr blocks of the PUs of the CU to reconstruct the luma, Cb, and Cr coding blocks of the CU. For example, reconstruction unit 158 may add samples of the luma, Cb, and Cr transform blocks to corresponding samples of the predictive luma, Cb, and Cr blocks to reconstruct the luma, Cb, and Cr coding blocks of the CU.

滤波器单元160可执行解块操作以减少与CU的明度、Cb及Cr译码块相关联的成块假象。视频解码器30可将CU的明度、Cb及Cr译码块存储于经解码图片缓冲器162中。经解码图片缓冲器162可提供参考图片以用于后续运动补偿、帧内预测及呈现于显示装置(例如，图1的显示装置32)上。举例来说，视频解码器30可基于经解码图片缓冲器162中的明度、Cb及Cr块对其它CU的PU执行帧内预测或帧间预测操作。以此方式，视频解码器30可从位流提取大量明度系数块的变换系数层级，反量化变换系数层级，对变换系数层级应用变换以产生变换块，至少部分基于变换块产生译码块并输出译码块以用于显示。Filter unit 160 may perform a deblocking operation to reduce blocking artifacts associated with the luma, Cb, and Cr coding blocks of a CU. Video decoder 30 may store the luma, Cb, and Cr coding blocks of the CU in decoded picture buffer 162. Decoded picture buffer 162 may provide reference pictures for subsequent motion compensation, intra prediction, and presentation on a display device (e.g., display device 32 of FIG. 1 ). For example, video decoder 30 may perform intra prediction or inter prediction operations on PUs of other CUs based on the luma, Cb, and Cr blocks in decoded picture buffer 162. In this manner, video decoder 30 may extract transform coefficient levels for a large number of luma coefficient blocks from the bitstream, inverse quantize the transform coefficient levels, apply a transform to the transform coefficient levels to generate transform blocks, generate coded blocks based at least in part on the transform blocks, and output the coded blocks for display.

以下章节提供对3D-HEVC(其为公开可用的)的实例解码过程改变。在用于子PU时间视图间运动向量候选者的导出过程中，视频译码器可首先产生PU层级经视图间预测的运动向量候选者。如果以帧间预测模式译码中心参考子PU(即，视图间参考块中的中心子PU)且中心参考子PU在参考图片列表X中的参考图片具有相同于当前切片的参考图片列表X中的一个项目的POC值(对于X＝0或1)，则其运动向量及参考图片用作PU层级经预测运动向量候选者。否则，视频译码器可使用对于参考图片列表0及参考图片列表1(如果当前切片为B切片)具有等于0的参考图片索引的零运动作为PU层级经预测运动向量候选者。视频译码器可接着使用PU层级经预测运动向量候选者作为其对应参考块以帧内预测模式译码或以帧间预测模式译码的子PU的运动，但其参考图片不包含于当前切片的参考图片列表中。The following sections provide example decoding process changes for 3D-HEVC (which is publicly available). In the derivation process for sub-PU temporal inter-view motion vector candidates, the video coder may first generate a PU-level inter-view predicted motion vector candidate. If the center reference sub-PU (i.e., the center sub-PU in an inter-view reference block) is coded in inter-prediction mode and its reference picture in reference picture list X has the same POC value as an entry in reference picture list X of the current slice (for X=0 or 1), its motion vector and reference picture are used as the PU-level predicted motion vector candidate. Otherwise, the video coder may use zero motion with a reference picture index equal to 0 for reference picture list 0 and reference picture list 1 (if the current slice is a B slice) as the PU-level predicted motion vector candidate. The video coder may then use the PU-level predicted motion vector candidate as the motion of a sub-PU whose corresponding reference block is coded in intra-prediction mode or in inter-prediction mode, but whose reference picture is not included in the reference picture list of the current slice.

本发明的实例可改变3D-HEVC草案文本2(即，文件JCT3V-F1001v2)中所定义的子PU层级视图间运动预测过程(或用于子预测块时间视图间运动向量候选者的导出过程)。根据本发明的一或多个实例，添加到3D-HEVC草案文本2的文本带下划线且从3D-HEVC草案文本2删除的文本为斜体并封闭于双方括号中。Examples of this disclosure may modify the sub-PU level inter-view motion prediction process (or the derivation process for sub-prediction block temporal inter-view motion vector candidates) defined in 3D-HEVC Draft Text 2 (i.e., document JCT3V-F1001v2). According to one or more examples of this disclosure, text added to 3D-HEVC Draft Text 2 is underlined and text deleted from 3D-HEVC Draft Text 2 is italicized and enclosed in double square brackets.

解码过程Decoding process

H.8.5.3.2.16用于子预测块时间视图间运动向量候选者的导出过程H.8.5.3.2.16 Derivation process for sub-prediction block temporal inter-view motion vector candidates

当iv_mv_pred_flag[nuh_layer_id]等于0时，不调用此过程。When iv_mv_pred_flag[nuh_layer_id] is equal to 0, this process is not called.

此过程的输入为：The inputs to this process are:

-当前预测单元的左上方明度样本相对于当前图片的左上方明度样本的明度位置(xPb,yPb)，- the luma position (xPb, yPb) of the top-left luma sample of the current prediction unit relative to the top-left luma sample of the current picture,

-分别指定当前预测单元的宽度及高度的变量nPbW及nPbH，- variables nPbW and nPbH specifying the width and height of the current prediction unit respectively,

-参考视图索引refViewIdx。- Reference view index refViewIdx.

-视差向量mvDisp，- disparity vector mvDisp,

此过程的输出为：The output of this process is:

-指定时间视图间运动向量候选者是否可用的旗标availableFlagLXInterView(其中X在0到1的范围内(包含0与1))，- a flag availableFlagLXInterView specifying whether a temporal inter-view motion vector candidate is available (where X is in the range of 0 to 1 inclusive),

-时间视图间运动向量候选者mvLXInterView(其中X在0到1的范围内(包含0与1))。- Temporal inter-view motion vector candidate mvLXInterView (where X is in the range of 0 to 1 inclusive).

-指定参考图片列表RefPicListLX中的参考图片的参考索引refIdxLXInterView(其中X在0到1的范围内(包含0与1))，- a reference index refIdxLXInterView (where X is in the range of 0 to 1, inclusive) of a reference picture in the reference picture list RefPicListLX,

对于在0到1的范围内的X(包含0与1)，以下适用：For X in the range 0 to 1 (inclusive), the following applies:

-将旗标availableFlagLXInterView设定成等于0。-Set the flag availableFlagLXInterView to 0.

-将运动向量mvLXInterView设定成等于(0,0)。-Set the motion vector mvLXInterView equal to (0,0).

-将参考索引refIdxLXInterView设定成等于-1。- Set the reference index refldxLXInterView equal to -1.

变量nSbW及nSbH导出为：The variables nSbW and nSbH are derived as:

nSbW＝Min(nPbW,SubPbSize[nuh_layer_id])(H-173)nSbW＝Min(nPbW,SubPbSize[nuh_layer_id])(H-173)

nSbH＝Min(nPbH,SubPbSize[nuh_layer_id])(H-174)nSbH＝Min(nPbH,SubPbSize[nuh_layer_id])(H-174)

变量ivRefPic设定成等于当前接入单元中具有等于refViewIdx的ViewIdx的图片The variable ivRefPic is set equal to the picture in the current access unit with ViewIdx equal to refViewIdx

以下适用以导出变量旗标centerPredFlagLX、运动向量centerMvLX及参考索引centerRefIdxLX。 The following applies to derive the variables flag centerPredFlagLX, motion vector centerMvLX and reference index centerRefIdxLX.

-变量centerAvailableFlag设定成等于0。 -The variable centerAvailableFlag is set equal to 0.

-对于在0到1的范围内的X(包含0与1)，以下适用： - For X in the range 0 to 1 (inclusive), the following applies:

-将旗标centerPredFlagLX设定成等于0。- Set the flag centerPredFlagLX to 0.

-将运动向量centerMvLX设定成等于(0,0)。- Set the motion vector centerMvLX equal to (0,0).

-将参考索引centerRefIdxLX设定成等于-1。- Set the reference index centerRefIdxLX equal to -1.

-通过如下导出参考层明度位置(xRef,yRef) - Derive the reference layer luminance position (xRef, yRef) by

xRef＝Clip3(0,PicWidthInSamplesL-1,xRef＝Clip3(0,PicWidthInSamplesL-1,

xPb+(nPbW/nSbW/2)*nSbW+nSbW/2+((mvDisp[0]+2)>>2))xPb+(nPbW/nSbW/2)*nSbW+nSbW/2+((mvDisp[0]+2)>>2))

(H-175)(H-175)

yRef＝Clip3(0,PicHeightInSamplesL-1,yRef＝Clip3(0,PicHeightInSamplesL-1,

yPb+(nPbH/nSbH/2)*nSbH+nSbH/2+((mvDisp[1]+2)>>2))yPb+(nPbH/nSbH/2)*nSbH+nSbH/2+((mvDisp[1]+2)>>2))

(H-176)(H-176)

-变量ivRefPb指定覆盖由ivRefPic所指定的视图间参考图片内部由(xRef,yRef) 给出的位置的明度预测块。 The variable ivRefPb specifies the luma prediction block covering the position given by (xRef, yRef) inside the inter-view reference picture specified by ivRefPic .

-相对于由ivRefPic指定的视图间参考图片的左上方明度样本，明度位置 (xIvRefPb,yIvRefPb)设定成等于由ivRefPb指定的视图间参考明度预测块的左上方样本。 - Relative to the top-left luma sample of the inter-view reference picture specified by ivRefPic, the luma position (xIvRefPb, yIvRefPb) is set equal to the top-left sample of the inter-view reference luma prediction block specified by ivRefPb.

-当不以帧内预测模式译码ivRefPb时，对于在0到1的范围内的X(包含0与1)，以下适用： When ivRefPb is not coded in intra prediction mode, for X in the range of 0 to 1 (inclusive), the following applies:

-当X等于0或当前切片为B切片时，对于在X到(1-X)的范围内的Y(包含X与(1-X))，以下适用： - When X is equal to 0 or the current slice is a B slice, for Y in the range of X to (1-X), inclusive, the following applies:

-变量refPicListLYIvRef、predFlagLYIvRef、mvLYIvRef及refIdxLYIvRef设定成分别等于图片ivRefPic的RefPicListLY、PredFlagLY、MvLY及RefIdxLY。 - The variables refPicListLYIvRef, predFlagLYIvRef, mvLYIvRef, and refIdxLYIvRef are set equal to RefPicListLY, PredFlagLY, MvLY, and RefIdxLY of the picture ivRefPic, respectively.

-当predFlagLYIvRef[xIvRefPb][yIvRefPb]等于1时，对于从0到num_ref_idx_ lX_active_minus1的每一i(包含0与num_ref_idx_lX_active_minus1)，以下适用： - When predFlagLYIvRef[xIvRefPb][yIvRefPb] is equal to 1, for each i from 0 to num_ref_idx_lX_active_minus1 (inclusive), the following applies:

-当PicOrderCnt(refPicListLYIvRef[refIdxLYIvRef[xIvRefPb][yIvRefPb]]) 等于PicOrderCnt(RefPicListLX[i])且centerPredFlagLX等于0时，以下适用。 -When PicOrderCnt(refPicListLYIvRef[refIdxLYIvRef[xIvRefPb][yIvRefPb]]) is equal to PicOrderCnt(RefPicListLX[i]) and centerPredFlagLX is equal to 0, the following applies.

centerMvLX＝mvLYIvRef[xIvRefPb][yIvRefPb](H-177)centerMvLX＝mvLYIvRef[xIvRefPb][yIvRefPb](H-177)

centerRefIdxLX＝i(H-178)centerRefIdxLX＝i(H-178)

centerPredFlagLX＝1(H-179)centerPredFlagLX=1(H-179)

centerAvailableFlag＝1(H-180)centerAvailableFlag=1(H-180)

-如果centerAvailableFlag等于0且ivRefPic并非I切片，则对于在0到1的范围内的X(包含0与1)，以下适用： If centerAvailableFlag is equal to 0 and ivRefPic is not an I-slice, then for X in the range of 0 to 1 (inclusive), the following applies:

centerMvLX＝(0,0)(H-181)centerMvLX＝(0,0)(H-181)

centerRefIdxLX＝0(H-182)centerRefIdxLX=0(H-182)

centerPredFlagLX＝1(H-183)centerPredFlagLX=1(H-183)

-将旗标availableFlagLXInterView设定成等于centerPredFlagLX。 -Set the flag availableFlagLXInterView equal to centerPredFlagLX.

-将运动向量mvLXInterView设定成等于centerMvLX。 - Set the motion vector mvLXInterView equal to centerMvLX.

-将参考索引refIdxLXInterView设定成等于centerRefIdxLX。 - Set the reference index refldxLXInterView equal to centerRefldxLX.

对于在0到(nPbH/nSbH-1)的范围内的yBlk(包含0与(nPbH/nSbH-1))且对于在0到(nPbW/nSbW-1)的范围内的xBlk(包含0与(nPbW/nSbW-1))，以下适用：For yBlk in the range of 0 to (nPbH/nSbH-1), inclusive, and for xBlk in the range of 0 to (nPbW/nSbW-1), inclusive, the following applies:

-将变量curAvailableFlag设定成等于0。-Set the variable curAvailableFlag to 0.

-对于在0到1的范围内的X(包含0与1)，以下适用：- For X in the range 0 to 1 (inclusive), the following applies:

-将旗标spPredFlagL1[xBlk][yBlk]设定成等于0。- Set the flag spPredFlagL1[xBlk][yBlk] to 0.

-将运动向量spMvLX设定成等于(0,0)。- Set the motion vector spMvLX equal to (0,0).

-将参考索引spRefIdxLX[xBlk][yBlk]设定成等于-1。- Set the reference index spRefIdxLX[xBlk][yBlk] equal to -1.

-通过如下导出参考层明度位置(xRef,yRef)- Derive the reference layer luminance position (xRef, yRef) by

xRef＝Clip3(0,PicWidthInSamplesL-1,xPb+xBlk*nSbW+nSbW/2+((mvDisp[0]+2)>>2))(H-184[[175]])xRef＝Clip3(0,PicWidthInSamplesL-1,xPb+xBlk*nSbW+nSbW/2+((mvDisp[0]+2)>>2))(H-184[[175]])

-变量ivRefPb指定覆盖由ivRefPic所指定的视图间参考图片内部由(xRef,yRef)给出的位置的明度预测块。- The variable ivRefPb specifies the luma prediction block covering the position given by (xRef, yRef) inside the inter-view reference picture specified by ivRefPic.

-相对于由ivRefPic指定的视图间参考图片的左上方明度样本，明度位置(xIvRefPb,yIvRefPb)设定成等于由ivRefPb指定的视图间参考明度预测块的左上方样本。- Relative to the top-left luma sample of the inter-view reference picture specified by ivRefPic, the luma position (xIvRefPb, yIvRefPb) is set equal to the top-left sample of the inter-view reference luma prediction block specified by ivRefPb.

-当不以帧内预测模式译码ivRefPb时，对于在0到1的范围内的X(包含0与1)，以下适用：When ivRefPb is not coded in intra prediction mode, for X in the range of 0 to 1 (inclusive), the following applies:

-当X等于0或当前切片为B切片时，对于在X到(1-X)的范围内的Y(包含X与(1-X))，以下适用：- When X is equal to 0 or the current slice is a B slice, for Y in the range of X to (1-X), inclusive, the following applies:

-变量refPicListLYIvRef、predFlagLYIvRef[x][y]、mvLYIvRef[x][y]及refIdxLYIvRef[x][y]设定成分别等于图片ivRefPic的RefPicListLY、PredFlagLY[x][y]、MvLY[x][y]及RefIdxLY[x][y]。- The variables refPicListLYIvRef, predFlagLYIvRef[x][y], mvLYIvRef[x][y], and refIdxLYIvRef[x][y] are set equal to RefPicListLY, PredFlagLY[x][y], MvLY[x][y], and RefIdxLY[x][y] of the picture ivRefPic, respectively.

-当predFlagLYIvRef[xIvRefPb][yIvRefPb]等于1时，对于从0到num_ref_idx_lX_active_minus1的每一i(包含0与num_ref_idx_lX_active_minus1)，以下适用：- When predFlagLYIvRef[xIvRefPb][yIvRefPb] is equal to 1, for each i from 0 to num_ref_idx_1X_active_minus1 (inclusive), the following applies:

-当PicOrderCnt(refPicListLYIvRef[refIdxLYIvRef[xIvRefPb][yIvRefPb]])等于PicOrderCnt(RefPicListLX[i])且spPredFlagLX[xBlk][yBlk]等于0时，以下适用。-When PicOrderCnt(refPicListLYIvRef[refIdxLYIvRef[xIvRefPb][yIvRefPb]]) is equal to PicOrderCnt(RefPicListLX[i]) and spPredFlagLX[xBlk][yBlk] is equal to 0, the following applies.

spMvLX[xBlk][yBlk]＝mvLYIvRef[xIvRefPb][yIvRefPb](H-186[[177]])spMvLX[xBlk][yBlk]=mvLYIvRef[xIvRefPb][yIvRefPb](H-186[[177]])

spRefIdxLX[xBlk][yBlk]＝i(H-187[[178]])spRefIdxLX[xBlk][yBlk]=i(H-187[[178]])

spPredFlagLX[xBlk][yBlk]＝1(H-188[[179]])spPredFlagLX[xBlk][yBlk]＝1(H-188[[179]])

curAvailableFlag＝1(H-189[[180]])curAvailableFlag=1(H-189[[180]])

-[[取决于curAvailableFlag，以下适用：-[[Depending on curAvailableFlag, the following applies:

-如果curAvailableFlag等于1，则以下有序步骤适用：If curAvailableFlag is equal to 1, the following ordered steps apply:

1.当lastAvailableFlag等于0时，以下适用：1. When lastAvailableFlag is equal to 0, the following applies:

mvLXInterView＝spMvLX[xBlk][yBlk](H-181)mvLXInterView=spMvLX[xBlk][yBlk](H-181)

refIdxLXInterView＝spRefIdxLX[xBlk][yBlk](H-182)refIdxLXInterView=spRefIdxLX[xBlk][yBlk](H-182)

availableFlagLXInterView＝spPredFlag[xBlk][yBlk](H-183)availableFlagLXInterView=spPredFlag[xBlk][yBlk](H-183)

-当curSubBlockIdx大于0时，对于在0到(curSubBlockIdx-1)的范围内的k(包含0与(curSubBlockIdx-1))，以下适用：- When curSubBlockIdx is greater than 0, for k in the range of 0 to (curSubBlockIdx-1), inclusive, the following applies:

-如下文中指定地导出变量i及k：- Export the variables i and k as specified below:

i＝k％(nPSW/nSbW)(H-184)i＝k%(nPSW/nSbW)(H-184)

j＝k/(nPSW/nSbW)(H-185)j＝k/(nPSW/nSbW)(H-185)

spMvLX[i][j]＝spMvLX[xBlk][yBlk](H-186)spMvLX[i][j]＝spMvLX[xBlk][yBlk](H-186)

spRefIdxLX[i][j]＝spRefIdxLX[xBlk][yBlk](H-187)spRefIdxLX[i][j]＝spRefIdxLX[xBlk][yBlk](H-187)

spPredFlagLX[i][j]＝spPredFlagLX[xBlk][yBlk](H-188)spPredFlagLX[i][j]＝spPredFlagLX[xBlk][yBlk](H-188)

2.将变量lastAvailableFlag设定成等于1。2. Set the variable lastAvailableFlag to 1.

3.将变量xLastAvail及yLastAvail设定成分别等于xBlk及yBlk。]]3. Set the variables xLastAvail and yLastAvail to xBlk and yBlk respectively.

-[[否则(]]如果curAvailableFlag等于0[[)，当lastAvailable旗标等于1时，]]对于在0到1的范围内的X(包含0与1)，以下适用：-[[else (]]If curAvailableFlag is equal to 0[[), when lastAvailable flag is equal to 1,]]for X in the range of 0 to 1 (inclusive), the following applies:

[[spMvLX[xBlk][yBlk]＝spMvLX[xLastAvail][yLastAvail](H-189)[[spMvLX[xBlk][yBlk]＝spMvLX[xLastAvail][yLastAvail](H-189)

spRefIdxLX[xBlk][yBlk]＝spRefIdxLX[xLastAvail][yLastAvail](H-190)spRefIdxLX[xBlk][yBlk]＝spRefIdxLX[xLastAvail][yLastAvail](H-190)

spPredFlagLX[xBlk][yBlk]＝spPredFlagLX[xLastAvail][yLastAvail]spPredFlagLX[xBlk][yBlk]=spPredFlagLX[xLastAvail][yLastAvail]

(H-191)]](H-191)

spMvLX[xBlk][yBlk]＝centerMvLX(H-190)spMvLX[xBlk][yBlk]=centerMvLX(H-190)

spRefIdxLX[xBlk][yBlk]＝centerRefIdxLX(H-191)spRefIdxLX[xBlk][yBlk]=centerRefIdxLX(H-191)

spPredFlagLX[xBlk][yBlk]＝centerPredFlagLX(H-192)spPredFlagLX[xBlk][yBlk]=centerPredFlagLX(H-192)

-[[将变量curSubBlockIdx设定成等于curSubBlockIdx+1。]]-[[Set the variable curSubBlockIdx equal to curSubBlockIdx+1. ]]

为用于稍后调用于解码过程中的变量的导出过程，进行以下指派，对于x＝0..nPbW-1且对于y＝0..nPbH-1：For the derivation process of variables to be used later in the decoding process, the following assignments are made, for x = 0..nPbW-1 and for y = 0..nPbH-1:

-如下文中指定地导出变量SubPbPredFlagLX、SubPbMvLX及SubPbRefIdxLX：- Export the variables SubPbPredFlagLX, SubPbMvLX, and SubPbRefIdxLX as specified below:

SubPbPredFlagLX[xPb+x][yPb+y]＝spPredFlagLX[x/nSbW][y/nSbW](H-193[[192]])SubPbPredFlagLX[xPb+x][yPb+y]＝spPredFlagLX[x/nSbW][y/nSbW](H-193[[192]])

SubPbMvLX[xPb+x][yPb+y]＝spMvLX[x/nSbW][y/nSbW](H-194[[193]])SubPbMvLX[xPb+x][yPb+y]＝spMvLX[x/nSbW][y/nSbW](H-194[[193]])

SubPbRefIdxLX[xPb+x][yPb+y]＝spRefIdxLX[x/nSbW][y/nSbW](H-195[[194]])SubPbRefIdxLX[xPb+x][yPb+y]＝spRefIdxLX[x/nSbW][y/nSbW](H-195[[194]])

-通过SubPbMvLX[xPb+x][yPb+y]作为输入调用子条款8.5.3.2.9中用于色度运动向量的导出过程且输出为SubPbMvCLX[xPb+x][yPb+y]。- Invoke the derivation process for chroma motion vectors in subclause 8.5.3.2.9 with SubPbMvLX[xPb+x][yPb+y] as input and output as SubPbMvCLX[xPb+x][yPb+y].

图16A为根据本发明的实例说明使用帧间预测来编码CU的视频编码器20的实例操作的流程图。在图16A的实例中，视频编码器20可产生用于当前CU的当前PU的合并候选者列表(200)。根据本发明的一或多个实例，视频编码器20可产生合并候选者列表，使得合并候选者列表包含基于当前PU的子PU的运动信息的时间视图间合并候选者。在一些实例中，当前PU可为深度PU，且视频编码器20可产生合并候选者列表，使得合并候选者列表包含基于当前深度PU的子PU的运动信息的纹理合并候选者。此外，在一些实例中，视频编码器20可执行图17的操作来产生用于当前PU的合并候选者列表。FIG16A is a flowchart illustrating an example operation of video encoder 20 that encodes a CU using inter prediction according to an example of this disclosure. In the example of FIG16A , video encoder 20 may generate a merge candidate list for a current PU of the current CU (200). According to one or more examples of this disclosure, video encoder 20 may generate the merge candidate list such that the merge candidate list includes temporal inter-view merge candidates based on motion information of sub-PUs of the current PU. In some examples, the current PU may be a depth PU, and video encoder 20 may generate the merge candidate list such that the merge candidate list includes texture merge candidates based on motion information of sub-PUs of the current depth PU. Furthermore, in some examples, video encoder 20 may perform the operations of FIG17 to generate the merge candidate list for the current PU.

在产生用于当前PU的合并候选者列表之后，视频编码器20可从合并候选者列表选择合并候选者(202)。在一些实例中，视频编码器20可基于速率/失真分析选择合并候选者。此外，视频编码器20可使用所选择合并候选者的运动信息(例如，运动向量及参考索引)以确定当前PU的预测性块(204)。视频编码器20可用信号通知指示所选择合并候选者在合并候选者列表内的位置的合并候选者索引(206)。After generating a merge candidate list for the current PU, video encoder 20 may select a merge candidate from the merge candidate list (202). In some examples, video encoder 20 may select the merge candidate based on a rate/distortion analysis. Furthermore, video encoder 20 may use motion information (e.g., a motion vector and a reference index) of the selected merge candidate to determine a predictive block for the current PU (204). Video encoder 20 may signal a merge candidate index indicating a position of the selected merge candidate within the merge candidate list (206).

如果所选择合并候选者为如描述于本发明的实例中的使用子PU建构的IPMVC或MVI候选者(即，纹理合并候选者)，则IPMVC或MVI候选者可指定当前PU的每一子PU的单独运动参数集合(例如，一或多个运动向量的集合及一或多个参考索引的集合)。当视频编码器20正确定当前PU的预测性块时，视频编码器20可使用当前PU的子PU的运动参数以确定子PU的预测性块。视频编码器20可通过汇编当前PU的子PU的预测性块确定当前PU的预测性块。If the selected merge candidate is an IPMVC or MVI candidate constructed using sub-PUs (i.e., a texture merge candidate), as described in the examples of this disclosure, the IPMVC or MVI candidate may specify a separate set of motion parameters (e.g., a set of one or more motion vectors and a set of one or more reference indices) for each sub-PU of the current PU. When video encoder 20 is determining the predictive block of the current PU, video encoder 20 may use the motion parameters of the sub-PUs of the current PU to determine the predictive block of the sub-PU. Video encoder 20 may determine the predictive block of the current PU by assembling the predictive blocks of the sub-PUs of the current PU.

视频编码器20可确定当前CU中是否存在任何剩余PU(208)。如果当前CU中存在一或多个剩余PU(208的“是”)，则视频编码器20可将当前CU的另一PU作为当前PU重复动作200到208。以此方式，视频编码器20可针对当前CU的每一PU重复动作200到208。Video encoder 20 may determine whether there are any remaining PUs in the current CU (208). If there are one or more remaining PUs in the current CU ("YES" of 208), video encoder 20 may repeat actions 200-208 with another PU of the current CU as the current PU. In this manner, video encoder 20 may repeat actions 200-208 for each PU of the current CU.

当并不存在当前CU的剩余PU(208的“否”)时，视频编码器20可确定当前CU的残余数据(210)。在一些实例中，残余数据的每一样本可指示当前CU的译码块中的样本与当前CU的PU的预测性块中的对应样本之间的差。在其它实例中，视频编码器20可使用ARP以确定当前CU的残余数据。视频编码器20可在位流中用信号通知残余数据(212)。举例来说，视频编码器20可通过将一或多个变换应用于残余数据来产生系数块、量化系数、对指示经量化系数的语法元素进行熵编码并将经熵编码语法元素包含于位流中而在位流中用信号通知残余数据。When there are no remaining PUs for the current CU ("No" of 208), video encoder 20 may determine residual data for the current CU (210). In some examples, each sample of the residual data may indicate a difference between a sample in a coding block of the current CU and a corresponding sample in a predictive block of a PU of the current CU. In other examples, video encoder 20 may use ARP to determine the residual data for the current CU. Video encoder 20 may signal the residual data in the bitstream (212). For example, video encoder 20 may signal the residual data in the bitstream by applying one or more transforms to the residual data to generate coefficient blocks, quantizing the coefficients, entropy encoding syntax elements indicating the quantized coefficients, and including the entropy-encoded syntax elements in the bitstream.

图16B为根据本发明的实例说明使用帧间预测解码CU的视频解码器30的实例操作的流程图。在图16B的实例中，视频解码器30可产生用于当前CU的当前PU的合并候选者列表(220)。根据本发明的一或多个实例，视频解码器30可产生合并候选者列表，使得合并候选者列表包含基于当前PU的子PU的运动信息的时间视图间合并候选者。在一些实例中，当前PU可为深度PU，且视频解码器30可产生合并候选者列表，使得合并候选者列表包含基于当前深度PU的子PU的运动信息的纹理合并候选者。此外，在一些实例中，视频解码器30可执行图17的操作来产生用于当前PU的合并候选者列表。FIG16B is a flowchart illustrating an example operation of video decoder 30 for decoding a CU using inter-frame prediction according to an example of this disclosure. In the example of FIG16B , video decoder 30 may generate a merge candidate list for a current PU of the current CU (220). According to one or more examples of this disclosure, video decoder 30 may generate the merge candidate list such that the merge candidate list includes temporal inter-view merge candidates based on motion information of sub-PUs of the current PU. In some examples, the current PU may be a depth PU, and video decoder 30 may generate the merge candidate list such that the merge candidate list includes texture merge candidates based on motion information of sub-PUs of the current depth PU. Furthermore, in some examples, video decoder 30 may perform the operations of FIG17 to generate the merge candidate list for the current PU.

在产生用于当前PU的合并候选者列表之后，视频解码器30可从合并候选者列表确定所选择合并候选者(222)。在一些实例中，视频解码器30可基于位流中用信号通知的合并候选者索引确定所选择合并候选者。此外，视频解码器30可使用所选择合并候选者的运动参数(例如，运动向量及参考索引)以确定当前PU的预测性块(224)。举例来说，视频解码器30可使用所选择合并候选者的运动参数以确定当前PU的明度预测性块、Cb预测性块及Cr预测性块。After generating the merge candidate list for the current PU, video decoder 30 may determine a selected merge candidate from the merge candidate list (222). In some examples, video decoder 30 may determine the selected merge candidate based on a merge candidate index signaled in the bitstream. Furthermore, video decoder 30 may use motion parameters (e.g., motion vector and reference index) of the selected merge candidate to determine the predictive blocks for the current PU (224). For example, video decoder 30 may use the motion parameters of the selected merge candidate to determine the luma predictive blocks, Cb predictive blocks, and Cr predictive blocks for the current PU.

如果所选择合并候选者为如描述于本发明的实例中的使用子PU建构的IPMVC或MVI候选者，则IPMVC或MVI候选者可指定当前PU的每一子PU的单独运动参数集合(例如，一或多个运动向量的集合及一或多个参考索引的集合)。当视频解码器30正确定当前PU的预测性块时，视频解码器30可使用当前PU的子PU的运动参数以确定子PU的预测性块。视频解码器30可通过汇编当前PU的子PU的预测性块而确定当前PU的预测性块。If the selected merge candidate is an IPMVC or MVI candidate constructed using sub-PUs, as described in the examples of this disclosure, the IPMVC or MVI candidate may specify a separate set of motion parameters (e.g., a set of one or more motion vectors and a set of one or more reference indices) for each sub-PU of the current PU. When video decoder 30 is determining the predictive block of the current PU, video decoder 30 may use the motion parameters of the sub-PUs of the current PU to determine the predictive blocks of the sub-PUs. Video decoder 30 may determine the predictive block of the current PU by assembling the predictive blocks of the sub-PUs of the current PU.

视频解码器30可接着确定当前CU中是否存在任何剩余PU(226)。如果当前CU中存在一或多个剩余PU(226的“是”)，则视频解码器30可通过将当前CU的另一PU作为当前PU重复动作220到226。以此方式，视频解码器30可针对当前CU的每一PU重复动作220到226。Video decoder 30 may then determine whether there are any remaining PUs in the current CU (226). If there are one or more remaining PUs in the current CU ("YES" of 226), video decoder 30 may repeat actions 220-226 by using another PU of the current CU as the current PU. In this manner, video decoder 30 may repeat actions 220-226 for each PU of the current CU.

当并不存在当前CU的剩余PU(226的“否”)时，视频解码器30可确定当前CU的残余数据(228)。在一些实例中，视频解码器30可与确定当前CU的PU的运动参数并行地确定残余数据。在一些实例中，视频解码器30可使用ARP以确定当前CU的残余数据。另外，视频解码器30可基于当前CU的PU的预测性块及当前CU的残余数据重建构当前CU的译码块(230)。When there are no remaining PUs for the current CU ("NO" of 226), video decoder 30 may determine residual data for the current CU (228). In some examples, video decoder 30 may determine the residual data in parallel with determining the motion parameters of the PUs of the current CU. In some examples, video decoder 30 may use ARP to determine the residual data for the current CU. In addition, video decoder 30 may reconstruct the coding blocks of the current CU based on the predictive blocks of the PUs of the current CU and the residual data of the current CU (230).

图17为根据本发明的实例说明建构用于当前视图分量中的当前PU的合并候选者列表的视频译码器的实例操作的流程图。在图17的实例中，视频译码器(例如，视频编码器20或视频解码器30)可确定空间合并候选者(250)。空间合并候选者可包含指定覆盖图3中的位置A₀、A₁、B₀、B₁及B₂的PU的运动参数的合并候选者。在一些实例中，视频译码器可通过执行MV-HEVC测试模型4的子条款G.8.5.2.1.2中所描述的操作确定空间合并候选者。此外，在图17的实例中，视频译码器可确定时间合并候选者(252)。时间合并候选者可指定在不同于当前视图分量的时间实例中的参考视图分量的PU的运动参数。在一些实例中，视频译码器可通过执行3D-HEVC测试模型4的子条款H.8.5.2.1.7中所描述的操作确定时间合并候选者。FIG17 is a flowchart illustrating example operation of a video coder to construct a merge candidate list for a current PU in a current view component according to an example of this disclosure. In the example of FIG17 , a video coder (e.g., video encoder 20 or video decoder 30) may determine spatial merge candidates (250). The spatial merge candidates may include merge candidates that specify motion parameters of PUs covering positions _A0 , _A1 , _B0 , _B1 , and _B2 in FIG3 . In some examples, the video coder may determine the spatial merge candidates by performing the operations described in subclause G.8.5.2.1.2 of MV-HEVC Test Model 4. Furthermore, in the example of FIG17 , the video coder may determine temporal merge candidates (252). The temporal merge candidates may specify motion parameters of a PU of a reference view component in a time instance different from the current view component. In some examples, the video coder may determine the temporal merge candidates by performing the operations described in subclause H.8.5.2.1.7 of 3D-HEVC Test Model 4.

另外，视频译码器可确定IPMVC及IDMVC(254)。根据本发明的实例，视频译码器可使用子PU层级视图间运动预测技术产生IPMVC。之后，IPMVC可指定当前PU的每一子PU的运动参数。在一些实例中，视频译码器可执行图19的操作以确定IPMVC。IDMVC可指定当前PU的视差向量。在一些实例中，视频译码器仅当当前层的视图间运动预测旗标(例如，iv_mv_pred_flag)指示针对当前层启用视图间运动预测时确定IPMVC及IDMVC。当前层可为当前视图分量属于的层。In addition, the video coder may determine the IPMVC and the IDMVC (254). According to an example of the present disclosure, the video coder may generate the IPMVC using a sub-PU level inter-view motion prediction technique. The IPMVC may then specify motion parameters for each sub-PU of the current PU. In some examples, the video coder may perform the operations of FIG. 19 to determine the IPMVC. The IDMVC may specify a disparity vector for the current PU. In some examples, the video coder determines the IPMVC and the IDMVC only when an inter-view motion prediction flag (e.g., iv_mv_pred_flag) of the current layer indicates that inter-view motion prediction is enabled for the current layer. The current layer may be the layer to which the current view component belongs.

此外，在图17的实例中，视频译码器可确定VSP合并候选者(256)。在一些实例中，视频译码器可通过执行3D-HEVC测试模型4的子条款H.8.5.2.1.12中所描述的操作确定VSP合并候选者。在一些实例中，视频译码器仅当当前层的视图合成预测旗标指示针对当前层启用视图合成预测时确定VSP合并候选者。17 , the video coder may determine a VSP merge candidate ( 256 ). In some examples, the video coder may determine the VSP merge candidate by performing operations described in subclause H.8.5.2.1.12 of 3D-HEVC Test Model 4. In some examples, the video coder determines the VSP merge candidate only when the view synthesis prediction flag of the current layer indicates that view synthesis prediction is enabled for the current layer.

另外，视频译码器可确定当前视图分量是否为深度视图分量(258)。响应于确定当前视图分量为深度视图分量(258的“是”)，视频译码器可确定纹理合并候选者(260)。纹理合并候选者可指定对应于当前(深度)视图分量的纹理视图分量中的一或多个PU的运动信息。根据本发明的一或多个实例，视频译码器可使用子PU层级运动预测技术产生纹理合并候选者。之后，纹理合并候选者可指定当前PU的每一子PU的运动参数。在一些实例中，视频译码器可执行图19的操作以确定纹理合并候选者。视频译码器可接着确定纹理合并分量是否可用(262)。响应于确定纹理合并分量可用(262的“是”)，视频译码器可将纹理合并候选者插入到合并候选者列表(264)。In addition, the video coder may determine whether the current view component is a depth view component (258). In response to determining that the current view component is a depth view component ("YES" of 258), the video coder may determine a texture merge candidate (260). The texture merge candidate may specify motion information of one or more PUs in the texture view component corresponding to the current (depth) view component. According to one or more examples of the present disclosure, the video coder may generate the texture merge candidate using a sub-PU level motion prediction technique. Thereafter, the texture merge candidate may specify motion parameters for each sub-PU of the current PU. In some examples, the video coder may perform the operations of FIG. 19 to determine the texture merge candidate. The video coder may then determine whether a texture merge component is available (262). In response to determining that a texture merge component is available ("YES" of 262), the video coder may insert the texture merge candidate into the merge candidate list (264).

响应于确定当前图片并非深度图片(258的“否”)、响应于确定纹理合并候选者并不可用(262的“否”)或在将纹理合并候选者插入到合并候选者列表之后，视频译码器可确定IPMVC是否可用(266)。当视频译码器不能够确定IPMVC时(例如当当前PU在基础视图中时)，IPMVC可不可用。响应于确定IPMVC可用(268的“是”)，视频译码器可将IPMVC插入到合并候选者列表(268)。In response to determining that the current picture is not a depth picture ("No" of 258), in response to determining that a texture merge candidate is not available ("No" of 262), or after inserting a texture merge candidate into the merge candidate list, the video coder may determine whether an IPMVC is available (266). When the video coder is unable to determine the IPMVC (e.g., when the current PU is in the base view), the IPMVC may not be available. In response to determining that the IPMVC is available ("Yes" of 268), the video coder may insert the IPMVC into the merge candidate list (268).

响应于确定IPMVC不可用(266的“否”)或在将IPMVC插入到合并候选者列表之后，视频译码器可确定位置A₁处的空间合并候选者(即，A₁空间合并候选者)是否可用(270)。当使用帧内预测译码覆盖与空间合并候选者相关联的位置(例如，位置A₀、A₁、B₀、B₁或B₂)的PU或PU在当前切片或图片边界外部时，例如A₁空间合并候选者的空间合并候选者可不可用。响应于确定A₁空间合并候选者可用(270的“是”)，视频译码器可确定A₁空间合并候选者的运动向量及参考索引是否匹配IPMVC的代表性运动向量及代表性参考索引(270)。响应于确定A₁空间合并候选者的运动向量及参考索引并不匹配IPMVC的代表性运动向量及代表性参考索引(272的“否”)，视频译码器可将A₁空间合并候选者插入到合并候选者列表(274)。In response to determining that the IPMVC is not available ("No" of 266), or after inserting the IPMVC into the merge candidate list, the video coder may determine whether a spatial merge candidate at position _A1 (i.e., _A1 spatial merge candidate) is available (270). A spatial merge candidate, such as the A1 spatial merge candidate, may be unavailable when the PU or PUs covering the position associated with the spatial merge candidate (e.g., position _A0 , _A1 , _B0 , _B1 , or _B2 ) are coded using intra prediction and are outside the current slice or picture boundary. In response to determining that the _A1 spatial merge candidate is available ("Yes" of 270), the video coder may determine whether the motion vector and reference index of _the _A1 spatial merge candidate match the representative motion vector and representative reference index of the IPMVC (270). In response to determining that the motion vector and reference index of the A ₁ spatial merge candidate do not match the representative motion vector and representative reference index of the IPMVC (“NO” of 272 ), the video coder may insert the A ₁ spatial merge candidate into the merge candidate list ( 274 ).

如上文所指示，视频译码器可使用子PU层级运动预测技术产生IPMVC及/或纹理合并候选者。之后，IPMVC及/或纹理合并候选者可指定多个运动向量及多个参考索引。因此，视频译码器可确定A₁空间合并候选者的运动向量是否匹配IPMVC及/或纹理合并候选者的代表性运动向量，及A₁空间合并候选者的参考索引是否匹配IPMVC及/或纹理合并候选者的代表性参考索引。IPMVC的代表性运动向量及代表性参考索引可在本文中被称作“PU层级IPMVC”。纹理合并候选者的代表性运动向量及代表性参考索引可在本文中被称作“PU层级运动参数继承(MPI)候选者”。视频译码器可以不同方式确定PU层级IPMVC及PU层级MPI候选者。本发明中在其它地方描述视频译码器可如何确定PU层级IPMVC及PU层级MPI候选者的实例。As indicated above, the video coder may use sub-PU level motion prediction techniques to generate IPMVC and/or texture merge candidates. The IPMVC and/or texture merge candidates may then specify multiple motion vectors and multiple reference indices. Thus, the video coder may determine whether the motion vector of the _A1 spatial merge candidate matches the representative motion vector of the IPMVC and/or texture merge candidate, and whether the reference index of the _A1 spatial merge candidate matches the representative reference index of the IPMVC and/or texture merge candidate. The representative motion vector and representative reference index of the IPMVC may be referred to herein as a “PU-level IPMVC”. The representative motion vector and representative reference index of the texture merge candidate may be referred to herein as a “PU-level motion parameter inheritance (MPI) candidate”. The video coder may determine the PU-level IPMVC and PU-level MPI candidates in different ways. Examples of how the video coder may determine the PU-level IPMVC and PU-level MPI candidates are described elsewhere in this disclosure.

响应于确定A₁空间合并候选者不可用(270的“否”)、响应于确定A₁空间合并候选者的运动向量及参考索引匹配IPMVC的代表性运动向量及代表性参考索引(272的“是”)或在将A₁空间合并候选者插入到合并候选者列表之后，视频译码器可确定位置B₁处的空间合并候选者(即，B₁空间合并候选者)是否可用(276)。响应于确定B₁空间合并候选者可用(276的“是”)，视频译码器可确定B₁空间合并候选者的运动向量及参考索引是否匹配IPMVC的代表性运动向量及代表性参考索引(278)。响应于确定B₁空间合并候选者的运动向量及参考索引并不匹配IPMVC的代表性运动向量及代表性参考索引(278的“否”)，视频译码器可将B₁空间合并候选者包含于合并候选者列表中(280)。In response to determining that the _A1 spatial merge candidate is not available ("No" of 270), in response to determining that the motion vector and reference index of the _A1 spatial merge candidate match the representative motion vector and representative reference index of the IPMVC ("Yes" of 272), or after inserting the _A1 spatial merge candidate into the merge candidate list, the video coder may determine whether a spatial merge candidate at position _B1 (i.e., the _B1 spatial merge candidate) is available (276). In response to determining that the _B1 spatial merge candidate is available ("Yes" of 276), the video coder may determine whether the motion vector and reference index of the _B1 spatial merge candidate match the representative motion vector and representative reference index of the IPMVC (278). In response to determining that the motion vector and reference index of the _B1 spatial merge candidate do not match the representative motion vector and representative reference index of the IPMVC ("No" of 278), the video coder may include the _B1 spatial merge candidate in the merge candidate list (280).

响应于确定B₁空间运动向量不可用(276的“否”)、响应于确定B₁空间运动向量的运动向量及参考索引匹配IPMVC的代表性运动向量及代表性参考索引(278的“是”)或在将B₁空间合并候选者插入到合并候选者列表之后，视频译码器可确定位置B₀处的空间合并候选者(即，B₀空间合并候选者)是否可用(282)。响应于确定B₀空间合并候选者可用(282的“是”)，视频译码器可将B₀空间合并候选者插入到合并候选者列表(284)。In response to determining that the _B1 spatial motion vector is not available ("No" of 276), in response to determining that the motion vector and reference index of the _B1 spatial motion vector match the representative motion vector and representative reference index of the IPMVC ("Yes" of 278), or after inserting the _B1 spatial merge candidate into the merge candidate list, the video coder may determine whether a spatial merge candidate at position _B0 (i.e., the _B0 spatial merge candidate) is available (282). In response to determining that the _B0 spatial merge candidate is available ("Yes" of 282), the video coder may insert the _B0 spatial merge candidate into the merge candidate list (284).

如上文所指示，视频译码器可以不同方式确定IPMVC的代表性运动向量及代表性参考索引。在一个实例中，视频译码器可从当前PU的子PU当中确定中心子PU。在此实例中，中心子PU为最接近当前PU的明度预测块的中心像素的子PU。因为预测块的高度及/或宽度可为偶数个样本，所以预测块的“中心”像素可为邻近于预测块的真实中心的像素。此外，在此实例中，视频译码器可接着通过将当前PU的视差向量添加到中心子PU的明度预测块的中心的坐标而确定中心子PU的视图间参考块。如果使用运动补偿预测译码中心子PU的视图间参考块(即，中心子PU的视图间参考块具有一或多个运动向量及参考索引)，则视频译码器可将PU层级IPMVC的运动信息设定为中心子PU的视图间参考块的运动信息。因此，如由中心子PU提供的PU层级IPMVC可用于精简此子PU候选者以及其它候选者(例如空间相邻候选者A₁及B₁)。举例来说，如果常规候选者(例如，A₁空间合并候选者或B₁空间合并候选者)等于由中心子PU所产生的候选者，则不将另一常规候选者添加到合并候选者列表。As indicated above, the video coder may determine the representative motion vector and representative reference index of the IPMVC in different ways. In one example, the video coder may determine the center sub-PU from among the sub-PUs of the current PU. In this example, the center sub-PU is the sub-PU that is closest to the center pixel of the luma prediction block of the current PU. Because the height and/or width of the prediction block may be an even number of samples, the "center" pixel of the prediction block may be the pixel adjacent to the true center of the prediction block. Furthermore, in this example, the video coder may then determine the inter-view reference block of the center sub-PU by adding the disparity vector of the current PU to the coordinates of the center of the luma prediction block of the center sub-PU. If the inter-view reference block of the center sub-PU is coded using motion compensated prediction (i.e., the inter-view reference block of the center sub-PU has one or more motion vectors and reference indices), the video coder may set the motion information of the PU-level IPMVC to the motion information of the inter-view reference block of the center sub-PU. Therefore, the PU-level IPMVC provided by the center sub-PU can be used to refine this sub-PU candidate and other candidates (such as spatial neighboring candidates _A1 and _B1 ). For example, if a regular candidate (such as _A1 spatial merge candidate or _B1 spatial merge candidate) is equal to the candidate generated by the center sub-PU, the other regular candidate is not added to the merge candidate list.

响应于确定B₀空间合并候选者不可用(282的“否”)或在将B₀空间合并候选者插入到合并候选者列表之后，视频译码器可确定IDMVC是否可用且IDMVC的运动向量及参考索引是否不同于A₁空间合并候选者及B₁空间合并候选者的运动向量及参考索引(286)。响应于确定IDMVC可用且IDMVC的运动向量及参考索引不同于A₁空间合并候选者及B₁空间合并候选者的运动向量及参考索引(286的“是”)，视频译码器可将IDMVC插入到合并候选者列表(288)。In response to determining that the _B0 spatial merge candidate is not available ("No" of 282) or after inserting the _B0 spatial merge candidate into the merge candidate list, the video coder may determine whether IDMVC is available and whether the motion vector and reference index of the IDMVC are different from the motion vectors and reference indexes of the _A1 spatial merge candidate and the _B1 spatial merge candidate (286). In response to determining that the IDMVC is available and the motion vector and reference index of the IDMVC are different from the motion vectors and reference indexes of the _A1 spatial merge candidate and the _B1 spatial merge candidate ("Yes" of 286), the video coder may insert the IDMVC into the merge candidate list (288).

响应于确定IDMVC不可用或IDMVC的运动向量及参考索引并非不同于A₁空间合并候选者或B₁空间合并候选者的运动向量及参考索引(286的“否”)或在将IDMVC插入到合并候选者列表之后，视频译码器可执行图18中所展示的参考图片列表建构操作的部分(在图17中表示为A)。In response to determining that IDMVC is not available or the motion vector and reference index of IDMVC are not different from the motion vector and reference index of the _A1 spatial merge candidate or the _B1 spatial merge candidate ("No" of 286) or after inserting the IDMVC into the merge candidate list, the video decoder may perform the portion of the reference picture list construction operation shown in Figure 18 (denoted as A in Figure 17).

图18为根据本发明的实例说明图17的参考图片列表建构操作的延续部分的流程图。在图18的实例中，视频译码器可确定VSP合并候选者是否可用(300)。响应于确定VSP合并候选者可用(300的“是”)，视频译码器可将VSP合并候选者插入到合并候选者列表(302)。FIG18 is a flowchart illustrating a continuation of the reference picture list construction operation of FIG17 according to an example of this disclosure. In the example of FIG18 , the video coder may determine whether a VSP merge candidate is available ( 300 ). In response to determining that a VSP merge candidate is available (“YES” of 300 ), the video coder may insert the VSP merge candidate into the merge candidate list ( 302 ).

响应于确定VSP合并候选者不可用(300的“否”)或在将VSP合并候选者插入到合并候选者列表之后，视频译码器可确定位置A₀处的空间合并候选者(即，A₀空间合并候选者)是否可用(304)。响应于确定A₀空间合并候选者可用(304的“是”)，视频译码器可将A₀空间合并候选者插入到合并候选者列表(306)。In response to determining that the VSP merge candidate is not available ("No" of 300) or after inserting the VSP merge candidate into the merge candidate list, the video coder may determine whether a spatial merge candidate at position _A0 (i.e., _A0 spatial merge candidate) is available (304). In response to determining that the _A0 spatial merge candidate is available ("Yes" of 304), the video coder may insert the _A0 spatial merge candidate into the merge candidate list (306).

此外，响应于确定A₀空间合并候选者不可用(306的“否”)或在将A₀空间合并候选者插入到合并候选者列表之后，视频译码器可确定位置B₂处的空间合并候选者(即，B₂空间合并候选者)是否可用(308)。响应于确定B₂空间合并候选者可用(308的“是”)，视频译码器可将B₂空间合并候选者插入到合并候选者列表(310)。Furthermore, in response to determining that the _A0 spatial merge candidate is not available ("No" of 306), or after inserting the _A0 spatial merge candidate into the merge candidate list, the video coder may determine whether a spatial merge candidate at position _B2 (i.e., a _B2 spatial merge candidate) is available (308). In response to determining that the _B2 spatial merge candidate is available ("Yes" of 308), the video coder may insert the _B2 spatial merge candidate into the merge candidate list (310).

响应于确定B₂空间合并候选者不可用(308的“否”)或在将B₂空间合并候选者插入到合并候选者列表之后，视频译码器可确定时间合并候选者是否可用(312)。响应于确定时间合并候选者可用(312的“是”)，视频译码器可将时间合并候选者插入到合并候选者列表(314)。In response to determining that the _B2 spatial merge candidate is not available ("No" of 308), or after inserting the _B2 spatial merge candidate into the merge candidate list, the video coder may determine whether a temporal merge candidate is available (312). In response to determining that the temporal merge candidate is available ("Yes" of 312), the video coder may insert the temporal merge candidate into the merge candidate list (314).

此外，响应于确定时间合并候选者不可用(312的“否”)或在将时间合并候选者插入到合并候选者列表之后，视频译码器可确定当前切片是否为B切片(316)。响应于确定当前切片为B切片(316的“是”)，视频译码器可导出组合双向预测性合并候选者(318)。在一些实例中，视频译码器可通过执行3D-HEVC测试模型4的子条款H.8.5.2.1.3中所描述的操作导出组合双向预测性合并候选者。Furthermore, in response to determining that the temporal merge candidate is not available ("No" of 312) or after inserting the temporal merge candidate into the merge candidate list, the video coder may determine whether the current slice is a B slice (316). In response to determining that the current slice is a B slice ("Yes" of 316), the video coder may derive a combined bi-predictive merge candidate (318). In some examples, the video coder may derive the combined bi-predictive merge candidate by performing the operations described in subclause H.8.5.2.1.3 of 3D-HEVC Test Model 4.

响应于确定当前切片并非B切片(316的“否”)或在导出组合双向预测性合并候选者之后，视频译码器可导出零运动向量合并候选者(320)。零运动向量合并候选者可指定具有等于0的水平分量及垂直分量的运动向量。在一些实例中，视频译码器可通过执行3D-HEVC测试模型4的子条款8.5.2.1.4中所描述的操作导出零运动向量候选者。In response to determining that the current slice is not a B slice ("NO" of 316), or after deriving the combined bi-predictive merge candidate, the video coder may derive a zero motion vector merge candidate (320). The zero motion vector merge candidate may specify a motion vector having a horizontal component and a vertical component equal to zero. In some examples, the video coder may derive the zero motion vector candidate by performing the operations described in subclause 8.5.2.1.4 of 3D-HEVC Test Model 4.

图19为根据本发明的实例说明视频译码器确定IPMVC或纹理合并候选者的操作的流程图。在图19的实例中，视频译码器(例如，视频编码器20或视频解码器30)可将当前PU划分成多个子PU(348)。在不同实例中，子PU中的每一者的块大小可为4×4、8×8、16×16或另一大小。FIG19 is a flowchart illustrating the operation of a video coder to determine an IPMVC or texture merge candidate according to an example of this disclosure. In the example of FIG19 , a video coder (e.g., video encoder 20 or video decoder 30) may divide a current PU into a plurality of sub-PUs (348). In different examples, the block size of each of the sub-PUs may be 4×4, 8×8, 16×16, or another size.

此外，在图19的实例中，视频译码器可设定默认运动向量及默认参考索引(350)。在不同实例中，视频译码器可以不同方式设定默认运动向量及默认参考索引。在一些实例中，默认运动参数(即，默认运动向量及默认参考索引)等于PU层级运动向量候选者。此外，在一些实例中，视频译码器可取决于视频译码器是否正确定IPMVC或纹理合并候选者不同地确定默认运动信息。19 , the video coder may set a default motion vector and a default reference index (350). In different examples, the video coder may set the default motion vector and the default reference index differently. In some examples, the default motion parameters (i.e., the default motion vector and the default reference index) are equal to the PU-level motion vector candidate. Furthermore, in some examples, the video coder may determine the default motion information differently depending on whether the video coder is determining an IPMVC or texture merge candidate.

在视频译码器正确定IPMVC的一些实例中，视频译码器可从当前PU的对应区的中心位置导出PU层级IPMVC，如3D-HEVC测试模型4中所定义。此外，在此实例中，视频译码器可将默认运动向量及参考索引设定为等于PU层级IPMVC。举例来说，视频译码器可将默认运动向量及默认参考索引设定为PU层级IPMVC。在此实例中，视频译码器可从当前PU的对应区的中心位置导出PU层级IPMVC。In some examples where the video coder is determining the IPMVC, the video coder may derive the PU-level IPMVC from the center position of the corresponding region of the current PU, as defined in 3D-HEVC Test Model 4. Furthermore, in this example, the video coder may set the default motion vector and reference index equal to the PU-level IPMVC. For example, the video coder may set the default motion vector and default reference index to the PU-level IPMVC. In this example, the video coder may derive the PU-level IPMVC from the center position of the corresponding region of the current PU.

在视频译码器正确定IPMVC的另一实例中，视频译码器可将默认运动参数设定为由覆盖参考视图中的参考图片的坐标(xRef,yRef)处的像素的视图间参考块所含有的运动参数。视频译码器可如下确定坐标(xRef,yRef)：In another example where the video coder is determining the IPMVC, the video coder may set the default motion parameters to the motion parameters contained by the inter-view reference block covering the pixel at coordinates (xRef, yRef) of the reference picture in the reference view. The video coder may determine the coordinates (xRef, yRef) as follows:

xRef＝Clip3(0,PicWidthInSamplesL-1,xP+((nPSW[[-1]])>>1)+((mvDisp[0]+2)>>2))xRef＝Clip3(0,PicWidthInSamplesL-1,xP+((nPSW[[-1]])>>1)+((mvDisp[0]+2)>>2))

yRef＝Clip3(0,PicHeightInSamplesL-1,yP+((nPSH[[-1]])>>1)+((mvDisp[1]+2)>>2))yRef＝Clip3(0,PicHeightInSamplesL-1,yP+((nPSH[[-1]])>>1)+((mvDisp[1]+2)>>2))

在上文方程式中，(xP,yP)指示当前PU的左上方样本的坐标，mvDisp为视差向量且nPSWxnPSH为当前PU的大小，且PicWidthInSamplesL及PicHeightInSamplesL定义参考视图中的图片的分辨率(相同于当前视图)。在上文的方程式中，双方括号中的斜体文本指示从3D-HEVC测试模型4的章节H.8.5.2.1.10中的方程式H-124及H-125删除的文本。In the above equations, (xP, yP) indicates the coordinates of the top-left sample of the current PU, mvDisp is the disparity vector, nPSWxnPSH is the size of the current PU, and PicWidthInSamplesL and PicHeightInSamplesL define the resolution of the picture in the reference view (same as the current view). In the above equations, italic text in double square brackets indicates text deleted from equations H-124 and H-125 in section H.8.5.2.1.10 of 3D-HEVC Test Model 4.

如上文所论述，3D-HEVC测试模型4的章节H.8.5.2.1.10描述用于时间视图间运动向量候选者的导出过程。此外，如上文所论述，方程式H-124及H-125在3D-HEVC测试模型4的章节H.8.5.2.1.10中用于确定参考块在参考图片中的明度位置。相反于3D-HEVC测试模型4中的方程式H-124及H-125，本实例的方程式并不从nPSW及nPSH减去1。结果，xRef及yRef指示直接在当前PU的预测块的真实中心右下方的像素的坐标。因为为样本值的当前PU的预测块的宽度及高度可为偶数，所以可不存在在当前PU的预测块的真实中心处的样本值。相对于当xRef及yRef指示直接在当前PU的预测块的真实中心的左上方的像素的坐标的情况，当xRef及yRef指示直接在当前PU的预测块的真实中心的右下方的像素的坐标时可带来译码增益。在其它实例中，视频译码器可使用覆盖不同像素(xRef,yRef)的其它块以导出默认运动向量及参考索引。As discussed above, section H.8.5.2.1.10 of 3D-HEVC Test Model 4 describes the derivation process for temporal inter-view motion vector candidates. Furthermore, as discussed above, equations H-124 and H-125 are used in section H.8.5.2.1.10 of 3D-HEVC Test Model 4 to determine the luma position of a reference block in a reference picture. Unlike equations H-124 and H-125 in 3D-HEVC Test Model 4, the equations of this example do not subtract 1 from nPSW and nPSH. Consequently, xRef and yRef indicate the coordinates of the pixel directly below and to the right of the true center of the prediction block of the current PU. Because the width and height of the prediction block of the current PU, which are sample values, may be even numbers, there may not be a sample value at the true center of the prediction block of the current PU. Compared to the case where xRef and yRef indicate the coordinates of the pixel directly above and to the left of the true center of the prediction block of the current PU, coding gain may be achieved when xRef and yRef indicate the coordinates of the pixel directly below and to the right of the true center of the prediction block of the current PU. In other examples, the video coder may use other blocks covering different pixels (xRef, yRef) to derive the default motion vector and reference index.

在视频译码器在视频译码器正确定IPMVC时可如何设定默认运动参数的另一实例中，在设定当前PU的子PU的运动参数之前，视频译码器可从当前PU的所有子PU当中选择最接近于当前PU的明度预测块的中心像素的子PU。视频译码器可接着针对所选择子PU确定参考视图分量中的参考块。换句话说，视频译码器可确定所选择子PU的视图间参考块。当使用运动补偿预测译码所选择子PU的视图间参考块时，视频译码器可使用所选择子PU的视图间参考块以导出默认运动向量及参考索引。换句话说，视频译码器可将默认运动参数设定为最接近于当前PU的明度预测块的中心像素的参考块的子PU的运动参数。In another example of how a video coder may set default motion parameters when the video coder is determining IPMVC, before setting the motion parameters of the sub-PUs of the current PU, the video coder may select the sub-PU that is closest to the center pixel of the luma prediction block of the current PU from among all the sub-PUs of the current PU. The video coder may then determine a reference block in a reference view component for the selected sub-PU. In other words, the video coder may determine an inter-view reference block for the selected sub-PU. When decoding the inter-view reference block of the selected sub-PU using motion compensated prediction, the video coder may use the inter-view reference block of the selected sub-PU to derive a default motion vector and reference index. In other words, the video coder may set the default motion parameters to the motion parameters of the sub-PU of the reference block that is closest to the center pixel of the luma prediction block of the current PU.

以此方式，视频译码器可确定参考图片中的参考块，所述参考块具有相同于当前PU的预测块的大小。另外，视频译码器可从参考块的子PU当中确定最接近于参考块的中心像素的子PU。视频译码器可从参考块的所确定子PU的运动参数导出默认运动参数。In this way, the video coder can determine a reference block in the reference picture that has the same size as the prediction block of the current PU. In addition, the video coder can determine the sub-PU closest to the center pixel of the reference block from among the sub-PUs of the reference block. The video coder can derive default motion parameters from the motion parameters of the determined sub-PUs of the reference block.

视频译码器可以不同方式确定最接近于参考块的中心像素的子PU。举例来说，假定子PU大小为2^Ux2^U，视频译码器可相对于当前PU的明度预测块的左上方样本选择具有以下坐标的子PU：(((nPSW>>(u+1))-1)<<u,(((nPSH>>(u+1))-1)<<u)。换句话说，最接近于参考块的中心像素的子PU包含相对于参考块的左上方样本具有以下坐标的像素：(((nPSW>>(u+1))-1)<<u,(((nPSH>>(u+1))-1)<<u)。替代性地，视频译码器可相对于当前PU的明度预测块的左上方样本选择具有以下坐标相对坐标的子PU：((nPSW>>(u+1))<<u,(nPSH>>(u+1))<<u)。换句话说，最接近于参考块的中心像素的子PU包含相对于参考块的左上方样本具有以下坐标的像素：((nPSW>>(u+1))<<u,(nPSH>>(u+1))<<u)。在这些方程式中，nPSW及nPSH分别为当前PU的明度预测块的宽度及高度。因此，在一个实例中，视频译码器可从当前PU的多个子PU当中确定最接近于当前PU的明度预测块的中心像素的子PU。在此实例中，视频译码器可从所确定子PU的视图间参考块导出默认运动参数。The video coder may determine the sub-PU that is closest to the center pixel of the reference block in different ways. For example, assuming the sub-PU size is ^2U ^x2U , the video coder may select the sub-PU with the following coordinates relative to the upper left sample of the luma prediction block of the current PU: (((nPSW>>(u+1))-1)<<u, (((nPSH>>(u+1))-1)<<u). In other words, the sub-PU that is closest to the center pixel of the reference block includes the pixel with the following coordinates relative to the upper left sample of the reference block: (((nPSW>>(u+1))-1)<<u, (((nPSH>>(u+1))-1)<<u). Alternatively, the video coder may select the sub-PU with the following relative coordinates relative to the upper left sample of the luma prediction block of the current PU: ((nPSW>>(u+ 1))<<u,(nPSH>>(u+1))<<u). In other words, the sub-PU closest to the center pixel of the reference block includes the pixel with the following coordinates relative to the top-left sample of the reference block: ((nPSW>>(u+1))<<u,(nPSH>>(u+1))<<u). In these equations, nPSW and nPSH are the width and height of the luma prediction block of the current PU, respectively. Therefore, in one example, the video decoder may determine the sub-PU closest to the center pixel of the luma prediction block of the current PU from among multiple sub-PUs of the current PU. In this example, the video decoder may derive default motion parameters from the inter-view reference block of the determined sub-PU.

在视频译码器正确定IPMVC的其它实例中，默认运动向量为零运动向量。此外，在一些实例中，默认参考索引等于当前参考图片列表中的第一可用时间参考图片(即，在不同于当前图片的时间实例中的参考图片)或默认参考索引可等于0。换句话说，默认运动参数可包含默认运动向量及默认参考索引。视频译码器可将默认运动向量设定为零运动向量且可将默认参考索引设定为0或当前参考图片列表中的第一可用时间参考图片。In other examples where the video coder is determining the IPMVC, the default motion vector is a zero motion vector. Furthermore, in some examples, the default reference index is equal to the first available temporal reference picture in the current reference picture list (i.e., a reference picture at a different time instance than the current picture) or the default reference index may be equal to 0. In other words, the default motion parameters may include a default motion vector and a default reference index. The video coder may set the default motion vector to the zero motion vector and may set the default reference index to 0 or to the first available temporal reference picture in the current reference picture list.

举例来说，如果当前切片为P切片，则默认参考索引可指示当前图片的RefPicList0中的第一可用时间参考图片(即，当前图片的RefPicList0中具有最小参考索引的时间参考图片)。此外，如果当前切片为B切片且启用来自RefPicList0的帧间预测但并不启用来自当前图片的RefPicList1的帧间预测，则默认参考索引可指示当前图片的RefPicList0中的第一可用时间参考图片。如果当前切片为B切片且启用来自当前图片的RefPicList1的帧间预测但并不启用来自当前图片的RefPicList0的帧间预测，则默认参考索引可指示当前图片的RefPicList1中的第一可用时间参考图片(即，当前图片的RefPicList1中具有最小参考索引的时间参考图片)。如果当前切片为B切片且启用来自当前图片的RefPicList0及当前图片的RefPicList1的帧间预测，则默认RefPicList0参考索引可指示当前图片的RefPicList0中的第一可用时间参考图片且默认RefPicList1参考索引可指示当前图片的RefPicList1中的第一可用时间参考图片。For example, if the current slice is a P slice, the default reference index may indicate the first available temporal reference picture in RefPicList0 of the current picture (i.e., the temporal reference picture with the smallest reference index in RefPicList0 of the current picture). Furthermore, if the current slice is a B slice and inter-frame prediction from RefPicList0 is enabled but inter-frame prediction from RefPicList1 of the current picture is not enabled, the default reference index may indicate the first available temporal reference picture in RefPicList0 of the current picture. If the current slice is a B slice and inter-frame prediction from RefPicList1 of the current picture is enabled but inter-frame prediction from RefPicList0 of the current picture is not enabled, the default reference index may indicate the first available temporal reference picture in RefPicList1 of the current picture (i.e., the temporal reference picture with the smallest reference index in RefPicList1 of the current picture). If the current slice is a B slice and inter-frame prediction from RefPicList0 of the current picture and RefPicList1 of the current picture is enabled, the default RefPicList0 reference index may indicate the first available temporal reference picture in RefPicList0 of the current picture and the default RefPicList1 reference index may indicate the first available temporal reference picture in RefPicList1 of the current picture.

此外，在上文提供的用于在视频译码器正确定IPMVC时确定默认运动参数的一些实例中，视频译码器可将默认运动参数设定为最接近于当前PU的明度预测块的中心像素的子PU的运动参数。然而，在这些及其它实例中，默认运动参数可保持不可用。举例来说，如果对应于最接近于当前PU的明度预测块的中心像素的子PU的视图间参考块经帧内预测，则默认运动参数可保持不可用。之后，在一些实例中，当默认运动参数不可用且使用运动补偿预测译码第一子PU的视图间参考块(即，第一子PU的视图间参考块具有可用运动信息)时，视频译码器可将默认运动参数设定为第一子PU的运动参数。在此实例中，第一子PU可为当前PU在当前PU的子PU的光栅扫描次序上的第一子PU。因此，当确定默认运动参数时，响应于确定在多个子PU的光栅扫描次序上的第一子PU具有可用运动参数，视频译码器可将默认运动参数设定为所述多个子PU的光栅扫描次序上的第一子PU的可用运动参数。Furthermore, in some of the examples provided above for determining default motion parameters when the video coder is determining IPMVC, the video coder may set the default motion parameters to the motion parameters of the sub-PU closest to the center pixel of the luma prediction block of the current PU. However, in these and other examples, the default motion parameters may remain unavailable. For example, if the inter-view reference block corresponding to the sub-PU closest to the center pixel of the luma prediction block of the current PU is intra-predicted, the default motion parameters may remain unavailable. Thereafter, in some examples, when the default motion parameters are unavailable and the inter-view reference block of the first sub-PU is coded using motion-compensated prediction (i.e., the inter-view reference block of the first sub-PU has available motion information), the video coder may set the default motion parameters to the motion parameters of the first sub-PU. In this example, the first sub-PU may be the first sub-PU of the current PU in raster scan order of the sub-PUs of the current PU. Thus, when determining default motion parameters, in response to determining that the first sub-PU in the raster scan order of multiple sub-PUs has available motion parameters, the video decoder may set the default motion parameters to the available motion parameters of the first sub-PU in the raster scan order of the multiple sub-PUs.

否则，当默认运动信息不可用时(例如，当第一子PU的视图间参考块的运动参数不可用时)，如果当前子PU行的第一子PU具有可用运动参数，则视频译码器可将默认运动信息设定为当前子PU行的第一子PU的运动参数。当默认运动参数仍不可用时(例如，当当前子PU行的第一子PU的视图间参考块不可用时)，视频译码器可将默认运动向量设定为零运动向量且可将默认参考索引设定为等于当前参考图片列表中的第一可用时间参考图片。以此方式，当视频译码器正确定默认运动参数时，响应于确定包含相应子PU的子PU行的第一子PU具有可用运动参数，视频译码器可将默认运动参数设定为包含相应子PU的子PU行的第一子PU的可用运动参数。Otherwise, when the default motion information is not available (for example, when the motion parameters of the inter-view reference block of the first sub-PU are not available), if the first sub-PU of the current sub-PU row has available motion parameters, the video coder may set the default motion information to the motion parameters of the first sub-PU of the current sub-PU row. When the default motion parameters are still not available (for example, when the inter-view reference block of the first sub-PU of the current sub-PU row is not available), the video coder may set the default motion vector to a zero motion vector and may set the default reference index to be equal to the first available temporal reference picture in the current reference picture list. In this way, when the video coder is determining the default motion parameters, in response to determining that the first sub-PU of the sub-PU row including the corresponding sub-PU has available motion parameters, the video coder may set the default motion parameters to the available motion parameters of the first sub-PU of the sub-PU row including the corresponding sub-PU.

此外，如上文关于图17的实例所描述，视频译码器可使用子PU层级运动预测技术确定纹理合并候选者。在此类实例中，当前PU可在本文中被称作“当前深度PU”。视频译码器可执行图19的操作以确定纹理合并候选者。之后，当视频译码器正确定纹理合并候选者时，视频译码器可将当前深度PU划分成若干子PU，且每一子PU使用共置纹理块的运动信息以用于运动补偿。此外，当视频译码器正确定纹理合并候选者时，如果子PU的对应纹理块经帧内译码或在相同于对应纹理块的参考图片的接入单元中的图片不包含于当前深度PU的参考图片列表中，则视频译码器可将默认运动向量及参考索引指派到所述子PU。因此，一般来说，当共置纹理块未经帧内译码且由共置纹理块使用的参考图片在当前深度图片的参考图片列表中时，视频译码器可确定共置纹理块具有可用运动信息。相反，当共置纹理块经帧内译码或共置纹理块使用并不在当前深度图片的参考图片列表中的参考图片时，共置纹理块的运动参数可不可用。Furthermore, as described above with respect to the example of FIG. 17 , the video coder may use sub-PU-level motion prediction techniques to determine texture merge candidates. In such examples, the current PU may be referred to herein as the “current depth PU.” The video coder may perform the operations of FIG. 19 to determine texture merge candidates. Thereafter, when the video coder is determining texture merge candidates, the video coder may divide the current depth PU into several sub-PUs, and each sub-PU uses the motion information of the collocated texture block for motion compensation. Furthermore, when the video coder is determining texture merge candidates, if the sub-PU's corresponding texture block is intra-coded or is not included in the reference picture list of the current depth PU in a picture in the same access unit as the corresponding texture block, the video coder may assign a default motion vector and reference index to the sub-PU. Therefore, in general, when the collocated texture block is not intra-coded and the reference picture used by the collocated texture block is in the reference picture list of the current depth picture, the video coder may determine that the collocated texture block has available motion information. In contrast, when the co-located texture block is intra-coded or the co-located texture block uses a reference picture that is not in the reference picture list of the current depth picture, the motion parameters of the co-located texture block may not be available.

如上文所指示，视频译码器可取决于视频译码器是否正确定IPMVC或纹理合并候选者不同地确定默认运动信息。举例来说，当视频译码器正确定纹理合并候选者时，视频译码器可根据以下实例或其它实例中的一者确定默认运动向量及默认参考索引。在一个实例中，共置纹理块可与当前深度PU共置且可具有相同于当前深度PU的大小。在此实例中，视频译码器将默认运动向量及默认参考索引设定为覆盖共置纹理块的中心像素的块的运动信息。As indicated above, the video coder may determine default motion information differently depending on whether the video coder is determining an IPMVC or texture merge candidate. For example, when the video coder is determining a texture merge candidate, the video coder may determine a default motion vector and a default reference index according to one of the following examples or other examples. In one example, a co-located texture block may be co-located with the current depth PU and may have the same size as the current depth PU. In this example, the video coder sets the default motion vector and the default reference index to the motion information of the block covering the center pixel of the co-located texture block.

因此，在当前图片为深度视图分量且参考图片为在相同于当前图片的视图及接入单元中的纹理视图分量的一些实例中，视频译码器可将默认运动参数设定为与覆盖参考块的像素的块相关联的运动参数，所述参考块在参考图片中、与当前PU共置且具有相同于当前PU的大小。在此类实例中，像素可为参考块的中心像素或参考块的另一像素。Thus, in some examples where the current picture is a depth view component and the reference picture is a texture view component in the same view and access unit as the current picture, the video coder may set the default motion parameters to the motion parameters associated with a block of pixels covering a reference block that is in the reference picture, co-located with the current PU, and has the same size as the current PU. In such examples, the pixel may be the center pixel of the reference block or another pixel of the reference block.

在视频译码器正确定纹理合并候选者的另一实例中，共置纹理块可具有相同于当前深度PU的大小。在此实例中，视频译码器可将默认运动向量及默认参考索引设定为覆盖共置纹理块内的任何给定像素的块(例如，PU)的运动信息。In another example where the video coder is determining a texture merge candidate, the co-located texture block may have the same size as the current depth PU. In this example, the video coder may set a default motion vector and a default reference index to the motion information of a block (e.g., a PU) covering any given pixel within the co-located texture block.

在视频译码器正确定纹理合并候选者的另一实例中，视频译码器可首先选择当前深度PU的中心子PU。在当前深度PU的所有子PU当中，中心子PU可最接近于(或可包含)当前深度PU的预测块的中心像素而定位。视频译码器可接着使用与中心子PU共置的纹理块以导出默认运动向量及参考索引。假定子PU大小为2^Ux2^U，视频译码器可确定中心子PU为相对于当前深度PU的预测块的左上方样本(且之后为共置纹理块的左上方样本)具有以下坐标的子PU：(((nPSW>>(u+1))-1)<<u,(((nPSH>>(u+1))-1)<<u)。替代性地，视频译码器可确定中心子PU的相对坐标为：((nPSW>>(u+1))<<u,(nPSH>>(u+1))<<u)。在这些方程式中，nPSW及nPSH分别为当前深度PU的预测块的宽度及高度。In another example where the video coder is determining a texture merge candidate, the video coder may first select a center sub-PU of the current depth PU. Among all sub-PUs of the current depth PU, the center sub-PU may be located closest to (or may include) the center pixel of the prediction block of the current depth PU. The video coder may then use the texture block co-located with the center sub-PU to derive a default motion vector and reference index. Assuming a sub-PU size of ^2U ^x2U , the video coder may determine the center sub-PU as the sub-PU with the following coordinates relative to the top-left sample of the prediction block of the current depth PU (and subsequently the top-left sample of the co-located texture block): (((nPSW >> (u+1))-1) <<u, (((nPSH >> (u+1))-1) <<u). Alternatively, the video coder may determine the relative coordinates of the center sub-PU as: ((nPSW >> (u+1)) <<u, (nPSH >> (u+1)) <<u). In these equations, nPSW and nPSH are the width and height of the prediction block of the current depth PU, respectively.

因此，在一个实例中，视频译码器可从当前PU的多个子PU当中确定最接近于当前PU的预测块的中心的子PU。在此实例中，视频译码器可从所确定子PU的共置纹理块导出默认运动参数。Thus, in one example, the video coder may determine the sub-PU closest to the center of the prediction block of the current PU from among the multiple sub-PUs of the current PU. In this example, the video coder may derive default motion parameters from the co-located texture block of the determined sub-PU.

在视频译码器正确定纹理合并候选者且默认运动信息不可用(例如，当中心子PU的共置纹理块的运动参数不可用时)的一些实例中，视频译码器可确定当前深度PU的第一子PU的共置纹理块是否具有可用运动信息。当前深度PU的第一子PU可为当前深度PU在当前深度PU的子PU的光栅扫描次序上的第一子PU。如果当前深度PU的第一子PU的共置纹理块的运动参数可用，则视频译码器可将默认运动参数设定为当前深度PU的第一子PU的运动参数。In some instances where the video coder is determining a texture merge candidate and default motion information is not available (e.g., when the motion parameters of the collocated texture block of the center sub-PU are not available), the video coder may determine whether the collocated texture block of the first sub-PU of the current depth PU has available motion information. The first sub-PU of the current depth PU may be the first sub-PU of the current depth PU in the raster scan order of the sub-PUs of the current depth PU. If the motion parameters of the collocated texture block of the first sub-PU of the current depth PU are available, the video coder may set the default motion parameters to the motion parameters of the first sub-PU of the current depth PU.

此外，在视频译码器正确定纹理合并候选者的一些实例中，当默认运动信息不可用时(例如，当第一子PU的共置纹理块的运动参数不可用时)，如果当前子PU行的第一子PU具有可用运动信息，则视频译码器将默认运动信息设定为当前子PU行的第一子PU的运动信息。此外，当默认运动信息不可用时(例如，当当前子PU行的第一子PU的运动信息不可用时)，默认运动向量为零运动向量，且默认参考索引等于当前参考图片列表中的第一可用时间参考图片或0。Furthermore, in some examples where the video coder is determining a texture merge candidate, when default motion information is unavailable (e.g., when motion parameters of a co-located texture block of the first sub-PU are unavailable), if the first sub-PU of the current sub-PU row has available motion information, the video coder sets the default motion information to the motion information of the first sub-PU of the current sub-PU row. Furthermore, when default motion information is unavailable (e.g., when motion information of the first sub-PU of the current sub-PU row is unavailable), the default motion vector is a zero motion vector, and the default reference index is equal to the first available temporal reference picture in the current reference picture list or 0.

在视频译码器正确定纹理合并候选者的一些实例中，默认运动向量为零运动向量，且默认参考索引等于当前参考图片列表中的第一可用时间参考图片或0。In some examples where the video coder is determining a texture merge candidate, the default motion vector is a zero motion vector, and the default reference index is equal to the first available temporal reference picture in the current reference picture list or 0.

不论视频译码器正确定IPMVC还是纹理合并候选者，视频译码器可针对整个当前PU设定默认运动信息。因此，视频译码器并不需要存储较多运动向量于当前PU中以用于预测空间相邻块、时间相邻块(当含有此PU的图片在TMVP期间用作共置图片时)或解块。Regardless of whether the video coder is determining an IPMVC or texture merge candidate, the video coder can set default motion information for the entire current PU. Therefore, the video coder does not need to store more motion vectors in the current PU for predicting spatial neighboring blocks, temporal neighboring blocks (when the picture containing this PU is used as a co-located picture during TMVP), or deblocking.

此外，视频译码器可确定PU层级运动向量候选者(352)。举例来说，视频译码器可取决于视频译码器是否正确定IPMVC或纹理合并候选者而确定PU层级IPMVC或PU层级运动参数继承(MPI)候选者(即，PU层级纹理合并候选者)。视频译码器可基于PU层级运动向量候选者确定是否将一或多个空间合并候选者包含于候选者列表中。在一些实例中，PU层级运动向量候选者指定相同于默认运动参数的运动参数。Furthermore, the video coder may determine PU-level motion vector candidates (352). For example, the video coder may determine a PU-level IPMVC or PU-level motion parameter inheritance (MPI) candidate (i.e., a PU-level texture merge candidate) depending on whether the video coder is determining an IPMVC or texture merge candidate. The video coder may determine whether to include one or more spatial merge candidates in the candidate list based on the PU-level motion vector candidate. In some examples, the PU-level motion vector candidate specifies motion parameters that are the same as the default motion parameters.

在视频译码器正确定IPMVC的一些实例中，视频译码器可从当前PU的对应区的中心位置导出PU层级IPMVC，如3D-HEVC测试模型4中所定义。如描述于图17的实例中，视频译码器可使用IPMVC(即，PU层级IPMVC)的代表性运动向量及代表性参考索引以确定是否将A₁空间合并候选者及B₁空间合并候选者包含于合并候选者列表中。In some examples where the video coder is determining the IPMVC, the video coder may derive the PU-level IPMVC from the center position of the corresponding region of the current PU, as defined in 3D-HEVC Test Model 4. As described in the example of FIG 17 , the video coder may use the representative motion vector and the representative reference index of the IPMVC (i.e., the PU-level IPMVC) to determine whether to include the _A1 spatial merge candidate and the _B1 spatial merge candidate in the merge candidate list.

在视频译码器正确定IPMVC的另一实例中，视频译码器可基于当前PU的视差向量确定视图间参考图片中的参考块。视频译码器可接着确定覆盖参考块的中心像素的子PU(即，最接近于参考块的中心像素的子PU)。在此实例中，视频译码器可确定PU层级IPMVC指定参考块的所确定子PU的运动参数。如本发明中在其它地方所指示，视频译码器可以不同方式确定最接近于参考块的中心像素的子PU。举例来说，假定子PU大小为2^Ux2^U，最接近于参考块的中心像素的子PU包含相对于参考块的左上方样本具有以下坐标的像素：(((nPSW>>(u+1))-1)<<u,(((nPSH>>(u+1))-1)<<u)。替代性地，最接近于参考块的中心像素的子PU包含相对于参考块的左上方样本具有以下坐标的像素：((nPSW>>(u+1))<<u,(nPSH>>(u+1))<<u)。在这些方程式中，nPSW及nPSH分别为当前PU的明度预测块的宽度及高度。在此实例中，视频译码器可使用所确定子PU的运动参数作为PU层级IPMVC。PU层级IPMVC可指定IPMVC的代表性运动向量及代表性参考索引。以此方式，视频译码器可使用最接近于参考块的中心像素的子PU的运动参数以确定PU层级IPMVC。换句话说，视频译码器可从当前PU的对应区的中心位置导出PU层级IPMVC，并基于PU层级IPMVC确定是否将空间合并候选者包含于候选者列表中。从子PU使用的运动参数可与视频译码器用于建立IPMVC的运动参数相同。In another example where the video coder is determining the IPMVC, the video coder may determine a reference block in an inter-view reference picture based on the disparity vector of the current PU. The video coder may then determine a sub-PU that covers a center pixel of the reference block (i.e., a sub-PU that is closest to the center pixel of the reference block). In this example, the video coder may determine motion parameters of the determined sub-PU for the reference block specified by the PU-level IPMVC. As indicated elsewhere in this disclosure, the video coder may determine the sub-PU that is closest to the center pixel of the reference block in different ways. For example, assuming a sub-PU size of 2 ^U x 2 ^U , the sub-PU closest to the center pixel of the reference block includes the pixel with the following coordinates relative to the top left sample of the reference block: (((nPSW >> (u+1))-1) <<u, (((nPSH >> (u+1))-1) <<u). Alternatively, the sub-PU closest to the center pixel of the reference block includes the pixel with the following coordinates relative to the top left sample of the reference block: ((nPSW >> (u+1)) <<u, (nPSH >> (u+1)) <<u). In these equations, nPSW and nPSH are the width and height of the luma prediction block of the current PU, respectively. In this example, the video coding The video coder can use the determined motion parameters of the sub-PU as the PU-level IPMVC. The PU-level IPMVC can specify the representative motion vector and representative reference index of the IPMVC. In this way, the video coder can use the motion parameters of the sub-PU closest to the center pixel of the reference block to determine the PU-level IPMVC. In other words, the video coder can derive the PU-level IPMVC from the center position of the corresponding region of the current PU and determine whether to include the spatial merge candidate in the candidate list based on the PU-level IPMVC. The motion parameters used from the sub-PU can be the same as the motion parameters used by the video coder to establish the IPMVC.

在视频译码器正确定纹理合并候选者的一些实例中，从中心子PU使用以用于默认运动参数的运动信息可与用于建立PU层级运动参数继承(MPI)候选者的运动信息相同。视频译码器可基于PU层级MPI候选者确定是否将特定空间合并候选者包含于合并候选者列表中。举例来说，如果A₁空间合并候选者与PU层级MPI候选者具有相同运动向量及相同参考索引，则视频译码器并不将A₁空间合并候选者插入到合并候选者列表。类似地，如果B₁空间合并候选者与A₁空间合并候选者或PU层级MPI候选者具有相同运动向量及相同参考索引，则视频译码器并不将B₁插入到合并候选者列表。In some examples where the video coder is determining texture merge candidates, the motion information used from the center sub-PU for default motion parameters may be the same as the motion information used to establish the PU-level motion parameter inheritance (MPI) candidate. The video coder may determine whether to include a particular spatial merge candidate in the merge candidate list based on the PU-level MPI candidate. For example, if the _A1 spatial merge candidate has the same motion vector and the same reference index as the PU-level MPI candidate, the video coder does not insert the _A1 spatial merge candidate into the merge candidate list. Similarly, if the _B1 spatial merge candidate has the same motion vector and the same reference index as the _A1 spatial merge candidate or the PU-level MPI candidate, the video coder does not insert _B1 into the merge candidate list.

在图19的实例中，视频译码器可针对当前PU的当前子PU确定参考图片中的参考样本位置(354)。参考图片可在不同于含有当前PU的图片(即，当前图片)的视图中。在一些实例中，视频译码器可通过将当前PU的视差向量添加到当前子PU的中心像素的坐标而确定参考位置。在其它实例中(例如，当当前PU为深度PU时)，参考样本位置可与当前深度PU的预测块的样本共置。In the example of FIG19 , the video coder may determine a reference sample location in a reference picture for a current sub-PU of the current PU (354). The reference picture may be in a different view than the picture containing the current PU (i.e., the current picture). In some examples, the video coder may determine the reference location by adding the disparity vector of the current PU to the coordinates of the center pixel of the current sub-PU. In other examples (e.g., when the current PU is a depth PU), the reference sample location may be co-located with samples of the prediction block of the current depth PU.

另外，视频译码器可确定当前子PU的参考块(356)。参考块可为参考图片的PU且可覆盖所确定参考样本位置。接下来，视频译码器可确定参考块是否经使用运动补偿预测译码(358)。举例来说，如果使用帧内预测译码参考块，则视频译码器可确定参考块未经使用运动补偿预测译码。如果使用运动补偿预测译码参考块，则参考块具有一或多个运动向量。In addition, the video coder may determine a reference block for the current sub-PU (356). The reference block may be a PU of a reference picture and may cover the determined reference sample location. Next, the video coder may determine whether the reference block was coded using motion compensated prediction (358). For example, if the reference block was coded using intra prediction, the video coder may determine that the reference block was not coded using motion compensated prediction. If the reference block was coded using motion compensated prediction, the reference block has one or more motion vectors.

响应于确定参考块经使用运动补偿预测译码(358的“是”)，视频译码器可基于参考块的运动参数设定当前子PU的运动参数(360)。举例来说，视频译码器可将当前子PU的RefPicList0运动向量设定为参考块的RefPicList0运动向量，可将当前子PU的RefPicList0参考索引设定为参考块的RefPicList0参考索引，可将当前子PU的RefPicList1运动向量设定为参考块的RefPicList1运动向量，且可将当前子PU的RefPicList1参考索引设定为参考块的RefPicList1参考索引。In response to determining that the reference block is coded using motion compensated prediction ("YES" of 358), the video coder may set the motion parameters of the current sub-PU based on the motion parameters of the reference block (360). For example, the video coder may set the RefPicList0 motion vector of the current sub-PU to the RefPicList0 motion vector of the reference block, may set the RefPicList0 reference index of the current sub-PU to the RefPicList0 reference index of the reference block, may set the RefPicList1 motion vector of the current sub-PU to the RefPicList1 motion vector of the reference block, and may set the RefPicList1 reference index of the current sub-PU to the RefPicList1 reference index of the reference block.

另一方面，响应于确定参考块未经使用运动补偿预测译码(358的“否”)，视频译码器可将当前子PU的运动参数设定为默认运动参数(362)。因此，在图19的实例中，当当前子PU的参考块未经使用运动补偿预测译码时，视频译码器并不将当前子PU的运动参数设定为具有经使用运动补偿预测译码的参考块的最接近子PU的运动参数。实际上，视频译码器可将当前子PU的运动参数直接设定为默认运动参数。此操作可简化并加速译码过程的过程。On the other hand, in response to determining that the reference block is not coded using motion compensated prediction ("No" of 358), the video coder may set the motion parameters of the current sub-PU to default motion parameters (362). Therefore, in the example of FIG19, when the reference block of the current sub-PU is not coded using motion compensated prediction, the video coder does not set the motion parameters of the current sub-PU to the motion parameters of the closest sub-PU having a reference block coded using motion compensated prediction. In fact, the video coder may directly set the motion parameters of the current sub-PU to the default motion parameters. This operation can simplify and speed up the decoding process.

在设定当前子PU的运动参数之后，视频译码器可确定当前PU是否具有任何额外子PU(364)。响应于确定当前PU具有一或多个额外子PU(364的“是”)，视频译码器可通过将当前PU的子PU中的另一者作为当前子PU而执行动作354到364。以此方式，视频译码器可设定当前PU的子PU中的每一者的运动参数。另一方面，响应于确定并不存在当前PU的额外子PU(366的“否”)，视频译码器可将候选者(例如，IPMVC)包含于用于当前PU的合并候选者列表中(366)。候选者可指定当前PU的子PU中的每一者的运动参数。After setting the motion parameters of the current sub-PU, the video coder may determine whether the current PU has any additional sub-PUs (364). In response to determining that the current PU has one or more additional sub-PUs ("YES" of 364), the video coder may perform actions 354 to 364 by taking another of the sub-PUs of the current PU as the current sub-PU. In this manner, the video coder may set the motion parameters of each of the sub-PUs of the current PU. On the other hand, in response to determining that there are no additional sub-PUs of the current PU ("NO" of 366), the video coder may include a candidate (e.g., IPMVC) in the merge candidate list for the current PU (366). The candidate may specify the motion parameters of each of the sub-PUs of the current PU.

在一或多个实例中，所描述功能可以硬件、软件、固件或其任何组合来实施。如果以软件实施，则所述功能可作为一或多个指令或代码在计算机可读媒体上存储或发射，且由基于硬件的处理单元执行。计算机可读媒体可包含计算机可读存储媒体，其对应于有形媒体(例如，数据存储媒体)，或包含促进将计算机程序从一处传送到另一处(例如，根据通信协议)的任何媒体的通信媒体。以此方式，计算机可读媒体通常可对应于(1)有形计算机可读存储媒体，其是非暂时性的，或(2)通信媒体，例如信号或载波。数据存储媒体可为可由一或多个计算机或一个或多个处理器接入以检索用于实施本发明中所描述的技术的指令、代码及/或数据结构的任何可用媒体。计算机程序产品可包含计算机可读媒体。In one or more examples, the described functions may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored or transmitted as one or more instructions or codes on a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media, which corresponds to tangible media (e.g., data storage media), or communication media that includes any media that facilitates the transfer of a computer program from one place to another (e.g., according to a communication protocol). In this manner, computer-readable media may generally correspond to (1) tangible computer-readable storage media, which is non-transitory, or (2) communication media, such as a signal or carrier wave. Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, codes, and/or data structures for implementing the techniques described in this disclosure. A computer program product may include computer-readable media.

借助于实例而非限制，此类计算机可读存储媒体可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储器、磁盘存储器或其它磁性存储装置、快闪存储器或可用来存储指令或数据结构形式的所要程序代码且可由计算机接入的任何其它媒体。而且，可恰当地将任何连接称作计算机可读媒体。举例来说，如果使用同轴缆线、光纤缆线、双绞线、数字订户线(DSL)或例如红外线、无线电及微波的无线技术从网站、服务器或其它远程源发射指令，则同轴缆线、光纤缆线、双绞线、DSL或例如红外线、无线电及微波的无线技术包含在媒体的定义中。然而，应理解，所述计算机可读存储媒体及数据存储媒体并不包含连接、载波、信号或其它暂时媒体，而是实际上针对非暂时性有形存储媒体。如本文中所使用，磁盘及光盘包含压缩光盘(CD)、激光光盘、光学光盘、数字多功能光盘(DVD)、软性磁盘及蓝光光盘，其中磁盘通常以磁性方式再现数据，而光盘利用激光以光学方式再现数据。上文的组合也应包含在计算机可读媒体的范围内。By way of example, and not limitation, such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Moreover, any connection is properly referred to as a computer-readable medium. For example, if instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwaves, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwaves are included in the definition of medium. However, it should be understood that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transient media, but rather refer to non-transitory, tangible storage media. As used herein, disk and optical disc include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, and Blu-ray disc, where disks typically reproduce data magnetically, while optical discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

可由例如一或多个数字信号处理器(DSP)、通用微处理器、专用集成电路(ASIC)、现场可编程逻辑阵列(FPGA)或其它等效集成或离散逻辑电路的一或多个处理器来执行指令。因此，如本文中所使用的术语“处理器”可指上述结构或适于实施本文中所描述的技术的任何其它结构中的任一者。另外，在一些方面中，本文中所描述的功能性可在经配置以用于编码及解码的专用硬件及/或软件模块内提供，或并入在组合编解码器中。而且，可将所述技术完全实施于一或多个电路或逻辑元件中。Instructions may be executed by one or more processors, such as one or more digital signal processors (DSPs), general-purpose microprocessors, application-specific integrated circuits (ASICs), field-programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuits. Thus, the term "processor," as used herein, may refer to any of the aforementioned structures or any other structure suitable for implementing the techniques described herein. Additionally, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated into a combined codec. Moreover, the techniques may be fully implemented in one or more circuits or logic elements.

本发明的技术可实施于广泛多种装置或设备中，包含无线手持机、集成电路(IC)或IC集合(例如，芯片组)。本发明中描述各种组件、模块或单元是为了强调经配置以执行所揭示技术的装置的功能方面，但未必要求由不同硬件单元实现。实际上，如上文所描述，各种单元可结合合适软件及/或固件组合在编解码器硬件单元中，或由互操作硬件单元集来提供，所述硬件单元包含如上文所描述的一或多个处理器。The techniques of this disclosure can be implemented in a wide variety of devices or apparatuses, including wireless handsets, integrated circuits (ICs), or collections of ICs (e.g., chipsets). The various components, modules, or units described in this disclosure are intended to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require implementation by distinct hardware units. In fact, as described above, the various units may be combined in a codec hardware unit in conjunction with appropriate software and/or firmware, or provided by a set of interoperable hardware units, including one or more processors as described above.

已描述各种实例。这些及其它实例属于以下权利要求书的范围内。Various embodiments have been described. These and other embodiments are within the scope of the following claims.

Claims

1. A method for decoding multi-view video data, the method comprising:

The current prediction unit (PU) is divided into multiple sub-PUs, wherein the current PU is in the current image;

Determine the default motion parameters;

Process the sub-PUs from the plurality of sub-PUs in a specific order.

Wherein, for each corresponding subPU from the plurality of subPUs, if motion compensation prediction decoding of the reference block of the corresponding subPU is not used, then in response to subsequently determining the reference block of any later subPU in the order of motion compensation prediction decoding, the motion parameters of the corresponding subPU are not set, wherein motion compensation prediction decoding of the reference block of at least one of the subPUs is not used, and wherein processing the subPU includes, for each corresponding subPU from the plurality of subPUs:

Determine the reference block of the corresponding sub-PU, wherein the reference image contains the reference block of the corresponding sub-PU;

If motion compensation predictive decoding is used for the reference block of the corresponding sub-PU, then the motion parameters of the corresponding sub-PU are set based on the motion parameters of the reference block of the corresponding sub-PU; and

If motion compensation prediction decoding of the reference block of the corresponding sub-PU is not used, then the motion parameters of the corresponding sub-PU are set to the default motion parameters; and

The candidate is included in the candidate list of the current PU, wherein the candidate is based on the motion parameters of the plurality of subPUs;

Obtain the syntax element indicating the selected candidate from the candidate list from the bit stream; and

The motion parameters of the selected candidates are used to reconstruct the predictive block of the current PU.

2. The method according to claim 1, wherein the reference image is in a view different from the current image; and

Determining the reference block of the corresponding sub-PU includes determining the location of a reference sample in the reference image based on the disparity vector of the current PU, wherein the reference block of the corresponding sub-PU covers the location of the reference sample.

3. The method according to claim 1, wherein:

The current image is a depth view component, and the reference image is a texture view component in the same view and access unit as the current image; and

Determining the reference block of the corresponding sub-PU includes determining that the reference block of the corresponding sub-PU is the PU of the reference image co-located with the corresponding sub-PU.

4. The method according to claim 1, wherein:

Determining the default motion parameters includes setting the default motion parameters to motion parameters associated with a block of pixels covering a reference block, which is co-located with the current PU in the reference image and has the same size as the current PU.

5. The method of claim 4, wherein the pixel is the center pixel of the reference block.

6. The method of claim 1, further comprising determining whether to include spatial merging candidates in the candidate list based on PU-level motion vector candidates, wherein the PU-level motion vector candidates specify motion parameters that are the same as the default motion parameters.

7. The method according to claim 1, wherein:

The default motion parameters include default motion vectors and default reference indices, and

Determining the default motion parameters includes:

Set the default motion vector to zero; and

Set the default reference index to 0 or the first available time reference image in the current reference image list.

8. The method of claim 1, wherein determining the default motion parameter includes, in response to determining that a first subPU in the raster scan sequence of the plurality of subPUs has available motion parameters, setting the default motion parameter to the available motion parameter of the first subPU in the raster scan sequence of the plurality of subPUs.

9. The method of claim 1, wherein determining the default motion parameter includes, in response to determining that a first subPU comprising the subPU row containing the corresponding subPU has available motion parameters, setting the default motion parameter to the available motion parameters of the first subPU comprising the subPU row containing the corresponding subPU.

10. The method of claim 1, wherein the default motion parameters are the same for each of the plurality of sub-PUs.

11. The method of claim 1, wherein determining the default motion parameters comprises:

A reference block is determined in the reference image, the reference block having the same size as the prediction block of the current PU;

From the sub-PUs of the reference block, determine the sub-PU closest to the center pixel of the reference block; and

The default motion parameters are derived from the motion parameters of the determined sub-PU in the reference block.

12. The method of claim 11, wherein each of the sub-PUs of the reference block is 2 ^U x 2 ^U , and the sub-PU closest to the center pixel of the reference block comprises pixels with the following coordinates relative to the upper left sample of the reference block: (((nPSW>>(u+1))-1)<<u, (((nPSH>>(u+1))-1)<<u), where nPSW represents the width of the reference block in pixels and nPSH represents the height of the reference block in pixels.

13. The method of claim 11, wherein each of the sub-PUs of the reference block is 2 ^U x 2 ^U , and the sub-PU closest to the center pixel of the reference block comprises pixels having the following coordinates relative to the upper left sample of the reference block: ((nPSW>>(u+1))<<u,(nPSH>>(u+1))<<u), where nPSW represents the width of the reference block in pixels and nPSH represents the height of the reference block in pixels.

14. A method for encoding video data, the method comprising:

Determine the default motion parameters;

Process the sub-PUs from the plurality of sub-PUs in a specific order.

The candidates are included in the candidate list of the current PU, wherein the candidates are based on the motion parameters of the plurality of sub-PUs; and

The syntax element of the selected candidate in the candidate list is signaled in the bit stream.

15. The method of claim 14, wherein:

The reference image is in a view different from the current image; and

16. The method of claim 14, wherein:

17. The method of claim 14, wherein:

18. The method of claim 17, wherein the pixel is the center pixel of the reference block.

19. The method of claim 14, further comprising determining whether to include a spatial merging candidate in the candidate list based on PU-level motion vector candidates, wherein the PU-level motion vector candidates specify motion parameters identical to the default motion parameters.

20. The method of claim 14, wherein:

Determining the default motion parameters includes:

Set the default motion vector to zero; and

21. The method of claim 14, wherein determining the default motion parameter includes, in response to determining that a first subPU in the raster scan sequence of the plurality of subPUs has available motion parameters, setting the default motion parameter to the available motion parameters of the first subPU in the raster scan sequence of the plurality of subPUs.

22. The method of claim 14, wherein determining the default motion parameter includes, in response to determining that a first subPU comprising the subPU row containing the corresponding subPU has available motion parameters, setting the default motion parameter to the available motion parameters of the first subPU comprising the subPU row containing the corresponding subPU.

23. The method of claim 14, wherein the default motion parameters are the same for each of the plurality of sub-PUs.

24. The method of claim 14, wherein determining the default motion parameters comprises:

25. The method of claim 24, wherein each of the sub-PUs of the reference block is 2 ^U x 2 ^U , and the sub-PU closest to the center pixel of the reference block comprises pixels with the following coordinates relative to the upper left sample of the reference block: (((nPSW>>(u+1))-1)<<u, (((nPSH>>(u+1))-1)<<u), where nPSW represents the width of the reference block in pixels and nPSH represents the height of the reference block in pixels.

26. The method of claim 24, wherein each of the sub-PUs of the reference block is 2 ^U x 2 ^U , and the sub-PU closest to the center pixel of the reference block comprises pixels having the following coordinates relative to the upper left sample of the reference block: ((nPSW>>(u+1))<<u,(nPSH>>(u+1))<<u), where nPSW represents the width of the reference block in pixels and nPSH represents the height of the reference block in pixels.

27. An apparatus for decoding video data, the apparatus comprising:

A memory, used to store decoded images; and

One or more processors, configured to:

Determine the default motion parameters;

Process the sub-PUs from the plurality of sub-PUs in a specific order.

The candidates are included in the candidate list of the current PU, wherein the candidates are based on the motion parameters of the plurality of subPUs.

28. The apparatus according to claim 27, wherein:

The reference image is in a view different from the current image; and

The one or more processors are configured to determine the location of a reference sample in the reference image based on the disparity vector of the current PU, wherein the reference block of the corresponding sub-PU covers the location of the reference sample.

29. The apparatus according to claim 27, wherein:

The one or more processors are configured to determine that the reference block of the corresponding sub-PU is the PU of the reference image co-located with the corresponding sub-PU.

30. The apparatus according to claim 29, wherein:

The one or more processors are configured to set the default motion parameters as motion parameters associated with a block of pixels covering a reference block in the reference image, co-located with the current PU, and having the same size as the current PU.

31. The apparatus of claim 30, wherein the pixel is the center pixel of the reference block.

32. The apparatus of claim 27, wherein the one or more processors are configured to determine whether to include a spatial merging candidate in the candidate list based on PU-level motion vector candidates, wherein the PU-level motion vector candidates specify motion parameters that are the same as the default motion parameters.

33. The apparatus according to claim 27, wherein:

The one or more processors are configured to determine the default motion parameters at least in part by:

Set the default motion vector to zero; and

34. The apparatus of claim 27, wherein the one or more processors are configured such that, in response to determining that a first subPU in the raster scan sequence of the plurality of subPUs has available motion parameters, the one or more processors set the default motion parameters to the available motion parameters of the first subPU in the raster scan sequence of the plurality of subPUs.

35. The apparatus of claim 27, wherein the one or more processors are configured such that, in response to determining that a first subPU comprising the corresponding subPU row has available motion parameters, the one or more processors set the default motion parameters to the available motion parameters of the first subPU comprising the corresponding subPU row.

36. The apparatus of claim 27, wherein the default motion parameters are the same for each of the plurality of sub-PUs.

37. The apparatus of claim 27, wherein the one or more processors are configured to:

38. The apparatus of claim 37, wherein each of the sub-PUs of the reference block is 2 ^U x 2 ^U , and the sub-PU closest to the center pixel of the reference block comprises pixels having the following coordinates relative to the upper left sample of the reference block: (((nPSW>>(u+1))-1)<<u, (((nPSH>>(u+1))-1)<<u), where nPSW represents the width of the reference block in pixels and nPSH represents the height of the reference block in pixels.

39. The apparatus of claim 37, wherein each of the sub-PUs of the reference block is 2 ^U x 2 ^U , and the sub-PU closest to the center pixel of the reference block comprises pixels having the following coordinates relative to the upper left sample of the reference block: ((nPSW>>(u+1))<<u,(nPSH>>(u+1))<<u), where nPSW represents the width of the reference block in pixels and nPSH represents the height of the reference block in pixels.

40. The apparatus of claim 27, wherein the one or more processors are further configured to:

41. The apparatus of claim 27, wherein the one or more processors are configured to signal in the bit stream the syntax element indicating the selected candidate in the candidate list.

42. An apparatus for decoding video data, the apparatus comprising:

A means for dividing a current prediction unit (PU) into multiple sub-PUs, wherein the current PU is in the current image;

A device for determining default motion parameters;

A means for processing sub-PUs from the plurality of sub-PUs in a specific order, wherein for each corresponding sub-PU from the plurality of sub-PUs, if a reference block of the corresponding sub-PU is not used for motion compensation prediction decoding, then in response to subsequently determining a reference block for any later sub-PU in the order to be motion compensation prediction decoded, motion parameters of the corresponding sub-PU are not set, wherein a reference block for at least one of the sub-PUs is not used for motion compensation prediction decoding, and wherein the means for processing the sub-PUs includes, for each corresponding sub-PU from the plurality of sub-PUs:

A means for determining a reference block of the corresponding sub-PU, wherein the reference image includes the reference block of the corresponding sub-PU;

An apparatus for setting motion parameters of a corresponding sub-PU based on motion parameters of the reference block of the corresponding sub-PU if motion compensation predictive decoding of the reference block is used; and

A means for setting the motion parameters of the corresponding sub-PU to the default motion parameters if the reference block of the corresponding sub-PU is not used for motion compensation predictive decoding; and

A means for including candidates in the candidate list of the current PU, wherein the candidates are based on the motion parameters of the plurality of subPUs.

43. A non-transitory computer-readable data storage medium having instructions stored thereon, the instructions causing an apparatus to:

Determine the default motion parameters;

Process the sub-PUs from the plurality of sub-PUs in a specific order.