HK40000797B - Intra video coding using a decoupled tree structure - Google Patents
Intra video coding using a decoupled tree structure Download PDFInfo
- Publication number
- HK40000797B HK40000797B HK19124169.4A HK19124169A HK40000797B HK 40000797 B HK40000797 B HK 40000797B HK 19124169 A HK19124169 A HK 19124169A HK 40000797 B HK40000797 B HK 40000797B
- Authority
- HK
- Hong Kong
- Prior art keywords
- mode
- candidate list
- video
- chroma
- modes
- Prior art date
Links
Description
本申请案主张2016年8月15日申请的美国临时申请案第62/375,383号及2016年10月5日申请的美国临时申请案第62/404,572的权益,所述美国临时申请案中的每一者特此以全文引用的方式并入本文中。This application claims the benefits of U.S. Provisional Application No. 62/375,383, filed August 15, 2016, and U.S. Provisional Application No. 62/404,572, filed October 5, 2016, each of which is hereby incorporated herein by reference in its entirety.
技术领域Technical Field
本发明涉及视频译码。This invention relates to video decoding.
背景技术Background Technology
数字视频能力可并入至广泛范围的装置中,所述装置包含数字电视、数字直播系统、无线广播系统、个人数字助理(PDA)、膝上型或桌上型计算机、平板计算机、电子书阅读器、数字相机、数字记录装置、数字媒体播放器、视频游戏装置、视频游戏控制台、蜂窝或卫星无线电电话、所谓的“智能电话”、视频电话会议装置、视频流式处理装置等等。数字视频装置实施视频译码技术,例如描述于由各种视频译码标准定义的标准中的视频译码技术。视频译码标准包含ITU-T H.261、ISO/IEC MPEG-1Visual、ITU-T H.262或ISO/IEC MPEG-2Visual、ITU-T H.263、ISO/IEC MPEG-4Visual及ITU-T H.264(也被称为ISO/IEC MPEG-4AVC),包含其可伸缩视频译码(SVC)及多视图视频译码(MVC)扩展。Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital live broadcast systems, wireless broadcasting systems, personal digital assistants (PDAs), laptops or desktop computers, tablets, e-book readers, digital cameras, digital recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio phones, so-called "smartphones," video conferencing devices, video streaming devices, and so on. Digital video devices implement video decoding technologies, such as those described in standards defined by various video decoding standards. Video decoding standards include ITU-T H.261, ISO/IEC MPEG-1 Visual, ITU-T H.262 or ISO/IEC MPEG-2 Visual, ITU-T H.263, ISO/IEC MPEG-4 Visual, and ITU-T H.264 (also known as ISO/IEC MPEG-4 AVC), including their Scalable Video Decoding (SVC) and Multi-View Video Decoding (MVC) extensions.
另外,已由ITU-T视频译码专家组(VCEG)及ISO/IEC动画专家组(MPEG)的视频译码联合合作小组(JCT-VC)新近开发出新的视频译码标准(即,高效率视频译码(HEVC))。最新的HEVC草案规范,在下文被称作“HEVC WD”,可从http://phenix.int-evry.fr/jct/doc_end_user/documents/14_Vienna/wg11/JCTVC-N1003-v1.zip获得。HEVC的规范及其扩展(格式范围(RExt)、可伸缩性(SHVC)及多视图(MV-HEVC)扩展及屏幕内容扩展)可从http://phenix.int-evry.fr/jct/doc_end_user/current_document.php?id=10481获得。ITU-TVCEG(Q6/16)及ISO/IEC MPEG(JTC 1/SC 29/WG 11)现在正研究对于将压缩能力显著超过当前HEVC标准(包含其当前扩展及针对屏幕内容译码及高动态范围译码的近期扩展)的压缩能力的未来视频译码技术标准化的潜在需要。In addition, a new video decoding standard (i.e., High Efficiency Video Decoding (HEVC)) has recently been developed by the Joint Collaborative Video Decoding Group (JCT-VC) of the ITU-T Video Decoding Experts Group (VCEG) and the ISO/IEC Animation Experts Group (MPEG). The latest HEVC draft specification, referred to below as "HEVC WD," is available at http://phenix.int-evry.fr/jct/doc_end_user/documents/14_Vienna/wg11/JCTVC-N1003-v1.zip. The HEVC specification and its extensions (Format Range (RExt), Scalability (SHVC), Multi-View (MV-HEVC) extensions, and Screen Content Extensions) are available at http://phenix.int-evry.fr/jct/doc_end_user/current_document.php?id=10481. ITU-TVCEG (Q6/16) and ISO/IEC MPEG (JTC 1/SC 29/WG 11) are now studying the potential need to standardize future video decoding technologies with compression capabilities that significantly exceed those of the current HEVC standard (including its current extensions and recent extensions for screen content decoding and high dynamic range decoding).
所述专家组正共同致力于联合合作工作(被称为联合视频探索小组(Joint VideoExploration Team,JVET))中的此探索活动,以评估由所述专家组在此领域中的专家建议的压缩技术设计。JVET在2015年10月19日至21日期间第一次会面。参考软件的最新版本(即,联合探索模型3(JEM 3))可从https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/tags/HM-16.6-JEM-3.0/下载。JEM3的算法描述进一步描述于J.Chen、E.Alshina、G.J.Sullivan、J.-R.Ohm、J.Boyce(JVET-C1001,日内瓦,2016年1月)的“Algorithm description of Joint Exploration Test Model 3”中。The expert group is working together on this exploration activity within a joint collaborative effort (known as the Joint Video Exploration Team (JVET)) to evaluate compression technology designs recommended by the group's experts in this field. JVET held its first meeting between October 19 and 21, 2015. The latest version of the reference software (i.e., Joint Exploration Model 3 (JEM 3)) is available for download at https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/tags/HM-16.6-JEM-3.0/. The algorithm description for JEM3 is further described in "Algorithm description of Joint Exploration Test Model 3" by J. Chen, E. Alshina, G.J. Sullivan, J.-R. Ohm, and J. Boyce (JVET-C1001, Geneva, January 2016).
视频装置可通过实施此类视频译码技术来更有效地发射、接收、编码、解码及/或存储数字视频信息。视频译码技术包含空间(图片内)预测及/或时间(图片间)预测以减少或移除视频序列中固有的冗余。对于基于块的视频译码,可将视频切片(例如,视频帧或视频帧的部分)分割为视频块,对于一些技术,视频块也可被称作树型块、译码单元(CU)及/或译码节点。图片的经帧内译码(I)切片中的视频块是使用相对于同一图片中的相邻块中的参考样本的空间预测进行编码。图片的经帧间译码(P或B)切片中的视频块可使用相对于同一图片中的相邻块中的参考样本的空间预测或相对于其它参考图片中的参考样本的时间预测。图片可被称作帧,且参考图片可被称作参考帧。Video devices can more efficiently transmit, receive, encode, decode, and/or store digital video information by implementing such video decoding techniques. Video decoding techniques include spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove inherent redundancy in video sequences. For block-based video decoding, video slices (e.g., video frames or portions of video frames) can be divided into video blocks, which, in some techniques, may also be referred to as tree blocks, decoding units (CUs), and/or decoding nodes. Video blocks in an intra-frame decoded (I) slice of a picture are encoded using spatial prediction relative to reference samples in adjacent blocks within the same picture. Video blocks in an inter-frame decoded (P or B) slice of a picture can use spatial prediction relative to reference samples in adjacent blocks within the same picture or temporal prediction relative to reference samples in other reference pictures. A picture may be referred to as a frame, and a reference picture may be referred to as a reference frame.
空间或时间预测产生用于待译码块的预测性块。残余数据表示待译码的原始块与预测性块之间的像素差。根据指向形成预测性块的参考样本的块的运动向量及指示经译码块与预测性块之间的差的残余数据来编码经帧间译码块。根据帧内译码模式及残余数据来编码经帧内译码块。为进行进一步压缩,可将残余数据从像素域变换至变换域,从而产生残余变换系数,可接着量化所述残余变换系数。可扫描最初布置为二维阵列的经量化变换系数以便产生变换系数的一维向量,且可应用熵译码以实现甚至更多的压缩。Spatial or temporal prediction generates predictive blocks for the block to be decoded. Residual data represents the pixel difference between the original block and the predictive block. Inter-frame decoded blocks are encoded based on the motion vector of the block pointing to the reference sample forming the predictive block and the residual data indicating the difference between the decoded block and the predictive block. Intra-frame decoded blocks are encoded based on the intra-frame decoding mode and the residual data. For further compression, the residual data can be transformed from the pixel domain to the transform domain to generate residual transform coefficients, which can then be quantized. The quantized transform coefficients, initially arranged as a two-dimensional array, can be scanned to generate a one-dimensional vector of transform coefficients, and entropy decoding can be applied to achieve even more compression.
发明内容Summary of the Invention
一般来说,本发明描述与使用帧内预测(在一些情况下根据提供亮度分量及色度分量的不同拆分信息的树结构)译码(例如,解码或编码)视频数据相关的技术。即,根据与所描述技术兼容的各种分割方案,亮度分割树结构可从对应色度分割树结构解耦。所描述技术可用于高级视频编解码器的情形中,例如HEVC的扩展或下一代视频译码标准。Generally, this invention describes techniques related to decoding (e.g., decoding or encoding) video data using intra-frame prediction (in some cases, based on a tree structure providing different splitting information for the luma and chroma components). That is, the luma segmentation tree structure can be decoupled from the corresponding chroma segmentation tree structure, depending on various segmentation schemes compatible with the described techniques. The described techniques can be used in the context of advanced video codecs, such as extensions to HEVC or next-generation video decoding standards.
在一个实例中,一种用于译码视频数据的装置包含存储器及与所述存储器通信的处理电路。所述装置的所述存储器经配置以存储视频数据。所述处理电路经配置以确定可用于预测存储至所述存储器的所述视频数据的亮度块的多个导出模式(DM)还可用于预测存储至所述存储器的所述视频数据的色度块,所述色度块对应于所述亮度块。所述处理电路经进一步配置以形成关于所述色度块的预测模式的候选者列表,所述候选者列表包含可用于预测所述色度块的所述多个DM中的一或多个DM。所述处理电路经进一步配置以确定使用所述候选者列表的所述一或多个DM中的任何DM来译码所述色度块,及译码基于使用所述候选者列表的所述一或多个DM中的任何DM来译码所述色度块的所述确定而译码识别将用于译码所述色度块的所述候选者列表的选定DM的指示。所述处理电路经进一步配置以根据所述候选者列表的所述选定DM来译码所述色度块。In one example, an apparatus for decoding video data includes a memory and processing circuitry in communication with the memory. The memory of the apparatus is configured to store video data. The processing circuitry is configured to determine a plurality of derived modes (DMs) that can be used to predict luminance blocks of the video data stored in the memory, and also to predict chroma blocks of the video data stored in the memory, the chroma blocks corresponding to the luminance blocks. The processing circuitry is further configured to form a candidate list of prediction modes for the chroma blocks, the candidate list containing one or more DMs that can be used to predict the chroma blocks. The processing circuitry is further configured to determine to decode the chroma blocks using any of the one or more DMs in the candidate list, and to decode an indication identifying a selected DM from the candidate list that will be used to decode the chroma blocks based on the determination to decode the chroma blocks using any of the one or more DMs in the candidate list. The processing circuitry is further configured to decode the chroma blocks according to the selected DM in the candidate list.
在另一实例中,一种译码视频数据的方法包含确定可用于预测所述视频数据的亮度块的多个导出模式(DM)还可用于预测所述视频数据的色度块,所述色度块对应于所述亮度块。所述方法进一步包含:形成关于所述色度块的预测模式的候选者列表,所述候选者列表包含可用于预测所述色度块的所述多个DM中的一或多个DM;及确定使用所述候选者列表的所述一或多个DM中的任何DM来译码所述色度块。所述方法进一步包含:基于使用所述候选者列表的所述一或多个DM中的任何DM来译码所述色度块的所述确定而译码识别将用于译码所述色度块的所述候选者列表的选定DM的指示;及根据所述候选者列表的所述选定DM来译码所述色度块。In another example, a method for decoding video data includes determining a plurality of derived modes (DMs) that can be used to predict luminance blocks of the video data and also to predict chroma blocks of the video data, the chroma blocks corresponding to the luminance blocks. The method further includes: forming a candidate list of prediction modes for the chroma blocks, the candidate list containing one or more DMs that can be used to predict the chroma blocks; and determining to use any of the one or more DMs in the candidate list to decode the chroma blocks. The method further includes: based on the determination to use any of the one or more DMs in the candidate list to decode the chroma blocks, decoding an indication of a selected DM in the candidate list that will be used to decode the chroma blocks; and decoding the chroma blocks according to the selected DM in the candidate list.
在另一实例中,一种设备包含用于确定可用于预测所述视频数据的亮度块的多个导出模式(DM)还可用于预测所述视频数据的色度块的装置,所述色度块对应于所述亮度块。所述方法进一步包含:形成关于所述色度块的预测模式的候选者列表,所述候选者列表包含可用于预测所述色度块的所述多个DM中的一或多个DM;及确定使用所述候选者列表的所述一或多个DM中的任何DM来译码所述色度块。所述设备进一步包含:用于形成关于所述色度块的预测模式的候选者列表的装置,所述候选者列表包含可用于预测所述色度块的所述多个DM中的一或多个DM;及用于确定使用所述候选者列表的所述一或多个DM中的任何DM来译码所述色度块的装置。所述设备进一步包含:用于基于使用所述候选者列表的所述一或多个DM中的任何DM来译码所述色度块的所述确定而译码识别将用于译码所述色度块的所述候选者列表的选定DM的指示的装置;及用于根据所述候选者列表的所述选定DM来译码所述色度块的装置。In another example, an apparatus includes means for determining a plurality of derived modes (DMs) that can be used to predict luminance blocks of video data, and means for predicting chroma blocks of the video data corresponding to the luminance blocks. The method further includes: forming a candidate list of prediction modes for the chroma blocks, the candidate list containing one or more DMs that can be used to predict the chroma blocks; and determining to decode the chroma blocks using any of the one or more DMs in the candidate list. The apparatus further includes: means for forming a candidate list of prediction modes for the chroma blocks, the candidate list containing one or more DMs that can be used to predict the chroma blocks; and means for determining to decode the chroma blocks using any of the one or more DMs in the candidate list. The device further includes: means for decoding an indication of a selected DM in the candidate list to be used for decoding the chroma block based on the determination of decoding the chroma block using any DM in the candidate list; and means for decoding the chroma block according to the selected DM in the candidate list.
在另一实例中,一种非暂时性计算机可读存储媒体被编码有指令,所述指令在执行时致使计算装置的处理器确定可用于预测所述视频数据的亮度块的多个导出模式(DM)还可用于预测所述视频数据的色度块,所述色度块对应于所述亮度块。所述指令在执行时进一步致使所述处理器进行以下操作:形成关于所述色度块的预测模式的候选者列表,所述候选者列表包含可用于预测所述色度块的所述多个DM中的一或多个DM;及确定使用所述候选者列表的所述一或多个DM中的任何DM来译码所述色度块。所述指令在执行时进一步致使所述处理器进行以下操作:形成关于所述色度块的预测模式的候选者列表,所述候选者列表包含可用于预测所述色度块的所述多个DM中的一或多个DM;及确定使用所述候选者列表的所述一或多个DM中的任何DM来译码所述色度块。所述指令在执行时进一步致使所述处理器进行以下操作:基于使用所述候选者列表的所述一或多个DM中的任何DM来译码所述色度块的所述确定而译码识别将用于译码所述色度块的所述候选者列表的选定DM的指示;及根据所述候选者列表的所述选定DM来译码所述色度块。In another example, a non-transitory computer-readable storage medium is encoded with instructions that, upon execution, cause a processor of a computing device to determine that a plurality of derived modes (DMs) for predicting luminance blocks of video data can also be used to predict chroma blocks of the video data, the chroma blocks corresponding to the luminance blocks. Upon execution, the instructions further cause the processor to: form a candidate list of prediction modes for the chroma blocks, the candidate list containing one or more of the plurality of DMs for predicting the chroma blocks; and determine to use any of the one or more DMs in the candidate list to decode the chroma blocks. Upon execution, the instructions further cause the processor to: form a candidate list of prediction modes for the chroma blocks, the candidate list containing one or more of the plurality of DMs for predicting the chroma blocks; and determine to use any of the one or more DMs in the candidate list to decode the chroma blocks. When executed, the instructions further cause the processor to perform the following operations: to decode an indication of a selected DM in the candidate list to be used for decoding the chroma block based on the determination to decode the chroma block using any DM in the candidate list; and to decode the chroma block according to the selected DM in the candidate list.
在另一实例中,一种用于译码视频数据的装置包含存储器及与所述存储器通信的处理电路。所述装置的所述存储器经配置以存储视频数据。所述处理电路经配置以进行以下操作:针对存储至所述存储器的所述视频数据的色度块形成最可能模式(MPM)候选者列表,使得所述MPM候选者列表包含与相关联于所述色度块的所述视频数据的亮度块相关联的一或多个导出模式(DM),及可用于译码所述视频数据的亮度分量的多个亮度预测模式。所述处理电路经进一步配置以进行以下操作:从所述MPM候选者列表选择模式;及根据选自所述MPM候选者列表的所述模式来译码所述色度块。In another example, an apparatus for decoding video data includes a memory and processing circuitry in communication with the memory. The memory of the apparatus is configured to store video data. The processing circuitry is configured to: form a most probable mode (MPM) candidate list for chroma blocks of the video data stored in the memory, such that the MPM candidate list includes one or more derived modes (DMs) associated with luminance blocks of the video data associated with the chroma blocks, and a plurality of luminance prediction modes that can be used to decode the luminance components of the video data. The processing circuitry is further configured to: select a mode from the MPM candidate list; and decode the chroma blocks according to the mode selected from the MPM candidate list.
在另一实例中,一种译码视频数据的方法包含:针对所述视频数据的色度块形成最可能模式(MPM)候选者列表,使得所述MPM候选者列表包含与相关联于所述色度块的所述视频数据的亮度块相关联的一或多个导出模式(DM),及可用于译码所述视频数据的亮度分量的多个亮度预测模式。所述方法进一步包含:从所述MPM候选者列表选择模式;及根据选自所述MPM候选者列表的所述模式来译码所述色度块。In another example, a method for decoding video data includes: forming a most probable mode (MPM) candidate list for chroma blocks of the video data, such that the MPM candidate list includes one or more derived modes (DMs) associated with luminance blocks of the video data associated with the chroma blocks, and a plurality of luminance prediction modes that can be used to decode the luminance components of the video data. The method further includes: selecting a mode from the MPM candidate list; and decoding the chroma blocks according to the mode selected from the MPM candidate list.
在另一实例中,一种设备包含用于进行以下操作的装置:针对所述视频数据的色度块形成最可能模式(MPM)候选者列表,使得所述MPM候选者列表包含与相关联于所述色度块的所述视频数据的亮度块相关联的一或多个导出模式(DM),及可用于译码所述视频数据的亮度分量的多个亮度预测模式。所述设备进一步包含:用于从所述MPM候选者列表选择模式的装置;及用于根据选自所述MPM候选者列表的所述模式来译码所述色度块的装置。In another example, an apparatus includes means for performing the following operations: forming a most probable mode (MPM) candidate list for chroma blocks of the video data, such that the MPM candidate list includes one or more derived modes (DMs) associated with luminance blocks of the video data associated with the chroma blocks, and a plurality of luminance prediction modes for decoding the luminance components of the video data. The apparatus further includes: means for selecting a mode from the MPM candidate list; and means for decoding the chroma blocks according to the modes selected from the MPM candidate list.
在另一实例中,一种非暂时性计算机可读存储媒体被编码有指令,所述指令在执行时致使计算装置的处理器进行以下操作:针对存储至所述存储器的所述视频数据的色度块形成最可能模式(MPM)候选者列表,使得所述MPM候选者列表包含与相关联于所述色度块的所述视频数据的亮度块相关联的一或多个导出模式(DM),及可用于译码所述视频数据的亮度分量的多个亮度预测模式。所述指令在执行时致使所述计算装置的所述处理器进行以下操作:从所述MPM候选者列表选择模式;及根据选自所述MPM候选者列表的所述模式来译码所述色度块。In another example, a non-transitory computer-readable storage medium is encoded with instructions that, when executed, cause a processor of a computing device to: form a most probable mode (MPM) candidate list for chroma blocks of video data stored in the memory, such that the MPM candidate list includes one or more derived modes (DMs) associated with luminance blocks of the video data associated with the chroma blocks, and a plurality of luminance prediction modes that can be used to decode the luminance components of the video data. When executed, the instructions cause the processor of the computing device to: select a mode from the MPM candidate list; and decode the chroma blocks according to the mode selected from the MPM candidate list.
在以下附图及具体实施方式中阐述一或多个实例的细节。其它特征、目标及优点将从具体实施方式及附图以及从权利要求书显而易见。Details of one or more examples are set forth in the following figures and detailed descriptions. Other features, objectives, and advantages will be apparent from the detailed descriptions, the figures, and the claims.
附图说明Attached Figure Description
图1为绘示可经配置以执行本发明的技术的实例视频编码及解码系统的框图。Figure 1 is a block diagram illustrating an example video encoding and decoding system that can be configured to perform the technology of the present invention.
图2为绘示可经配置以执行本发明的技术的视频编码器的实例的框图。Figure 2 is a block diagram illustrating an example of a video encoder that can be configured to perform the technology of the present invention.
图3为绘示可经配置以执行本发明的技术的视频解码器的实例的框图。Figure 3 is a block diagram illustrating an example of a video decoder that can be configured to perform the technology of the present invention.
图4为绘示帧内预测的方面的概念图。Figure 4 is a conceptual diagram illustrating aspects of intra-frame prediction.
图5为绘示用于亮度块的帧内预测模式的概念图。Figure 5 is a conceptual diagram illustrating the intra-prediction mode used for luma blocks.
图6为绘示平面模式的方面的概念图。Figure 6 is a conceptual diagram illustrating aspects of the planar pattern.
图7为绘示根据HEVC的角度模式的方面的概念图。Figure 7 is a conceptual diagram illustrating aspects of the HEVC perspective model.
图8为绘示图片中的标称竖直及水平位置亮度样本及色度样本的实例的概念图。Figure 8 is a conceptual diagram illustrating examples of nominal vertical and horizontal brightness and chromaticity samples in an image.
图9为绘示用于导出在根据线性模型(LM)模式的预测中所使用的参数的样本的位置的概念图。Figure 9 is a conceptual diagram illustrating the location of samples used to derive the parameters used in predictions based on the linear model (LM) pattern.
图10为绘示四叉树二叉树(QTBT)结构的概念图。Figure 10 is a conceptual diagram illustrating the structure of a quadtree binary tree (QTBT).
图11A及11B绘示用于根据QTBT分割方案的对应亮度块及色度块的独立分割结构的实例。Figures 11A and 11B illustrate examples of independent segmentation structures for corresponding luma and chroma blocks according to the QTBT segmentation scheme.
图12A及12B绘示根据本发明的一或多个方面的用于色度预测模式的自适应排序的相邻块选择。Figures 12A and 12B illustrate adjacent block selection for adaptive sorting of chromaticity prediction modes according to one or more aspects of the present invention.
图13A及13B为绘示视频编码装置及解码装置可用于根据上文所描述的基于多个DM模式选择的技术来选择色度帧内预测模式的块位置的实例的概念图。Figures 13A and 13B are conceptual diagrams illustrating an example of a video encoding and decoding device that can be used to select the block position of a chroma intra-frame prediction mode according to the technique based on multiple DM mode selection described above.
图14为绘示根据本发明的方面的视频解码装置的处理电路可执行的实例过程的流程图。Figure 14 is a flowchart illustrating an example process that can be executed by the processing circuit of a video decoding apparatus according to an aspect of the present invention.
图15为绘示根据本发明的方面的视频编码装置的处理电路可执行的实例过程的流程图。Figure 15 is a flowchart illustrating an example process that can be executed by the processing circuit of a video encoding apparatus according to an aspect of the present invention.
图16为绘示根据本发明的方面的视频解码装置的处理电路可执行的实例过程的流程图。Figure 16 is a flowchart illustrating an example process executable by the processing circuitry of a video decoding apparatus according to an aspect of the present invention.
图17为绘示根据本发明的方面的视频编码装置的处理电路可执行的实例过程的流程图。Figure 17 is a flowchart illustrating an example process that can be executed by the processing circuit of a video encoding apparatus according to an aspect of the present invention.
具体实施方式Detailed Implementation
图1为绘示可经配置以执行关于运动向量预测的本发明的技术的实例视频编码及解码系统10的框图。如图1中所展示,系统10包含源装置12,其提供稍后将由目的地装置14解码的经编码视频数据。具体地说,源装置12经由计算机可读媒体16将视频数据提供至目的地装置14。源装置12及目的地装置14可包括广泛范围的装置中的任一者,包含桌上型计算机、笔记本(即,膝上型)计算机、平板计算机、机顶盒、电话手机(例如所谓的“智能”电话)、所谓的“智能”平板、电视、相机、显示装置、数字媒体播放器、视频游戏控制台、视频流式处理装置等等。在一些情况下,源装置12及目的地装置14可经装备以用于无线通信。Figure 1 is a block diagram illustrating an example video encoding and decoding system 10 that can be configured to perform motion vector prediction according to the technology of the present invention. As shown in Figure 1, system 10 includes a source device 12 that provides encoded video data that will later be decoded by a destination device 14. Specifically, the source device 12 provides the video data to the destination device 14 via a computer-readable medium 16. The source device 12 and the destination device 14 may include any of a wide range of devices, including desktop computers, laptop computers, tablet computers, set-top boxes, mobile phones (e.g., so-called "smartphones"), so-called "smart" tablets, televisions, cameras, display devices, digital media players, video game consoles, video streaming devices, and the like. In some cases, the source device 12 and the destination device 14 may be equipped for wireless communication.
目的地装置14可经由计算机可读媒体16接收待解码的经编码视频数据。计算机可读媒体16可包括能够将经编码视频数据从源装置12移动至目的地装置14的任何类型的媒体或装置。在一个实例中,计算机可读媒体16可包括通信媒体以使源装置12能够实时地将经编码视频数据直接发射至目的地装置14。可根据通信标准(例如无线通信协议)调制经编码视频数据,且将其发射至目的地装置14。通信媒体可包括任何无线或有线通信媒体,例如射频(RF)频谱或一或多个物理传输线。通信媒体可形成基于数据包的网络(例如局域网、广域网或全局网络,例如因特网)的部分。通信媒体可包含路由器、交换机、基站,或可适用于促进从源装置12至目的地装置14的通信的任何其它设备。Destination device 14 may receive encoded video data to be decoded via computer-readable medium 16. Computer-readable medium 16 may include any type of media or device capable of moving encoded video data from source device 12 to destination device 14. In one example, computer-readable medium 16 may include communication media enabling source device 12 to transmit encoded video data directly to destination device 14 in real time. The encoded video data may be modulated according to a communication standard (e.g., a wireless communication protocol) and transmitted to destination device 14. Communication media may include any wireless or wired communication media, such as radio frequency (RF) spectrum or one or more physical transmission lines. Communication media may form part of a packet-based network (e.g., a local area network, wide area network, or global network, such as the Internet). Communication media may include routers, switches, base stations, or any other devices suitable for facilitating communication from source device 12 to destination device 14.
在一些实例中,经编码数据可从输出接口22输出至存储装置。类似地,经编码数据可由输入接口从存储装置存取。存储装置可包含多种分布式或本地存取数据存储媒体中的任一者,例如硬盘驱动器、蓝光(Blu-ray)光盘、DVD、CD-ROM、闪速存储器、易失性或非易失性存储器,或用于存储经编码视频数据的任何其它合适的数字存储媒体。在再一实例中,存储装置可对应于可存储由源装置12产生的经编码视频的文件服务器或另一中间存储装置。目的地装置14可经由流式处理或下载从存储装置存取存储的视频数据。文件服务器可为能够存储经编码视频数据且将所述经编码视频数据发射至目的地装置14的任何类型的服务器。实例文件服务器包含网页服务器(例如,用于网站)、FTP服务器、网络连接存储(NAS)装置或本地磁盘驱动器。目的地装置14可经由任何标准数据连接(包含因特网连接)而存取经编码视频数据。此连接可包含适合于存取存储于文件服务器上的经编码视频数据的无线信道(例如,Wi-Fi连接)、有线连接(例如,DSL、电缆调制解调器等等)或两者的组合。来自存储装置的经编码视频数据的发射可为流式处理发射、下载发射或其组合。In some instances, encoded data can be output from output interface 22 to a storage device. Similarly, encoded data can be accessed from a storage device via an input interface. The storage device can comprise any of a variety of distributed or locally accessible data storage media, such as hard disk drives, Blu-ray discs, DVDs, CD-ROMs, flash memory, volatile or non-volatile memory, or any other suitable digital storage media for storing encoded video data. In yet another instance, the storage device can correspond to a file server or another intermediate storage device capable of storing encoded video generated by source device 12. Destination device 14 can access the stored video data from the storage device via streaming or downloading. The file server can be any type of server capable of storing and transmitting the encoded video data to destination device 14. Example file servers include web servers (e.g., for websites), FTP servers, network-attached storage (NAS) devices, or local disk drives. Destination device 14 can access the encoded video data via any standard data connection, including an Internet connection. This connection may include a wireless channel (e.g., Wi-Fi connection), a wired connection (e.g., DSL, cable modem, etc.), or a combination of both, suitable for accessing encoded video data stored on a file server. Transmission of encoded video data from the storage device may be streaming transmission, download transmission, or a combination thereof.
本发明的技术未必限于无线应用或设定。所述技术可应用于视频译码以支持多种多媒体应用中的任一者,例如空中电视广播、有线电视发射、卫星电视发射、因特网流式处理视频发射(例如经由HTTP的动态自适应流式处理(DASH))、经编码至数据存储媒体上的数字视频、存储于数据存储媒体上的数字视频的解码,或其它应用。在一些实例中,系统10可经配置以支持单向或双向视频发射以支持例如视频流式处理、视频回放、视频广播及/或视频电话的应用。The technology of this invention is not necessarily limited to wireless applications or settings. The technology can be applied to video decoding to support any of a variety of multimedia applications, such as over-the-air television broadcasting, cable television transmission, satellite television transmission, internet streaming video transmission (e.g., Dynamic Adaptive Streaming (DASH) via HTTP), digital video encoded to data storage media, decoding digital video stored on data storage media, or other applications. In some instances, system 10 can be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
在图1的实例中,源装置12包含视频源18、视频编码器20及输出接口22。目的地装置14包含输入接口28、视频解码器30及显示装置32。根据本发明,源装置12的视频编码器20可经配置以应用关于运动向量预测的本发明的技术。在其它实例中,源装置及目的地装置可包含其它组件或布置。举例来说,源装置12可从外部视频源18(例如外部相机)接收视频数据。同样地,目的地装置14可与外部显示装置介接,而非包含集成显示装置。In the example of Figure 1, source device 12 includes a video source 18, a video encoder 20, and an output interface 22. Destination device 14 includes an input interface 28, a video decoder 30, and a display device 32. According to the invention, the video encoder 20 of source device 12 can be configured to apply the techniques of the invention regarding motion vector prediction. In other examples, the source and destination devices may include other components or arrangements. For example, source device 12 may receive video data from an external video source 18 (e.g., an external camera). Similarly, destination device 14 may interface with an external display device, rather than including an integrated display device.
图1的所绘示系统10仅为一个实例。关于运动向量预测的本发明的技术可由任何数字视频编码及/或解码装置执行。尽管本发明的技术通常由视频编码装置执行,但所述技术还可由视频编码器/解码器(通常被称作“CODEC”)执行。此外,本发明的技术还可由视频预处理器执行。源装置12及目的地装置14仅为源装置12产生经译码视频数据用于发射至目的地装置14的这些译码装置的实例。在一些实例中,装置12、14可以基本上对称的方式操作,使得装置12、14中的每一者包含视频编码及解码组件。因此,系统10可支持例如视频装置12、14之间的单向或双向视频传播,以用于视频流式处理、视频回放、视频广播或视频电话。The system 10 illustrated in Figure 1 is only one example. The techniques of the present invention regarding motion vector prediction can be performed by any digital video encoding and/or decoding device. Although the techniques of the present invention are generally performed by video encoding devices, they can also be performed by video encoders/decoders (commonly referred to as "CODECs"). Furthermore, the techniques of the present invention can also be performed by video preprocessors. Source device 12 and destination device 14 are merely examples of decoding devices that generate decoded video data from source device 12 for transmission to destination device 14. In some instances, devices 12 and 14 can operate in a substantially symmetrical manner, such that each of devices 12 and 14 includes video encoding and decoding components. Thus, system 10 can support, for example, one-way or two-way video transmission between video devices 12 and 14 for video streaming, video playback, video broadcasting, or video telephony.
源装置12的视频源18可包含视频捕获装置,例如视频相机、含有先前捕获的视频的视频存档,及/或用于从视频内容提供者接收视频的视频馈送接口。作为另一替代例,视频源18可产生基于计算机图形的数据作为源视频,或实况视频、经存档视频及计算机产生的视频的组合。在一些情况下,如果视频源18为视频相机,那么源装置12及目的地装置14可形成所谓的相机电话或视频电话。然而,如上文所提及,本发明中所描述的技术一般可适用于视频译码,且可适用于无线及/或有线应用。在每一情况下,捕获、预捕获或计算机产生的视频可由视频编码器20编码。经编码视频信息可接着由输出接口22输出至计算机可读媒体16上。The video source 18 of source device 12 may include a video capture device, such as a video camera, a video archive containing previously captured video, and/or a video feed interface for receiving video from a video content provider. Alternatively, video source 18 may generate computer graphics-based data as source video, or a combination of live video, archived video, and computer-generated video. In some cases, if video source 18 is a video camera, then source device 12 and destination device 14 may form a so-called camera phone or video phone. However, as mentioned above, the techniques described in this invention are generally applicable to video decoding and to wireless and/or wired applications. In each case, captured, pre-captured, or computer-generated video may be encoded by video encoder 20. The encoded video information may then be output from output interface 22 to computer-readable media 16.
计算机可读媒体16可包含暂时性媒体,例如无线广播或有线网络发射,或存储媒体(即,非暂时性存储媒体),例如硬盘、闪存驱动器、紧密光盘、数字视频光盘、蓝光光盘或其它计算机可读媒体。在一些实例中,网络服务器(未展示)可从源装置12接收经编码视频数据,且例如经由网络发射将经编码视频数据提供至目的地装置14。类似地,媒体产生设施(例如光盘冲压设施)的计算装置可从源装置12接收经编码视频数据且生产含有经编码视频数据的光盘。因此,在各种实例中,可将计算机可读媒体16理解为包含各种形式的一或多个计算机可读媒体。Computer-readable media 16 may comprise transient media, such as wireless broadcasting or wired network transmissions, or storage media (i.e., non-transient storage media), such as hard disks, flash drives, compact optical discs, digital video optical discs, Blu-ray discs, or other computer-readable media. In some instances, a network server (not shown) may receive encoded video data from source device 12 and, for example, provide the encoded video data to destination device 14 via network transmission. Similarly, a computing device of a media production facility (e.g., an optical disc stamping facility) may receive encoded video data from source device 12 and produce an optical disc containing the encoded video data. Therefore, in various instances, computer-readable media 16 may be understood to comprise one or more computer-readable media of various forms.
目的地装置14的输入接口28从计算机可读媒体16接收信息。计算机可读媒体16的信息可包含由视频编码器20定义的语法信息,所述语法信息还供视频解码器30使用,包含描述块及其它经译码单元(例如,GOP)的特性及/或处理的语法元素。显示装置32将经解码视频数据显示给用户,且可包括多种显示装置中的任一者,例如阴极射线管(CRT)、液晶显示器(LCD)、等离子体显示器、有机发光二极管(OLED)显示器或另一类型的显示装置。The input interface 28 of the destination device 14 receives information from the computer-readable medium 16. The information on the computer-readable medium 16 may include grammatical information defined by the video encoder 20, which is also used by the video decoder 30, including description blocks and other characteristics and/or processed grammatical elements of the decoding unit (e.g., GOP). The display device 32 displays the decoded video data to the user and may include any of a variety of display devices, such as a cathode ray tube (CRT), liquid crystal display (LCD), plasma display, organic light-emitting diode (OLED) display, or another type of display device.
视频编码器20及视频解码器30可根据视频译码标准(例如高效率视频译码(HEVC)标准、HEVC标准的扩展或后续标准,例如ITU-TH.266)而操作。替代地,视频编码器20及视频解码器30可根据其它专有或行业标准(例如ITU-T H.264标准,替代地被称作MPEG-4,第10部分,高级视频译码(AVC))或这些标准的扩展而操作。然而,本发明的技术不限于任何特定译码标准。视频译码标准的其它实例包含MPEG-2及ITU-T H.263。尽管图1中未展示,但在一些方面中,视频编码器20及视频解码器30可各自与音频编码器及解码器集成,且可包含适当MUX-DEMUX单元或其它硬件及软件,以处置共同数据流或单独数据流中的音频及视频两者的编码。适用时,MUX-DEMUX单元可遵照ITU H.223多路复用器协议或例如用户数据报协议(UDP)的其它协议。The video encoder 20 and video decoder 30 may operate according to a video decoding standard, such as the High Efficiency Video Decoding (HEVC) standard, an extension or successor to the HEVC standard, such as ITU-T H.266. Alternatively, the video encoder 20 and video decoder 30 may operate according to other proprietary or industry standards, such as the ITU-T H.264 standard, alternatively referred to as MPEG-4, Part 10, Advanced Video Decoding (AVC) or extensions of these standards. However, the technology of the present invention is not limited to any particular decoding standard. Other examples of video decoding standards include MPEG-2 and ITU-T H.263. Although not shown in Figure 1, in some aspects, the video encoder 20 and video decoder 30 may each be integrated with an audio encoder and decoder, and may include appropriate MUX-DEMUX units or other hardware and software to handle the encoding of both audio and video in a common data stream or separate data streams. Where applicable, the MUX-DEMUX unit may comply with the ITU H.223 multiplexer protocol or other protocols such as User Datagram Protocol (UDP).
视频编码器20及视频解码器30各自可被实施为多种合适编码器电路中的任一者,例如一或多个微处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)、离散逻辑、软件、硬件、固体或其任何组合。当所述技术部分地以软件实施时,装置可将用于软件的指令存储于合适的非暂时性计算机可读媒体中,且在硬件中使用一或多个处理器执行指令以执行本发明的技术。视频编码器20及视频解码器30中的每一者可包含于一或多个编码器或解码器中,一或多个编码器或解码器中的任一者可被集成为相应装置中的组合式编码器/解码器(CODEC)的部分。The video encoder 20 and video decoder 30 can each be implemented as any of a variety of suitable encoder circuits, such as one or more microprocessors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), discrete logic, software, hardware, solid-state, or any combination thereof. When the technology is implemented in part in software, the device may store instructions for software in a suitable non-transitory computer-readable medium, and execute the instructions in hardware using one or more processors to perform the technology of the invention. Each of the video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, and any of the one or more encoders or decoders may be integrated as part of a combined encoder/decoder (CODEC) in the respective device.
视频译码标准包含ITU-T H.261、ISO/IEC MPEG-1Visual、ITU-T H.262或ISO/IECMPEG-2Visual、ITU-T H.263、ISO/IEC MPEG-4Visual及ITU-T H.264(也被称为ISO/IECMPEG-4AVC),包含其可伸缩视频译码(SVC)及多视图视频译码(MVC)扩展。MVC的一个联合草案描述于2010年3月的“用于通用视听服务的高级视频译码”(ITU-T标准H.264)中。Video decoding standards include ITU-T H.261, ISO/IEC MPEG-1 Visual, ITU-T H.262 or ISO/IEC MPEG-2 Visual, ITU-T H.263, ISO/IEC MPEG-4 Visual, and ITU-T H.264 (also known as ISO/IEC MPEG-4 AVC), which include their Scalable Video Decoding (SVC) and Multi-View Video Decoding (MVC) extensions. A joint draft of MVC is described in "Advanced Video Decoding for General Audiovisual Services" (ITU-T Standard H.264) in March 2010.
另外,存在新开发的视频译码标准,即,ITU-T视频译码专家组(VCEG)及ISO/IEC动画专家组(MPEG)的视频译码联合合作小组(JCT-VC)所开发的高效率视频译码(HEVC)。HEVC的最近草案可从http://phenix.int-evry.fr/jct/doc_end_user/documents/12_Geneva/wg11/JCTVC-L1003-v34.zip获得。HEVC标准还在标准ITU-T H.265及国际标准ISO/IEC23008-2中联合提出,两者皆名为“高效率视频译码”且两者皆于2014年10月公开。In addition, there is a newly developed video decoding standard, namely High Efficiency Video Decoding (HEVC), developed by the ITU-T Video Decoding Experts Group (VCEG) and the Joint Collaborative Team on Video Decoding (JCT-VC) of the ISO/IEC Animation Experts Group (MPEG). The latest draft of HEVC is available at http://phenix.int-evry.fr/jct/doc_end_user/documents/12_Geneva/wg11/JCTVC-L1003-v34.zip. The HEVC standard is also jointly proposed in ITU-T H.265 and the international standard ISO/IEC 23008-2, both titled "High Efficiency Video Decoding," and both were published in October 2014.
JCT-VC开发了HEVC标准。HEVC标准化努力是基于视频译码装置的演进模型,被称作HEVC测试模型(HM)。HM根据(例如)ITU-T H.264/AVC假定视频译码装置相对于现有装置的若干额外能力。举例来说,尽管H.264提供九个帧内预测编码模式,但HEVC HM可提供多达三十三个帧内预测编码模式。JCT-VC developed the HEVC standard. The HEVC standardization effort is based on an evolutionary model of video decoding devices, known as the HEVC Test Model (HM). The HM assumes, for example, several additional capabilities of the video decoding device compared to existing devices, according to ITU-T H.264/AVC. For instance, while H.264 provides nine intra-frame prediction coding modes, HEVC HM can provide up to thirty-three intra-frame prediction coding modes.
一般来说,HM的工作模型描述视频帧或图片可划分成包含亮度样本及色度样本两者的树型块或最大译码单元(LCU)的序列。位流内的语法数据可定义LCU的大小,LCU就像素的数目来说为最大译码单元。切片包含按译码次序的数个连续树型块。视频帧或图片可分割成一或多个切片。每一树型块可根据四叉树而分裂成若干译码单元(CU)。一般来说,四叉树数据结构每个CU包含一个节点,其中根节点对应于树型块。如果CU分裂成四个子CU,那么对应于所述CU的节点包含四个叶节点,所述四个叶节点中的每一者对应于所述子CU中的一者。Generally, the working model of HM describes video frames or images as sequences of tree blocks or maximum decoding units (LCUs) containing both luma and chroma samples. The syntax data within the bitstream defines the size of the LCU, which is the maximum decoding unit in terms of the number of pixels. A slice contains several consecutive tree blocks in decoding order. Video frames or images can be divided into one or more slices. Each tree block can be split into several decoding units (CUs) according to a quadtree. Generally, in a quadtree data structure, each CU contains one node, where the root node corresponds to the tree block. If a CU splits into four sub-CUs, then the node corresponding to that CU contains four leaf nodes, each of which corresponds to one of the sub-CUs.
四叉树数据结构的每一节点可提供对应CU的语法数据。举例来说,四叉树中的节点可包含分裂旗标,其指示对应于所述节点的CU是否分裂成子CU。针对CU的语法元素可经递归地定义,且可取决于CU是否分裂成子CU。如果CU不进一步分裂,那么所述CU被称作叶CU。在本发明中,即使不存在原始叶CU的明显分裂,叶CU的四个子CU也将被称作叶CU。举例来说,如果16×16大小的CU未进一步分裂,那么四个8×8子CU也将被称作叶CU,尽管16×16CU从未分裂。Each node in a quadtree data structure provides syntax data for the corresponding CU. For example, a node in a quadtree may contain a split flag indicating whether the CU corresponding to that node has split into child CUs. The syntax elements for a CU can be defined recursively and can depend on whether the CU splits into child CUs. If a CU does not split further, then that CU is called a leaf CU. In this invention, even if there is no obvious split of the original leaf CU, the four child CUs of the leaf CU will also be called leaf CUs. For example, if a 16×16 CU does not split further, then the four 8×8 child CUs will also be called leaf CUs, even though the 16×16 CU has never split.
除了CU不具有大小区别之外,CU具有与H.264标准的宏块类似的用途。举例来说,树型块可分裂成四个子节点(也被称作子CU),且每一子节点又可为上代节点且可被分裂成另外四个子节点。被称作四叉树的叶节点的最终未分裂子节点包括译码节点,所述译码节点也被称作叶CU。与经译码位流相关联的语法数据可定义树型块可分裂的最大次数(其被称作最大CU深度),且还可定义所述译码节点的最小大小。因此,位流还可定义最小译码单元(SCU)。本发明使用术语“块”来指HEVC的上下文中的CU、PU或TU中的任一者,或其它标准的上下文中的类似数据结构(例如,H.264/AVC中的宏块及其子块)。Except that CUs do not have size distinctions, they serve a similar purpose to macroblocks in the H.264 standard. For example, a tree block can be split into four child nodes (also called child CUs), and each child node can be a parent node and can be split into four more child nodes. The final unsplit child nodes of a leaf node, called a quadtree, include decoder nodes, also called leaf CUs. The syntax data associated with the decoded bitstream defines the maximum number of times the tree block can be split (called the maximum CU depth) and also defines the minimum size of the decoder nodes. Therefore, the bitstream can also define the minimum decoding unit (SCU). This invention uses the term "block" to refer to any of the CUs, PUs, or TUs in the context of HEVC, or similar data structures in the context of other standards (e.g., macroblocks and their child blocks in H.264/AVC).
CU包含译码节点以及与所述译码节点相关联的预测单元(PU)及变换单元(TU)。CU的大小对应于译码节点的大小且形状必须为正方形。CU的大小可在8×8像素直至具有最大64×64像素或更多像素的树型块的大小的范围内。每一CU可含有一或多个PU及一或多个TU。与CU相关联的语法数据可描述例如将CU分割成一或多个PU。分割模式可在CU经跳过或直接模式编码、帧内预测模式编码或帧间预测模式编码之间不同。PU可分割成非正方形形状。与CU相关联的语法数据还可描述例如根据四叉树将CU分割成一或多个TU。TU可为正方形或非正方形(例如,矩形)形状。A CU comprises a decoding node and associated prediction units (PUs) and transform units (TUs). The size of the CU corresponds to the size of the decoding node and must be square. The size of the CU can range from 8×8 pixels to a tree block size with a maximum of 64×64 pixels or more. Each CU may contain one or more PUs and one or more TUs. The syntax data associated with the CU may describe, for example, the partitioning of the CU into one or more PUs. The partitioning mode may differ between skip or direct mode coding, intra-frame prediction mode coding, or inter-frame prediction mode coding. PUs may be partitioned into non-square shapes. The syntax data associated with the CU may also describe, for example, the partitioning of the CU into one or more TUs according to a quadtree. TUs may be square or non-square (e.g., rectangular) shapes.
HEVC标准允许根据TU的变换,所述变换对于不同CU可不同。通常基于针对经分割LCU所定义的给定CU内的PU的大小来对TU设定大小,但可并非总是此情况。TU的大小通常与PU相同或比PU小。在一些实例中,可使用被称为“残余四叉树”(RQT)的四叉树结构将对应于CU的残余样本再分为较小单元。RQT的叶节点可被称作变换单元(TU)。与TU相关联的像素差值可经变换以产生可加以量化的变换系数。The HEVC standard allows for transformations based on the transform unit (TU), which can differ for different core cells (CUs). Typically, the TU size is set based on the size of the learning unit (PU) within a given CU defined by the segmented LCU, but this is not always the case. The TU size is usually the same as or smaller than the PU. In some instances, a quadtree structure called a "residual quadtree" (RQT) can be used to further divide the residual samples corresponding to the CU into smaller units. The leaf nodes of the RQT can be called transform units (TUs). The pixel differences associated with the TU can be transformed to produce quantizable transform coefficients.
叶CU可包含一或多个预测单元(PU)。一般来说,PU表示对应于对应CU的全部或部分的空间区域,且可包含用于检索PU的参考样本的数据。此外,PU包含与预测有关的数据。举例来说,当PU经帧内模式编码时,PU的数据可包含于残余四叉树(RQT)中,所述RQT可包含描述用于对应于PU的TU的帧内预测模式的数据。作为另一实例,当PU经帧间模式编码时,PU可包含定义PU的一或多个运动向量的数据。定义PU的运动向量的数据可描述例如运动向量的水平分量、运动向量的竖直分量、运动向量的分辨率(例如,四分之一像素精度或八分之一像素精度)、运动向量所指向的参考图片,及/或运动向量的参考图片列表(例如,列表0、列表1或列表C)。A leaf CU may contain one or more prediction units (PUs). Generally, a PU represents all or part of a spatial region corresponding to a given CU and may contain data for retrieving reference samples for the PU. Furthermore, a PU contains prediction-related data. For example, when a PU is coded in an intra-frame mode, the PU's data may be contained in a residual quadtree (RQT), which may contain data describing the intra-frame prediction mode used for the TU corresponding to the PU. As another example, when a PU is coded in an inter-frame mode, the PU may contain data defining one or more motion vectors of the PU. The data defining the motion vectors of the PU may describe, for example, the horizontal component of the motion vector, the vertical component of the motion vector, the resolution of the motion vector (e.g., quarter-pixel accuracy or eighth-pixel accuracy), the reference picture to which the motion vector points, and/or a list of reference pictures for the motion vector (e.g., list 0, list 1, or list C).
具有一或多个PU的叶CU还可包含一或多个变换单元(TU)。如上文所论述,可使用RQT(也被称作TU四叉树结构)来指定所述变换单元。举例来说,分裂旗标可指示叶CU是否分裂成四个变换单元。接着,可将每一变换单元进一步分裂为其它若干子TU。当TU未进一步分裂时,可将其称作叶TU。大体来说,对于帧内译码,属于叶CU的所有叶TU共享相同的帧内预测模式。即,一般应用同一帧内预测模式来计算叶CU的所有TU的预测值。对于帧内译码,视频编码器可使用帧内预测模式将每一叶TU的残余值计算为CU的对应于TU的部分与原始块之间的差。TU未必限于PU的大小。因此,TU可大于或小于PU。对于帧内译码,PU可与同一CU的对应叶TU共置。在一些实例中,叶TU的最大大小可对应于对应叶CU的大小。A leaf CU with one or more PUs may also contain one or more transform units (TUs). As discussed above, the transform unit can be specified using an RQT (also known as a TU quadtree structure). For example, a split flag can indicate whether a leaf CU is split into four transform units. Each transform unit can then be further split into several other sub-TUs. When a TU is not further split, it can be called a leaf TU. Generally, for intra-frame decoding, all leaf TUs belonging to a leaf CU share the same intra-frame prediction mode. That is, the same intra-frame prediction mode is generally used to calculate the prediction values of all TUs of the leaf CU. For intra-frame decoding, the video encoder can use the intra-frame prediction mode to calculate the residual value of each leaf TU as the difference between the portion of the CU corresponding to the TU and the original block. The TU is not necessarily limited to the size of the PU. Therefore, a TU can be larger or smaller than the PU. For intra-frame decoding, a PU can coexist with the corresponding leaf TU of the same CU. In some instances, the maximum size of a leaf TU can correspond to the size of the corresponding leaf CU.
此外,叶CU的TU还可与相应四叉树数据结构(被称作残余四叉树(RQT))相关联。即,叶CU可包含指示叶CU如何分割成TU的四叉树。TU四叉树的根节点通常对应于叶CU,而CU四叉树的根节点通常对应于树型块(或LCU)。将RQT的未被分裂的TU称作叶TU。一般来说,除非另有指示,否则本发明分别使用术语CU及TU来指叶CU及叶TU。Furthermore, the TU of a leaf CU can also be associated with a corresponding quadtree data structure (called a residual quadtree (RQT)). That is, a leaf CU can contain a quadtree indicating how the leaf CU is partitioned into TUs. The root node of the TU quadtree typically corresponds to a leaf CU, while the root node of the CU quadtree typically corresponds to a tree block (or LCU). The unsplit TU of the RQT is referred to as a leaf TU. Generally, unless otherwise indicated, the terms CU and TU are used in this invention to refer to leaf CU and leaf TU, respectively.
视频序列通常包含一系列视频帧或图片。图片群组(GOP)大体上包括一系列视频图片中的一或多者。GOP可包含GOP的标头、图片中的一或多者的标头或别处中的语法数据,所述语法数据描述包含于GOP中的图片的数目。图片的每一切片可包含描述相应切片的编码模式的切片语法数据。视频编码器20通常对个别视频切片内的视频块进行操作,以便编码视频数据。视频块可对应于CU内的译码节点。视频块可具有固定或变化的大小,且可根据指定译码标准而大小不同。A video sequence typically comprises a series of video frames or pictures. A group of pictures (GOP) generally includes one or more of these video pictures. A GOP may contain a header, headers of one or more of the pictures, or syntax data elsewhere, which describes the number of pictures contained in the GOP. Each slice of a picture may contain slice syntax data describing the encoding mode of the corresponding slice. The video encoder 20 typically operates on video blocks within individual video slices to encode video data. Video blocks may correspond to decoding nodes within a CU. Video blocks may have a fixed or variable size and may vary in size depending on a specified decoding standard.
作为实例,HM支持各种PU大小的预测。假定特定CU的大小为2N×2N,HM支持2N×2N或N×N(在8×8CU的情况下)的PU大小的帧内预测,及2N×2N、2N×N、N×2N或N×N的对称PU大小的帧间预测。HM还支持2N×nU、2N×nD、nL×2N及nR×2N的PU大小的帧间预测的不对称分割。在不对称分割中,CU的一个方向未分割,而另一方向被分割成25%及75%。CU的对应于25%分割的部分由“n”随后“上(Up)”、“下(Down)”、“左(Left)”或“右(Right)”的指示来指示。因此,例如,“2N×nU”是指水平地以顶部的2N×0.5N PU及底部的2N×1.5N PU分割的2N×2N CU。As an example, HM supports prediction for various PU sizes. Assuming a specific CU size of 2N×2N, HM supports intra-frame prediction for PU sizes of 2N×2N or N×N (in the case of an 8×8 CU), and inter-frame prediction for symmetrical PU sizes of 2N×2N, 2N×N, N×2N, or N×N. HM also supports asymmetric segmentation for inter-frame prediction of PU sizes of 2N×nU, 2N×nD, nL×2N, and nR×2N. In asymmetric segmentation, one direction of the CU is not segmented, while the other direction is segmented into 25% and 75%. The portion of the CU corresponding to the 25% segmentation is indicated by “n” followed by “Up,” “Down,” “Left,” or “Right.” Thus, for example, “2N×nU” refers to a 2N×2N CU horizontally segmented with a top 2N×0.5N PU and a bottom 2N×1.5N PU.
在本发明中,“N×N”与“N乘N”可互换地使用以指视频块在竖直维度及水平维度方面的像素尺寸,例如,16×16像素或16乘16像素。一般来说,16×16块在竖直方向上将具有16个像素(y=16)且在水平方向上将具有16个像素(x=16)。同样地,N×N块通常在竖直方向上具有N个像素且在水平方向上具有N个像素,其中N表示非负整数值。块中的像素可按行及列来布置。此外,块未必需要在水平方向上与竖直方向上具有相同数目个像素。举例来说,块可包括N×M个像素,其中M未必等于N。In this invention, "N×N" and "N multiplied by N" are used interchangeably to refer to the pixel size of a video block in both the vertical and horizontal dimensions, for example, 16×16 pixels or 16 by 16 pixels. Generally, a 16×16 block will have 16 pixels in the vertical direction (y=16) and 16 pixels in the horizontal direction (x=16). Similarly, an N×N block typically has N pixels in the vertical direction and N pixels in the horizontal direction, where N represents a non-negative integer value. Pixels in a block can be arranged in rows and columns. Furthermore, a block does not necessarily need to have the same number of pixels in the horizontal and vertical directions. For example, a block may include N×M pixels, where M is not necessarily equal to N.
在使用CU的PU的帧内预测性或帧间预测性译码之后,视频编码器20可计算CU的TU的残余数据。PU可包括描述在空间域(也被称作像素域)中产生预测性像素数据的方法或模式的语法数据,且TU可包括在对残余视频数据应用变换(例如,离散余弦变换(DCT)、整数变换、小波变换或在概念上类似的变换)之后变换域中的系数。残余数据可对应于未经编码的图片的像素与对应于PU的预测值之间的像素差。视频编码器20可形成包含CU的残余数据的TU,且接着变换所述TU以产生CU的变换系数。After intra-frame predictive or inter-frame predictive decoding of the PU of the CU, the video encoder 20 may compute residual data of the TU of the CU. The PU may include syntax data describing the method or pattern of generating predictive pixel data in the spatial domain (also known as the pixel domain), and the TU may include coefficients in the transform domain after applying a transform (e.g., discrete cosine transform (DCT), integer transform, wavelet transform, or a conceptually similar transform) to the residual video data. The residual data may correspond to the pixel difference between the pixels of the uncoded image and the predicted value corresponding to the PU. The video encoder 20 may form a TU containing the residual data of the CU, and then transform the TU to produce the transform coefficients of the CU.
在用以产生变换系数的任何变换之后,视频编码器20可对变换系数执行量化。量化通常指将变换系数量化以可能减少用以表示变换系数的数据的量从而提供进一步压缩的过程。量化过程可减小与系数中的一些或全部相关联的位深度。举例来说,可在量化期间将n位值降值舍位至m位值,其中n大于m。After any transformation used to generate the transform coefficients, the video encoder 20 may perform quantization on the transform coefficients. Quantization generally refers to the process of quantizing transform coefficients to potentially reduce the amount of data used to represent the transform coefficients, thereby providing further compression. The quantization process can reduce the bit depth associated with some or all of the coefficients. For example, an n-bit value can be down-rounded to an m-bit value during quantization, where n is greater than m.
在量化之后,视频编码器可扫描变换系数,从而从包含经量化变换系数的二维矩阵产生一维向量。所述扫描可经设计以将较高能量(且因此较低频率)系数置于阵列前部,及将较低能量(且因此较高频率)系数置于阵列后部。在一些实例中,视频编码器20可利用预定义扫描次序来扫描经量化变换系数以产生可经熵编码的串行化向量。在其它实例中,视频编码器20可执行自适应扫描。在扫描经量化变换系数以形成一维向量之后,视频编码器20可例如根据上下文自适应可变长度译码(CAVLC)、上下文自适应二进制算术译码(CABAC)、基于语法的上下文自适应二进制算术译码(SBAC)、概率区间分割熵(PIPE)译码或另一熵编码方法来对一维向量进行熵编码。视频编码器20还可熵编码与经编码视频数据相关联的语法元素以供视频解码器30在解码视频数据时使用。After quantization, the video encoder can scan the transform coefficients to generate a one-dimensional vector from a two-dimensional matrix containing the quantized transform coefficients. The scan can be designed to place higher-energy (and therefore lower-frequency) coefficients at the front of the array and lower-energy (and therefore higher-frequency) coefficients at the back. In some instances, the video encoder 20 can utilize a predefined scan order to scan the quantized transform coefficients to produce a serialized vector that can be entropy-encoded. In other instances, the video encoder 20 can perform adaptive scanning. After scanning the quantized transform coefficients to form a one-dimensional vector, the video encoder 20 can entropy-encode the one-dimensional vector, for example, according to context-adaptive variable-length decoding (CAVLC), context-adaptive binary arithmetic decoding (CABAC), syntax-based context-adaptive binary arithmetic decoding (SBAC), probabilistic interval partitioning entropy (PIPE) decoding, or another entropy coding method. The video encoder 20 can also entropy-encode syntax elements associated with the encoded video data for use by the video decoder 30 when decoding the video data.
为执行CABAC,视频编码器20可将上下文模型内的上下文指派至待发射的符号。所述上下文可能是关于(例如)符号的相邻值是否为非零。为执行CAVLC,视频编码器20可选择可变长度码用于待发射的符号。To perform CABAC, video encoder 20 can assign context within a context model to the symbol to be emitted. The context might be, for example, whether the symbol's neighboring values are non-zero. To perform CAVLC, video encoder 20 can select variable-length codes for the symbol to be emitted.
可将VLC中的码字构造成使得相对较短码对应于更有可能的符号,而较长码对应于较不可能的符号。以此方式,相对于(例如)针对待发射的每一符号使用相等长度码字,使用VLC可实现位节省。概率确定可基于经指派至符号的上下文。Codewords in a VLC can be constructed such that relatively shorter codes correspond to more likely symbols, while longer codes correspond to less likely symbols. In this way, bit savings can be achieved using VLC compared to (for example) using codewords of equal length for each symbol to be transmitted. Probability determination can be based on the context assigned to the symbol.
根据本发明的一或多种技术,视频编码器20及/或视频解码器30可实施本发明的技术中的一或多者。举例来说,视频编码器20及/或视频解码器30可在运动估计及补偿中使用仿射模型。According to one or more techniques of the present invention, the video encoder 20 and/or the video decoder 30 may implement one or more of the techniques of the present invention. For example, the video encoder 20 and/or the video decoder 30 may use affine models for motion estimation and compensation.
图2为绘示可经配置以执行关于运动向量预测的本发明的技术的视频编码器20的实例的框图。视频编码器20可执行视频切片内的视频块的帧内译码及帧间译码。帧内译码依赖于空间预测以减小或移除给定视频帧或图片内的视频的空间冗余。帧间译码依赖于时间预测以减小或移除视频序列的邻近帧或图片内的视频的时间冗余。帧内模式(I模式)可指若干基于空间的译码模式中的任一者。帧间模式(例如单向预测(P模式)或双向预测(B模式))可指若干基于时间的译码模式中的任一者。Figure 2 is a block diagram illustrating an example of a video encoder 20 that can be configured to perform motion vector prediction according to the technology of the present invention. The video encoder 20 can perform intra-frame decoding and inter-frame decoding of video blocks within a video slice. Intra-frame decoding relies on spatial prediction to reduce or remove spatial redundancy of video within a given video frame or image. Inter-frame decoding relies on temporal prediction to reduce or remove temporal redundancy of video within neighboring frames or images of a video sequence. Intra-frame mode (I-mode) can refer to any of several spatial-based decoding modes. Inter-frame mode (e.g., one-way prediction (P-mode) or two-way prediction (B-mode)) can refer to any of several time-based decoding modes.
如图2所展示,视频编码器20接收待编码的视频切片内的当前视频块。在图2的实例中,视频编码器20包含模式选择单元40、参考图片存储器64、求和器50、变换处理单元52、量化单元54及熵编码单元56。模式选择单元40又包含运动补偿单元44、运动估计单元42、帧内预测单元46及分割单元48。为了视频块重构,视频编码器20还包含反量化单元58、反变换单元60及求和器62。还可包含解块滤波器(图2中未展示)以便对块边界进行滤波,以从经重构视频移除块效应假影。需要时,解块滤波器将通常对求和器62的输出进行滤波。除了解块滤波器之外,还可使用额外滤波器(环路内或环路后)。为简洁起见而未展示此类滤波器,但需要时,此类滤波器可对求和器50的输出进行滤波(作为环路内滤波器)。As shown in Figure 2, the video encoder 20 receives the current video block within the video slice to be encoded. In the example of Figure 2, the video encoder 20 includes a mode selection unit 40, a reference image memory 64, a summer 50, a transform processing unit 52, a quantization unit 54, and an entropy coding unit 56. The mode selection unit 40 further includes a motion compensation unit 44, a motion estimation unit 42, an intra-frame prediction unit 46, and a segmentation unit 48. For video block reconstruction, the video encoder 20 also includes an inverse quantization unit 58, an inverse transform unit 60, and a summer 62. A deblocking filter (not shown in Figure 2) may also be included to filter block boundaries to remove block artifacts from the reconstructed video. When needed, the deblocking filter will typically filter the output of the summer 62. In addition to the deblocking filter, additional filters (in-loop or post-loop) may be used. Such filters are not shown for simplicity, but when needed, they can filter the output of the summer 50 (as an in-loop filter).
在编码处理期间,视频编码器20接收待译码的视频帧或切片。可将帧或切片划分成多个视频块。运动估计单元42及运动补偿单元44执行所接收的视频块相对于一或多个参考帧中的一或多个块的帧间预测性译码以提供时间预测。帧内预测单元46可替代地执行所接收的视频块相对于与待译码块同帧或切片中的一或多个相邻块的帧内预测性译码以提供空间预测。视频编码器20可执行多个译码遍次,例如,以选择用于每一视频数据块的适当译码模式。During the encoding process, the video encoder 20 receives video frames or slices to be decoded. Frames or slices can be divided into multiple video blocks. Motion estimation unit 42 and motion compensation unit 44 perform inter-frame predictive decoding of the received video blocks relative to one or more blocks in one or more reference frames to provide temporal prediction. Intra-frame prediction unit 46 may alternatively perform intra-frame predictive decoding of the received video blocks relative to one or more adjacent blocks in the same frame or slice as the block to be decoded to provide spatial prediction. The video encoder 20 can perform multiple decoding passes, for example, to select an appropriate decoding mode for each video data block.
此外,分割单元48可基于对先前译码遍次中的先前分割方案的评估计而将视频数据的块分割成子块。举例来说,分割单元48可首先将帧或切片分割成LCU,且基于位率-失真分析(例如,位率-失真优化)来将所述LCU中的每一者分割成子CU。模式选择单元40可进一步产生指示将LCU分割为子CU的四叉树数据结构。四叉树的叶节点CU可包含一或多个PU及一或多个TU。Furthermore, segmentation unit 48 can segment blocks of video data into sub-blocks based on an evaluation of previous segmentation schemes in previous decoding passes. For example, segmentation unit 48 can first segment frames or slices into LCUs, and then segment each of the LCUs into sub-CUs based on bitrate-distortion analysis (e.g., bitrate-distortion optimization). Mode selection unit 40 can further generate a quadtree data structure indicating the segmentation of LCUs into sub-CUs. The leaf nodes CU of the quadtree can contain one or more PUs and one or more TUs.
模式选择单元40可(例如)基于误差结果而选择译码模式(帧内或帧间)中的一者,且将所得的经帧内译码块或经帧间译码块提供至求和器50以产生残余块数据,及提供至求和器62以重构经编码块以用作参考帧。模式选择单元40还将语法元素(例如运动向量、帧内模式指示符、分区信息及其它此类语法信息)提供至熵编码单元56。The mode selection unit 40 may, for example, select a decoding mode (intra-frame or inter-frame) based on the error result, and provide the resulting intra-frame decoded block or inter-frame decoded block to the summer 50 to generate residual block data, and to the summer 62 to reconstruct the coded block for use as a reference frame. The mode selection unit 40 also provides syntax elements (e.g., motion vectors, intra-frame mode indicators, partition information, and other such syntax information) to the entropy coding unit 56.
运动估计单元42及运动补偿单元44可高度集成,但出于概念目的而单独绘示。由运动估计单元42执行的运动估计为产生运动向量的过程,所述运动向量估计视频块的运动。举例来说,运动向量可指示当前视频帧或图片内的视频块的PU相对于参考图片(或其它经译码单元)内的预测性块相对于在当前图片(或其它经译码单元)内正经译码的当前块的位移。预测性块为就像素差来说被发现紧密地匹配于待译码块的块,像素差可通过绝对差和(SAD)、平方差和(SSD)或其它差度量予以确定。在一些实例中,视频编码器20可计算存储于参考图片存储器64中的参考图片的次整数像素位置的值。举例来说,视频编码器20可内插参考图片的四分之一像素位置、八分之一像素位置或其它分数像素位置的值。因此,运动估计单元42可执行关于全像素位置及分数像素位置的运动搜索且输出具有分数像素精度的运动向量。Motion estimation unit 42 and motion compensation unit 44 can be highly integrated, but are shown separately for conceptual purposes. Motion estimation performed by motion estimation unit 42 is the process of generating motion vectors that estimate the motion of video blocks. For example, a motion vector may indicate the displacement of the PU of a video block within the current video frame or image relative to a predictive block within a reference image (or other decoded unit) relative to the current block being decoded within the current image (or other decoded unit). A predictive block is a block found to closely match the block to be decoded in terms of pixel differences, which can be determined by the sum of absolute differences (SAD), sum of squared differences (SSD), or other difference metrics. In some instances, video encoder 20 may calculate the value of the second-integer pixel position of a reference image stored in reference image memory 64. For example, video encoder 20 may interpolate the value of a quarter-pixel position, an eighth-pixel position, or other fractional pixel position of the reference image. Therefore, motion estimation unit 42 may perform motion search with respect to full-pixel positions and fractional pixel positions and output motion vectors with fractional-pixel precision.
运动估计单元42通过将PU的位置与参考图片的预测性块的位置比较而计算经帧间译码切片中的视频块的PU的运动向量。参考图片可选自第一参考图片列表(列表0)或第二参考图片列表(列表1),所述列表中的每一者识别存储于参考图片存储器64中的一或多个参考图片。运动估计单元42将经计算运动向量发送至熵编码单元56及运动补偿单元44。The motion estimation unit 42 calculates the motion vector of the PU in the inter-frame decoded slice by comparing the position of the PU with the position of a predictive block in a reference image. The reference images can be selected from a first list of reference images (list 0) or a second list of reference images (list 1), each of which identifies one or more reference images stored in the reference image memory 64. The motion estimation unit 42 sends the calculated motion vector to the entropy coding unit 56 and the motion compensation unit 44.
由运动补偿单元44执行的运动补偿可涉及基于由运动估计单元42确定的运动向量提取或产生预测性块。再次,在一些实例中,运动估计单元42与运动补偿单元44可在功能上集成。在接收到当前视频块的PU的运动向量之后,运动补偿单元44可在参考图片列表中的一者中定位运动向量指向的预测性块。求和器50通过从正经译码的当前视频块的像素值减去预测性块的像素值来形成残余视频块,从而形成像素差值,如下文所论述。一般来说,运动估计单元42相对于亮度分量执行运动估计,且运动补偿单元44将基于所述亮度分量所计算的运动向量用于色度分量及亮度分量两者。模式选择单元40还可产生与视频块及视频切片相关联的语法元素以供视频解码器30在解码视频切片的视频块时使用。Motion compensation performed by motion compensation unit 44 may involve extracting or generating predictive blocks based on motion vectors determined by motion estimation unit 42. Again, in some instances, motion estimation unit 42 and motion compensation unit 44 may be functionally integrated. After receiving the motion vector of the PU for the current video block, motion compensation unit 44 may locate the predictive block to which the motion vector points in one of the reference image lists. Summer 50 forms a residual video block by subtracting the pixel values of the predictive block from the pixel values of the properly decoded current video block, thus forming a pixel difference, as discussed below. Generally, motion estimation unit 42 performs motion estimation relative to the luminance component, and motion compensation unit 44 uses the motion vector calculated based on the luminance component for both the chroma and luminance components. Mode selection unit 40 may also generate syntax elements associated with video blocks and video slices for use by video decoder 30 when decoding video slices.
视频编码器20可经配置以执行上文关于图1所论述的本发明的各种技术中的任一者,且如下文将更详细地所描述。举例来说,运动补偿单元44可经配置以根据本发明的技术使用AMVP或合并模式来译码用于视频数据的块的运动信息。The video encoder 20 can be configured to perform any of the various techniques of the invention discussed above with respect to FIG1, as will be described in more detail below. For example, the motion compensation unit 44 can be configured to decode motion information for blocks of video data using AMVP or merging mode according to the techniques of the invention.
假定运动补偿单元44选择执行合并模式,运动补偿单元44可形成包含合并候选者集合的候选者列表。运动补偿单元44可基于特定的预定次序将候选者添加至候选者列表。如上文所论述,运动补偿单元44还可添加额外候选者且执行对候选者列表的修剪。最终,模式选择单元40可确定哪些候选者将用于编码当前块的运动信息,且编码表示所选择候选者的合并索引。Assuming motion compensation unit 44 selects to execute the merging mode, it can form a candidate list containing a set of merging candidates. Motion compensation unit 44 can add candidates to the candidate list in a specific predetermined order. As discussed above, motion compensation unit 44 can also add additional candidates and perform pruning of the candidate list. Finally, mode selection unit 40 determines which candidates will be used to encode the motion information of the current block, and the encoded representation is the merging index of the selected candidates.
如上文所描述,作为由运动估计单元42及运动补偿单元44执行的帧间预测的替代例,帧内预测单元46可对当前块进行帧内预测。具体地说,帧内预测单元46可确定待用以编码当前块的帧内预测模式。在一些实例中,帧内预测单元46可例如在单独编码遍次期间使用各种帧内预测模式来编码当前块,且帧内预测单元46(或在一些实例中,模式选择单元40)可从所测试模式中选择适当帧内预测模式来使用。As described above, as an alternative to inter-frame prediction performed by motion estimation unit 42 and motion compensation unit 44, intra-frame prediction unit 46 may perform intra-frame prediction for the current block. Specifically, intra-frame prediction unit 46 may determine the intra-frame prediction mode to be used to encode the current block. In some instances, intra-frame prediction unit 46 may use various intra-frame prediction modes to encode the current block, for example, during individual encoding passes, and intra-frame prediction unit 46 (or, in some instances, mode selection unit 40) may select an appropriate intra-frame prediction mode from the tested modes for use.
举例来说,帧内预测单元46可使用对于各种所测试帧内预测模式的位率-失真分析来计算位率-失真值,且在所测试模式中选择具有最佳位率-失真特性的帧内预测模式。位率-失真分析大体上确定经编码块与原始、未编码块(其经编码以产生经编码块)之间的失真(或误差)量,以及用以产生经编码块的位率(即,位的数目)。帧内预测单元46可根据各种经编码块的失真及位率来计算比率,以确定哪一帧内预测模式展现所述块的最佳位率-失真值。For example, intra-prediction unit 46 can use bitrate-distortion analysis for various tested intra-prediction modes to calculate bitrate-distortion values and select the intra-prediction mode with the best bitrate-distortion characteristics among the tested modes. Bitrate-distortion analysis generally determines the amount of distortion (or error) between the coded block and the original, uncoded block (which is encoded to produce the coded block), and the bitrate (i.e., the number of bits) used to produce the coded block. Intra-prediction unit 46 can calculate a ratio based on the distortion and bitrate of various coded blocks to determine which intra-prediction mode exhibits the best bitrate-distortion value for the block.
在针对块选择帧内预测模式之后,帧内预测单元46可将指示用于块的所选帧内预测模式的信息提供至熵编码单元56。熵编码单元56可编码指示所选帧内预测模式的信息。视频编码器20可在经发射位流中包含以下各者:配置数据,其可包含多个帧内预测模式索引表及多个经修改帧内预测模式索引表(也被称作码字映射表);各种块的编码上下文的定义;及待用于所述上下文中的每一者的最可能帧内预测模式、帧内预测模式索引表及经修改帧内预测模式索引表的指示。After selecting an intra-prediction mode for a block, the intra-prediction unit 46 may provide information indicating the selected intra-prediction mode for the block to the entropy coding unit 56. The entropy coding unit 56 may encode the information indicating the selected intra-prediction mode. The video encoder 20 may include the following in the transmitted bitstream: configuration data, which may include multiple intra-prediction mode index tables and multiple modified intra-prediction mode index tables (also called codeword maps); definitions of the coding contexts for various blocks; and indications of the most probable intra-prediction mode to be used for each of the contexts, the intra-prediction mode index tables, and the modified intra-prediction mode index tables.
视频编码器20通过从正被译码的原始视频块减去来自模式选择单元40的预测数据而形成残余视频块。求和器50表示执行此减法运算的一或多个组件。变换处理单元52将变换(例如离散余弦变换(DCT)或概念上类似的变换)应用于残余块,从而产生包括残余变换系数值的视频块。变换处理单元52可执行概念上类似于DCT的其它变换。还可使用小波变换、整数变换、子带变换或其它类型的变换。The video encoder 20 forms a residual video block by subtracting the prediction data from the mode selection unit 40 from the original video block being decoded. The summer 50 represents one or more components performing this subtraction. The transform processing unit 52 applies a transform (e.g., a discrete cosine transform (DCT) or a conceptually similar transform) to the residual block, thereby producing a video block that includes the residual transform coefficient values. The transform processing unit 52 can perform other transforms conceptually similar to DCT. Wavelet transform, integer transform, subband transform, or other types of transforms can also be used.
在任何情况下,变换处理单元52将变换应用于残余块,从而产生残余变换系数块。变换可将残余信息从像素值域转换至变换域,例如频域。变换处理单元52可将所得变换系数发送至量化单元54。量化单元54量化变换系数以进一步减小位率。量化过程可减小与系数中的一些或全部相关联的位深度。量化程度可通过调整量化参数来修改。在一些实例中,量化单元54可接着执行对包含经量化变换系数的矩阵的扫描。替代地,熵编码单元56可执行扫描。In any case, transform processing unit 52 applies a transform to the residual block, thereby producing a residual transform coefficient block. The transform can convert residual information from the pixel value domain to the transform domain, such as the frequency domain. Transform processing unit 52 can send the resulting transform coefficients to quantization unit 54. Quantization unit 54 quantizes the transform coefficients to further reduce the bit rate. The quantization process can reduce the bit depth associated with some or all of the coefficients. The degree of quantization can be modified by adjusting the quantization parameters. In some instances, quantization unit 54 may then perform a scan of a matrix containing the quantized transform coefficients. Alternatively, entropy coding unit 56 may perform the scan.
在量化之后,熵编码单元56对经量化变换系数进行熵译码。举例来说,熵编码单元56可执行上下文自适应可变长度译码(CAVLC)、上下文自适应二进制算术译码(CABAC)、基于语法的上下文自适应二进制算术译码(SBAC)、概率区间分割熵(PIPE)译码或另一熵译码技术。在基于上下文的熵译码的情况下,上下文可基于相邻块。在由熵译码单元56进行熵译码之后,可将经编码位流发射至另一装置(例如,视频解码器30)或加以存档以供稍后发射或检索。After quantization, entropy coding unit 56 performs entropy decoding on the quantized transform coefficients. For example, entropy coding unit 56 may perform context-adaptive variable-length decoding (CAVLC), context-adaptive binary arithmetic decoding (CABAC), syntax-based context-adaptive binary arithmetic decoding (SBAC), probabilistic interval partitioning entropy (PIPE) decoding, or another entropy decoding technique. In the case of context-based entropy decoding, the context may be based on adjacent blocks. After entropy decoding by entropy decoding unit 56, the encoded bitstream can be transmitted to another device (e.g., video decoder 30) or archived for later transmission or retrieval.
反量化单元58及反变换单元60分别应用反量化及反变换以在像素域中重构残余块,例如,以供稍后用作参考块。运动补偿单元44可通过将残余块与参考图片存储器64的帧中的一者的预测性块相加来计算参考块。运动补偿单元44还可将一或多个内插滤波器应用于经重构残余块以计算用于在运动估计中使用的次整数像素值。求和器62将经重构残余块与由运动补偿单元44产生的运动补偿预测块相加,以产生经重构视频块以存储于参考图片存储器64中。经重构视频块可由运动估计单元42及运动补偿单元44用作参考块以对后续视频帧中的块进行帧间译码。Inverse quantization unit 58 and inverse transform unit 60 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain, for example, to be used later as a reference block. Motion compensation unit 44 can calculate the reference block by adding the residual block to a predictive block of one of the frames in reference image memory 64. Motion compensation unit 44 can also apply one or more interpolation filters to the reconstructed residual block to calculate sub-integer pixel values for use in motion estimation. Summer 62 adds the reconstructed residual block to the motion-compensated predictive block generated by motion compensation unit 44 to produce a reconstructed video block for storage in reference image memory 64. The reconstructed video block can be used as a reference block by motion estimation unit 42 and motion compensation unit 44 for inter-frame decoding of blocks in subsequent video frames.
图3为绘示可经配置以执行本发明的运动向量预测技术的视频解码器30的实例的框图。在图3的实例中,视频解码器30包括熵解码单元70、运动补偿单元72、帧内预测单元74、反量化单元76、反变换单元78、参考图片存储器82及求和器80。在一些实例中,视频解码器30可执行与关于视频编码器20(图2)所描述的编码遍次大体上互逆的解码遍次。运动补偿单元72可基于从熵解码单元70接收的运动向量来产生预测数据,而帧内预测单元74可基于从熵解码单元70接收的帧内预测模式指示符来产生预测数据。Figure 3 is a block diagram illustrating an example of a video decoder 30 that can be configured to perform the motion vector prediction technique of the present invention. In the example of Figure 3, the video decoder 30 includes an entropy decoding unit 70, a motion compensation unit 72, an intra-frame prediction unit 74, an inverse quantization unit 76, an inverse transform unit 78, a reference image memory 82, and a summer 80. In some instances, the video decoder 30 may perform a decoding pass that is substantially the inverse of the encoding pass described with respect to the video encoder 20 (Figure 2). The motion compensation unit 72 may generate prediction data based on motion vectors received from the entropy decoding unit 70, while the intra-frame prediction unit 74 may generate prediction data based on intra-frame prediction mode indicators received from the entropy decoding unit 70.
在解码过程期间,视频解码器30从视频编码器20接收表示经编码视频切片的视频块及相关联语法元素的经编码视频位流。视频解码器30的熵解码单元70熵解码位流以产生经量化系数、运动向量或帧内预测模式指示符及其它语法元素。熵解码单元70将运动向量及其它语法元素转发至运动补偿单元72。视频解码器30可在视频切片层级及/或视频块层级接收语法元素。During the decoding process, video decoder 30 receives from video encoder 20 a encoded video bitstream representing video blocks of encoded video slices and associated syntax elements. Entropy decoding unit 70 of video decoder 30 entropy decodes the bitstream to produce quantized coefficients, motion vectors or intra-frame prediction mode indicators, and other syntax elements. Entropy decoding unit 70 forwards the motion vectors and other syntax elements to motion compensation unit 72. Video decoder 30 may receive syntax elements at the video slice level and/or video block level.
当视频切片经译码为经帧内译码(I)切片时,帧内预测处理单元74可基于用信号传送帧内预测模式及来自当前帧或图片的先前经解码块的数据来产生用于当前视频切片的视频块的预测数据。当视频帧经译码为帧间译码(即,B、P或GPB)切片时,运动补偿单元72基于运动向量及从熵解码单元70接收的其它语法元素而产生用于当前视频切片的视频块的预测性块。预测性块可由参考图片列表中的一者内的参考图片中的一者产生。视频解码器30可基于存储于参考图片存储器82中的参考图片,使用默认构造技术来构造参考帧列表:列表0及列表1。When a video slice is decoded into an intra-decoded (I) slice, the intra-prediction processing unit 74 can generate prediction data for the video block of the current video slice based on the intra-prediction mode transmitted via signal transmission and data from the previously decoded block of the current frame or picture. When a video frame is decoded into an inter-decoded (i.e., B, P, or GPB) slice, the motion compensation unit 72 generates predictive blocks for the video block of the current video slice based on motion vectors and other syntax elements received from the entropy decoding unit 70. The predictive blocks can be generated from one of the reference pictures in the reference picture list. The video decoder 30 can construct the reference frame list: list 0 and list 1, based on the reference pictures stored in the reference picture memory 82 using a default construction technique.
运动补偿单元72通过剖析运动向量及其它语法元素来确定当前视频切片的视频块的预测信息,且使用所述预测信息产生用于正经解码的当前视频块的预测性块。举例来说,运动补偿单元72使用所接收语法元素中的一些来确定用于译码视频切片的视频块的预测模式(例如,帧内或帧间预测)、帧间预测切片类型(例如,B切片或P切片)、所述切片的参考图片列表中的一或多者的构造信息、所述切片的每一经帧间编码视频块的运动向量、所述切片的每一经帧间译码视频块的帧间预测状态及用以解码当前视频切片中的视频块的其它信息。Motion compensation unit 72 determines prediction information for video blocks in the current video slice by analyzing motion vectors and other syntax elements, and uses the prediction information to generate predictive blocks for proper decoding of the current video slice. For example, motion compensation unit 72 uses some of the received syntax elements to determine the prediction mode (e.g., intra-frame or inter-frame prediction), inter-frame prediction slice type (e.g., B-slice or P-slice), construction information of one or more of the slice's reference picture list, motion vectors of each inter-frame encoded video block of the slice, inter-frame prediction state of each inter-frame decoded video block of the slice, and other information for decoding video blocks in the current video slice.
运动补偿单元72还可执行基于内插滤波器的内插。运动补偿单元72可使用如由视频编码器20在编码视频块期间使用的内插滤波器来计算参考块的次整数像素的内插值。在此情况下,运动补偿单元72可从接收的语法元素确定由视频编码器20使用的内插滤波器且使用所述内插滤波器产生预测性块。The motion compensation unit 72 can also perform interpolation based on an interpolation filter. The motion compensation unit 72 can use an interpolation filter, such as that used by the video encoder 20 during video block encoding, to calculate the interpolated values of the second-integer pixels of the reference block. In this case, the motion compensation unit 72 can determine the interpolation filter used by the video encoder 20 from the received syntax elements and use the interpolation filter to generate a predictive block.
视频解码器30可经配置以执行上文关于图1所论述的本发明的各种技术中的任一者,如下文将更详细地所论述。举例来说,运动补偿单元72可经配置以确定根据本发明的技术使用AMVP或合并模式来执行运动向量预测。熵解码单元70可解码表示运动信息如何用于当前块的译码的一或多个语法元素。The video decoder 30 can be configured to perform any of the various techniques of the invention discussed above with respect to FIG. 1, as will be discussed in more detail below. For example, the motion compensation unit 72 can be configured to determine whether to perform motion vector prediction using AMVP or merging mode according to the technique of the invention. The entropy decoding unit 70 can decode one or more syntax elements representing how motion information is used for decoding the current block.
假定语法元素指示合并模式被执行,运动补偿单元72可形成包含合并候选者集合的候选者列表。运动补偿单元72可基于特定的预定次序将候选者添加至候选者列表。如上文所论述,运动补偿单元72还可添加额外候选者及执行候选者列表的修剪。最后,运动补偿单元72可解码表示哪一候选者被用于译码当前块的运动信息的合并索引。Assuming the syntax element indicates that a merge mode is executed, motion compensation unit 72 can form a candidate list containing a set of merge candidates. Motion compensation unit 72 can add candidates to the candidate list in a specific predetermined order. As discussed above, motion compensation unit 72 can also add additional candidates and perform pruning of the candidate list. Finally, motion compensation unit 72 can decode the merge index indicating which candidate was used to decode the motion information of the current block.
反量化单元76反量化(即,解量化)提供于位流中且由熵解码单元70熵解码的经量化变换系数。反量化过程可包含使用由视频解码器30针对视频切片中的每一视频块计算的量化参数QPY以确定应当应用的量化程度及类似地确定应当应用的反量化程度。The dequantization unit 76 dequantizes (i.e., dequantizes) the quantized transform coefficients provided in the bitstream and entropy-decoded by the entropy decoding unit 70. The dequantization process may include using the quantization parameter QPY calculated by the video decoder 30 for each video block in the video slice to determine the degree of quantization to be applied and similarly determining the degree of dequantization to be applied.
反变换单元78将反变换(例如,反DCT、反整数变换或在概念上类似的反变换过程)应用于变换系数,以便在像素域中产生残余块。The inverse transform unit 78 applies an inverse transform (e.g., inverse DCT, inverse integer transform, or a conceptually similar inverse transform process) to the transform coefficients in order to produce a residual block in the pixel domain.
在运动补偿单元72基于运动向量及其它语法元素产生当前视频块的预测性块之后,视频解码器30通过对来自反变换单元78的残余块与由运动补偿单元72产生的对应预测性块求和而形成经解码视频块。求和器80表示执行此求和运算的一或多个组件。必要时,还可应用解块滤波器对经解码块进行滤波以便移除块效应假影。还可使用其它环路滤波器(在译码环路内或在译码环路后)以使像素转变平滑,或以其它方式改善视频质量。接着将给定帧或图片中的经解码视频块存储于参考图片存储器82中,所述参考图片存储器存储用于后续运动补偿的参考图片。参考图片存储器82还存储经解码视频以用于稍后在显示装置(例如图1的显示装置32)上呈现。After the motion compensation unit 72 generates a predictive block for the current video block based on motion vectors and other syntax elements, the video decoder 30 forms a decoded video block by summing the residual block from the inverse transform unit 78 with the corresponding predictive block generated by the motion compensation unit 72. The summer 80 represents one or more components performing this summation operation. If necessary, a deblocking filter can be applied to the decoded block to remove block artifacts. Other loop filters (within or after the decoding loop) can also be used to smooth pixel transitions or otherwise improve video quality. The decoded video block in a given frame or image is then stored in a reference image memory 82, which stores reference images for subsequent motion compensation. The reference image memory 82 also stores the decoded video for later presentation on a display device (e.g., display device 32 of FIG. 1).
图4为绘示帧内预测的方面的概念图。视频编码器20及/或视频解码器30可实施帧内预测以通过使用块的空间相邻经重构图像样本来执行图像块预测。用于16×16图像块的帧内预测的典型实例展示于图4中。如图4中所绘示,通过帧内预测,16×16图像块(呈实线正方形)是根据沿着选定预测方向(如箭头所指示)位于最近上方行及左边列中的上方及左边相邻经重构样本(参考样本)来预测。在HEVC,对于亮度块的帧内预测,包含35个模式。Figure 4 is a conceptual diagram illustrating aspects of intra-frame prediction. The video encoder 20 and/or video decoder 30 can implement intra-frame prediction to perform image block prediction using spatially adjacent reconstructed image samples of the block. A typical example of intra-frame prediction for a 16×16 image block is shown in Figure 4. As illustrated in Figure 4, through intra-frame prediction, a 16×16 image block (shown as a solid-lined square) is predicted based on the upper and left adjacent reconstructed samples (reference samples) located in the nearest upper row and left column along a selected prediction direction (as indicated by the arrows). In HEVC, intra-frame prediction for luma blocks includes 35 modes.
图5为绘示用于亮度块的帧内预测模式的概念图。所述模式包含平面模式、DC模式及33个角度模式,如图5中所指示。定义于HEVC中的帧内预测的35个模式被加索引,如下表1中所展示:Figure 5 is a conceptual diagram illustrating the intra-prediction modes used for luma blocks. These modes include a planar mode, a DC mode, and 33 angular modes, as indicated in Figure 5. The 35 intra-prediction modes defined in HEVC are indexed and shown in Table 1 below:
表1-帧内预测模式及相关联名称的规范Table 1 - Specifications of Intra-Frame Prediction Modes and Associated Names
图6为绘示平面模式的方面的概念图。对于通常为最常使用的帧内预测模式的平面模式,预测样本是如图6中所展示而产生。为了对N×N块执行平面预测,对于定位在(x,y)的每一样本,视频编码器20及/或视频解码器30可利用双线性滤波器使用四个特定相邻经重构样本(即,参考样本)来计算预测值。四个参考样本包含右上方经重构样本TR、左下方经重构样本BL、位于当前样本的同一列(rx,-1)处(表示为T)及当前样本的同一行(r-1,y)处(表示为L)的两个经重构样本。平面模式如以下方程式中所展示而公式化:pxy=(N-x-1)·L+(N-y-1)·T+x·TR+y·BL。Figure 6 is a conceptual diagram illustrating aspects of the planar mode. For the planar mode, which is typically the most commonly used intra-frame prediction mode, the prediction samples are generated as shown in Figure 6. To perform planar prediction on an N×N block, for each sample located at (x, y), the video encoder 20 and/or video decoder 30 can use a bilinear filter to compute a prediction using four specific adjacent reconstructed samples (i.e., reference samples). The four reference samples include the upper right reconstructed sample TR, the lower left reconstructed sample BL, and two reconstructed samples located in the same column (rx, -1) of the current sample (denoted as T) and in the same row (r-1, y) of the current sample (denoted as L). The planar mode is formulated as shown in the following equation: p <sub>xy</sub> = (Nx-1)·L + (Ny-1)·T + x·TR + y·BL.
对于DC模式,简单地用相邻经重构样本的平均值填充预测块。一般来说,针对模型化平滑地变化及恒定图片区域应用平面模式及DC模式两者。For DC mode, the prediction block is simply filled with the average of neighboring reconstructed samples. Generally, both planar mode and DC mode are applied to modeled, smoothly varying, and constant image regions.
图7为绘示根据HEVC的角度模式的方面的概念图。对于总共包含33个不同预测方向的HEVC中的角度帧内预测模式,帧内预测过程描述如下。对于每一给定角度帧内预测,可相应地识别帧内预测方向。举例来说,根据图5,帧内模式18对应于纯水平预测方向,且帧内模式26对应于纯竖直预测方向。给定特定帧内预测方向,针对预测块的每一样本,首先将样本的坐标(x,y)沿着预测方向投影至相邻经重构样本的行/列,如图7中的实例中所展示。假设(x,y)对经投影至两个相邻经重构样本L与R之间的分数位置α,那么使用双抽头双线性内插滤波器来计算用于(x,y)的预测值,如以下方程式中所展示而公式化:pxy=(1-α)·L+α·R。为避免浮点运算,在HEVC中,实际上使用如pxy=((32-a)·L+a·R+16)>>5的整数算术来近似上述计算,其中a为等于32*α的整数。Figure 7 is a conceptual diagram illustrating aspects of the angular patterns according to HEVC. For the angular intra-prediction patterns in HEVC, which contain a total of 33 different prediction directions, the intra-prediction process is described below. For each given angular intra-prediction, the corresponding intra-prediction direction can be identified. For example, according to Figure 5, intra-prediction pattern 18 corresponds to a purely horizontal prediction direction, and intra-prediction pattern 26 corresponds to a purely vertical prediction direction. Given a specific intra-prediction direction, for each sample in the prediction block, the sample's coordinates (x, y) are first projected along the prediction direction to the row/column of adjacent reconstructed samples, as shown in the example in Figure 7. Assuming (x, y) is projected to a fractional position α between two adjacent reconstructed samples L and R, a two-tap bilinear interpolation filter is used to calculate the predicted value for (x, y), formulated as shown in the following equation: p<sub>xy</sub> = (1-α)·L + α·R. To avoid floating-point operations, HEVC actually uses integer arithmetic such as p xy = ((32-a)·L+a·R+16)>>5 to approximate the above calculation, where a is an integer equal to 32*α.
下文大体上描述了色度编码及解码的方面。色度信号中的结构常常遵循对应亮度信号的结构。如所描述,根据HEVC,每一亮度块对应于一个色度块,而每一色度预测块可基于等于2N×2N或N×N的亮度预测块的分区大小而对应于一个或四个亮度预测块。利用色度信号结构的这些特性及一般趋势,HEVC提供视频编码器20可用以向视频解码器30指示色度PU是与对应选定亮度PU使用同一预测模式预测的情况或例子的机制。下表2规定视频编码器20可使用以用信号传送用于色度PU的色度模式的模式布置。举例来说,一个经帧内编码的色度PU可使用选自五个(5)模式中的一者的模式来预测,所述模式包含平面模式(INTRA_PLANAR)、竖直模式(INTRA_ANGULAR26)、水平模式(INTRA_ANGULAR10)、DC模式(INTRA_DC)及导出模式(DM)。DM经设定为用于预测对应选定亮度PU的帧内预测模式。举例来说,如果对应选定亮度PU是用具有等于11的索引的帧内模式译码,那么DM经设定为具有等于11的索引的帧内模式。The following describes the aspects of chroma encoding and decoding in general. The structure in a chroma signal often follows the structure of the corresponding luminance signal. As described, according to HEVC, each luminance block corresponds to one chroma block, and each chroma prediction block may correspond to one or four luminance prediction blocks based on a partition size equal to 2N×2N or N×N luminance prediction blocks. Taking advantage of these characteristics and general trends in the structure of chroma signals, HEVC provides a mechanism for the video encoder 20 to indicate to the video decoder 30 that the chroma PU is predicted using the same prediction mode as the corresponding selected luminance PU. Table 2 below specifies the mode arrangement of the chroma modes that the video encoder 20 can use to signal the chroma modes for the chroma PU. For example, an intra-coded chroma PU may be predicted using a mode selected from five (5) modes, including planar mode (INTRA_PLANAR), vertical mode (INTRA_ANGULAR26), horizontal mode (INTRA_ANGULAR10), DC mode (INTRA_DC), and derived mode (DM). The DM is set to the intra-prediction mode for predicting the corresponding selected luminance PU. For example, if the corresponding selected luminance PU is decoded using an intra-prediction mode with an index of 11, then the DM is set to the intra-prediction mode with an index of 11.
表2-色度帧内预测模式及相关联名称的规范Table 2 - Specifications of Chroma Intra-Frame Prediction Modes and Associated Names
如果在经编码视频位流中指示导出模式将用于PU,那么视频解码器30可使用用于对应亮度PU的预测模式来执行针对色度PU的预测。为了缓解可能在导出模式指预测模式中的始终存在的一者时出现的冗余问题,视频编码器20及视频解码器30可使用指定替代模式作为重复模式的替代物。如上文的表2中所展示,视频编码器20及视频解码器30可使用也被称作“角度(34)模式”的“INTRA_ANGULAR34”色度替代模式作为替代物以移除冗余。举例来说,色度PU与亮度PU之间的关系为一对一或多对一,视频编码器20及视频解码器30可通过选择可适用于单个对应亮度PU的预测模式来确定用于色度PU的预测模式。If the encoded video bitstream indicates that an export mode will be used for the PU, then the video decoder 30 can use the prediction mode for the corresponding luma PU to perform prediction for the chroma PU. To mitigate redundancy issues that may arise when the export mode indicates that one of the prediction modes is always present, the video encoder 20 and the video decoder 30 can use a specified alternative mode as a substitute for the repeating mode. As shown in Table 2 above, the video encoder 20 and the video decoder 30 can use the “INTRA_ANGULAR34” chroma alternative mode, also known as the “angle (34) mode,” as a substitute to remove redundancy. For example, if the relationship between the chroma PU and the luma PU is one-to-one or many-to-one, the video encoder 20 and the video decoder 30 can determine the prediction mode for the chroma PU by selecting a prediction mode applicable to a single corresponding luma PU.
然而,在一些情况下,一个色度PU可对应于多个亮度PU。认为单个色度PU对应于多个亮度PU的情境是关于色度编码及解码的例外或“特殊情况”。举例来说,在这些特殊情况中的一些中,一个色度PU可对应于四个亮度PU。在色度-亮度关系是一对多的特殊情况下,视频编码器20及视频解码器30可通过选择用于对应左上方亮度PU的预测模式来确定用于色度PU的预测模式。However, in some cases, one chroma PU can correspond to multiple luma PUs. The scenario where a single chroma PU corresponds to multiple luma PUs is an exception or "special case" regarding chroma encoding and decoding. For example, in some of these special cases, one chroma PU can correspond to four luma PUs. In the special case where the chroma-luma relationship is one-to-many, the video encoder 20 and video decoder 30 can determine the prediction mode for the chroma PU by selecting the prediction mode corresponding to the upper left luma PU.
视频编码器20及视频解码器30可熵译码(分别为熵编码及熵解码)指示用于视频数据块的色度预测模式的数据。根据色度模式译码,视频编码器20可指派1-b语法元素(0)至单个最常出现的导出模式,同时指派3-b语法元素(分别为100、101、110及111)至剩余四个模式中的每一者。视频编码器20及视频解码器3可通过一个上下文模型仅译码第一二进位(bin),且可对剩余两个二进位(需要时)进行旁路译码。Video encoder 20 and video decoder 30 can entropy decode (entropy encoding and entropy decoding, respectively) the data indicating the chroma prediction mode for the video data block. Based on the chroma mode decoding, video encoder 20 can assign a 1-b syntax element (0) to a single most frequently occurring derived mode, while assigning 3-b syntax elements (100, 101, 110, and 111, respectively) to each of the remaining four modes. Video encoder 20 and video decoder 3 can decode only the first binary (bin) through a context model, and can bypass decoding the remaining two binary (if needed).
视频编码器20及视频解码器30可根据上下文自适应二进制算术译码(CABAC)来熵译码(分别为熵编码及熵解码)视频数据。CABAC是如下熵译码方法:首先在H.264/AVC中介绍,且描述于D.Marpe、H.Schwarz及T.Wiegand的“Context-based adaptive binaryarithmetic coding in the H.264/AVC video compression standard”(IEEETrans.Circuits Syst.Video Technol.,2003年7月,第7期第13卷,第620至636页)中。CABAC现在用于高效率视频译码(HEVC)视频译码标准中。视频编码器20可视频解码器30可以与针对HEVC所执行的CABAC类似的方式将CABAC用于熵译码。Video encoder 20 and video decoder 30 can entropy decode (entropy encoding and entropy decoding, respectively) video data using context-based adaptive binary arithmetic decoding (CABAC). CABAC is an entropy decoding method first introduced in H.264/AVC and described in "Context-based adaptive binaryarithmetic coding in the H.264/AVC video compression standard" by D. Marpe, H. Schwarz, and T. Wiegand (IEEE Trans. Circuits Syst. Video Technol., July 2003, Vol. 13, No. 7, pp. 620-636). CABAC is now used in the High Efficiency Video Decoding (HEVC) video decoding standard. Video encoder 20 and video decoder 30 can use CABAC for entropy decoding in a similar manner to CABAC performed for HEVC.
CABAC涉及三个主要功能:二进制化、上下文模型化及算术译码。二进制化功能将语法元素映射至被称作二进位串的二进制符号(二进位)。上下文模型化功能估计二进位的概率。算术译码功能(也被称作二进制算术译码)基于估计的概率而将二进位压缩至位。CABAC involves three main functions: binaryization, context modeling, and arithmetic decoding. The binaryization function maps syntax elements to binary symbols (bits) called binary strings. The context modeling function estimates the probabilities of the bits. The arithmetic decoding function (also known as binary arithmetic decoding) compresses the bits to bits based on the estimated probabilities.
视频编码器20及视频解码器30可使用HEVC中所提供的若干不同二进制化过程中的一或多者执行用于CABAC的二进制化。HEVC中所提供的二进制化过程包含一元(U)、截短一元(TU)、k阶指数哥伦布(EGk)及固定长度(FL)技术。这些二进制化过程的细节描述于V.Sze及M.Budagavi的“High throughput CABAC entropy coding in HEVC”(IEEETransactions on Circuits and Systems for Video Technology(TCSVT),2012年12月,第12期,第22卷,第1778至1791页)中。The video encoder 20 and video decoder 30 can perform binaryization for CABAC using one or more of the various binaryization processes provided in HEVC. The binaryization processes provided in HEVC include unary (U), truncated unary (TU), k-order exponential Columbus (EGk), and fixed-length (FL) techniques. Details of these binaryization processes are described in V. Sze and M. Budagavi's "High throughput CABAC entropy coding in HEVC" (IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), December 2012, Vol. 22, No. 12, pp. 1778-1791).
根据基于一元的编码,视频编码器20可用信号传送长度N+1的二进位串,其中“N”表示整数值,其中前N个二进位(值)为1,且其中最后二进位(值)为0。根据基于一元的解码,视频解码器30可搜索二进位的0值。在到0值二进位后,视频解码器30可确定语法元素是完整的。According to unary-based encoding, video encoder 20 can transmit a binary string of length N+1, where "N" represents an integer value, the first N bits (values) are 1, and the last bit (value) is 0. According to unary-based decoding, video decoder 30 can search for the binary value of 0. After finding the binary value of 0, video decoder 30 can determine that the syntax element is complete.
根据截短一元译码,视频编码器20可比一元译码的情况少编码一个二进位。举例来说,视频编码器20可设定语法元素的最大可能值的最大值。最大值在本文中由“cMax”指示。当(N+1)<cMax时,如同一元译码,视频编码器20可实施相同用信号传送。然而,当(N+1)=cMax时,视频编码器20可将所有二进位设定为相应值1。视频解码器30可搜索0值二进位,直至已检验了cMax数目个二进位,以确定何时语法元素是完整的。一元及截短一元译码中所使用的二进位串的方面及其间的对比绘示于下面的表3中。对比二进位值绘示于表3中,使用粗斜体调出。According to truncated unary decoding, video encoder 20 can encode one less binary bit than in the case of unary decoding. For example, video encoder 20 can set the maximum possible value of the syntax element. The maximum value is indicated by "cMax" in this document. When (N+1) < cMax, video encoder 20 can perform the same signal transmission as in unary decoding. However, when (N+1) = cMax, video encoder 20 can set all binary bits to the corresponding value 1. Video decoder 30 can search for 0-value binary bits until cMax number of binary bits have been checked to determine when the syntax element is complete. Aspects of the binary strings used in unary and truncated unary decoding and their comparisons are illustrated in Table 3 below. The compared binary values are illustrated in Table 3 and highlighted in bold italics.
表3-一元及截短一元的二进位串实例Table 3 - Examples of unary and truncated unary binary strings
视频编码器20及视频解码器30还可执行CABAC的上下文模型化方面。上下文模型化提供相对准确的概率估计,其是实现高效率译码的方面。因此,上下文模型化是自适应过程,且有时被描述为“高度自适应”。不同上下文模型可用于不同二进位,其中上下文模型的概率可基于先前已译码二进位的值来更新。具有类似分布的二进位常常共享同一上下文模型。视频编码器20及/或视频解码器30可基于包含以下各者的一或多个因素来选择用于每一二进位的上下文模型化:语法元素的类型、语法元素中的二进位位置(binIdx)、亮度/色度、相邻信息等等。The video encoder 20 and video decoder 30 also perform the context modeling aspect of CABAC. Context modeling provides relatively accurate probability estimates, which is an aspect of achieving efficient decoding. Therefore, context modeling is an adaptive process and is sometimes described as “highly adaptive.” Different context models can be used for different binary bits, where the probabilities of the context model can be updated based on the values of previously decoded binary bits. Binaries with similar distributions often share the same context model. The video encoder 20 and/or video decoder 30 can select the context modeling for each binary bit based on one or more factors including: the type of syntax element, the binary position (binIdx) in the syntax element, luma/chroma, adjacency information, etc.
视频编码器20及视频解码器30可在二进位译码(二进位编码或二进位解码,视具体情况而定)的每一实例之后执行上下文切换。视频编码器20及视频解码器30可将概率模型作为7位条目(6个位用于概率状态且1个位用于最可能符号(MPS))存储于上下文存储器中,且可使用由上下文选择逻辑计算的上下文索引来寻址概率模型。HEVC提供与H.264/AVC相同的概率更新方法。然而,基于HEVC的上下文选择逻辑是关于H.264/AVC上下文选择逻辑经修改,以改善输送量。视频编码器20及视频解码器30还可将概率表示分别用于CABAC熵编码及解码。对于CABAC,64个代表性概率值pσ∈[0.01875,0.5]是通过以下递归方程式针对最不可能符号(LPS)导出:The video encoder 20 and video decoder 30 can perform context switching after each instance of binary decoding (binary encoding or binary decoding, depending on the specific case). The video encoder 20 and video decoder 30 can store the probability model as a 7-bit entry (6 bits for probability state and 1 bit for most probable symbol (MPS)) in context memory, and the probability model can be addressed using a context index computed by the context selection logic. HEVC provides the same probability update method as H.264/AVC. However, the HEVC-based context selection logic is a modification of the H.264/AVC context selection logic to improve throughput. The video encoder 20 and video decoder 30 can also use the probability representation for CABAC entropy encoding and decoding, respectively. For CABAC, 64 representative probability values pσ ∈ [0.01875, 0.5] are derived for the least probable symbol (LPS) through the following recursive equation:
pσ=α*pσ-1,对于所有σ=1,…,63, pσ = α* pσ-1 , for all σ = 1,…,63,
其中in
在上述方程式中,所述组概率中的所选缩放因数α≈0.9492与基数N=64两者表示概率表示的准确度与适应速度之间的折中方案。上述方程式中所使用的参数已展示出概率表示准确度与较快适应需要之间的相对良好折中方案。MPS的概率等于1减去LPS的概率(即,(1-LPS))。因此,可由CABAC表示的概率范围是[0.01875,0.98125]。范围的上限(MPS概率)等于一减去下限(即,一减去LPS概率)。即,1-0.01875=0.98125。In the above equation, the chosen scaling factor α ≈ 0.9492 and the cardinality N = 64 in the group probabilities represent a trade-off between the accuracy of the probability representation and the speed of adaptation. The parameters used in the above equation demonstrate a relatively good trade-off between the accuracy of the probability representation and the need for faster adaptation. The probability of MPS is equal to 1 minus the probability of LPS (i.e., (1-LPS)). Therefore, the probability range that can be represented by CABAC is [0.01875, 0.98125]. The upper limit of the range (MPS probability) is equal to one minus the lower limit (i.e., one minus the LPS probability). That is, 1 - 0.01875 = 0.98125.
在编码或解码特定切片之前,视频编码器20及视频解码器30可基于一些预定义值来初始化概率模型。举例来说,给定由“qp”指示的输入量化参数及由“initVal”指示的预定义值,视频编码器20及/或视频解码器30可导出概率模型(由“state”及“MPS”指示)的7位条目如下:Before encoding or decoding a specific slice, the video encoder 20 and video decoder 30 can initialize a probabilistic model based on some predefined values. For example, given the input quantization parameter indicated by "qp" and the predefined value indicated by "initVal", the video encoder 20 and/or video decoder 30 can derive the following 7-bit entries of the probabilistic model (indicated by "state" and "MPS"):
qp=Clip3(0,51,qp);qp = Clip3(0,51,qp);
slope=(initVal>>4)*5-45;slope = (initVal >> 4) * 5 - 45;
offset=((initVal&15)<<3)-16;offset=((initVal&15)<<3)-16;
initState=min(max(1,(((slope*qp)>>4)+offset)),126);initState=min(max(1,(((slope*qp)>>4)+offset)),126);
MPS=(initState>=64);MPS = (initState >= 64);
state index=((mpState?(initState-64):(63-initState))<<1)+MPS;state index=((mpState?(initState-64):(63-initState))<<1)+MPS;
导出的状态索引隐含地包含MPS信息。即,当状态索引为偶数值时,MPS值等于0。相反地,当状态索引为奇数值时,MPS值等于1。“initVal”的值在具有8位精度的范围[0,255]内。The exported state index implicitly contains MPS information. That is, when the state index is an even number, the MPS value is equal to 0. Conversely, when the state index is an odd number, the MPS value is equal to 1. The value of “initVal” is in the range [0, 255] with 8-bit precision.
预定义initVal是切片相依性的。即,视频编码器20可将三组上下文初始化参数分别用于特别用于I切片、P切片及B切片的译码的概率模型。以此方式,视频编码器20经启用以针对这三个切片类型在三个初始化表之间选择,使得可潜在地实现对不同译码情境及/或不同类型视频内容的更好配合。The predefined `initVal` is slice-dependent. That is, the video encoder 20 can use three sets of context initialization parameters respectively for probabilistic models specifically for decoding I-slices, P-slices, and B-slices. In this way, the video encoder 20 is enabled to select among three initialization tables for these three slice types, potentially enabling better adaptation to different decoding scenarios and/or different types of video content.
JEM3.0的新近进展包含关于帧内模式译码的发展。根据JEM3.0的这些新近发展,视频编码器20及视频解码器30可执行具有6个最可能模式(MPM)的帧内模式译码。如V.Seregin、X.Zhao、A.Said、M.Karczewicz的“Neighbor based intra most probablemodes list derivation”(JVET-C0055,日内瓦,2016年5月)中所描述,HEVC中的33个角度模式已扩展至65个角度模式,加上具有6个最可能模式(MPM)的DC及平面模式。视频编码器20可编码用以指示帧内亮度模式是否包含于MPM候选者列表中的一位旗标(例如,“MPM旗标”),所述MPM候选者列表包含6个模式(如上文所引用的JVET-C0055中所描述)。如果帧内亮度模式包含于MPM候选者列表中(由此导致视频编码器20将MPM旗标设定为正值),那么视频编码器20可进一步编码及用信号传送MPM候选者的索引,以指示列表中的哪个MPM候选者是帧内亮度模式。否则(即,如果视频编码器20将MPM旗标设定为负值),视频编码器20可进一步用信号传送剩余帧内亮度模式的索引。Recent advancements in JEM 3.0 include developments in intra-frame mode decoding. According to these advancements, video encoder 20 and video decoder 30 can perform intra-frame mode decoding with six most probable modes (MPMs). As described in “Neighbor-based intra most probable modes list derivation” (JVET-C0055, Geneva, May 2016) by V. Seregin, X. Zhao, A. Said, and M. Karczewicz ("Neighbor-based intra most probable modes list derivation"). The number of angular modes in HEVC has been expanded from 33 to 65, in addition to DC and planar modes with six most probable modes (MPMs). Video encoder 20 can encode a flag (e.g., “MPM flag”) to indicate whether an intra-frame luma mode is included in the MPM candidate list, which contains six modes (as described in JVET-C0055 cited above). If an intra-frame luma mode is included in the MPM candidate list (causing the video encoder 20 to set the MPM flag to a positive value), then the video encoder 20 can further encode and signal the index of the MPM candidate to indicate which MPM candidate in the list is the intra-frame luma mode. Otherwise (i.e., if the video encoder 20 sets the MPM flag to a negative value), the video encoder 20 can further signal the index of the remaining intra-frame luma modes.
根据JEM3.0进步的这些方面,视频解码器30可在接收到用信号传送的经编码视频位流后解码MPM旗标,以确定帧内亮度模式是否包含于MPM候选者列表中。如果视频解码器30确定MPM旗标经设定为正值,那么视频解码器30可解码接收的索引以从MPM候选者列表识别帧内亮度模式。相反地,如果视频解码器30确定MPM旗标经设定为负值,那么视频解码器30可接收且解码剩余帧内亮度模式的索引。Based on these advancements in JEM 3.0, the video decoder 30 can decode the MPM flag after receiving the encoded video bitstream transmitted as a signal to determine whether an intra-frame luma mode is included in the MPM candidate list. If the video decoder 30 determines that the MPM flag is set to a positive value, then the video decoder 30 can decode the received index to identify the intra-frame luma mode from the MPM candidate list. Conversely, if the video decoder 30 determines that the MPM flag is set to a negative value, then the video decoder 30 can receive and decode the indices of the remaining intra-frame luma modes.
关于自适应多核心变换还已实现新近JEM3.0进展。除了用于HEVC中的DCT-II及4×4DST-VII之外,自适应多变换(AMT)方案还被用于经帧间译码块及经帧内译码块两者的残余译码。AMT利用除了HEVC目前所定义的变换之外的来自DCT/DST家族的多个选定变换。JEM3.0的新引入变换矩阵为DST-VII、DCT-VIII、DST-I及DCT-V。Recent advancements in JEM3.0 have also been made regarding Adaptive Multi-Core Transform (AMT). In addition to DCT-II and 4×4 DST-VII used in HEVC, the AMT scheme is also applied to residual decoding of both inter-frame and intra-frame decoding blocks. AMT utilizes several selected transforms from the DCT/DST family, in addition to those currently defined in HEVC. The newly introduced transform matrices in JEM3.0 are DST-VII, DCT-VIII, DST-I, and DCT-V.
对于帧内残余译码,由于不同帧内预测模式的不同残余统计数据,视频编码器20及视频解码器30可使用模式相依性变换候选者选择过程。三个变换子集已如下表4中所展示而定义,且视频编码器20及/或视频解码器30可基于帧内预测模式来选择变换子集,如下表5中所指定。For intra-frame residual decoding, due to the different residual statistics of different intra-frame prediction modes, the video encoder 20 and video decoder 30 can use a mode-dependent transform candidate selection process. Three transform subsets are defined as shown in Table 4 below, and the video encoder 20 and/or video decoder 30 can select transform subsets based on intra-frame prediction modes, as specified in Table 5 below.
表4:三个预定义的变换候选者集合Table 4: Three predefined sets of transformation candidates
表5:用于每一帧内预测模式的选定水平(H)及竖直(V)变换集合Table 5: Selected Horizontal (H) and Vertical (V) Transform Sets for Prediction Mode in Each Frame
根据子集合概念,视频解码器30可首先基于下表6识别变换子集。举例来说,为了识别变换子集,视频解码器30可使用CU的帧内预测模式,其在CU层级AMT旗标设定为值1的情况下用信号传送。随后,针对水平及竖直变换中的每一者,视频解码器30可根据下表7选择经识别变换子集中的两个变换候选者中的一者。用于水平及竖直变换中的每一者的选定变换候选者是基于明确地用信号传送具有旗标的数据而选择。然而,对于帧间预测残余,视频解码器30可针对所有帧间模式且针对水平及竖直变两者使用仅一个变换集合,其由DST-VII及DCT-VIII组成。Based on the subset concept, the video decoder 30 can first identify transform subsets based on Table 6 below. For example, to identify transform subsets, the video decoder 30 can use the intra-frame prediction mode of the CU, which is signaled with the CU-level AMT flag set to value 1. Subsequently, for each of the horizontal and vertical transforms, the video decoder 30 can select one of two transform candidates from the identified transform subset according to Table 7 below. The selected transform candidate for each of the horizontal and vertical transforms is selected based on the explicit signaling of flagged data. However, for the inter-frame prediction residual, the video decoder 30 can use only one transform set for all inter-frame modes and for both horizontal and vertical transforms, consisting of DST-VII and DCT-VIII.
表6-色度帧内预测模式及相关联名称的规范Table 6 - Specifications of Chroma Intra-Frame Prediction Modes and Associated Names
表7-用于每一色度模式的二进位串Table 7 - Binary strings for each chroma mode
关于用于视频译码的LM(线性模型)预测模式已实现新近JEM3.0进展。本发明的视频译码装置(例如视频编码器20及视频解码器30)在视频编码及视频解码时可处理颜色空间及颜色格式的方面。颜色视频在多媒体系统中发挥主要作用,其中各种颜色空间用以有效表示颜色。颜色空间使用多个分量利用数字值指定颜色。常用颜色空间为“RGB”颜色空间,其中将颜色表示为三原色分量值(即,红色、绿色及蓝色)的组合。对于颜色视频压缩,已广泛地使用YCbCr颜色空间,如A.Ford及A.Roberts的“Colour space conversions”(Tech.Rep,伦敦,威斯敏斯特大学,1998年8月)中所描述。YCbCr可经由线性变换从RGB颜色空间相对容易地转换。在RGB至YCbCr转换中,不同分量之间的冗余(即,交叉分量冗余)在所得YCbCr颜色空间中显著减小。Recent advancements in JEM 3.0 have been made regarding LM (Linear Model) prediction modes for video decoding. The video decoding apparatus of this invention (e.g., video encoder 20 and video decoder 30) can handle aspects of color space and color format during video encoding and decoding. Color video plays a major role in multimedia systems, where various color spaces are used to efficiently represent colors. Color spaces use multiple components to specify colors using digital values. A commonly used color space is the "RGB" color space, where colors are represented as combinations of the three primary color component values (i.e., red, green, and blue). For color video compression, the YCbCr color space has been widely used, as described in A. Ford and A. Roberts' "Colour Space Conversions" (Tech. Rep, London, University of Westminster, August 1998). YCbCr can be relatively easily converted from the RGB color space via linear transformation. In the RGB to YCbCr conversion, redundancy between different components (i.e., cross-component redundancy) is significantly reduced in the resulting YCbCr color space.
YCbCr的一个优点为与黑白TV的反向兼容性,这是因为Y信号传达亮度信息。另外,色度带宽可通过以4:2:0色度取样格式次取样Cb及Cr分量而减少,与RGB中的次取样相比,主观影响显著较小。由于这些优点,YCbCr已为视频压缩中的主要颜色空间。还存在可用于视频压缩的其它颜色空间,例如YCoCg。出于说明目的,不管所使用的实际颜色空间如何,在整个本发明中使用Y、Cb、Cr信号来表示视频压缩方案中的三个颜色分量。在4:2:0取样中,两个色度阵列(Cb及Cr)中的每一者的高度及宽度均为亮度阵列(Y)的一半。One advantage of YCbCr is its backward compatibility with black and white TVs, because the Y signal conveys luminance information. Additionally, the chrominance bandwidth can be reduced by subsampling the Cb and Cr components in a 4:2:0 chrominance sampling format, resulting in significantly less subjective impact compared to subsampling in RGB. Due to these advantages, YCbCr has become the primary color space in video compression. Other color spaces, such as YCoCg, also exist for video compression. For illustrative purposes, regardless of the actual color space used, Y, Cb, and Cr signals are used throughout this invention to represent the three color components in the video compression scheme. In 4:2:0 sampling, the height and width of each of the two chrominance arrays (Cb and Cr) are half that of the luminance array (Y).
图8为绘示图片中的标称竖直及水平位置亮度样本及色度样本的实例的概念图。图片中的亮度样本及色度样本的标称竖直及水平相对位置是大体上对应于如4:2:0取样格式所提供的位置而展示于图8中。Figure 8 is a conceptual diagram illustrating examples of nominal vertical and horizontal luminance and chrominance samples in an image. The nominal vertical and horizontal relative positions of the luminance and chrominance samples in the image are roughly corresponding to the positions provided by a 4:2:0 sampling format, as shown in Figure 8.
用于视频译码的LM预测模式的方面将在以下段落中论述。尽管交叉分量冗余在YCbCr颜色空间中显著减小,但三个颜色分量之间的相关性在YCbCr颜色空间中仍然存在。已研究各种技术以通过进一步减小颜色分量之间的相关性来改善视频译码性能。关于4:2:0色度视频译码,在HEVC标准开发期间研究了线性模型(LM)预测模式。LM预测模式的方面描述于J.Chen、V.Seregin、W.-J.Han、J.-S.Kim及B.-M.Joen的“CE6.a.4:Chroma intraprediction by reconstructed luma samples”(ITU-T SG16WP3及ISO/IEC JTC1/SC29/WG11的视频译码联合合作小组(Joint Collaborative Team on Video Coding,JCT-VC),JCTVC-E266,第5次会议,日内瓦,2011年3月16日至23日)中。Aspects of the LM prediction mode used for video decoding will be discussed in the following paragraphs. Although cross-component redundancy is significantly reduced in the YCbCr color space, correlation between the three color components still exists in the YCbCr color space. Various techniques have been investigated to improve video decoding performance by further reducing the correlation between color components. Regarding 4:2:0 chroma video decoding, the linear model (LM) prediction mode was studied during the development of the HEVC standard. The aspects of the LM prediction model are described in “CE6.a.4: Chroma intraprediction by reconstructed luma samples” by J. Chen, V. Seregin, W.-J. Han, J.-S. Kim and B.-M. Joen (Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16WP3 and ISO/IEC JTC1/SC29/WG11, JCTVC-E266, 5th meeting, Geneva, March 16-23, 2011).
在根据LM预测模式执行预测时,视频编码器20及视频解码器30可通过使用以下方程式(1)中所展示的线性模型,基于同一块的减少取样的经重构亮度样本来预测色度样本。When performing prediction according to the LM prediction mode, the video encoder 20 and the video decoder 30 can predict chroma samples based on the reduced-sampled reconstructed luminance samples of the same block by using the linear model shown in the following equation (1).
predC(i,j)=α·recL(i,j)+β (1)pred C (i,j)=α·rec L (i,j)+β (1)
其中predC(i,j)表示块中的色度样本的预测且recL(i,j)表示同一块的减少取样的重构亮度样本。参数α及β是从当前块周围的因果性重构样本导出。Where pred C (i,j) represents the prediction of the chromaticity sample in the block and rec L (i,j) represents the reconstructed luminance sample of the same block after reduction sampling. The parameters α and β are derived from the causal reconstructed samples around the current block.
图9为绘示用于导出在根据线性模型(LM)模式的预测中所使用的参数的样本的位置的概念图。图9中所描绘的选定参考样本的实例是关于如上文的方程式(1)中所使用的α及β的导出。如果色度块大小由N×N(其中N为整数)表示,那么i及j均在范围[0,N]内。Figure 9 is a conceptual diagram illustrating the location of samples used to derive the parameters used in predictions based on the linear model (LM) pattern. The instance of the selected reference sample depicted in Figure 9 is a derivation of α and β used in equation (1) above. If the chroma block size is represented by N×N (where N is an integer), then i and j are both in the range [0, N].
视频编码器20及视频解码器30可通过根据以下方程式(2)减小或潜在地最小化当前块周围的相邻经重构亮度样本及色度样本之间的回归误差来导出方程式(1)中的参数α及β。The video encoder 20 and the video decoder 30 can derive the parameters α and β in equation (1) by reducing or potentially minimizing the regression error between adjacent reconstructed luminance and chrominance samples around the current block according to the following equation (2).
参数α及β如下所述而解出。The parameters α and β are solved as described below.
β=(∑yi-α·∑xi)/I (4)β=(∑y i -α·∑x i )/I (4)
其中xi表示减少取样的经重构亮度参考样本,yi表示经重构色度参考样本,且I表示参考样本的量(例如,计数)。对于目标N×N色度块,当左方及上方因果样本两者可用时,所涉及样本的总数(I)等于2N。当仅左方或上方因果样本可用时,所涉及样本的总数(I)等于N。Where x <sub>i </sub> represents the reduced-sampled reconstructed luminance reference sample, y<sub>i</sub> represents the reconstructed chrominance reference sample, and I represents the amount of reference samples (e.g., count). For a target N×N chrominance block, the total number of samples involved (I) is equal to 2N when both the left and top causal samples are available. The total number of samples involved (I) is equal to N when only the left or top causal samples are available.
总体来说,当应用LM预测模式时,视频编码器20及/或视频解码器30可按列于如下的次序调用以下步骤:In general, when applying LM prediction mode, the video encoder 20 and/or video decoder 30 may invoke the following steps in the following order:
a)减少取样相邻亮度样本;a) Reduce the sampling of adjacent brightness samples;
b)导出线性参数(即,α及β);及b) Derive the linear parameters (i.e., α and β); and
c)减少取样当前亮度块且从减少取样的亮度块及线性参数导出预测。c) Reduce the sampling of the current luminance block and derive the prediction from the reduced-sampled luminance block and linear parameters.
为了进一步改善译码效率,视频编码器20及/或视频解码器30可利用减少取样滤波器(1,2,1)及(1,1)来导出对应亮度块内的相邻样本xi及减少取样的亮度样本recL(i,j)。To further improve decoding efficiency, the video encoder 20 and/or the video decoder 30 may use downsampling filters (1,2,1) and (1,1) to derive adjacent samples x i and downsampled luminance samples rec L (i,j) within the corresponding luminance block.
关于色度分量之间的预测还已实现新近JEM3.0进展。在JEM中,LM预测模式经扩展至两个色度分量之间的预测。举例来说,可从Cb分量预测Cr分量。替代使用经重构样本信号,视频编码器20及/或视频解码器30在剩余域中可应用交叉分量预测。举例来说,视频编码器20及/或视频解码器30可通过将加权的经重构Cb残余与原始Cr帧内预测相加从而形成最终Cr预测来实施交叉分量预测的剩余域应用。此操作的实例展示于以下方程式(3)中:Recent advancements in JEM 3.0 have also enabled predictions between chrominance components. In JEM, the LM prediction mode has been extended to predictions between two chrominance components. For example, the Cr component can be predicted from the Cb component. Instead of using reconstructed sample signals, the video encoder 20 and/or video decoder 30 can apply cross-component predictions in the residual domain. For example, the video encoder 20 and/or video decoder 30 can implement the residual domain application of cross-component predictions by adding the weighted reconstructed Cb residual to the original Cr intra-frame prediction to form the final Cr prediction. An example of this operation is shown in the following equation (3):
视频编码器20及/或视频解码器30可导出缩放因数α,如在LM模式中的导出。然而,一个不同之处在于相对于误差函数中的默认α值增加了回归成本,使得导出的缩放因数偏向默认值(-0.5)。LM预测模式是作为一个额外色度帧内预测模式而添加。就此来说,视频编码器20可针对色度分量增加多一次RD成本检查,以用于选择色度帧内预测模式。Video encoder 20 and/or video decoder 30 can derive a scaling factor α, as in LM mode. However, one difference is that a regression cost is added relative to the default α value in the error function, causing the derived scaling factor to be biased towards the default value (-0.5). LM prediction mode is added as an additional chroma intra-prediction mode. In this regard, video encoder 20 can add an extra RD cost check for the chroma component to select the chroma intra-prediction mode.
四叉树二叉树(QTBT)结构的方面描述于以下段落中。在VCEG提议COM16-C966(J.An、Y.-W Chen、K.Zhang、H.Huang、Y.-W.Huang及S.Lei的“Block partitioningstructure for next generation video coding”(International TelecommunicationUnion,COM16-C966,2015年9月))中,针对超过HEVC的未来视频译码标准建议QTBT分割方案。展示COM16-C966中所建议的QTBT结构的模拟比HEVC中所使用的四叉树结构更有效率。在COM16-C966的建议QTBT结构中,首先根据四叉树结构来分割译码树型块(CTB),其中一个节点的四叉树拆分可反复,直至所述节点达到最小允许的四叉树叶节点大小(MinQTSize)。The Quad-tree Binary Tree (QTBT) structure is described in the following paragraphs. In the VCEG proposal COM16-C966 (“Block partitioning structure for next generation video coding” by J. An, Y.-W. Chen, K. Zhang, H. Huang, Y.-W. Huang, and S. Lei (International Telecommunication Union, COM16-C966, September 2015), a QTBT partitioning scheme is proposed for future video decoding standards exceeding HEVC. Simulations demonstrating that the QTBT structure proposed in COM16-C966 is more efficient than the quadtree structure used in HEVC. In the proposed QTBT structure of COM16-C966, the decoding tree block (CTB) is first partitioned according to the quadtree structure. The quadtree partitioning of a node can be repeated until the node reaches the minimum allowed quadtree leaf node size (MinQTSize).
根据QTBT结构,如果四叉树叶节点大小不大于最大允许的二叉树型根节点大小(MaxBTSize),那么可根据二叉树结构进一步分割四叉树叶节点。给定节点的二叉树拆分可反复,直至所述节点达到最小允许的二叉树叶节点大小(MinBTSize),或直至反复拆分达到最大允许的二叉树深度(MaxBTDepth)。二叉树叶节点即为CU,其在无任何进一步分割的情况下可用于预测(例如图片内或图片间预测)及变换。According to the QTBT structure, if the size of a quadtree leaf node is no larger than the maximum allowed size of a binary tree root node (MaxBTSize), then the quadtree leaf nodes can be further divided according to the binary tree structure. The binary tree splitting of a given node can be repeated until the node reaches the minimum allowed size of a binary tree leaf node (MinBTSize), or until the repeated splitting reaches the maximum allowed depth of the binary tree (MaxBTDepth). The binary tree leaf node is called a CU (Cumulative Node), which can be used for prediction (e.g., intra-image or inter-image prediction) and transformation without any further splitting.
根据二叉树拆分,视频编码器20及/或视频解码器30可实施两个拆分类型,即,对称水平拆分及对称竖直拆分。在QTBT分割结构的一个实例中,CTU大小经设定为128×128(即,128×128亮度样本及两个对应64×64色度样本),MinQTSize经设定为16×16,MaxBTSize经设定为64×64,MinBTSize(对于宽度及高度两者)经设定为4,且MaxBTDepth经设定为4。视频编码器20及/或视频解码器30可首先将QTBT方案的四叉树分割部分应用于CTU,以产生四叉树叶节点。四叉树叶节点可具有从16×16(即,MinQTSize)至128×128(即,CTU大小)的大小。Based on the binary tree partitioning, the video encoder 20 and/or video decoder 30 can implement two partitioning types: symmetrical horizontal partitioning and symmetrical vertical partitioning. In an example of the QTBT partitioning structure, the CTU size is set to 128×128 (i.e., 128×128 luminance samples and two corresponding 64×64 chrominance samples), the MinQTSize is set to 16×16, the MaxBTSize is set to 64×64, the MinBTSize (for both width and height) is set to 4, and the MaxBTDepth is set to 4. The video encoder 20 and/or video decoder 30 can first apply the quadtree partitioning portion of the QTBT scheme to the CTU to generate quadtree leaf nodes. The quadtree leaf nodes can have sizes ranging from 16×16 (i.e., MinQTSize) to 128×128 (i.e., CTU size).
如果叶四叉树节点为128×128,那么视频编码器20及/或视频解码器30不能使用QTBT方案的二叉树部分将叶四叉树节点进一步拆分,这是因为节点大小超过MaxBTSize(在此情况下,64×64)。在其它方面(即,如果节点大小不超过64×64的MaxBTSize),视频编码器20及/或视频解码器30可使用QTBT结构的二叉树分割部分将叶四叉树节点进一步分割。因此,四叉树叶节点还为QTBT方案的二叉树部分的根节点,且因此具有二叉树深度0。当反复二叉树分割达到使二叉树深度达到MaxBTDepth(即,4)时,视频编码器20及/或视频解码器30关于叶节点不执行任何种类的进一步拆分。当QTBT方案的二叉树部分产生具有等于MinBTSize(即,4)的宽度的二叉树节点时,视频编码器20及/或视频解码器30可不执行节点的进一步水平拆分。类似地,当QTBT方案的二叉树部分产生具有等于MinBTSize(即,4)的高度的二叉树节点时,视频编码器20及/或视频解码器30可不执行节点的进一步竖直拆分。QTBT方案的二叉树部分的叶节点(在分割完全达到二叉树分割的情况下)即无任何其它分割的情况下通过预测及变换进一步处理的CU。If the leaf quadtree node is 128×128, then the video encoder 20 and/or video decoder 30 cannot further split the leaf quadtree node using the binary tree portion of the QTBT scheme because the node size exceeds MaxBTSize (64×64 in this case). Otherwise (i.e., if the node size does not exceed MaxBTSize of 64×64), the video encoder 20 and/or video decoder 30 can further split the leaf quadtree node using the binary tree segmentation portion of the QTBT structure. Therefore, the leaf node of the quadtree is also the root node of the binary tree portion of the QTBT scheme, and thus has a binary tree depth of 0. When repeated binary tree segmentation reaches a binary tree depth of MaxBTDepth (i.e., 4), the video encoder 20 and/or video decoder 30 does not perform any kind of further splitting with respect to the leaf node. When the binary tree portion of the QTBT scheme produces a binary tree node with a width equal to MinBTSize (i.e., 4), the video encoder 20 and/or video decoder 30 may not perform further horizontal splitting of the node. Similarly, when the binary tree portion of the QTBT scheme produces binary tree nodes with a height equal to MinBTSize (i.e., 4), the video encoder 20 and/or video decoder 30 may not perform further vertical splitting of the nodes. The leaf nodes of the binary tree portion of the QTBT scheme (in the case where the splitting is completely achieved to binary tree segmentation) are further processed by CUs through prediction and transformation without any other splitting.
图10为绘示QTBT分割方案的方面的概念图。图10的左侧的框图绘示根据QTBT分割结构分割块162的实例。QTBT分割方案的四叉树分割方面是使用块162中的实线绘示,而QTBT分割方案的二叉树分割方面是使用块162中的虚线绘示。块162在仅调用QTBT方案的四叉树部分的情况下分割成正方形叶节点,在调用QTBT方案的二叉树部分(无论其是否与四叉树分割部分组合地调用)的任何情况下分割成非正方形的矩形叶节点。与HEVC的分割技术(其中多个变换是可能的)相比,QTBT分割方案提供一系统,由此,PU大小始终等于CU大小。Figure 10 is a conceptual diagram illustrating aspects of the QTBT partitioning scheme. The block diagram on the left side of Figure 10 illustrates an example of partitioning block 162 according to the QTBT partitioning structure. The quadtree partitioning aspect of the QTBT partitioning scheme is depicted using solid lines in block 162, while the binary tree partitioning aspect is depicted using dashed lines in block 162. Block 162 partitions into square leaf nodes when only the quadtree portion of the QTBT scheme is invoked, and into non-square rectangular leaf nodes in any case where the binary tree portion of the QTBT scheme is invoked (whether or not it is invoked in combination with the quadtree partitioning portion). Compared to HEVC partitioning techniques (where multiple transformations are possible), the QTBT partitioning scheme provides a system where the PU size is always equal to the CU size.
图10的右侧的示意图绘示树结构164。树结构164是用于关于图10中的块162所绘示的分割的对应树结构。同样在树结构164的情况下,在图10的QTBT分割方案的支持下,实线指示四叉树拆分,且虚线指示二叉树拆分。对于使用树结构164中的虚线所绘示的二叉树型部的每一拆分(即,非叶)节点,视频编码器20可用信号传送相应一位旗标以指示哪个拆分类型(即,水平或竖直)被使用。根据QTBT分割的一些实施方案,视频编码器20可将旗标设定为值零(0)以指示水平拆分,且设定为值一(1)以指示竖直拆分。应了解,对于QTBT分割结构的四叉树拆分部分,不需要指示拆分类型,这是因为四叉树拆分始终将块水平地且竖直地拆分成大小相等的4个子块。The schematic diagram on the right side of Figure 10 illustrates tree structure 164. Tree structure 164 is the corresponding tree structure for the segmentation shown with respect to block 162 in Figure 10. Also in the case of tree structure 164, with the support of the QTBT segmentation scheme in Figure 10, solid lines indicate quadtree splits, and dashed lines indicate binary tree splits. For each split (i.e., non-leaf) node of the binary tree type shown by the dashed lines in tree structure 164, the video encoder 20 can signal a corresponding one-bit flag to indicate which split type (i.e., horizontal or vertical) is used. According to some implementations of QTBT segmentation, the video encoder 20 can set the flag to a value of zero (0) to indicate a horizontal split and to a value of one (1) to indicate a vertical split. It should be understood that for the quadtree split portion of the QTBT segmentation structure, it is not necessary to indicate the split type, because quadtree splitting always splits the block horizontally and vertically into four equal-sized sub-blocks.
图11A及11B绘示用于根据QTBT分割方案的对应亮度块及色度块的独立分割结构的实例。QTBT块分割技术准许且支持具有独立的基于QTBT的分割结构的对应亮度块及色度块的特征。根据QTBT分割方案,对于P切片及B切片,一个CTU中的对应亮度CTU及色度CTU共享同一基于QTBT的分割结构。然而,对于I切片,亮度CTU可根据第一基于QTBT的分割结构分割成CU,且色度CTU是根据第二基于QTBT的分割结构分割成色度CU,第二基于QTBT的分割结构与第一基于QTBT的分割结构可不相同或可并非不相同。因此,I切片中的CU可由亮度分量的译码块或两个色度分量的译码块组成,而对于P切片及B切片中的CU,CU可由所有三个颜色分量的译码块组成。Figures 11A and 11B illustrate examples of independent segmentation structures for corresponding luma and chroma blocks according to a QTBT segmentation scheme. The QTBT block segmentation technique permits and supports the feature of corresponding luma and chroma blocks having independent QTBT-based segmentation structures. According to the QTBT segmentation scheme, for P and B slices, corresponding luma CTUs and chroma CTUs within a CTU share the same QTBT-based segmentation structure. However, for I slices, the luma CTU can be segmented into CUs according to a first QTBT-based segmentation structure, and the chroma CTU is segmented into chroma CUs according to a second QTBT-based segmentation structure, which may or may not be the same as the first QTBT-based segmentation structure. Therefore, a CU in an I slice can consist of a decoding block for the luma component or decoding blocks for the two chroma components, while for CUs in P and B slices, a CU can consist of decoding blocks for all three color components.
用于I切片的由QTBT支持的独立树结构包含与色度译码有关的方面。举例来说,JEM允许每个PU六个(6)色度模式。DM模式的使用指示:视频编码器20及/或视频解码器30将与用于对应亮度PU相同的预测模式用于色度PU。如上所述,对于I切片,用于亮度块及对应色度的基于QTBT的分割结构可不同。因而,当DM模式用于I切片中时,视频编码器20及/或视频解码器30可继承覆盖左上方位置的PU的亮度预测模式以针对色度PU执行预测。与HEVC的分割技术(其中亮度块及其对应色度块始终共享同一树结构)相比,JEM3.0的基于的QTBT分割准许如图11A及11B中所展示的亮度树结构与色度树结构之间的可能差异。The independent tree structure supported by QTBT for I slices contains aspects related to chroma decoding. For example, JEM allows six (6) chroma modes per PU. The use of DM mode indicates that the video encoder 20 and/or video decoder 30 will use the same prediction mode for the corresponding luma PU for the chroma PU. As mentioned above, for I slices, the QTBT-based segmentation structures for luma blocks and their corresponding chroma blocks can be different. Therefore, when the DM mode is used in an I slice, the video encoder 20 and/or video decoder 30 can inherit the luma prediction mode covering the PU in the upper left position to perform prediction for the chroma PU. Compared to HEVC's segmentation technique (where luma blocks and their corresponding chroma blocks always share the same tree structure), JEM3.0's QTBT-based segmentation allows for possible differences between the luma tree structure and the chroma tree structure as shown in Figures 11A and 11B.
图11A及11B绘示I切片中的一个CTU的QTBT分割结构的实例。图11A绘示亮度块172,其中左分割区174使用上部及下部书名号调出。图11B绘示对应色度块176,其中左分割区178使用上部及下部书名号调出。相应左分割区174及178包含更细分割区,如图11A及11B中所展示。L(i)(其中“i”表示相应分割区内所绘示的相应整数值)指示:用于相应分割区的亮度帧内预测模式具有等于i的索引。在图11A及11B中所绘示的实例中,视频编码器20及/或视频解码器30可根据DM模式来编码/解码色度块176的左分割区。因此,视频编码器20及/或视频解码器30可选择LM模式根据左上方对应亮度块分割区来预测色度块176的左分割区178。在图11A及11B中所绘示的使用情况情境下,视频编码器20及/或视频解码器30可选择具有等于1的索引的帧内预测模式来编码/解码色度块176的左分割区178,这是因为“i”在亮度块172的左上方分割区中具有值1。Figures 11A and 11B illustrate an example of the QTBT segmentation structure of a CTU in slice I. Figure 11A illustrates luma block 172, where the left segment 174 is indicated using upper and lower title marks. Figure 11B illustrates the corresponding chroma block 176, where the left segment 178 is indicated using upper and lower title marks. The corresponding left segments 174 and 178 contain finer segmentations, as shown in Figures 11A and 11B. L(i) (where “i” represents the corresponding integer value drawn within the corresponding segment) indicates that the luma intra-frame prediction mode used for the corresponding segment has an index equal to i. In the example illustrated in Figures 11A and 11B, video encoder 20 and/or video decoder 30 can encode/decode the left segment of chroma block 176 according to DM mode. Therefore, video encoder 20 and/or video decoder 30 can select LM mode to predict the left segment 178 of chroma block 176 according to the corresponding luma block segmentation in the upper left corner. In the usage scenarios illustrated in Figures 11A and 11B, the video encoder 20 and/or the video decoder 30 may select an intra-prediction mode with an index equal to 1 to encode/decode the left segment 178 of the chroma block 176, because "i" has a value of 1 in the upper left segment of the luma block 172.
上文的表7规定视频编码器20可用于用信号传送色度模式的模式布置。为了移除在导出模式(DM)指始终存在的模式中的一者时可能出现的色度模式用信号传送中的可能冗余,视频编码器20可使用角度(在总共存在67个帧内模式时为66)模式来替代如下表7.1中所展示的重复模式。在下表7.1中所绘示的使用情况情境下,角度模式(表示为INTRA_ANGULAR66)被称为“替代模式”。Table 7 above specifies the pattern arrangement that the video encoder 20 can use to transmit chroma modes via signaling. To remove potential redundancy in chroma mode signaling when the derived mode (DM) refers to one of the always-present modes, the video encoder 20 can use an angled (66 when there are a total of 67 intra-frame modes) mode to replace the repeating modes shown in Table 7.1 below. In the use cases illustrated in Table 7.1 below, the angled mode (denoted as INTRA_ANGULAR66) is referred to as the "alternative mode".
表7.1-色度帧内预测模式及相关联名称的规范Table 7.1 - Specifications of Chroma Intra-Frame Prediction Modes and Associated Names
如上文所论述,视频编码器20及视频解码器30可执行色度预测模式的熵译码。在色度模式译码中,1-b语法元素(0)经指派至最常出现的导出模式,两个二进位(10)经指派至LM模式,而4-b语法元素(1100、1101、1110、1111)经指派至剩余四个模式。前两个二进位是用一个上下文模型译码且剩余两个二进位(需要时)可经旁路译码。As discussed above, the video encoder 20 and video decoder 30 can perform entropy decoding of the chroma prediction mode. In chroma mode decoding, the 1-b syntax element (0) is assigned to the most frequently occurring derived mode, the two bits (10) are assigned to the LM mode, and the 4-b syntax elements (1100, 1101, 1110, 1111) are assigned to the remaining four modes. The first two bits are decoded using a context model, and the remaining two bits (if needed) can be bypassed and decoded.
表7.2-用于每一色度模式的二进位串Table 7.2 - Binary strings for each chroma mode
本发明的技术涉及改善上文所论述的各种技术的性能。如上所述,JEM3.0支持用于针对同一CTU的色度块分割及亮度块分割的独立树结构。然而,一个色度PU可对应于多个亮度PU。根据JEM3.0的QTBT分割方面而从用于色度译码的多个亮度PU仅继承亮度帧内预测模式中的一者可提供次佳结果,所述结果可通过本发明的各种技术来改善或可能优化。另外,对于JEM中的给定PU,可能色度模式的总数为六(6)。然而,针对亮度译码,可能模式的总数为六十七(67)。本发明的各种技术可通过增大色度模式的总数来改善译码效率。The techniques of this invention relate to improving the performance of the various techniques discussed above. As mentioned above, JEM3.0 supports independent tree structures for chroma block segmentation and luma block segmentation for the same CTU. However, one chroma PU can correspond to multiple luma PUs. According to the QTBT segmentation aspect of JEM3.0, inheriting only one of the luma intra-prediction modes from multiple luma PUs used for chroma decoding can provide suboptimal results, which can be improved or possibly optimized by the various techniques of this invention. In addition, for a given PU in JEM, the total number of possible chroma modes is six (6). However, for luma decoding, the total number of possible modes is sixty-seven (67). The various techniques of this invention can improve decoding efficiency by increasing the total number of chroma modes.
本发明的各种技术在下文以详细列举方式列出。应了解,视频编码器20及/或视频解码器30可应用下文所论述的各种技术,个别地或以所描述技术中的两者或多者的各种组合。尽管描述为由视频编码器20及/或视频解码器30执行,但应了解,图2中所绘示的视频编码器20的一或多个组件及/或图3中所绘示的视频解码器30的一或多个组件可执行本发明的各种技术。Various techniques of the present invention are listed below by way of detailed enumeration. It should be understood that the video encoder 20 and/or video decoder 30 may employ the various techniques discussed below, individually or in various combinations of two or more of the described techniques. Although described as being performed by the video encoder 20 and/or video decoder 30, it should be understood that one or more components of the video encoder 20 illustrated in FIG. 2 and/or one or more components of the video decoder 30 illustrated in FIG. 3 may perform the various techniques of the present invention.
下文的描述将一个色度块的尺寸表示为W*H(其中“W”为色度块的宽度且“H”为色度块的高度)。色度块中的左上方像素相对于整个切片的位置是由元组(x,y)表示,其中“x”及“y”分别为水平偏移及竖直偏移。对应于给定色度块的亮度块具有等于2W*2H的大小(对于4:2:0颜色格式)或W*H(对于4:4:4颜色格式)。对应亮度块中的左上方像素相对于整个切片的位置是由元组(2x,2y)(对于4:2:0)或(x,y)(对于4:4:4)表示。下文给出的实例是关于4:2:0颜色格式来描述。应了解,本文中所描述的技术还可扩展至其它颜色格式。The following description represents the size of a chroma block as W*H (where "W" is the width of the chroma block and "H" is the height of the chroma block). The position of the top-left pixel in a chroma block relative to the entire slice is represented by a tuple (x, y), where "x" and "y" are the horizontal and vertical offsets, respectively. A luma block corresponding to a given chroma block has a size equal to 2W*2H (for a 4:2:0 color format) or W*H (for a 4:4:4 color format). The position of the top-left pixel in a corresponding luma block relative to the entire slice is represented by a tuple (2x, 2y) (for 4:2:0) or (x, y) (for 4:4:4). The examples given below are for the 4:2:0 color format. It should be understood that the techniques described herein can be extended to other color formats.
根据本发明的某些方面,关于色度译码可添加多个DM模式,由此增大可供视频编码器20及视频解码器30使用(来自亮度块)的可用色度编码及解码模式的数目。即,根据本发明的这些方面,视频编码器20及视频解码器30可具有比单个选项更多的DM选项用以继承用于对应亮度块的译码模式。举例来说,根据本发明的技术,视频编码器20及/或视频解码器30可基于对应亮度块中所使用的帧内预测模式而产生含有用于色度块的DM帧内预测模式的候选者列表。尽管通过在DM候选者列表中维持相同总数的可能色度模式保持了译码及带宽效率,针对应用多个DM的本发明的技术提供潜在的精度增强,这是因为所述DM与现有技术中所使用的默认模式相比提供较佳准确度。According to certain aspects of the invention, multiple DM modes can be added for chroma decoding, thereby increasing the number of available chroma encoding and decoding modes (from the luma block) for use by the video encoder 20 and video decoder 30. That is, according to these aspects of the invention, the video encoder 20 and video decoder 30 can have more DM options than a single option to inherit the decoding mode for the corresponding luma block. For example, according to the technique of the invention, the video encoder 20 and/or video decoder 30 can generate a candidate list containing DM intra-prediction modes for the chroma block based on the intra-prediction mode used in the corresponding luma block. While maintaining the same total number of possible chroma modes in the DM candidate list preserves decoding and bandwidth efficiency, the technique of the invention provides potential accuracy enhancement for the application of multiple DMs because the DMs offer better accuracy compared to the default modes used in the prior art.
在此实例中,视频编码器20可用信号传送如目前在JEM3.0中所阐述的色度模式。然而,如果视频编码器20选择DM模式用于色度块的色度译码,那么视频编码器20可实施额外用信号传送。更具体地说,根据此实例,视频编码器20可编码及用信号传送指示DM模式经选择用于色度块的编码的旗标。基于色度块已在DM模式下编码,那么视频编码器20可编码及用信号传送索引值,以指示候选者列表的哪个模式被用作DM模式。基于候选者列表的大小,视频编码器20可编码及用信号传送零(0)与五(5)之间的索引值。即,视频编码器20可产生色度预测模式的候选者列表,其包含总共六个(6)候选者,即,导致候选者列表大小为六(6)。In this example, the video encoder 20 can signal the chroma mode as currently described in JEM3.0. However, if the video encoder 20 selects the DM mode for chroma decoding of the chroma block, then the video encoder 20 can implement additional signal transmission. More specifically, according to this example, the video encoder 20 can encode and signal a flag indicating that the DM mode has been selected for encoding the chroma block. Based on the fact that the chroma block has been encoded in DM mode, the video encoder 20 can encode and signal an index value to indicate which mode in the candidate list is used as the DM mode. Based on the size of the candidate list, the video encoder 20 can encode and signal an index value between zero (0) and five (5). That is, the video encoder 20 can generate a candidate list of chroma prediction modes containing a total of six (6) candidates, resulting in a candidate list size of six (6).
基于接收到设定为指示经编码色度块是使用DM模式来编码的值的旗标,视频解码器30可确定用于色度块的解码模式包含于候选者列表中。随后,视频解码器30可接收及解码识别色度模式候选者列表中的条目的索引。基于指示经编码色度块是使用DM模式来编码的旗标,且使用针对经编码色度块的所接收的索引值,视频解码器30可从色度模式候选者列表选择特定模式用于解码色度块。以此方式,在DM模式经选择用于译码色度块的例子中,视频编码器20及视频解码器30可增大可用于编码及解码色度块的候选者模式的数目。基于候选者列表的大小,视频解码器30可解码零(0)与五(5)之间的索引值。即,视频解码器30可产生色度预测模式的候选者列表,其包含总共六个(6)候选者,即,导致候选者列表大小为六(6)。Based on a received flag indicating that the encoded chroma block is encoded using a DM mode, the video decoder 30 can determine that the decoding mode for the chroma block is included in the candidate list. Subsequently, the video decoder 30 can receive and decode the index of the entry in the chroma mode candidate list. Based on the flag indicating that the encoded chroma block is encoded using a DM mode, and using the received index value for the encoded chroma block, the video decoder 30 can select a specific mode from the chroma mode candidate list for decoding the chroma block. In this way, in the example where a DM mode is selected for decoding the chroma block, the video encoder 20 and the video decoder 30 can increase the number of candidate modes available for encoding and decoding the chroma block. Based on the size of the candidate list, the video decoder 30 can decode index values between zero (0) and five (5). That is, the video decoder 30 can generate a candidate list of chroma prediction modes containing a total of six (6) candidates, resulting in a candidate list size of six (6).
在一些实例中,视频编码器20可首先编码及用信号传送指示色度块是否以线性模型(LM)模式来编码的旗标。在这些实例中,视频编码器20可在用信号传送旗标(用以指示色度块是否经LM编码)后跟随指示候选者列表中的所有DM候选者的数据。根据此实施方案,视频解码器30可在经编码视频位流中接收指示色度块是否以LM模式来编码的经编码旗标。视频解码器30可从在经编码视频位流中的LM旗标之后开始的位置剖析指示候选者列表中的所有DM候选者的数据。因此将了解,根据本发明的各种实例,视频解码器30可构造DM候选者列表,或替代地,可在经编码视频位流中接收整个DM候选者列表。在任一情境下,视频解码器30可使用用信号传送索引从候选者列表选择适当DM模式。In some instances, the video encoder 20 may first encode and signal a flag indicating whether the chroma block is encoded in a linear model (LM) mode. In these instances, the video encoder 20 may signal the flag (indicating whether the chroma block is LM encoded) followed by data indicating all DM candidates in the candidate list. According to this embodiment, the video decoder 30 may receive the encoded flag indicating whether the chroma block is LM encoded in the encoded video bitstream. The video decoder 30 may parse the data indicating all DM candidates in the candidate list from the position following the LM flag in the encoded video bitstream. Thus, it will be understood that, according to various embodiments of the invention, the video decoder 30 may construct a DM candidate list, or alternatively, may receive the entire DM candidate list in the encoded video bitstream. In either case, the video decoder 30 may use a signal-transmitted index to select the appropriate DM mode from the candidate list.
视频编码器20还可实施关于DM候选者列表的DM的精简。即,视频编码器20可确定包含于所述列表中的DM中的两个是否相同。如果视频编码器20确定单个DM的多个实例(即,多个相同DM)包含于候选者列表中,那么视频编码器20可通过移除同一DM的所有其它实例来移除冗余。即,视频编码器20可精简所述列表,使得此相同DM的仅一个实例保持在候选者列表中。The video encoder 20 can also streamline the list of candidate DMs. Specifically, the video encoder 20 can determine whether two DMs included in the list are identical. If the video encoder 20 determines that multiple instances of a single DM (i.e., multiple identical DMs) are included in the candidate list, then the video encoder 20 can remove redundancy by removing all other instances of the same DM. In other words, the video encoder 20 can streamline the list so that only one instance of this identical DM remains in the candidate list.
在本发明的基于DM候选者列表的技术的一些实例中,视频编码器20可针对默认模式中的一或多者来精简候选者列表中的DM候选者。根据本发明的精简技术,如果视频编码器20确定默认模式中的一者(例如,默认模式列表中的第K模式)与DM候选者列表中的DM模式中的一者相同,那么视频编码器20可用替代模式替换候选者列表中的此DM模式。除了替换候选者列表中的经精简DM模式之外,视频编码器20还可将替代模式设定为具有等于((最大帧内模式索引)-1-K)的值的索引的模式。在视频编码器20用信号传送指示包含于候选者列表中的所有DM模式的数据的一些实施方案中,视频编码器20可用信号传送反映经精简DM候选者列表的数据。In some embodiments of the DM candidate list-based technique of the present invention, the video encoder 20 may refine the DM candidates in the candidate list for one or more of the default modes. According to the refinement technique of the present invention, if the video encoder 20 determines that one of the default modes (e.g., the Kth mode in the default mode list) is the same as one of the DM modes in the DM candidate list, then the video encoder 20 may replace this DM mode in the candidate list with an alternative mode. In addition to replacing the refined DM mode in the candidate list, the video encoder 20 may also set the alternative mode to a mode with an index equal to ((maximum intra-frame mode index) - 1 - K). In some embodiments where the video encoder 20 signals data indicating all DM modes included in the candidate list, the video encoder 20 may signal data reflecting the refined DM candidate list.
在视频解码器30还执行DM候选者列表构造的一些实例中,视频解码器30还可执行精简以完成DM候选者列表。举例来说,如果视频解码器30确定默认模式中的一者(例如,默认模式列表中的第K模式)与DM候选者列表中的DM模式中的一者相同,那么视频解码器30可用替代模式替换候选者列表中的此DM模式。除了替换候选者列表中的经精简DM模式之外,视频解码器30还可将替代模式设定为具有等于((最大帧内模式索引)-1-K)的值的索引的模式。In some instances where the video decoder 30 also performs DM candidate list construction, the video decoder 30 may also perform simplification to complete the DM candidate list. For example, if the video decoder 30 determines that one of the default modes (e.g., the Kth mode in the default mode list) is the same as one of the DM modes in the DM candidate list, then the video decoder 30 may replace this DM mode in the candidate list with an alternative mode. In addition to replacing the simplified DM mode in the candidate list, the video decoder 30 may also set the alternative mode to a mode with an index equal to ((maximum intra-frame mode index) - 1 - K).
通过实施上文所描述的基于DM候选者列表的技术中的一或多者,视频编码器20及视频解码器30可增大可能色度预测模式的数目。经由上文所描述的基于DM候选者列表的技术可获得的增大数目的色度模式可在维持精度同时改善译码效率。如上所述,在各种实例中,视频解码器30可经由经编码视频位流接收整个DM候选者列表。或替代地,可构造DM候选者列表且使用所用信号传送索引而从关于色度块的DM候选者列表选择预测模式。因为视频解码器30可接收显式用信号传送的DM候选者列表,或替代地构造DM候选者列表,所以各种基于DM候选者列表的技术在本文中被描述为由视频编码器20且视情况由视频解码器30执行。By implementing one or more of the DM candidate list-based techniques described above, the video encoder 20 and video decoder 30 can increase the number of possible chroma prediction modes. The increased number of chroma modes obtainable via the DM candidate list-based techniques described above can improve decoding efficiency while maintaining accuracy. As mentioned above, in various instances, the video decoder 30 can receive the entire DM candidate list via the encoded video bitstream. Alternatively, the DM candidate list can be constructed and a prediction mode can be selected from the DM candidate list for a chroma block using the signal transmission index used. Because the video decoder 30 can receive an explicitly signaled DM candidate list, or alternatively construct a DM candidate list, the various DM candidate list-based techniques are described herein as being performed by the video encoder 20 and, where appropriate, by the video decoder 30.
在一些实例中,视频编码器20可将DM候选者列表的大小(即,包含于DM候选者列表中的候选者的总数)固定在特定范围内,例如在图案块内、在切片内、在图片内或在序列内。在一些此类实例中,如果视频解码器30经配置以构造DM候选者列表且使用所用信号传送索引来选择候选者,那么视频解码器30还可将DM候选者列表的大小(即,包含于DM候选者列表中的候选者的总数)固定在特定范围内,例如在图案块内、在切片内、在图片内或在序列内。In some instances, the video encoder 20 may fix the size of the DM candidate list (i.e., the total number of candidates included in the DM candidate list) within a specific range, such as within a pattern block, within a slice, within a picture, or within a sequence. In some such instances, if the video decoder 30 is configured to construct the DM candidate list and select candidates using the signal transmission index used, then the video decoder 30 may also fix the size of the DM candidate list (i.e., the total number of candidates included in the DM candidate list) within a specific range, such as within a pattern block, within a slice, within a picture, or within a sequence.
在一些实例中,视频编码器20可按含元数据的数据结构来用信号传送候选者列表的大小,所述数据结构可关于对应经编码视频数据在带外用信号传送。作为一些非限制性实例,视频编码器20可在切片标头、图片参数集合(PPS)或序列参数集(SPS)中的任一者中用信号传送候选者列表的大小。根据一些实例,视频编码器20(且视情况,视频解码器30)可经配置以预定义候选者列表的大小,使得候选者列表的大小对于所有块大小相同。替代地,视频编码器20(且视情况,视频解码器30)可经配置以预定义候选者列表的大小,使得候选者列表的大小视块的大小而变化。In some instances, the video encoder 20 may signal the size of the candidate list in a data structure containing metadata, which may be signaled out-of-band with respect to the corresponding encoded video data. As some non-limiting examples, the video encoder 20 may signal the size of the candidate list in any of the slice header, picture parameter set (PPS), or sequence parameter set (SPS). According to some instances, the video encoder 20 (and, where appropriate, the video decoder 30) may be configured to predefine the size of the candidate list such that the size of the candidate list is the same for all block sizes. Alternatively, the video encoder 20 (and, where appropriate, the video decoder 30) may be configured to predefine the size of the candidate list such that the size of the candidate list varies depending on the block size.
根据一些实例,视频编码器20(且视情况,视频解码器30)可构造DM候选者列表以包含(例如,含有)至多三个部分。在这些实例中,DM候选者列表的三个部分包含以下各者:(i)第一部分,其包含与相对于对应亮度块的特定位置相关联的亮度帧内预测模式的候选者;(ii)第二部分,其包含从对应亮度块内的所有亮度块的函数导出的候选者,例如,如上文的一个实例中所描述的最常使用的亮度帧内预测模式;及(iii)第三部分,其包含从具有模式索引的特定偏移的选定亮度帧内预测模式导出的候选者。According to some examples, the video encoder 20 (and, where appropriate, the video decoder 30) may construct a DM candidate list to include (e.g., contain) up to three parts. In these examples, the three parts of the DM candidate list include each of the following: (i) a first part containing candidates for the lumen intra-prediction mode associated with a specific position relative to the corresponding lumen block; (ii) a second part containing candidates derived from a function of all lumen blocks within the corresponding lumen block, such as the most commonly used lumen intra-prediction mode as described in one of the examples above; and (iii) a third part containing candidates derived from a selected lumen intra-prediction mode with a specific offset of a mode index.
在一个实例中,视频编码器20(且视情况,视频解码器30)可按次序将来自前两个部分的候选者插入至DM候选者列表中,直至候选者的总数等于预定义列表大小(即,DM模式的预定义总数)。在关于包含于DM候选者列表中的模式执行精简过程之后,如果候选者列表的大小仍然小于DM模式的预定义总数,那么视频编码器20(且视情况,视频解码器30)可插入来自所述列表的第三部分的候选者。在一个此类实例中,视频编码器20(且视情况,视频解码器30)可按第一部分、继之以第二部分、继之以第三部分的次序将来自三个部分(或两个部分,视精简的结果而定)的候选者插入至候选者列表中。在另一替代实例中,视频编码器20(且视情况,视频解码器30)可在来自第一部分的候选者之前插入来自第二部分的候选者。在又一替代实例中,视频编码器20(且视情况,视频解码器30)可在来自第一部分的候选者当中插入来自第二部分的候选者(例如,通过交错或交织第一部分及第二部分的候选者)。In one instance, the video encoder 20 (and, if applicable, the video decoder 30) may sequentially insert candidates from the first two parts into the DM candidate list until the total number of candidates equals the predefined list size (i.e., the predefined total number of DM modes). After a simplification process is performed on the modes included in the DM candidate list, if the size of the candidate list is still less than the predefined total number of DM modes, then the video encoder 20 (and, if applicable, the video decoder 30) may insert candidates from the third part of the list. In one such instance, the video encoder 20 (and, if applicable, the video decoder 30) may insert candidates from three parts (or two parts, depending on the result of simplification) into the candidate list in the order of the first part, followed by the second part, followed by the third part. In another alternative instance, the video encoder 20 (and, if applicable, the video decoder 30) may insert candidates from the second part before candidates from the first part. In yet another alternative instance, the video encoder 20 (and, where appropriate, the video decoder 30) may insert candidates from the second part into the candidates from the first part (e.g., by interleaving or interlacing the candidates from the first and second parts).
根据一些实例,DM候选者列表的第一部分的候选者是继承自特定位置的用于译码对应亮度块的模式。举例来说,候选者列表的第一部分可包含继承自对应亮度块中的以下位置的模式:中心位置、左上方位置、右上方位置、左下方位置及右下方位置。即,在此实例中,候选者列表的第一部分可包含继承自对应亮度块的四个边角的模式。在一个此类实例中,视频编码器20(且视情况,视频解码器30)可按以下次序将继承自对应亮度块的四个边角位置的模式插入至DM候选者列表:中心、左上方、右上方、左下方及右下方。在另一此类实例中,视频编码器20(且视情况,视频解码器30)可按以下次序将继承自对应亮度块的四个边角位置的模式插入至DM候选者列表:中心、左上方、右下方、左下方及右上方。在其它实例中,次序可变化,且应了解,上文所描述的次序是非限制性实例。In some instances, the first part of the DM candidate list consists of patterns inherited from specific positions used for decoding the corresponding luma blocks. For example, the first part of the candidate list may contain patterns inherited from the following positions within the corresponding luma block: center, upper left, upper right, lower left, and lower right. That is, in this instance, the first part of the candidate list may contain patterns inherited from the four corners of the corresponding luma block. In one such instance, the video encoder 20 (and, where appropriate, the video decoder 30) may insert the patterns inherited from the four corner positions of the corresponding luma block into the DM candidate list in the following order: center, upper left, upper right, lower left, and lower right. In another such instance, the video encoder 20 (and, where appropriate, the video decoder 30) may insert the patterns inherited from the four corner positions of the corresponding luma block into the DM candidate list in the following order: center, upper left, lower right, lower left, and upper right. In other instances, the order may vary, and it should be understood that the order described above is a non-limiting example.
在一个实例中,视频编码器20(且视情况,视频解码器30)可形成DM候选者列表的第一部分以包含对应亮度块的所有位置的帧内预测模式。在此实例中,第二部分可变为非必要的,这是因为第一部分包含对应亮度块的所有帧内预测模式。另外,视频编码器20(且视情况,视频解码器30)可按某一次序遍历对应亮度块内的所有单元。替代地或另外,视频编码器20(且视情况,视频解码器30)可按基于对应亮度块内的出现的减小数目的次序将附加模式添加至DM候选者列表。In one instance, the video encoder 20 (and, if applicable, the video decoder 30) may form a first part of the DM candidate list to include intra-prediction modes for all positions of the corresponding luma block. In this instance, a second part may be optional because the first part contains all intra-prediction modes for the corresponding luma block. Alternatively, the video encoder 20 (and, if applicable, the video decoder 30) may traverse all units within the corresponding luma block in a certain order. Alternatively or additionally, the video encoder 20 (and, if applicable, the video decoder 30) may add additional modes to the DM candidate list in a decreasing order based on their occurrence within the corresponding luma block.
在一个实例中,视频编码器20(且视情况,视频解码器30)可为了形成第三部分而将偏移应用于已插入至所述列表的前一或多个候选者。另外,在形成第三部分时,视频编码器20(且视情况,视频解码器30)可进一步应用或执行对已插入候选者的精简。在一个替代实例中,视频编码器20(且视情况,视频解码器30)可形成第三部分以包含来自相邻块的一或多个帧内色度模式。In one instance, the video encoder 20 (and, if applicable, the video decoder 30) may apply an offset to one or more candidates already inserted into the list in order to form a third portion. Additionally, when forming the third portion, the video encoder 20 (and, if applicable, the video decoder 30) may further apply or perform a reduction of the inserted candidates. In an alternative instance, the video encoder 20 (and, if applicable, the video decoder 30) may form the third portion to include one or more intra-frame chroma modes from adjacent blocks.
根据本文中所描述的技术的一些实施方案,视频编码器20(且视情况,视频解码器30)可从CU至CU或从PU至PU或从TU至TU而自适应地改变候选者列表的大小。在一个实例中,视频编码器20(且视情况,视频解码器30)可仅添加来自第一部分的候选者,如关于三部分DM候选者列表形成实施方案所描述。替代地,视频编码器20(且视情况,视频解码器30)可仅将来自第一部分及第二部分的候选者添加至DM候选者列表。在一些实例中,视频编码器20(且视情况,视频解码器30)可执行精简以移除相同的帧内预测模式。III。According to some embodiments of the technology described herein, the video encoder 20 (and, depending, the video decoder 30) can adaptively change the size of the candidate list from CU to CU, from PU to PU, or from TU to TU. In one instance, the video encoder 20 (and, depending, the video decoder 30) can add only candidates from the first part, as described in the embodiments relating to the formation of a three-part DM candidate list. Alternatively, the video encoder 20 (and, depending, the video decoder 30) can add only candidates from the first and second parts to the DM candidate list. In some instances, the video encoder 20 (and, depending, the video decoder 30) can perform simplification to remove identical intra-prediction modes. III.
在视频编码器20精简DM候选者列表的实例中,如果最终精简后DM候选者列表中的候选者的数目等于1,那么视频编码器20可不用信号传送DM索引。在一些实例中,视频编码器20(且视情况,视频解码器30)可使用截短一元二进制化将DM候选者列表内的DM索引值二进制化。替代地,视频编码器20(且视情况,视频解码器30)可使用一元二进制化将DM候选者列表内的DM索引值二进制化。In an instance where the video encoder 20 reduces the DM candidate list, if the number of candidates in the final reduced DM candidate list is equal to 1, then the video encoder 20 may not need to signal the DM index. In some instances, the video encoder 20 (and, where appropriate, the video decoder 30) may use truncated unary binary representation to binaryize the DM index values in the DM candidate list. Alternatively, the video encoder 20 (and, where appropriate, the video decoder 30) may use unary binary representation to binaryize the DM index values in the DM candidate list.
在一些实例中,视频编码器20(且视情况,视频解码器30)可设定上下文模型索引等于二进位索引。替代地,用于译码DM索引值的上下文模型的总数可小于最大候选者数目。在此情况下,视频编码器20可设定上下文模型索引设定等于min(K,二进位索引),其中K表示正整数。替代地,视频编码器20可用上下文模型仅编码前几个二进位,且可用于旁路模式编码剩余二进位。在此实例中,视频解码器30可用上下文模型解码仅前几个二进位,且可用旁路模式解码剩余二进位。In some instances, the video encoder 20 (and, depending on the case, the video decoder 30) may set the context model index to equal the binary index. Alternatively, the total number of context models used to decode the DM index value may be less than the maximum number of candidates. In this case, the video encoder 20 may set the context model index to equal min(K, binary index), where K represents a positive integer. Alternatively, the video encoder 20 may use the context model to encode only the first few bits and may use it in bypass mode to encode the remaining bits. In this instance, the video decoder 30 may use the context model to decode only the first few bits and may use it in bypass mode to decode the remaining bits.
替代地,视频编码器20(且视情况,视频解码器30)可视DM候选者的总数或CU、PU或TU大小中的一或多者来确定经上下文译码的二进位的数目。替代地,对于前M个二进位(例如,M等于1),上下文模型化可进一步取决于最终(例如精简后)DM候选者列表中的DM候选者的总数或CU/PU/TU大小或对应亮度块的拆分信息。Alternatively, the video encoder 20 (and, where appropriate, the video decoder 30) may determine the number of context-decoded bits based on one or more of the total number of DM candidates or the sizes of CU, PU, or TU. Alternatively, for the first M bits (e.g., M equals 1), the context modeling may further depend on the total number of DM candidates or the sizes of CU/PU/TU or the splitting information of the corresponding luma blocks in the final (e.g., reduced) DM candidate list.
在一些实例中,视频编码器20(且视情况,视频解码器30)可在二进制化之前进一步重排序候选者列表中的候选者。在一个实例中,当CU/PU/TU的宽度大于CU/PU/TU的高度时,重排序可基于用于候选者的实际帧内模式与水平帧内预测模式之间的帧内预测模式索引差。所述差越小,将指派的将指派至DM候选者列表中的候选者的索引将越小。在另一实例中,当CU/PU/TU的高度大于CU/PU/TU的宽度时,重排序可基于用于候选者的实际帧内模式与竖直帧内预测模式之间的帧内预测模式索引差。还在此实例中,所述差越小,针对DM候选者列表中的候选者将指派的索引越小。In some instances, the video encoder 20 (and, where appropriate, the video decoder 30) may further reorder candidates in the candidate list before binaryization. In one instance, when the width of the CU/PU/TU is greater than the height of the CU/PU/TU, the reordering may be based on the difference in intra-prediction mode index between the actual intra-frame mode used for a candidate and the horizontal intra-prediction mode. The smaller the difference, the smaller the index of the candidate to be assigned to the DM candidate list. In another instance, when the height of the CU/PU/TU is greater than the width of the CU/PU/TU, the reordering may be based on the difference in intra-prediction mode index between the actual intra-frame mode used for a candidate and the vertical intra-prediction mode. Also in this instance, the smaller the difference, the smaller the index to be assigned to the candidate in the DM candidate list.
替代地,此外,视频编码器20(且视情况,视频解码器30)可相对于默认模式执行所述列表中的所有DM候选者的精简。如果默认模式中的一者(例如,默认模式列表中的第K模式)与DM候选者列表中的DM模式中的一者相同,那么视频编码器20(且视情况,视频解码器30)可用替代模式替换DM候选者列表中的此DM模式。除了替换候选者列表中的经精简DM模式之外,视频编码器20(且视情况,视频解码器30)还可将替代模式设定成具有等于((最大帧内模式索引)-1-K)的值的索引的模式。Alternatively, the video encoder 20 (and, where appropriate, the video decoder 30) may perform a reduction of all DM candidates in the list relative to the default mode. If one of the default modes (e.g., the Kth mode in the default mode list) is the same as one of the DM modes in the DM candidate list, then the video encoder 20 (and, where appropriate, the video decoder 30) may replace this DM mode in the DM candidate list with an alternative mode. In addition to replacing the reduced DM modes in the candidate list, the video encoder 20 (and, where appropriate, the video decoder 30) may also set the alternative mode to a mode with an index equal to ((maximum intra-frame mode index) - 1 - K).
根据本发明的一些技术,视频编码器20及视频解码器30可统一亮度及色度帧内预测模式。即,对于每一色度块,除了线性模型(LM)模式及译码色度分量所特有的其它模式之外,视频编码器20及/或视频解码器30还可从可用亮度预测模式的池选择预测模式。可用亮度预测模式的池在本文中经描述为包含总共“N”个预测模式,其中“N”表示正整数值。在一些实例中,“N”的值等于六十七(67),对应于67个不同的可用亮度预测模式。According to some techniques of the present invention, the video encoder 20 and the video decoder 30 can unify the intra-frame prediction modes for luma and chroma. That is, for each chroma block, in addition to the linear model (LM) mode and other modes specific to the decoded chroma components, the video encoder 20 and/or the video decoder 30 can also select a prediction mode from a pool of available luma prediction modes. The pool of available luma prediction modes is described herein as containing a total of “N” prediction modes, where “N” represents a positive integer value. In some instances, the value of “N” is equal to sixty-seven (67), corresponding to 67 different available luma prediction modes.
另外,关于色度帧内预测模式的编码及用信号传送,视频编码器20还可用信号传送最可能模式(MPM)旗标,且视MPM旗标的值而用信号传送MPM索引(对应于MPM候选者列表中的MPM候选者的索引)。举例来说,视频编码器20可通过首先将用于色度块的一或多个DM模式添加至MPM候选者列表来构造MPM候选者列表。如上所述,视频编码器20可识别用于色度块的多个DM模式。然而,应了解,在一些情境下,视频编码器20可识别用于色度块的单个DM模式。在将DM模式添加至MPM候选者列表之后,视频编码器20可将来自相邻块的其它色度模式添加至MPM候选者列表。替代地或另外,视频编码器20可添加默认模式,例如通过使用描述于V.Seregin、X.Zhao、A.Said、M.Karczewicz的“Neighbor based intra mostprobable modes list derivation”(JVET-C0055,日内瓦,2016年5月(在下文中,“Seregin”)中的亮度MPM候选者列表构造过程。In addition, regarding the encoding and signal transmission of chroma intra-prediction modes, the video encoder 20 can also signal a Most Probable Mode (MPM) flag and, depending on the value of the MPM flag, signal an MPM index (corresponding to the index of an MPM candidate in the MPM candidate list). For example, the video encoder 20 can construct an MPM candidate list by first adding one or more DM modes for the chroma block to the MPM candidate list. As mentioned above, the video encoder 20 can identify multiple DM modes for the chroma block. However, it should be understood that in some situations, the video encoder 20 can identify a single DM mode for the chroma block. After adding a DM mode to the MPM candidate list, the video encoder 20 can add other chroma modes from adjacent blocks to the MPM candidate list. Alternatively or additionally, the video encoder 20 may add a default mode, for example, by using the luminance MPM candidate list construction process described in “Neighbor based intra mostprobable modes list derivation” (JVET-C0055, Geneva, May 2016) by V. Seregin, X. Zhao, A. Said, M. Karczewicz (hereinafter, “Seregin”).
替代地,视频编码器20可构造色度MPM候选者列表,其方式与用于亮度模式MPM候选者列表的方式相同。举例来说,视频编码器20可按描述于Seregin中的次序来检查若干相邻块。在这些实施方案中,视频编码器20可处理LM模式及/或其它色度特定帧内预测模式,其方式与视频编码器20处理其它帧内预测模式相同。此外,视频编码器20可精简MPM候选者列表以移除由相同帧内预测模式从多个源添加所产生的冗余。Alternatively, the video encoder 20 may construct a chroma-specific MPM candidate list in the same manner as for a luma-specific MPM candidate list. For example, the video encoder 20 may examine several adjacent blocks in the order described in Seregin. In these embodiments, the video encoder 20 may process LM mode and/or other chroma-specific intra-prediction modes in the same manner as it processes other intra-prediction modes. Furthermore, the video encoder 20 may streamline the MPM candidate list to remove redundancy resulting from the addition of the same intra-prediction mode from multiple sources.
在一个实例中,视频编码器20可首先用信号传送旗标以指示仅适用于色度分量的一或多个色度特定模式(例如仅用于译码色度分量的LM模式及/或其它预测模式)的使用。如果选定预测模式并非色度特定模式(即,视频编码器20将上述旗标设定至停用状态),那么视频编码器20可进一步用信号传送MPM旗标。在此实例实施方案中,当将继承自相邻块的色度预测模式添加至MPM列表时,视频编码器20可不考虑色度特定模式(例如,LM模式),在此色度特定模式是取自相邻块的情况下。In one example, the video encoder 20 may first signal a flag to indicate the use of one or more chroma-specific modes applicable only to the chroma components (e.g., the LM mode and/or other prediction modes used only for decoding chroma components). If the selected prediction mode is not a chroma-specific mode (i.e., the video encoder 20 sets the aforementioned flag to a disabled state), then the video encoder 20 may further signal an MPM flag. In this example implementation, when adding a chroma prediction mode inherited from a neighboring block to the MPM list, the video encoder 20 may disregard chroma-specific modes (e.g., the LM mode) if the chroma-specific mode is taken from a neighboring block.
此实施方案的实例使用情况描述于下文。视频编码器20可使用LM模式来帧内预测色度块,且因此可用信号传送经设定至启用状态的LM旗标。基于色度块已使用LM预测模式经编码,视频编码器20可用信号传送指示用于色度块的MPM候选者列表内的位置的MPM索引。此实例使用情况说明,视频编码器20可使用一位旗标首先为视频解码器30提供用于色度块的预测模式根本是否为MPM候选者列表中的候选者的指示。当且仅当用于色度块的预测模式是来自MPM候选者列表的候选者时,视频编码器20才可用信号传送索引以向视频解码器30指示MPM候选者列表的哪个模式是用以预测色度块。以此方式,视频编码器20可通过首先使用一位旗标、接着基于旗标的值确定根本是否用信号传送索引值来节省带宽。Example use cases of this implementation are described below. Video encoder 20 can use LM mode to intra-frame predict chroma blocks, and therefore can signal an LM flag set to the enabled state. Based on the chroma block being encoded using the LM prediction mode, video encoder 20 can signal an MPM index indicating the position of the chroma block within the MPM candidate list. This example use case illustrates that video encoder 20 can use a one-bit flag to first provide video decoder 30 with an indication of whether the prediction mode for the chroma block is a candidate in the MPM candidate list. Only if the prediction mode for the chroma block is a candidate from the MPM candidate list can video encoder 20 signal an index to indicate to video decoder 30 which mode in the MPM candidate list is used to predict the chroma block. In this way, video encoder 20 can save bandwidth by first using a one-bit flag and then determining whether to signal an index value based on the flag value.
上述技术的解码器侧方面论述于下文。视频解码器30可在经编码视频位流中接收MPM旗标。如果MPM旗标的值经设定至启用状态,那么视频解码器30还可接收关于相关色度块的MPM索引,其对应于MPM候选者列表中的特定MPM候选者的索引。举例来说,视频解码器30可通过首先将用于色度块的一或多个DM模式添加至MPM候选者列表来构造MPM候选者列表。如上所述,视频解码器30可识别用于色度块的重构的多个DM模式。然而,应了解,在一些情境下,视频解码器30可识别用于色度块的单个DM模式。在将DM模式添加至MPM候选者列表之后,视频解码器30可将来自相邻块的其它色度模式添加至MPM候选者列表。替代地或另外,视频解码器30可添加默认模式,例如通过使用描述于Seregin中的亮度MPM候选者列表构造过程。The decoder aspect of the above-described technology is discussed below. The video decoder 30 can receive an MPM flag in the encoded video bitstream. If the MPM flag value is set to the enabled state, the video decoder 30 can also receive an MPM index for the relevant chroma block, which corresponds to the index of a specific MPM candidate in the MPM candidate list. For example, the video decoder 30 can construct the MPM candidate list by first adding one or more DM modes for the chroma block to the MPM candidate list. As mentioned above, the video decoder 30 can recognize multiple DM modes for the reconstruction of the chroma block. However, it should be understood that in some situations, the video decoder 30 can recognize a single DM mode for the chroma block. After adding a DM mode to the MPM candidate list, the video decoder 30 can add other chroma modes from adjacent blocks to the MPM candidate list. Alternatively or additionally, the video decoder 30 can add a default mode, for example, by using the luma MPM candidate list construction process described in Seregin.
替代地,视频解码器30可构造色度MPM候选者列表,其方式与用于亮度模式MPM候选者列表的方式相同。举例来说,视频解码器30可按描述于Seregin中的次序来检查若干相邻块。在这些实施方案中,视频解码器30可处理LM模式及/或其它色度特定帧内预测模式,其方式与视频解码器30处理其它帧内预测模式相同。此外,视频解码器30可精简MPM候选者列表以移除由相同帧内预测模式从多个源添加所产生的冗余。Alternatively, the video decoder 30 may construct a chroma-specific MPM candidate list in the same manner as for a luma-specific MPM candidate list. For example, the video decoder 30 may examine several adjacent blocks in the order described in Seregin. In these embodiments, the video decoder 30 may process LM mode and/or other chroma-specific intra-prediction modes in the same manner as it processes other intra-prediction modes. Furthermore, the video decoder 30 may streamline the MPM candidate list to remove redundancy resulting from the addition of the same intra-prediction mode from multiple sources.
在一个实例中,视频编码器20可首先用信号传送旗标以指示仅适用于色度分量的一或多个色度特定模式(例如仅用于译码色度分量的LM模式及/或其它预测模式)的使用。如果选定预测模式并非色度特定模式(即,视频解码器30确定上述旗标经设定至停用状态),那么视频解码器30可进一步接收MPM旗标。在此实例实施方案中,当将继承自相邻块的色度预测模式添加至MPM列表时,视频解码器30可不考虑色度特定模式(例如,LM模式),在此色度特定模式是取自相邻块的情况下。In one example, the video encoder 20 may first signal a flag to indicate the use of one or more chroma-specific modes applicable only to the chroma components (e.g., the LM mode and/or other prediction modes used only for decoding chroma components). If the selected prediction mode is not a chroma-specific mode (i.e., the video decoder 30 determines that the flag is set to a disabled state), then the video decoder 30 may further receive an MPM flag. In this example implementation, when adding a chroma prediction mode inherited from a neighboring block to the MPM list, the video decoder 30 may disregard chroma-specific modes (e.g., the LM mode) if the chroma-specific mode is taken from a neighboring block.
此实施方案的实例使用情况描述于下文。视频解码器30可接收经设定至启用状态的LM旗标,且可因此使用LM模式帧内预测来重构色度块。基于色度块已使用LM预测模式经编码,视频解码器30可接收指示用于色度块的MPM候选者列表内的位置的MPM索引。此实例使用情况说明,视频解码器30可使用一位旗标来首先确定用于色度块的预测模式根本是否为MPM候选者列表中的候选者。如果预测模式并非来自MPM候选者列表的候选者,那么视频解码器30避免需要视频编码器20用信号传送指示MPM候选者列表的哪个模式是用以预测色度块的索引。以此方式,视频解码器30可通过减小需要视频编码器20用信号传送索引值的实例的数目来节省带宽,此与用信号传送一位旗标相比可更加带宽密集。Example use cases of this implementation are described below. Video decoder 30 can receive an LM flag set to the enabled state and can therefore reconstruct the chroma block using LM mode intra-frame prediction. Based on the chroma block being encoded using an LM prediction mode, video decoder 30 can receive an MPM index indicating its position within the MPM candidate list for the chroma block. This example use case illustrates that video decoder 30 can use a one-bit flag to first determine whether the prediction mode for the chroma block is actually a candidate in the MPM candidate list. If the prediction mode is not a candidate from the MPM candidate list, then video decoder 30 avoids needing video encoder 20 to signal an indication of which mode in the MPM candidate list is used to predict the chroma block. In this way, video decoder 30 can save bandwidth by reducing the number of instances where video encoder 20 needs to signal the index value, which is more bandwidth-intensive than signaling a one-bit flag.
在一些实例中,除了LM模式之外,视频编码器20及/或视频解码器30还可将其它色度特有或色度特定帧内预测模式添加至MPM列表,且添加剩余帧内预测模式作为所述列表的默认模式。替代地,视频编码器20可首先用信号传送MPM旗标,且在构造MPM列表时,视频编码器20及/或视频解码器30可始终考虑相邻块的色度预测模式,而不管相邻块是否是使用LM模式所预测。在另一实例中,如果LM模式未添加至MPM列表,那么视频编码器20及/或视频解码器30可添加LM模式作为第一默认模式。在另一实例中,视频编码器20及/或视频解码器30可仅使用来自MPM候选者列表的LM及模式,且可将默认模式一起移除。在一些实例中,仅当添加的默认模式的总数小于由“K”表示的预定整数值时,视频编码器20(且视情况,视频解码器30)才可添加现有默认模式。在一个此类实例中,K经设定至值四(4)。In some instances, in addition to the LM mode, the video encoder 20 and/or video decoder 30 may add other chroma-specific or chroma-specific intra-prediction modes to the MPM list, and add the remaining intra-prediction modes as the default modes of the list. Alternatively, the video encoder 20 may first signal an MPM flag, and when constructing the MPM list, the video encoder 20 and/or video decoder 30 may always consider the chroma prediction modes of adjacent blocks, regardless of whether the adjacent blocks are predicted using the LM mode. In another instance, if the LM mode is not added to the MPM list, the video encoder 20 and/or video decoder 30 may add the LM mode as the first default mode. In another instance, the video encoder 20 and/or video decoder 30 may use only the LM and modes from the MPM candidate list, and may remove the default mode along with it. In some instances, the video encoder 20 (and, where appropriate, the video decoder 30) may add an existing default mode only if the total number of added default modes is less than a predetermined integer value represented by "K". In one such instance, K is set to the value four (4).
在一些实例中,当仅允许一个DM时,替代从具有对应亮度块的左上方边角取得亮度帧内预测模式,视频编码器20及/或视频解码器30可使用以下规则中的一或多者来选择亮度帧内预测模式作为DM模式。在此规则的一个实例中,亮度帧内预测模式是对应亮度块内最常使用的模式。在一个实例中,基于某一扫描次序,视频编码器20及/或视频解码器30可遍历对应亮度块内的每一单元的帧内预测模式,且记录现有亮度预测模式的出现次数。视频编码器20及/或视频解码器30可选择具有最大出现次数的模式。即,视频编码器20及/或视频解码器30可选择覆盖对应亮度块的大小(即,面积)最多的亮度帧内预测模式。当两个预测模式在对应亮度块中具有相同使用量时,视频编码器20及/或视频解码器30可选择基于扫描次序首先检测到的预测模式。此处,将单元定义为用于亮度/色度帧内预测的最小PU/TU大小。在一些实例中,扫描次序可为光栅/Z形/对角线/Z形扫描次序或译码次序。In some instances, when only one DM is allowed, instead of obtaining the intra-prediction mode of luminance from the upper left corner of the corresponding luminance block, the video encoder 20 and/or video decoder 30 may use one or more of the following rules to select the luminance intra-prediction mode as the DM mode. In one instance of this rule, the luminance intra-prediction mode is the most frequently used mode within the corresponding luminance block. In one instance, based on a certain scan order, the video encoder 20 and/or video decoder 30 may traverse the intra-prediction modes of each unit within the corresponding luminance block and record the occurrence count of the existing luminance prediction mode. The video encoder 20 and/or video decoder 30 may select the mode with the highest occurrence count. That is, the video encoder 20 and/or video decoder 30 may select the luminance intra-prediction mode that covers the largest size (i.e., area) of the corresponding luminance block. When two prediction modes have the same usage in the corresponding luminance block, the video encoder 20 and/or video decoder 30 may select the prediction mode detected first based on the scan order. Here, a unit is defined as the minimum PU/TU size for luminance/chrominance intra-prediction. In some instances, the scanning order can be a raster/Z-shaped/diagonal/Z-shaped scanning order or a decoding order.
替代地,视频编码器20及/或视频解码器30可从亮度块的中心位置开始扫描,且按某一次序遍历至边界。替代地或另外,扫描/单元可取决于PU/TU大小。替代地,基于某一扫描次序,视频编码器20及/或视频解码器30可遍历对应亮度块内的每一PU/TU/CU的帧内预测模式,且记录所记录的现有亮度预测模式的出现次数。视频编码器20及/或视频解码器30可选择具有最大出现次数的模式。当两个模式在亮度块中具有相同使用量时,视频编码器20及/或视频解码器30可选择基于扫描次序首先出现(即,首先检测到)的预测模式。在一些实例中,扫描次序可为光栅/Z形/对角线/Z形扫描次序或译码次序。替代地,扫描可取决于PU/TU大小。Alternatively, the video encoder 20 and/or video decoder 30 may begin scanning from the center of the luma block and traverse to the boundary in a certain order. Alternatively or additionally, the scan/unit may depend on the PU/TU size. Alternatively, based on a certain scan order, the video encoder 20 and/or video decoder 30 may traverse the intra-frame prediction modes of each PU/TU/CU within the corresponding luma block and record the number of occurrences of the recorded existing luma prediction modes. The video encoder 20 and/or video decoder 30 may select the mode with the maximum number of occurrences. When two modes have the same usage in the luma block, the video encoder 20 and/or video decoder 30 may select the prediction mode that appears first (i.e., is detected first) based on the scan order. In some instances, the scan order may be a raster/Z-shaped/diagonal/Z-shaped scan order or a decoding order. Alternatively, the scan may depend on the PU/TU size.
在另一替代例中,对于上文关于单个经允许DM模式所描述的实例,如果视频编码器20及/或视频解码器30确定两个或多个模式在对应亮度块中具有相等出现次数,那么视频编码器20及/或视频解码器30可选择在亮度块中具有相等出现次数的所述模式中的一者。所述选择可取决于这些多个亮度模式的模式索引及/或PU/TU大小。替代地,针对特定块大小,例如大于32×32的块大小,视频编码器20及/或视频解码器30可根据此基于单个DM的规则仅评估对应亮度块的亮度帧内预测模式的部分(例如,部分子集)。In another alternative example, for the instance described above with respect to a single permitted DM mode, if the video encoder 20 and/or video decoder 30 determine that two or more modes have an equal number of occurrences in the corresponding luma block, then the video encoder 20 and/or video decoder 30 may select one of the modes that have an equal number of occurrences in the luma block. This selection may depend on the mode index and/or PU/TU size of these multiple luma modes. Alternatively, for a specific block size, such as a block size greater than 32×32, the video encoder 20 and/or video decoder 30 may evaluate only a portion (e.g., a subset) of the luma intra-prediction modes for the corresponding luma block according to this rule based on a single DM.
作为关于单个DM模式情境的此规则的另一实例,视频编码器20及/或视频解码器30可选择与对应亮度块的中心位置相关联的亮度帧内预测模式。在一个实例中,视频编码器20及/或视频解码器30可根据用于4:2:0颜色格式的坐标元组(2x+W-1,2y+H-1)来界定中心位置。替代地,视频编码器20及/或视频解码器30可如下所述而界定中心位置:As another example of this rule regarding a single DM mode scenario, the video encoder 20 and/or video decoder 30 may select a luma intra-frame prediction mode associated with the center position of the corresponding luma block. In one instance, the video encoder 20 and/or video decoder 30 may define the center position based on a coordinate tuple (2x+W-1, 2y+H-1) for the 4:2:0 color format. Alternatively, the video encoder 20 and/or video decoder 30 may define the center position as follows:
-如果W及H均等于2,那么视频编码器20及/或视频解码器30可使用位置(2x,2y)作为中心位置。- If both W and H are equal to 2, then the video encoder 20 and/or the video decoder 30 can use position (2x, 2y) as the center position.
-否则,如果H等于2,那么视频编码器20及/或视频解码器30可使用位置(2x+(2*W/4/2-1)*4,2y)作为中心位置。- Otherwise, if H equals 2, then the video encoder 20 and/or the video decoder 30 can use the position (2x+(2*W/4/2-1)*4,2y) as the center position.
-否则,如果W等于2,那么视频编码器20及/或视频解码器30可使用位置(2x,2y+(2*H/4/2-1)*4)作为中心位置。Otherwise, if W equals 2, then the video encoder 20 and/or the video decoder 30 can use the position (2x, 2y + (2*H/4/2-1)*4) as the center position.
-否则(例如,H及W均不等于4),那么使用(2x+(2*W/4/2-1)*4,2y+(2*H/4/2-1)*4)作为中心位置。Otherwise (for example, if H and W are not equal to 4), then use (2x+(2*W/4/2-1)*4,2y+(2*H/4/2-1)*4) as the center position.
根据本发明的技术的一些实例,替代将同一默认模式用于所有块,视频编码器20及/或视频解码器30可将从对应亮度块导出的所述模式看作默认模式。在一个实例中,默认模式的总数经增大以包含从对应亮度块导出的更多模式。在另一实例中,当添加的默认模式的总数小于K(在一个非限制性实例中,K经设定至4)时,仅添加现有默认模式。According to some embodiments of the technology of the present invention, instead of using the same default mode for all blocks, the video encoder 20 and/or video decoder 30 may treat the mode derived from the corresponding luma block as the default mode. In one embodiment, the total number of default modes is increased to include more modes derived from the corresponding luma block. In another embodiment, when the total number of added default modes is less than K (in a non-limiting embodiment, K is set to 4), only existing default modes are added.
图12A及12B绘示根据本发明的一或多个方面的用于色度预测模式的自适应排序的相邻块选择。根据本发明的技术的一些实例,视频编码器20及/或视频解码器30可应用色度模式的自适应排序,使得次序可取决于相邻块的色度模式。在一个实例中,视频编码器20及/或视频解码器30将自适应排序仅应用于特定模式,例如DM模式及/或LM模式。在另一实例中,相邻块为五个相邻块,如图12A中所描绘。替代地,视频编码器20及/或视频解码器30可使用仅两个相邻块,例如,如图12A中所展示的A1及B1,或图12B中所展示的上方块(A)及左边块(L)。在一个实例中,当所有可用相邻经帧内译码块是用LM模式译码时,视频编码器20及/或视频解码器30可使LM模式处于DM模式之前。替代地,当可用相邻经帧内译码块中的至少一者是用LM模式译码时,视频编码器20及/或视频解码器30可使LM模式处于DM模式之前。Figures 12A and 12B illustrate adjacent block selection for adaptive sorting of chroma prediction modes according to one or more aspects of the present invention. In some embodiments of the technology according to the present invention, the video encoder 20 and/or video decoder 30 may apply adaptive sorting of chroma modes such that the order depends on the chroma mode of adjacent blocks. In one embodiment, the video encoder 20 and/or video decoder 30 applies adaptive sorting only to specific modes, such as DM mode and/or LM mode. In another embodiment, there are five adjacent blocks, as depicted in Figure 12A. Alternatively, the video encoder 20 and/or video decoder 30 may use only two adjacent blocks, for example, A1 and B1 shown in Figure 12A, or the upper block (A) and left block (L) shown in Figure 12B. In one embodiment, when all available adjacent intra-decoded blocks are decoded in LM mode, the video encoder 20 and/or video decoder 30 may place LM mode before DM mode. Alternatively, when at least one of the adjacent intra-frame decoded blocks is decoded in LM mode, the video encoder 20 and/or video decoder 30 may place the LM mode before the DM mode.
根据本发明的一些实例,视频编码器20及/或视频解码器30可使用亮度信息在熵译码之前对色度语法值重排序。在一个实例中,亮度块的NSST索引可用以更新色度NSST索引的译码次序。在此情况下,视频编码器20及/或视频解码器30可首先编码/解码指示色度块的索引与对应亮度块的NSST索引是否相同的二进位。在另一实例中,视频编码器20及/或视频解码器30可使用亮度块的自适应多重变换(AMT)索引来更新色度AMT索引的译码次序。在此情况下,视频编码器20及/或视频解码器30可首先编码/解码二进位以指示色度块的索引与对应亮度块的AMT索引是否相同。视频编码器20及/或视频解码器30可将另一(例如,类似)方式用于任何其它语法,关于所述方式,方法可适用于亮度分量及色度分量两者,而索引/模式对于亮度分量及色度分量可不同。According to some embodiments of the present invention, the video encoder 20 and/or the video decoder 30 may use luma information to reorder chroma syntax values before entropy decoding. In one embodiment, the NSST index of the luma block may be used to update the decoding order of the chroma NSST index. In this case, the video encoder 20 and/or the video decoder 30 may first encode/decode a binary indicating whether the index of the chroma block is the same as the NSST index of the corresponding luma block. In another embodiment, the video encoder 20 and/or the video decoder 30 may use the adaptive multiple transform (AMT) index of the luma block to update the decoding order of the chroma AMT index. In this case, the video encoder 20 and/or the video decoder 30 may first encode/decode a binary indicating whether the index of the chroma block is the same as the AMT index of the corresponding luma block. The video encoder 20 and/or the video decoder 30 may use another (e.g., similar) approach for any other syntax, wherein the method may be applicable to both luma and chroma components, and the index/pattern may be different for the luma and chroma components.
根据本发明的一些实例,视频编码器20及/或视频解码器30可针对一个色度块导出LM参数的多个集合,使得导出是基于对应亮度块的亮度帧内预测模式。在一个实例中,视频编码器20及/或视频解码器30可导出参数的至多K个集合,例如,其中“K”表示整数值。在一个实例中,“K”经设定至值二(2)。在另一实例中,视频编码器20及/或视频解码器30可基于位于对应亮度块中的样本的帧内预测模式而将相邻亮度/色度样本分类成K个集合。视频编码器20及/或视频解码器30可基于位于对应亮度块中的样本的帧内预测模式而将对应亮度块内的亮度样本样本分类成K个集合。在另一实例中,当认为两个帧内预测模式“远离”时,例如,在模式索引的绝对值大于阈值的情况下,视频编码器20及/或视频解码器30可将对应子块及相邻样本看作使用了不同参数。According to some embodiments of the invention, the video encoder 20 and/or the video decoder 30 can derive multiple sets of LM parameters for a chroma block, such that the derive is based on the intra-prediction mode of the corresponding luma block. In one embodiment, the video encoder 20 and/or the video decoder 30 can derive at most K sets of parameters, for example, where “K” represents an integer value. In one embodiment, “K” is set to a value of two (2). In another embodiment, the video encoder 20 and/or the video decoder 30 can classify adjacent luma/chroma samples into K sets based on the intra-prediction mode of samples located in the corresponding luma block. The video encoder 20 and/or the video decoder 30 can classify luma samples within the corresponding luma block into K sets based on the intra-prediction mode of samples located in the corresponding luma block. In another embodiment, when two intra-prediction modes are considered to be “far apart,” for example, when the absolute value of the mode index is greater than a threshold, the video encoder 20 and/or the video decoder 30 can treat corresponding sub-blocks and adjacent samples as using different parameters.
根据本发明的一些实例,视频编码器20及/或视频解码器30可将复合式DM模式用于编码/解码当前色度块。根据本发明的复合式DM模式,视频编码器20可使用从两个或多个经识别帧内预测模式产生的预测块的加权和而产生预测块。视频编码器20可识别用于编码共置亮度块或用于编码相邻色度块或用于编码对应亮度块的相邻块的两个或多个帧内预测模式。接着,视频编码器可产生经识别帧内预测模式中的每一者的预测块,且可导出两个或多个所产生预测块的加权和以作为此复合式DM模式的预测块。According to some embodiments of the present invention, the video encoder 20 and/or the video decoder 30 may use a composite DM mode to encode/decode the current chroma block. According to the composite DM mode of the present invention, the video encoder 20 may generate a prediction block using a weighted sum of prediction blocks generated from two or more identified intra-prediction modes. The video encoder 20 may identify two or more intra-prediction modes for encoding co-located luma blocks, for encoding adjacent chroma blocks, or for encoding adjacent blocks of corresponding luma blocks. The video encoder may then generate a prediction block for each of the identified intra-prediction modes and may derive a weighted sum of the two or more generated prediction blocks as the prediction block for this composite DM mode.
在一个实例中,用于产生此复合式DM模式的预测块的权重取决于应用于对应亮度块的每一经识别帧内预测模式的面积大小。替代地,每一经识别帧内预测模式的预测块的权重可取决于当前像素的位置及当前经识别帧内预测模式是否覆盖当前像素。在另一替代例中,所述权重对于每一经识别帧内预测模式是相同的。在另一替代例中,视频编码器20及/或视频解码器30仍然可利用预定义权重的集合。在又一替代例中,或另外,视频编码器20可用信号传送用于每一CTU/CU/PU/TU的权重的索引。当用信号传送默认模式(如表7.1中所展示的非DM模式及非LM模式)时,如果默认模式已经识别用于产生复合式DM模式,那么视频编码器20可用未识别用于产生复合式DM模式的其它帧内预测模式来替换所述默认模式。In one instance, the weights of the prediction blocks used to generate this composite DM mode depend on the area size of each identified intra-prediction mode applied to the corresponding luma block. Alternatively, the weights of the prediction blocks for each identified intra-prediction mode may depend on the position of the current pixel and whether the current identified intra-prediction mode covers the current pixel. In another alternative, the weights are the same for each identified intra-prediction mode. In yet another alternative, the video encoder 20 and/or the video decoder 30 may still utilize a predefined set of weights. In yet another alternative, or additionally, the video encoder 20 may signal an index of the weights for each CTU/CU/PU/TU. When a default mode (such as the non-DM and non-LM modes shown in Table 7.1) is signaled, if the default mode has already been identified for generating the composite DM mode, the video encoder 20 may replace the default mode with other intra-prediction modes that have not been identified for generating the composite DM mode.
图13A及13B为绘示视频编码器20及视频解码器30可用于根据上文所描述的基于多个DM模式选择的技术来选择色度帧内预测模式的块位置的实例的概念图。关于针对色度译码的基于多个DM模式的选择的一个实例实施方案描述于下文。如上所述,根据本发明的方面,视频编码器20(且视情况,视频解码器30)可执行DM模式的选择。即,在一些实例中,视频编码器20可显式地用信号传送DM候选者列表,由此消除对视频解码器30还形成DM候选者列表的需要。在其它实例中,视频编码器20可仅用信号传送来自DM候选者列表的选定候选者的索引,从而使视频解码器30能够从视频解码器30还形成的DM候选者列表选择候选者。Figures 13A and 13B are conceptual diagrams illustrating an example of how video encoder 20 and video decoder 30 can be used to select block positions of chroma intra-prediction modes according to the multiple DM mode selection technique described above. An example implementation of multiple DM mode selection for chroma decoding is described below. As described above, according to aspects of the invention, video encoder 20 (and, where appropriate, video decoder 30) can perform DM mode selection. That is, in some instances, video encoder 20 can explicitly transmit a list of DM candidates by signaling, thereby eliminating the need for video decoder 30 to also form a list of DM candidates. In other instances, video encoder 20 can transmit only the index of the selected candidate from the list of DM candidates by signaling, thereby enabling video decoder 30 to select a candidate from a list of DM candidates also formed by video decoder 30.
图13A绘示亮度分量(亮度块202)的子块中所使用的预测模式。图13B绘示根据HEVC技术的关于色度块204的亮度模式继承。如所展示,根据HEVC技术,来自亮度块202的左上方子块的预测模式(即,模式L(1))是相对于色度块204的左边区域继承。如图13A中所展示,(例如,由视频编码器20,且视情况,视频解码器30)获得用于位于中心(C0)、左上方(TL)、右上方(TR)、左下方(BL)及右下方(BR)处的子块的亮度模式。所述模式由首字母缩写词DMC、DMTL、DMTR、DMBL、DMBR表示。在一些替代例中,视频编码器20(且视情况,视频解码器30)可用对位置C1及/或C2及/或C3处所使用的模式的选择来替换C0选择。另外,视频编码器20(且视情况,视频解码器30)可将覆盖亮度块202的大部分面积的亮度模式作为额外DM模式添加至DM候选者列表。覆盖亮度块202的最大面积的亮度模式由首字母缩写词“DMM”表示。Figure 13A illustrates the prediction modes used in the sub-blocks of the luma component (luma block 202). Figure 13B illustrates the luma mode inheritance for chroma block 204 according to HEVC technology. As shown, according to HEVC technology, the prediction mode (i.e., mode L(1)) from the upper left sub-block of luma block 202 is inherited relative to the left region of chroma block 204. As shown in Figure 13A, (e.g., by video encoder 20, and, where applicable, video decoder 30) the luma modes for the sub-blocks located at the center (C0), upper left (TL), upper right (TR), lower left (BL), and lower right (BR) are obtained. The modes are represented by the acronyms DMC, DMTL, DMTR, DMBL, and DMBR. In some alternatives, video encoder 20 (and, where applicable, video decoder 30) may replace the C0 selection with the selection of the mode used at positions C1 and/or C2 and/or C3. Additionally, the video encoder 20 (and, if applicable, the video decoder 30) can add a luminance mode covering most of the area of luminance block 202 as an additional DM mode to the DM candidate list. The luminance mode covering the largest area of luminance block 202 is represented by the acronym "DMM".
视频编码器20(且视情况,视频解码器30)可使用下文所论述的一或多种技术来构造DM候选者列表。包含DMC、DMTL、DMTR、DMBL及DMBL的来自候选者群组的数个候选者(由“N”表示)可根据预定次序添加至DM候选者列表。在一个实例中,“N”经设定至六(6)且次序可如下:DMC、DMM、DMTL、DMTR、DMBL、DMBR。在一个替代例中,“N”经设定至五(5)且次序可如下:DMC、DMTL、DMTR、DMBL、DMBR。在形成候选者列表时,在将每一此类候选者添加至DM候选者列表之前,视频编码器20(且视情况,视频解码器30)可相对于所有候选者或先前添加的候选者的部分子集(例如,真子集)精简每一候选者。虽然上文论述了两个实例次序,但应了解,根据本发明的方面,视频编码器20(且视情况,视频解码器30)还可使用各种其它次序。假设候选者列表中的DM模式的总数为“M”(其中“M”为正整数),默认模式的总数由“F”来表示,那么DM候选者列表的特定候选者由DMi来表示。在此记法中,下标“i”表示范围介于0至M-1)的整数值。The video encoder 20 (and, where appropriate, the video decoder 30) may use one or more techniques discussed below to construct the DM candidate list. Several candidates (denoted by “N”) from the candidate group, including DMC, DMTL, DMTR, DMBL, and DMBL, may be added to the DM candidate list in a predetermined order. In one instance, “N” is set to six (6) and the order may be as follows: DMC, DMM, DMTL, DMTR, DMBL, DMBR. In an alternative example, “N” is set to five (5) and the order may be as follows: DMC, DMTL, DMTR, DMBL, DMBR. When forming the candidate list, before adding each such candidate to the DM candidate list, the video encoder 20 (and, where appropriate, the video decoder 30) may reduce each candidate relative to all candidates or a subset (e.g., a proper subset) of previously added candidates. While two example orders have been discussed above, it should be understood that, according to aspects of the invention, the video encoder 20 (and, where appropriate, the video decoder 30) may also use various other orders. Assuming the total number of DM modes in the candidate list is “M” (where “M” is a positive integer), and the total number of default modes is represented by “F”, then a specific candidate in the DM candidate list is represented by DM<sub> i </sub>. In this notation, the subscript “i” represents an integer value ranging from 0 to M-1.
视频编码器20(且视情况,视频解码器30)可在DM候选者及默认模式中使用应用精简。即,在形成DM候选者列表时,视频编码器20(且视情况,视频解码器30)可相对于默认模式精简DM候选者。在一个替代例中,对于每一DMi,视频编码器20(且视情况,视频解码器30)可比较DMi与默认模式中的每一者。如果发现任何默认模式与DMi相同,那么视频编码器20(且视情况,视频解码器30)可用替代模式替换第一此类默认模式(其被发现与DMi相同)。举例来说,视频编码器20(且视情况,视频解码器30)可用具有等于(K-1-i)的索引值的模式替换经精简的默认模式,其中“K”是用于对应亮度块的亮度预测模式的总数。用于这些操作的实例伪码在下文给出:The video encoder 20 (and, if applicable, the video decoder 30) can use applied simplification in both DM candidates and default modes. That is, when forming the list of DM candidates, the video encoder 20 (and, if applicable, the video decoder 30) can simplify the DM candidates relative to the default modes. In an alternative example, for each DM <sub>i </sub>, the video encoder 20 (and, if applicable, the video decoder 30) can compare DM <sub>i </sub> with each of the default modes. If any default mode is found to be identical to DM<sub> i </sub>, then the video encoder 20 (and, if applicable, the video decoder 30) can replace the first such default mode (which was found to be identical to DM<sub> i </sub>) with an alternative mode. For example, the video encoder 20 (and, if applicable, the video decoder 30) can replace the simplified default mode with a mode having an index value equal to (K-1-i), where “K” is the total number of luminance prediction modes for the corresponding luma block. Example pseudocode for these operations is given below:
举例来说,默认模式可为:模式0(平面)、模式50(竖直方向)、模式18(水平方向)及模式1(DC),且DM候选者列表是{模式0,模式63,模式50,模式1}。在精简过程之后,所述默认模式由以下集合替换:{模式66,模式64,模式18,模式63}。在另一替代例中,视频编码器20(且视情况,视频解码器30)可应用完全精简,其中每一默认模式相对于所有DM模式进行精简。即,对于每一默认模式,所述默认模式将与所有DM模式进行比较。如果逐步骤比较指示DM模式中的一者与目前在检验中的默认模式相同,那么所述默认模式将由最后非DM模式替换。用于此实例的实例伪码在下文给出:For example, the default modes could be: Mode 0 (planar), Mode 50 (vertical), Mode 18 (horizontal), and Mode 1 (DC), and the DM candidate list is {Mode 0, Mode 63, Mode 50, Mode 1}. After the simplification process, the default modes are replaced by the following set: {Mode 66, Mode 64, Mode 18, Mode 63}. In another alternative example, the video encoder 20 (and, if applicable, the video decoder 30) can apply full simplification, where each default mode is simplified relative to all DM modes. That is, for each default mode, the default mode is compared with all DM modes. If a step-by-step comparison indicates that one of the DM modes is the same as the default mode currently under examination, then the default mode is replaced by the last non-DM mode. Example pseudocode for this instance is given below:
视频编码器20可实施本发明的基于多个DM模式的技术的各种方面,以实施色度模式的用信号传送。视频编码器20可根据包含以下部分的过程来编码色度模式。作为一个部分,视频编码器20可编码及用信号传送一位旗标以指示仅可适用于色度分量的预测模式中的任一者(例如,LM,其是色度编码特有的)的使用。如果色度块是根据此色度特定模式经编码(由此致使视频编码器20将所述旗标设定至启用状态),那么视频编码器20可另外编码及用信号传送特定模式的索引。The video encoder 20 can implement various aspects of the present invention based on multiple DM modes to implement the signal transmission of chroma modes. The video encoder 20 can encode chroma modes according to a process comprising the following parts: As part, the video encoder 20 can encode and signal a one-bit flag to indicate the use of only any of the prediction modes applicable to the chroma components (e.g., LM, which is specific to chroma encoding). If the chroma block is encoded according to this chroma-specific mode (thereby causing the video encoder 20 to set the flag to an enabled state), then the video encoder 20 can additionally encode and signal an index of the specific mode.
另外,视频编码器20可编码及用信号传送旗标以指示从对应亮度块导出的模式的使用。即,如果视频编码器20基于用于对应亮度块的预测模式来选择用于编码色度块的预测模式,那么视频编码器20可将所述旗标设定至启用状态。随后,如果色度块是使用继承自对应亮度块的预测模式经编码,那么视频编码器20可另外编码及用信号传送选自对应亮度块的模式的索引。Additionally, the video encoder 20 can encode and signal a flag to indicate the use of a mode derived from the corresponding luma block. That is, if the video encoder 20 selects a prediction mode for encoding the chroma block based on the prediction mode used for the corresponding luma block, then the video encoder 20 can set the flag to an enabled state. Subsequently, if the chroma block is encoded using a prediction mode inherited from the corresponding luma block, then the video encoder 20 can additionally encode and signal an index of the mode selected from the corresponding luma block.
如果视频编码器20确定色度块既不根据色度特定预测模式也不根据亮度块导出的预测模式经编码,那么视频编码器20可编码及用信号传送识别剩余模式的信息。视频编码器20可根据不同次序实施色度编码的上文所列出的部分/选项。不同次序的实例是在以下表7.3及表7.4或表8中给出。If the video encoder 20 determines that the chroma block is encoded neither according to a chroma-specific prediction mode nor according to a prediction mode derived from the luma block, then the video encoder 20 can encode and signal information identifying the remaining mode. The video encoder 20 can implement some/options of the chroma encoding listed above in different orders. Examples of different orders are given in Tables 7.3 and 7.4 or Table 8 below.
表7.3-色度帧内预测模式及相关联名称的规范Table 7.3 - Specifications of Chroma Intra-Frame Prediction Modes and Associated Names
表7.4-用于每一色度模式的二进位串Table 7.4 - Binary strings for each chroma mode
表8-用于每一色度模式的二进位串Table 8 - Binary strings for each chroma mode
如上所述,本发明的方面涉及亮度模式及色度模式的统一。亮度模式及色度模式的统一的实例实施方案描述于下文。最可能模式(MPM)候选者的总允许数目在下文由Nmpm表示。视频编码器20及/或视频解码器30可构造色度帧内模式的模式列表以包含以下部分:As described above, aspects of the present invention relate to the unification of luma and chroma modes. Example embodiments of the unification of luma and chroma modes are described below. The total allowed number of most probable mode (MPM) candidates is denoted by N mpm below. The video encoder 20 and/or video decoder 30 can construct a mode list of chroma intra-frame modes to include the following portions:
-LM模式;及-LM mode; and
-MPM模式。-MPM mode.
MPM模式部分可包含DM候选者列表及色度模式部分。视频编码器20(且视情况,视频解码器30)可使用与上文关于DM多个DM模式所描述的相同技术来形成统一候选者列表的DM候选者列表部分。关于MPM模式的色度模式部分,视频编码器20(且视情况,视频解码器30)可从目前经译码色度块的相邻块导出色度模式。举例来说,为了从相邻块导出色度模式,视频编码器20(且视情况,视频解码器30)可再使用用于亮度模式的MPM构造过程。如果MPM候选者的总数在执行上文所描述的列表构造过程之后仍小于Nmpm,那么视频编码器20(且视情况,视频解码器30)可根据上文所引用的JVET-C0055实施各种步骤。The MPM mode portion may include a DM candidate list and a chroma mode portion. The video encoder 20 (and, if applicable, the video decoder 30) may use the same techniques described above regarding multiple DM modes to form the DM candidate list portion, which represents a unified candidate list. Regarding the chroma mode portion of the MPM mode, the video encoder 20 (and, if applicable, the video decoder 30) may derive the chroma mode from adjacent blocks of the currently decoded chroma block. For example, to derive the chroma mode from adjacent blocks, the video encoder 20 (and, if applicable, the video decoder 30) may further utilize the MPM construction process used for the luma mode. If the total number of MPM candidates is still less than N mpm after performing the list construction process described above, then the video encoder 20 (and, if applicable, the video decoder 30) may perform various steps according to JVET-C0055 referenced above.
举例来说,如果MPM候选者的总数在执行上文所阐述的列表构造过程之后小于Nmpm的值,那么视频编码器20(且视情况,视频解码器30)可添加以下模式:左边(L)、上方(A)、平面、DC、左下(BL)、右上(AR)及左上(AL)模式。如果MPM候选者列表仍不完整(即,如果MPM候选者的总数小于Nmpm的值),那么视频编码器20(且视情况,视频解码器30)可添加-1及+1至已包含的角度模式。如果MPM列表仍不完整,MPM候选者列表仍不完整(即,MPM候选者的总数小于Nmpm的值),那么视频编码器20(且视情况,视频解码器30)可添加默认模式,即,竖直、水平、2及对角线模式。For example, if the total number of MPM candidates is less than N mpm after performing the list construction process described above, then the video encoder 20 (and, depending on the case, the video decoder 30) can add the following modes: left (L), top (A), plane, DC, bottom left (BL), top right (AR), and top left (AL). If the MPM candidate list is still incomplete (i.e., if the total number of MPM candidates is less than N mpm ), then the video encoder 20 (and, depending on the case, the video decoder 30) can add -1 and +1 to the already included angle modes. If the MPM list is still incomplete (i.e., the total number of MPM candidates is less than N mpm ), then the video encoder 20 (and, depending on the case, the video decoder 30) can add default modes, namely, vertical, horizontal, 2, and diagonal modes.
视频编码器20及/或视频解码器30可识别的非MPM模式包含未包含于上文所描述的MPM候选者列表构造过程中的任何剩余帧内预测模式。与上文(例如,在引用JVET-C0055的部分)所描述的基于亮度的MPM列表构造过程的差别在于,当添加一个候选者时,所添加的候选者并非LM模式。替代地或另外,平面及DC模式可在所有空间相邻者之后添加。替代地,视频编码器20及/或视频解码器30可实施一或多个其它MPM列表构造技术来替换JVET-C0055的技术。The non-MPM modes recognizable by video encoder 20 and/or video decoder 30 include any remaining intra-predictive modes not included in the MPM candidate list construction process described above. The difference from the luma-based MPM list construction process described above (e.g., in the section referencing JVET-C0055) is that when a candidate is added, the added candidate is not an LM mode. Alternatively or additionally, planar and DC modes may be added after all spatially adjacent modes. Alternatively, video encoder 20 and/or video decoder 30 may implement one or more other MPM list construction techniques to replace the technique of JVET-C0055.
关于亮度模式及色度模式的统一,视频编码器20可实施本发明的各种色度模式用信号传送技术。视频编码器20可根据包含以下部分的过程来编码色度模式。作为一个部分,视频编码器20可编码及用信号传送一位旗标以指示仅可适用于色度分量的预测模式中的任一者(例如,LM,其是色度编码特有的)的使用。如果色度块是根据此色度特定模式经编码(由此致使视频编码器20将所述旗标设定至启用状态),那么视频编码器20可另外编码及用信号传送特定模式的索引。Regarding the unification of luminance and chrominance modes, the video encoder 20 can implement various chrominance mode signal transmission techniques of the present invention. The video encoder 20 can encode chrominance modes according to a process comprising the following parts: As part, the video encoder 20 can encode and signal a flag to indicate the use of only any of the predictive modes applicable to the chrominance components (e.g., LM, which is specific to chrominance encoding). If the chrominance block is encoded according to this chrominance-specific mode (thereby causing the video encoder 20 to set the flag to an enabled state), then the video encoder 20 can additionally encode and signal an index of the specific mode.
另外,视频编码器20可编码及用信号传送旗标以指示包含于MPM候选者列表中的模式的使用。即,如果视频编码器20选择预测模式用于编码色度块,且选定预测模式包含于MPM候选者列表中,那么视频编码器20可将所述旗标设定至启用状态。随后,如果色度块是使用包含于MPM候选者列表中的预测模式经编码,那么视频编码器20可另外编码及用信号传送所述模式的索引,其指示模式在MPM候选者列表中的位置。Additionally, the video encoder 20 can encode and signal a flag to indicate the use of a mode included in the MPM candidate list. That is, if the video encoder 20 selects a predictive mode for encoding chroma blocks, and the selected predictive mode is included in the MPM candidate list, then the video encoder 20 can set the flag to an enabled state. Subsequently, if the chroma blocks are encoded using a predictive mode included in the MPM candidate list, then the video encoder 20 can additionally encode and signal an index of the mode, indicating the mode's position in the MPM candidate list.
如果视频编码器20确定色度块既不根据色度特定预测模式也不根据包含于MPM候选者列表中的预测模式经编码,那么视频编码器20可编码及用信号传送识别剩余模式的信息。视频编码器20可根据不同次序实施色度编码的上文所列出的部分/选项。不同次序的实例在以下表8.1或表9中给出。If the video encoder 20 determines that the chroma block is not encoded according to either a chroma-specific prediction mode or a prediction mode included in the MPM candidate list, then the video encoder 20 can encode and signal information identifying the remaining mode. The video encoder 20 can implement some/options of the chroma encoding listed above in different orders. Examples of different orders are given in Table 8.1 or Table 9 below.
表8.1-用于每一色度模式的二进位串Table 8.1 - Binary strings for each chroma mode
如果色度帧内模式的模式列表仅包含LM部分及MPM部分(如同亮度MPM,包含多个DM模式及来自空间相邻者的模式),那么视频编码器20可以另一经修改方式来实施色度模式的用信号传送,如下表9中所展示:If the chroma intra-frame mode list only contains the LM and MPM portions (similar to the luma MPM, which contains multiple DM modes and modes from spatial neighbors), then the video encoder 20 can implement the chroma mode signal transmission in another modified manner, as shown in Table 9 below:
表9Table 9
在另一替代例中,视频编码器20(且视情况,视频解码器30)可总是添加默认模式(例如平面、DC、水平、竖直模式)至MPM候选者列表。在一个实例中,可首先用上述技术中的一或多者构造MPM候选者列表的Nmpm个候选者。接着,默认模式的缺失模式可替换最后一或多个MPM候选者。In another alternative, the video encoder 20 (and, if applicable, the video decoder 30) may always add a default mode (e.g., planar, DC, horizontal, vertical mode) to the MPM candidate list. In one instance, N mpm candidates for the MPM candidate list may be constructed first using one or more of the techniques described above. Then, the missing modes of the default mode may replace the last one or more MPM candidates.
图14为绘示根据本发明的方面的视频解码器30的处理电路可执行的实例过程220的流程图。过程220可在视频解码器30的处理电路进行以下操作时开始:确定可用于预测视频数据的亮度块的多个导出模式(DM)还可用于预测视频数据的色度块,色度块对应于亮度块(222)。视频解码器30可形成关于色度块的预测模式的候选者列表,候选者列表包含可用于预测色度块的多个DM中的一或多个DM(224)。在一些非限制性实例中,视频解码器30的处理电路可在经编码视频位流中接收指示候选者列表的一或多个DM中的每一相应DM的数据,且重构指示一或多个DM中的每一相应DM的所接收数据,从而形成候选者列表。在其它实例中,视频解码器30的处理电路可构造候选者列表。Figure 14 is a flowchart illustrating an example process 220 executable by the processing circuitry of the video decoder 30 according to an aspect of the present invention. Process 220 may begin when the processing circuitry of the video decoder 30 performs the following operations: determining a plurality of derived modes (DMs) that can be used to predict luminance blocks of video data, and also to predict chrominance blocks of video data, the chrominance blocks corresponding to the luminance blocks (222). The video decoder 30 may form a candidate list of prediction modes for the chrominance blocks, the candidate list containing one or more DMs that can be used to predict the chrominance blocks (224). In some non-limiting embodiments, the processing circuitry of the video decoder 30 may receive data in the encoded video bitstream indicating each corresponding DM in the candidate list, and reconstruct the received data indicating each corresponding DM in the candidate list, thereby forming the candidate list. In other embodiments, the processing circuitry of the video decoder 30 may construct the candidate list.
视频解码器30的处理电路可确定使用候选者列表的一或多个DM中的任何DM来解码色度块(226)。在一些非限制性实例中,视频解码器30的处理电路可在经编码视频位流中接收指示色度块是使用DM中的一者进行编码的一位旗标。基于使用候选者列表的一或多个DM中的任何DM来解码色度块的确定,视频解码器30的处理电路可解码识别将用于解码色度块的候选者列表的选定DM的指示(228)。举例来说,视频解码器30的处理电路可重构指示识别候选者列表中的选定DM的位置的索引值的数据(在经编码视频位流中接收)。随后,视频解码器30的处理电路可根据选定DM来解码色度块(230)。在各种实例中,包含亮度块及色度块的视频数据可存储至视频解码器30的存储器。The processing circuitry of video decoder 30 can determine to decode the chroma block using any of one or more DMs from the candidate list (226). In some non-limiting instances, the processing circuitry of video decoder 30 can receive a one-bit flag in the encoded video bitstream indicating that the chroma block is encoded using one of the DMs. Based on the determination to decode the chroma block using any of one or more DMs from the candidate list, the processing circuitry of video decoder 30 can decode an indication (228) identifying the selected DM from the candidate list to be used for decoding the chroma block. For example, the processing circuitry of video decoder 30 can reconstruct data (received in the encoded video bitstream) indicating the index value of the selected DM in the candidate list. Subsequently, the processing circuitry of video decoder 30 can decode the chroma block according to the selected DM (230). In various instances, video data containing the luma block and the chroma block can be stored in the memory of video decoder 30.
在一些实例中,包含于候选者列表中的一或多个DM可包含以下各者中的一或多者:第一预测模式,其与对应亮度块的中心位置相关联;第二预测模式,其与对应亮度块的左上方位置相关联;第三预测模式,其与对应亮度块的右上方位置相关联;第四预测模式,其与对应亮度块的左下方位置相关联;或第五预测模式,其与对应亮度块的右下方位置相关联。在一些实例中,候选者列表可进一步包含不同于一或多个DM中的每一者的一或多个色度帧内预测模式。在一些此类实例中,色度帧内预测模式中的每一者对应于用以预测色度块的相邻色度块的模式。在一些实例中,候选者列表的至少一个相应色度帧内预测模式是仅用于预测色度数据的色度特定预测模式。In some instances, one or more DMs included in the candidate list may include one or more of the following: a first prediction mode associated with the center position of the corresponding luma block; a second prediction mode associated with the upper left position of the corresponding luma block; a third prediction mode associated with the upper right position of the corresponding luma block; a fourth prediction mode associated with the lower left position of the corresponding luma block; or a fifth prediction mode associated with the lower right position of the corresponding luma block. In some instances, the candidate list may further include one or more chroma intra-frame prediction modes different from each of the one or more DMs. In some such instances, each of the chroma intra-frame prediction modes corresponds to a mode used to predict adjacent chroma blocks of the chroma block. In some instances, at least one corresponding chroma intra-frame prediction mode in the candidate list is a chroma-specific prediction mode used only for predicting chroma data.
图15为绘示根据本发明的方面的视频编码器20的处理电路可执行的实例过程240的流程图。过程240可在视频编码器20的处理电路进行以下操作时开始:确定可用于预测视频数据的亮度块的多个导出模式(DM)还可用于预测视频数据的色度块,色度块对应于亮度块(242)。在各种实例中,包含亮度块及色度块的视频数据可存储至视频编码器20的存储器。视频编码器20可形成关于色度块的预测模式的候选者列表,候选者列表包含可用于预测色度块的多个DM中的一或多个DM(244)。Figure 15 is a flowchart illustrating an example process 240 executable by the processing circuitry of the video encoder 20 according to an aspect of the present invention. Process 240 may begin when the processing circuitry of the video encoder 20 performs the following operations: determining a plurality of derived modes (DMs) that can be used to predict luminance blocks of video data, and also to predict chrominance blocks of video data, the chrominance blocks corresponding to the luminance blocks (242). In various instances, video data containing luminance and chrominance blocks may be stored in the memory of the video encoder 20. The video encoder 20 may form a candidate list of prediction modes for chrominance blocks, the candidate list containing one or more DMs (244) among a plurality of DMs that can be used to predict chrominance blocks.
视频编码器20的处理电路可确定使用候选者列表的一或多个DM中的任何DM来编码色度块(246)。基于使用候选者列表的一或多个DM中的任何DM来编码色度块的确定,视频编码器20的处理电路可编码识别将用于解码色度块的候选者列表的选定DM的指示(248)。举例来说,视频编码器20的处理电路可编码指示识别候选者列表中的选定DM的位置的索引值的数据,且在经编码视频位流中用信号传送经编码数据。随后,视频编码器20的处理电路可根据选定DM来编码色度块(250)。在一些实例中,视频编码器20的处理电路可在经编码视频位流中用信号传送指示色度块是否使用线性模型(LM)模式进行编码的一位旗标。在这些实例中,视频编码器20的处理电路可在经编码视频位流中用信号传送指示候选者列表的一或多个DM中的每一相应DM的数据。The processing circuitry of video encoder 20 can determine to encode a chroma block using any DM from one or more DMs in a candidate list (246). Based on the determination to encode the chroma block using any DM from one or more DMs in the candidate list, the processing circuitry of video encoder 20 can encode an indication (248) identifying a selected DM from the candidate list to be used for decoding the chroma block. For example, the processing circuitry of video encoder 20 can encode data indicating the index value of the position of the selected DM in the candidate list and signal the encoded data in the encoded video bitstream. Subsequently, the processing circuitry of video encoder 20 can encode the chroma block according to the selected DM (250). In some instances, the processing circuitry of video encoder 20 can signal a one-bit flag in the encoded video bitstream indicating whether the chroma block is encoded using a linear model (LM) mode. In these instances, the processing circuitry of video encoder 20 can signal data in the encoded video bitstream indicating each corresponding DM in one or more DMs in the candidate list.
在一些实例中,包含于候选者列表中的一或多个DM可包含以下各者中的一或多者:第一预测模式,其与对应亮度块的中心位置相关联;第二预测模式,其与对应亮度块的左上方位置相关联;第三预测模式,其与对应亮度块的右上方位置相关联;第四预测模式,其与对应亮度块的左下方位置相关联;或第五预测模式,其与对应亮度块的右下方位置相关联。在一些实例中,候选者列表可进一步包含不同于一或多个DM中的每一者的一或多个色度帧内预测模式。在一些此类实例中,色度帧内预测模式中的每一者对应于用以预测色度块的相邻色度块的模式。在一些实例中,候选者列表的至少一个相应色度帧内预测模式是仅用于预测色度数据的色度特定预测模式。在一些实例中,视频编码器20的处理电路可确定一或多个DM中的至少两个DM相同,且可将至少两个相同DM中的仅一个DM包含于候选者列表中。In some instances, one or more DMs included in the candidate list may include one or more of the following: a first prediction mode associated with the center position of the corresponding luma block; a second prediction mode associated with the upper left position of the corresponding luma block; a third prediction mode associated with the upper right position of the corresponding luma block; a fourth prediction mode associated with the lower left position of the corresponding luma block; or a fifth prediction mode associated with the lower right position of the corresponding luma block. In some instances, the candidate list may further include one or more chroma intra-frame prediction modes different from each of the one or more DMs. In some such instances, each of the chroma intra-frame prediction modes corresponds to a mode used to predict adjacent chroma blocks of the chroma block. In some instances, at least one corresponding chroma intra-frame prediction mode in the candidate list is a chroma-specific prediction mode used only for predicting chroma data. In some instances, the processing circuitry of the video encoder 20 may determine that at least two of the one or more DMs are identical, and may include only one of the at least two identical DMs in the candidate list.
图16为绘示根据本发明的方面的视频解码器30的处理电路可执行的实例过程260的流程图。过程260可在视频解码器30的处理电路进行以下操作时开始:针对存储至视频解码器30的存储器的视频数据的色度块形成最可能模式(MPM)候选者列表,使得MPM候选者列表包含与相关联于色度块的视频数据的亮度块相关联的一或多个导出模式(DM),及可用于解码视频数据的亮度分量的多个亮度预测模式(262)。在一些实例中,视频解码器30的处理电路可将一或多个DM添加至MPM候选者列表,且可在MPM候选者列表的出现在MPM候选者列表中的所有一或DM的位置之后的位置添加继承自色度块的相邻色度块的一或多个色度模式。Figure 16 is a flowchart illustrating an example process 260 executable by the processing circuitry of the video decoder 30 according to an aspect of the present invention. Process 260 may begin when the processing circuitry of the video decoder 30 performs the following operation: forming a most probable mode (MPM) candidate list for chroma blocks of video data stored in the memory of the video decoder 30, such that the MPM candidate list contains one or more derived modes (DMs) associated with luminance blocks of the video data associated with the chroma blocks, and a plurality of luminance prediction modes (262) that can be used to decode the luminance components of the video data. In some instances, the processing circuitry of the video decoder 30 may add one or more DMs to the MPM candidate list, and may add one or more chroma modes inherited from adjacent chroma blocks in the MPM candidate list after the positions of all one or more DMs appearing in the MPM candidate list.
在一些实例中,视频解码器30的处理电路可响应于LM模式是用以预测色度块的一或多个相邻色度块的确定而从MPM候选者列表省略LM模式的任何额外实例。在一些实例中,视频解码器30的处理电路可在经编码视频位流中接收指示色度块是否使用LM模式进行编码的一位旗标。在一个情境下,视频解码器30的处理电路可确定所接收的一位旗标经设定至停用状态,可接收对应于MPM候选者列表的特定模式的MPM索引,且基于所接收的一位旗标经设定至停用状态可选择对应于所接收的MPM索引的特定模式。在另一情境下,视频解码器30的处理电路可确定所接收的一位旗标经设定至启用状态,且基于所接收的一位旗标经设定至启用状态,可从MPM候选者列表选择LM模式。In some instances, the processing circuitry of the video decoder 30 may omit any additional instances of the LM mode from the MPM candidate list in response to the determination of one or more adjacent chroma blocks used to predict chroma blocks. In some instances, the processing circuitry of the video decoder 30 may receive a one-bit flag in the encoded video bitstream indicating whether a chroma block is encoded using the LM mode. In one scenario, the processing circuitry of the video decoder 30 may determine that the received one-bit flag is set to a deactivated state, receive an MPM index corresponding to a specific mode in the MPM candidate list, and select a specific mode corresponding to the received MPM index based on the received one-bit flag being set to a deactivated state. In another scenario, the processing circuitry of the video decoder 30 may determine that the received one-bit flag is set to an enabled state, and select an LM mode from the MPM candidate list based on the received one-bit flag being set to an enabled state.
在一些实例中,视频解码器30的处理电路可确定与色度块相关联的默认模式的数目是否符合预定阈值。基于默认模式的数目符合预定阈值的确定,视频解码器30的处理电路可将默认模式中的每一默认模式添加至MPM候选者列表,且可从MPM候选者列表省略所有默认模式。视频解码器30的处理电路可从MPM候选者列表选择模式(264)。随后,视频解码器30的处理电路可根据选自MPM候选者列表的模式来解码色度块(266)。In some instances, the processing circuitry of the video decoder 30 may determine whether the number of default modes associated with a chroma block meets a predetermined threshold. Based on the determination that the number of default modes meets the predetermined threshold, the processing circuitry of the video decoder 30 may add each default mode to the MPM candidate list and may omit all default modes from the MPM candidate list. The processing circuitry of the video decoder 30 may select a mode from the MPM candidate list (264). Subsequently, the processing circuitry of the video decoder 30 may decode the chroma block according to the mode selected from the MPM candidate list (266).
在一些实例中,为了形成MPM候选者列表,视频解码器30的处理电路可将一或多个DM添加至MPM候选者列表,且可在MPM候选者列表的出现在MPM候选者列表中的所有一或DM的位置之后的位置添加继承自色度块的相邻色度块的一或多个色度模式。在一些实例中,为了形成MPM候选者列表,视频解码器30的处理电路可将一或多个线性模型(LM)模式添加至MPM候选者列表。在一个此类实例中,视频解码器30的处理电路可确定一或多个LM模式包括第一模式的第一实例及第一LM模式的一或多个额外实例,且可响应于第一LM模式是用以预测色度块的一或多个相邻色度块的确定而从MPM候选者列表省略LM模式的一或多个额外实例。In some instances, to form an MPM candidate list, the processing circuitry of the video decoder 30 may add one or more DMs to the MPM candidate list, and may add one or more chroma modes inherited from the chroma block to the MPM candidate list after all positions of the one or DMs appearing in the MPM candidate list. In some instances, to form an MPM candidate list, the processing circuitry of the video decoder 30 may add one or more linear model (LM) modes to the MPM candidate list. In one such instance, the processing circuitry of the video decoder 30 may determine that one or more LM modes include a first instance of a first mode and one or more additional instances of the first LM mode, and may omit one or more additional instances of the LM mode from the MPM candidate list in response to the determination that the first LM mode is used to predict one or more adjacent chroma blocks of the chroma block.
在一些实例中,视频解码器30的处理电路可在经编码视频位流中接收指示色度块是否使用LM模式进行编码的一位旗标,其中从MPM候选者列表选择模式是基于一位旗标的值。在一些此类实例中,视频解码器30的处理电路可确定一或多个LM模式包含多个LM模式,且可确定所接收的一位旗标经设定至启用状态。在一些此类实例中,视频解码器30装置处理电路可接收对应于MPM候选者列表中的多个LM模式的特定LM模式的位置的LM索引,且可基于所接收的一位旗标经设定至启用状态而选择对应于所接收的LM索引的特定LM模式用于译码色度块。在一些实例中,为了从MPM候选者列表选择模式,视频解码器30的处理电路可确定所接收的一位旗标经设定至停用状态,可接收对应于MPM候选者列表的特定模式的MPM索引,且可基于所接收的一位旗标经设定至停用状态,选择对应于所接收的MPM索引的特定模式。In some instances, the processing circuitry of the video decoder 30 may receive a one-bit flag in the encoded video bitstream indicating whether a chroma block is encoded using an LM mode, wherein mode selection from the MPM candidate list is based on the value of the one-bit flag. In some such instances, the processing circuitry of the video decoder 30 may determine that one or more LM modes comprise multiple LM modes, and may determine that the received one-bit flag is set to an enabled state. In some such instances, the processing circuitry of the video decoder 30 may receive an LM index corresponding to the position of a specific LM mode among multiple LM modes in the MPM candidate list, and may select a specific LM mode corresponding to the received LM index for decoding the chroma block based on the received one-bit flag being set to an enabled state. In some instances, to select a mode from the MPM candidate list, the processing circuitry of the video decoder 30 may determine that the received one-bit flag is set to a disabled state, may receive an MPM index corresponding to a specific mode in the MPM candidate list, and may select a specific mode corresponding to the received MPM index based on the received one-bit flag being set to a disabled state.
在一些实例中,视频解码器30的处理电路可确定与色度块相关联的默认模式的数目是否符合预定阈值。在这些实例中,视频解码器30的处理电路可执行以下各者中的一者:(i)添加,基于默认模式的数目不符合预定阈值的确定,将默认模式中的每一默认模式添加至MPM候选者列表;或(ii)基于默认模式的数目符合预定阈值的确定,从MPM候选者列表省略所有默认模式。In some instances, the processing circuitry of the video decoder 30 may determine whether the number of default modes associated with a chroma block meets a predetermined threshold. In these instances, the processing circuitry of the video decoder 30 may perform one of the following: (i) adding each default mode in the default modes to the MPM candidate list based on the determination that the number of default modes does not meet the predetermined threshold; or (ii) omitting all default modes from the MPM candidate list based on the determination that the number of default modes meets the predetermined threshold.
图17为绘示根据本发明的方面的视频编码器20的处理电路可执行的实例过程280的流程图。过程280可在视频编码器20的处理电路进行以下操作时开始:针对存储至视频编码器20的存储器的视频数据的色度块形成最可能模式(MPM)候选者列表,使得MPM候选者列表包含线性模型(LM)模式、与相关联于色度块的视频数据的亮度块相关联的一或多个导出模式(DM),及可用于解码亮度块的多个亮度预测模式(282)。在一些实例中,视频编码器20的处理电路可将一或多个DM添加至MPM候选者列表,且可在出现在MPM候选者列表中的所有一或DM的位置之后的MPM候选者列表的位置添加继承自色度块的相邻色度块的一或多个色度模式。Figure 17 is a flowchart illustrating an example process 280 executable by the processing circuitry of the video encoder 20 according to an aspect of the present invention. Process 280 may begin when the processing circuitry of the video encoder 20 performs the following operation: forming a most probable mode (MPM) candidate list for chroma blocks of video data stored in the memory of the video encoder 20, such that the MPM candidate list includes linear model (LM) modes, one or more derived modes (DMs) associated with luminance blocks of the video data associated with the chroma blocks, and multiple luminance prediction modes (282) that can be used to decode the luminance blocks. In some instances, the processing circuitry of the video encoder 20 may add one or more DMs to the MPM candidate list, and may add one or more chroma modes inherited from adjacent chroma blocks to positions in the MPM candidate list after all positions of one or more DMs appearing in the MPM candidate list.
在一些实例中,视频编码器20的处理电路可响应于LM模式是用以预测色度块的一或多个相邻色度块的确定而从MPM候选者列表省略LM模式的任何额外实例。在一些实例中,视频编码器20的处理电路可在经编码视频位流中用信号传送指示色度块是否使用LM模式进行编码的一位旗标。在一个情境下,视频编码器20的处理电路可基于色度块未使用LM模式进行编码的确定而将一位旗标设定至停用状态。在此情境下,基于色度块未使用LM模式进行编码的确定及色度块是使用MPM候选者列表的特定模式进行编码的确定,视频编码器20的处理电路可在经编码视频位流中用信号传送对应于MPM候选者列表的特定模式的MPM索引。在另一情境下,视频编码器20的处理电路可基于色度块是使用LM模式进行编码的确定而将一位旗标设定至启用状态。In some instances, the processing circuitry of the video encoder 20 may omit any additional instances of the LM mode from the MPM candidate list in response to the determination that the LM mode is used to predict one or more adjacent chroma blocks. In some instances, the processing circuitry of the video encoder 20 may signal a one-bit flag in the encoded video bitstream indicating whether the chroma block is encoded using the LM mode. In one scenario, the processing circuitry of the video encoder 20 may set a one-bit flag to a deactivated state based on the determination that the chroma block is not encoded using the LM mode. In this scenario, based on the determination that the chroma block is not encoded using the LM mode and that the chroma block is encoded using a specific mode from the MPM candidate list, the processing circuitry of the video encoder 20 may signal an MPM index corresponding to the specific mode in the MPM candidate list in the encoded video bitstream. In another scenario, the processing circuitry of the video encoder 20 may set a one-bit flag to an enabled state based on the determination that the chroma block is encoded using the LM mode.
在一些实例中,视频编码器20的处理电路可确定与色度块相关联的默认模式的数目是否符合预定阈值。基于默认模式的数目符合预定阈值的确定,视频编码器20的处理电路可将默认模式的每一默认模式添加至MPM候选者列表,且可从MPM候选者列表省略所有默认模式。视频编码器20的处理电路可从MPM候选者列表选择模式(284)。随后,视频编码器20的处理电路可根据选自MPM候选者列表的模式来编码色度块。In some instances, the processing circuitry of the video encoder 20 may determine whether the number of default modes associated with a chroma block meets a predetermined threshold. Based on the determination that the number of default modes meets the predetermined threshold, the processing circuitry of the video encoder 20 may add each default mode to the MPM candidate list and may omit all default modes from the MPM candidate list. The processing circuitry of the video encoder 20 may select a mode (284) from the MPM candidate list. Subsequently, the processing circuitry of the video encoder 20 may encode the chroma block according to the mode selected from the MPM candidate list.
在一些实例中,为了形成MPM候选者列表,视频编码器20的处理电路可将一或多个线性模型(LM)模式添加至MPM候选者列表。在一些实例中,视频编码器20的处理电路可在经编码视频位流中用信号传送指示色度块是否使用MPM候选者列表的一或多个LM模式中的任一者进行编码的一位旗标。在一些实例中,视频编码器20的处理电路可设定基于色度块未使用候选者列表的任何LM模式进行编码的确定而将一位旗标设定至停用状态,且可基于色度块未使用MPM候选者列表的任何LM模式进行编码的确定且基于色度块是使用MPM候选者列表的特定模式进行编码的确定,在经编码视频位流中用信号传送对应于MPM候选者列表的特定模式的MPM索引。在一些实例中,视频编码器20的处理电路可基于色度块是使用MPM候选者列表的一或多个LM模式中的特定LM模式进行编码的确定而将一位旗标设定至启用状态。In some instances, to form an MPM candidate list, the processing circuitry of the video encoder 20 may add one or more linear model (LM) modes to the MPM candidate list. In some instances, the processing circuitry of the video encoder 20 may signal a flag in the encoded video bitstream indicating whether a chroma block is encoded using any of the one or more LM modes in the MPM candidate list. In some instances, the processing circuitry of the video encoder 20 may set a flag to a deactivated state based on the determination that the chroma block is not encoded using any LM mode in the candidate list, and may signal an MPM index corresponding to the specific mode in the MPM candidate list in the encoded video bitstream based on the determination that the chroma block is encoded using a specific mode in the MPM candidate list. In some instances, the processing circuitry of the video encoder 20 may set a flag to an enabled state based on the determination that the chroma block is encoded using a specific LM mode in the one or more LM modes in the MPM candidate list.
在一些实例中,视频编码器20的处理电路可确定与色度块相关联的默认模式的数目是否符合预定阈值。随后,视频编码器20的处理电路可执行以下各者中的一者:(i)基于默认模式的数目不符合预定阈值的确定,将默认模式中的每一默认模式添加至MPM候选者列表;或(ii)基于默认模式的数目符合预定阈值的确定,从MPM候选者列表省略所有默认模式。In some instances, the processing circuitry of the video encoder 20 may determine whether the number of default modes associated with a chroma block meets a predetermined threshold. Subsequently, the processing circuitry of the video encoder 20 may perform one of the following: (i) adding each default mode in the default modes to the MPM candidate list based on the determination that the number of default modes does not meet the predetermined threshold; or (ii) omitting all default modes from the MPM candidate list based on the determination that the number of default modes meets the predetermined threshold.
应认识到,取决于实例,本文中所描述的技术中的任一者的某些动作或事件可以不同序列执行、可被添加、合并或完全省去(例如,并非所有所描述动作或事件是实践所述技术所必要的)。此外,在某些实例中,可例如经由多线程处理、中断处理或多个处理器同时而非依序执行动作或事件。It should be recognized that, depending on the instance, certain actions or events of any of the techniques described herein may be performed in a different sequence, may be added, combined, or may be omitted entirely (e.g., not all described actions or events are necessary to practice the techniques). Furthermore, in some instances, actions or events may be performed simultaneously rather than sequentially, for example, via multithreading, interrupt handling, or multiple processors.
在一或多个实例中,所描述功能可以硬件、软件、固体或其任何组合来实施。如果以软件来实施,那么所述功能可作为一或多个指令或代码而存储于计算机可读媒体上或经由计算机可读媒体进行发射,且由基于硬件的处理单元执行。计算机可读媒体可包含:计算机可读存储媒体,其对应于例如数据存储媒体的有形媒体;或通信媒体,其包含(例如)根据通信协议促进计算机程序从一处传送至另一处的任何媒体。以此方式,计算机可读媒体通常可对应于(1)非暂时性的有形计算机可读存储媒体,或(2)例如信号或载波的通信媒体。数据存储媒体可为可由一或多个计算机或一或多个处理器存取以检索用于实施本发明中所描述的技术的指令、代码及/或数据结构的任何可用媒体。计算机程序产品可包含计算机可读媒体。In one or more instances, the described functionality may be implemented in hardware, software, solid-state, or any combination thereof. If implemented in software, the functionality may be stored on or transmitted via a computer-readable medium as one or more instructions or codes, and executed by a hardware-based processing unit. The computer-readable medium may include: a computer-readable storage medium corresponding to a tangible medium such as a data storage medium; or a communication medium containing, for example, any medium facilitating the transfer of a computer program from one place to another according to a communication protocol. In this manner, a computer-readable medium may generally correspond to (1) a non-transitory tangible computer-readable storage medium, or (2) a communication medium such as a signal or carrier wave. The data storage medium may be any available medium accessible by one or more computers or one or more processors to retrieve instructions, code, and/or data structures for implementing the techniques described herein. Computer program products may contain computer-readable media.
作为实例而非限制,此类计算机可读存储媒体可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储装置、磁盘存储装置或其它磁性存储装置、闪速存储器,或可用于存储呈指令或数据结构形式的所要程序代码且可由计算机存取的任何其它媒体。并且,任何连接被恰当地称为计算机可读媒体。举例来说,如果使用同轴电缆、光缆、双绞线、数字订户线(DSL)或无线技术(例如红外线、无线电及微波)从网站、服务器或其它远程源发射指令,那么同轴电缆、光缆、双绞线、DSL或无线技术(例如红外线、无线电及微波)包含于媒体的定义中。然而,应理解,计算机可读存储媒体及数据存储媒体并不包含连接、载波、信号或其它暂时性媒体,而是涉及非暂时性有形存储媒体。如本文中所使用,磁盘及光盘包含紧密光盘(CD)、激光光盘、光学光盘、数字多功能光盘(DVD)、软盘及蓝光光盘,其中磁盘通常以磁性方式再现数据,而光盘使用激光以光学方式再现数据。以上各者的组合也应包含于计算机可读媒体的范围内。By way of example and not limitation, such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disc storage devices, magnetic disk storage devices or other magnetic storage devices, flash memory, or any other media that can be used to store desired program code in the form of instructions or data structures and that is accessible to a computer. Furthermore, any connection is properly referred to as computer-readable media. For example, if instructions are transmitted from a website, server, or other remote source using coaxial cable, optical fiber, twisted pair, digital subscriber line (DSL), or wireless technology (e.g., infrared, radio, and microwave), then coaxial cable, optical fiber, twisted pair, DSL, or wireless technology (e.g., infrared, radio, and microwave) is included in the definition of media. However, it should be understood that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transient media, but rather refer to non-transient tangible storage media. As used herein, disks and optical discs include compact optical discs (CDs), laser discs, optical discs, digital versatile optical discs (DVDs), floppy disks, and Blu-ray discs, wherein disks typically reproduce data magnetically, while optical discs use lasers to reproduce data optically. The combination of the above should also be included in the scope of computer-readable media.
指令可由一或多个处理器执行,所述一或多个处理器是例如一或多个数字信号处理器(DSP)、通用微处理器、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或其它等效集成或离散逻辑电路。因此,如本文中所使用的术语“处理器”可指上述结构或适合于实施本文中所描述的技术的任何其它结构中的任一者。另外,在一些方面中,本文中所描述的功能性可提供于经配置用于编码及解码的专用硬件及/或软件模块内,或并入组合式编解码器中。此外,所述技术可完全实施于一或多个电路或逻辑元件中。The instructions can be executed by one or more processors, such as one or more digital signal processors (DSPs), general-purpose microprocessors, application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other equivalent integrated or discrete logic circuits. Therefore, as used herein, the term "processor" can refer to any of the above-described structures or any other structures suitable for implementing the techniques described herein. Additionally, in some aspects, the functionality described herein can be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated into combined codecs. Furthermore, the techniques can be fully implemented within one or more circuit or logic elements.
可以多种装置或设备来实施本发明的技术,所述装置或设备包含无线手机、集成电路(IC)或IC的集合(例如,芯片组)。在本发明中描述了各种组件、模块或单元以强调经配置以执行所揭示技术的装置的功能方面,但未必要求由不同硬件单元来实现。更确切地说,如上所述,可将各种单元组合于编解码器硬件单元中,或通过互操作性硬件单元(包含如上文所描述的一或多个处理器)的集合结合合适的软件及/或固体来提供所述单元。The techniques of this invention can be implemented using various devices or apparatuses, including wireless handsets, integrated circuits (ICs), or collections of ICs (e.g., chipsets). Various components, modules, or units are described in this invention to emphasize functional aspects of a device configured to perform the disclosed techniques, but implementation by different hardware units is not necessarily required. More precisely, as described above, various units can be combined within a codec hardware unit, or provided through a collection of interoperable hardware units (comprising one or more processors as described above) combined with suitable software and/or solid-state drives.
已描述了各种实例。这些及其它实例在所附权利要求书的范围内。Various examples have been described. These and other examples are within the scope of the appended claims.
Claims (18)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US62/375,383 | 2016-08-15 | ||
| US62/404,572 | 2016-10-05 | ||
| US15/676,345 | 2017-08-14 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK40000797A HK40000797A (en) | 2020-02-14 |
| HK40000797B true HK40000797B (en) | 2024-05-17 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11743509B2 (en) | Intra video coding using a decoupled tree structure | |
| CN109565599B (en) | Intra-Video Coding Using Decoupled Tree Structure | |
| JP6543716B2 (en) | Adaptive inter-component residual prediction | |
| CN106105204B (en) | Modifying Bit Depth in Color Space Transform Coding | |
| RU2584498C2 (en) | Intra-mode video encoding | |
| KR102334126B1 (en) | Residual prediction for intra block copying | |
| CN114402604B (en) | Simplified palette predictor update for video codecs | |
| JP2022523789A (en) | Coefficient coding for conversion skip mode | |
| KR20160078493A (en) | Color residual prediction for video coding | |
| KR20130063030A (en) | Intra smoothing filter for video coding | |
| JP2014520490A (en) | Quantization parameter prediction in video coding | |
| JP7651669B2 (en) | Position-dependent space-varying transformations for video coding. | |
| CN104604224A (en) | Transform basis adjustment in scalable video coding | |
| CN111149361B (en) | Adaptive group of pictures structure with future reference frames in a random access configuration for video coding | |
| HK40098896A (en) | Method and device for decoding video data and computer-readable storage medium | |
| HK40000797B (en) | Intra video coding using a decoupled tree structure | |
| HK40000339A (en) | Intra video coding using a decoupled tree structure | |
| HK40000797A (en) | Intra video coding using a decoupled tree structure | |
| HK40115578A (en) | Method and device for video encoding and computer-readable storage medium |