HK1226568B

HK1226568B - Decoding method for multi-layered frame-compatible video delivery

Info

Publication number: HK1226568B
Application number: HK16114646.1A
Authority: HK
Inventors: 阿萨纳西奥斯．莱昂塔里斯; 亚历山德罗斯．图拉皮斯; 佩沙拉．V．帕哈拉瓦达; 凯文．斯特茨; 沃尔特．胡萨克
Original assignee: 杜比实验室特许公司
Priority date: 2010-07-21
Filing date: 2016-12-23
Publication date: 2020-04-29

Description

Decoding method for multi-layer frame-compatible video transmission

本发明申请为申请日为2011年7月20日并于2013年1月21日进入中国国家阶段的发明名称为“用于多层帧兼容视频传输的系统及方法”的第201180035724.4号发明专利申请的分案申请。This invention application is a divisional application of invention patent application No. 201180035724.4, whose application date is July 20, 2011 and which entered the Chinese national phase on January 21, 2013 and whose invention name is “System and method for multi-layer frame-compatible video transmission”.

相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS

本申请要求于2010年7月21日提交的美国临时专利申请No. 61/366,512的优先权，其全部内容通过引用合并到本申请中。This application claims priority to U.S. Provisional Patent Application No. 61/366,512, filed July 21, 2010, which is hereby incorporated by reference in its entirety.

技术领域Technical Field

本公开内容涉及图像处理和视频压缩。更具体地，本发明的实施方式涉及用于多层帧兼容视频传输的编码和解码系统及方法。The present disclosure relates to image processing and video compression. More particularly, embodiments of the present invention relate to encoding and decoding systems and methods for multi-layer frame-compatible video transmission.

背景技术Background Art

近来，产业对于立体(3D)视频传输已经有了相当大的兴趣和吸引力。高票房的电影已经使3D立体视频成为主流，同时大型体育赛事也正在用3D来制作和播放。具体地，动画电影正在越来越多地以立体格式生成和呈现。Recently, there has been considerable interest and traction in the industry for stereoscopic (3D) video transmission. High-grossing films have made 3D stereoscopic video mainstream, and major sporting events are also being produced and broadcast in 3D. Specifically, animated films are increasingly being produced and presented in stereoscopic format.

然而，虽然3D功能的电影屏幕已经有了足够大的安装基础，但对消费者的3D应用来说并不是这样。这个领域的努力仍然处于初期阶段，但一些业界各方在消费者3D功能的显示器[参考文献1]的开发和市场营销方面正在投入相当大的努力。However, while there is a large enough installed base of 3D-capable cinema screens, the same is not true for consumer 3D applications. Efforts in this area are still in their early stages, but several industry parties are investing considerable effort in the development and marketing of consumer 3D-capable displays [Reference 1].

立体显示技术和立体内容创作是必须得到妥善解决以确保足够高质量的体验的问题。3D内容的传输同样很关键。内容传输包括几个部分，包括压缩。立体传输是具有挑战性的，因为立体传输系统处理的信息是 2D传输系统的两倍。此外，计算和存储吞吐量要求也显著地增加了。Stereoscopic display technology and stereoscopic content creation are issues that must be properly addressed to ensure a sufficiently high-quality experience. The delivery of 3D content is equally critical. Content delivery involves several components, including compression. Stereoscopic delivery is challenging because stereoscopic delivery systems process twice as much information as 2D systems. Furthermore, computational and storage throughput requirements increase significantly.

通常，存在两种主要的分配渠道，立体内容可以通过这两个主要的分配渠道传输给消费者：如蓝光光盘等固定介质以及其中内容首先传输给机顶盒然后再传输给PC的流解决方案。Typically, there are two main distribution channels through which stereoscopic content can be delivered to consumers: fixed media such as Blu-ray Discs, and streaming solutions where the content is first delivered to a set-top box and then to a PC.

大多数目前配置的蓝光播放器和机顶盒仅支持编解码器，例如基于 ITU-T/ISO/IECH.264/14496-10[参考文献2]现有技术视频编码标准(也已知为MPEG-4第10部分AVC)和SMPTE VC-1标准[参考文献3]的附录 A的规范的这些编解码器。Most currently deployed Blu-ray players and set-top boxes only support codecs such as those based on the specifications of the ITU-T/ISO/IEC H.264/14496-10 [Reference 2] prior art video coding standard (also known as MPEG-4 Part 10 AVC) and Annex A of the SMPTE VC-1 standard [Reference 3].

这些编解码器解决方案中的每个使得服务提供商能够以1920×1080 像素的分辨率传输单个HD(高清)图像序列。然而，传输立体内容包括传送左序列和右序列两个序列的信息。直接的方法是对两个单独的比特流进行编码，每个比特流用于一个视点，这种方法也已知为同时联播。Each of these codec solutions enables service providers to transmit a single HD (High Definition) image sequence at a resolution of 1920×1080 pixels. However, transmitting stereoscopic content involves sending information for both the left and right sequences. The straightforward approach is to encode two separate bitstreams, one for each viewpoint, an approach also known as simulcast.

首先，同时联播或类似的方法的压缩效率低。它们还使用高的带宽以维持可接受的质量等级。这是因为，左视点序列和右视点序列独立编码，即使它们是相关的。First, simulcast or similar methods have low compression efficiency. They also use a high bandwidth to maintain an acceptable quality level. This is because the left view sequence and the right view sequence are encoded independently, even though they are related.

其次，在两个适当地同步的解码器中对两个单独的比特流并行地进行解复用和解码。为了实现这样的解码器，可以使用两个现有的非专门设计的解码器。此外，并行解码适合图形处理单元架构。Secondly, the two separate bit streams are demultiplexed and decoded in parallel in two appropriately synchronized decoders. To implement such a decoder, two existing non-specially designed decoders can be used. In addition, parallel decoding is suitable for graphics processing unit architecture.

支持多个层的编解码器可以为立体视频提供高的压缩效率，同时维持向后兼容性。A codec supporting multiple layers can provide high compression efficiency for stereoscopic video while maintaining backward compatibility.

多层的或可伸缩的比特流由用预先定义的依赖关系表征的多个层组成。这些层中的一个或更多个层为在任何其他层之前解码并且是独立地可解码的所谓的基本层。A multi-layer or scalable bitstream consists of multiple layers characterized by predefined dependencies. One or more of these layers is a so-called base layer that is decoded before any other layer and is independently decodable.

其他层通常已知为增强层，因为它们的功能是改善通过对一个或更多个基本层进行解析和解码而获得的内容。这些增强层也是依赖层，因为它们依赖于基本层。增强层使用某种层间预测，增强层中的一个或更多个通常也可以依赖于对其他较高优先级的增强层的解码。因此，解码也可以在中间层之一处终止。The other layers are often referred to as enhancement layers, as their function is to improve the content obtained by parsing and decoding one or more base layers. These enhancement layers are also dependent layers, as they rely on the base layer. The enhancement layers use some form of inter-layer prediction, and one or more of the enhancement layers may also typically rely on the decoding of other, higher-priority enhancement layers. Consequently, decoding may also terminate at one of the intermediate layers.

多层的或可伸缩的比特流在质量/信号噪声比(SNR)、空间分辨率和 /或时间分辨率和/或甚至额外的视点的可用性方面使能可伸缩性。例如，使用基于H.264/MPEG-4第10部分、VC-1或VP8的附录A规范的编解码器，可以产生时间上可伸缩的比特流。Multi-layer or scalable bitstreams enable scalability in terms of quality/signal-to-noise ratio (SNR), spatial resolution and/or temporal resolution and/or even the availability of additional viewpoints. For example, temporally scalable bitstreams can be generated using codecs based on the Annex A specifications of H.264/MPEG-4 Part 10, VC-1, or VP8.

第一基本层如果被解码则可以以每秒15帧(fps)提供图像序列的版本，而第二增强层如果被解码则可以结合已经解码的基本层以30fps提供相同的图像序列。The first base layer, if decoded, may provide a version of the image sequence at 15 frames per second (fps), while the second enhancement layer, if decoded, may provide the same image sequence at 30 fps in conjunction with the already decoded base layer.

SNR和空间可伸缩性也是可能的。例如，当采用H.264/MPEG-4第 10部分AVC视频编码标准(附录G)的可伸缩视频编码(SVC)扩展时，基本层(根据附录A编码的)生成粗劣质量版本的图像序列。一个或更多个增强层可以在视觉质量方面提供额外的增加。类似地，基本层可以提供低分辨率版本的图像序列。可以通过对额外的增强层、空间和/或时间地进行解码来提高分辨率。可伸缩的或多层的比特流对于提供多视点可伸缩性也是有用的。SNR and spatial scalability are also possible. For example, when the Scalable Video Coding (SVC) extension of the H.264/MPEG-4 Part 10 AVC video coding standard (Annex G) is adopted, the base layer (encoded according to Appendix A) generates a lower quality version of the image sequence. One or more enhancement layers can provide additional increases in visual quality. Similarly, the base layer can provide a lower resolution version of the image sequence. The resolution can be increased by decoding additional enhancement layers, spatially and/or temporally. Scalable or multi-layered bitstreams are also useful for providing multi-viewpoint scalability.

最近，H.264/AVC的多视点编码(MVC)扩展(附录H)的立体高规范已经完成并且已被采用作为用于以立体内容为特征的下一代蓝光光盘(蓝光3D)的视频编解码器。这种编码方法尝试在一定程度上解决对立体视频流的高比特率的要求。Recently, the stereoscopic high-resolution specification of the Multiview Coding (MVC) extension (Annex H) of H.264/AVC has been completed and has been adopted as the video codec for the next generation Blu-ray Disc (Blu-ray 3D) featuring stereoscopic content. This encoding method attempts to address the high bit rate requirements for stereoscopic video streams to some extent.

立体高规范利用符合H.264/AVC的附录A的高规范并且对被称为基本视点的视点之一(通常为左视点)进行压缩的基本层。然后，增强层对被称为视点相关视点(dependentview)的另一视点进行压缩。虽然基本层就其本身而言是有效的H.264/AVC比特流，并且是独立于增强层可解码的，然而，对增强层来说可能不是并且通常不是这样。这是因为增强层可以利用来自基本层的已解码图片作为经运动补偿的预测参考。因此，视点相关视点(增强层)可以受益于视点间预测，并且，对于具有高视点间相关性(即，低的立体视差)的场景，压缩可以得到相当大地改善视点。因此，MVC扩展方法试图通过利用立体视差来解决增加的带宽的问题。Stereo High Profile utilizes a base layer that compresses one of the views (typically the left view) that conforms to the High Profile of Annex A of H.264/AVC and is called the base view. The enhancement layer then compresses the other view, called the view-dependent view (dependentview). While the base layer is a valid H.264/AVC bitstream in itself and is decodable independently of the enhancement layer, this may not be the case, and is not usually the case, for the enhancement layer. This is because the enhancement layer can utilize decoded pictures from the base layer as motion-compensated prediction references. Therefore, the view-dependent view (enhancement layer) can benefit from inter-view prediction, and compression can be significantly improved for scenes with high inter-view correlation (i.e., low stereo disparity). Therefore, the MVC extension method attempts to address the problem of increased bandwidth by utilizing stereo disparity.

然而，这种方法可能无法提供与现有的配置的机顶盒和蓝光播放器基础设施的兼容性。尽管现有的H.264解码器能够对基本视点进行解码和显示，然而，它会简单地丢弃和忽视相关(右)视点。因此，现有的解码器不具有对使用MVC编码的3D内容进行解码和显示的能力。因此，虽然 MVC保留了2D相容性，然而，MVC不能在传统的设备中传输3D内容。向后兼容性的缺乏是消费者3D立体视频的快速采用的另外的障碍。However, this approach may not provide compatibility with existing set-top boxes and Blu-ray player infrastructure. Although existing H.264 decoders can decode and display the base viewpoint, they simply discard and ignore the relevant (right) viewpoint. Therefore, existing decoders do not have the ability to decode and display 3D content encoded using MVC. Therefore, although MVC retains 2D compatibility, MVC cannot transmit 3D content in traditional devices. The lack of backward compatibility is another obstacle to the rapid adoption of consumer 3D stereoscopic video.

根据本申请的一个方面，提供了一种用于多层帧兼容视频传输的解码方法，包括：a)通过基本层对多个基本层比特流信号进行基本层处理，包括：i)提供至少一个帧兼容基本层解码图像或视频帧；以及b)通过一个或更多个增强层对多个增强比特流信号进行增强层处理，包括：ii)针对多个视点提供至少一个增强层已解码图像或视频帧；iii)对来自基本层的至少一个帧兼容基本层解码图像或视频帧或者来自不同的增强层的至少一个解码图像或视频帧进行参考处理；以及iv)执行视差补偿，其中，在相同的增强层中对所有的多个视点进行解码和处理。According to one aspect of the present application, a decoding method for multi-layer frame-compatible video transmission is provided, comprising: a) performing base layer processing on multiple base layer bit stream signals through a base layer, comprising: i) providing at least one frame-compatible base layer decoded image or video frame; and b) performing enhancement layer processing on multiple enhancement bit stream signals through one or more enhancement layers, comprising: ii) providing at least one enhancement layer decoded image or video frame for multiple viewpoints; iii) performing reference processing on at least one frame-compatible base layer decoded image or video frame from the base layer or at least one decoded image or video frame from different enhancement layers; and iv) performing disparity compensation, wherein all multiple viewpoints are decoded and processed in the same enhancement layer.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

并入到本说明书中并且构成本说明书的一部分的附图示出了本公开内容的一种或更多种实施方式，并且，附图连同对示例实施方式的描述一起用于说明本公开内容的原理和实现。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate one or more embodiments of the present disclosure and, together with the description of example embodiments, serve to explain the principles and implementations of the present disclosure.

图1描绘了用于立体素材的传输的棋盘交织布置。Figure 1 depicts a checkerboard interleaving arrangement for the transmission of stereoscopic material.

图2描绘了用于立体素材的传输的水平采样/列交织布置。FIG2 depicts a horizontal sample/column interleaving arrangement for transmission of stereoscopic material.

图3描绘了用于立体素材的传输的垂直采样/行交织布置。FIG3 depicts a vertical sampling/row interleaving arrangement for transmission of stereoscopic material.

图4描绘了用于立体素材的传输的水平采样/并排布置。FIG4 depicts a horizontal sampling/side-by-side arrangement for transmission of stereoscopic material.

图5描绘了用于立体素材的传输的垂直采样/上下(over-under)布置。FIG5 depicts a vertical sampling/over-under arrangement for transmission of stereoscopic material.

图6描绘了用于立体素材的传输的梅花形采样/并排布置。FIG6 depicts a quincunx sampling/side-by-side arrangement for transmission of stereoscopic material.

图7描绘了具有用于层间预测的参考处理的帧兼容全分辨率3D立体可伸缩视频编码系统。FIG7 depicts a frame-compatible full-resolution 3D stereoscopic scalable video coding system with reference processing for inter-layer prediction.

图8描绘了具有用于层间预测的参考处理的帧兼容全分辨率3D立体可伸缩视频解码系统。FIG8 depicts a frame-compatible full-resolution 3D stereoscopic scalable video decoding system with reference processing for inter-layer prediction.

图9描绘了具有用于层间预测的参考处理单元的可伸缩视频编码系统。FIG9 depicts a scalable video coding system with a reference processing unit for inter-layer prediction.

图10描绘了用于帧兼容全分辨率两层传输系统的重建模块。Figure 10 depicts the reconstruction module for a frame-compatible full-resolution two-layer transmission system.

图11描绘了根据本公开实施例的多层分辨率可伸缩3D立体视频编码器，其中，增强层以增强分辨率维持两个参考图片缓冲区中的每一个，并且以某个降低的分辨率执行运动/视差补偿(帧兼容)。11 illustrates a multi-layer resolution scalable 3D stereoscopic video encoder according to an embodiment of the present disclosure, where the enhancement layer maintains each of two reference picture buffers at an enhanced resolution and performs motion/disparity compensation at some reduced resolution (frame compatible).

图12描绘了根据本公开实施例的多层分辨率可伸缩3D立体视频解码器，其中，增强层以增强分辨率维持两个参考图片缓冲区中的每一个，并且以某个降低的分辨率执行运动/视差补偿(帧兼容)。12 depicts a multi-layer resolution scalable 3D stereoscopic video decoder according to an embodiment of the present disclosure, where the enhancement layer maintains each of two reference picture buffers at an enhanced resolution and performs motion/disparity compensation at some reduced resolution (frame compatible).

图13描绘了根据本公开实施例的多层分辨率可伸缩3D立体视频编码器，其中，增强层以增强分辨率维持两个参考图片缓冲区中的每一个，并且以增强分辨率执行运动/视差补偿。13 illustrates a multi-layer resolution scalable 3D stereoscopic video encoder according to an embodiment of the present disclosure, wherein the enhancement layer maintains each of two reference picture buffers at an enhanced resolution and performs motion/disparity compensation at the enhanced resolution.

图14描绘了根据本公开实施例的多层分辨率可伸缩3D立体视频解码器，其中，增强层以增强分辨率维持两个参考图片缓冲区中的每一个，并且以增强分辨率执行运动/视差补偿。14 illustrates a multi-layer resolution scalable 3D stereoscopic video decoder according to an embodiment of the present disclosure, wherein the enhancement layer maintains each of two reference picture buffers at an enhanced resolution and performs motion/disparity compensation at the enhanced resolution.

图15描绘了根据本公开实施例的多层分辨率可伸缩3D立体视频编码器，其中，基本层对帧兼容版本的数据进行编码，并且两个增强层对增强分辨率数据类别中的每个(3D立体视频传输的每个视点)进行编码。Figure 15 depicts a multi-layer resolution scalable 3D stereoscopic video encoder according to an embodiment of the present disclosure, where a base layer encodes a frame-compatible version of the data and two enhancement layers encode each of the enhanced resolution data categories (each viewpoint of the 3D stereoscopic video transmission).

图16描绘了根据本公开实施例的多层分辨率可伸缩3D立体视频解码器，其中，基本层对帧兼容版本的数据进行编码，并且两个增强层对增强分辨率数据类别中的每个(3D立体视频传输的每个视点)进行编码。Figure 16 depicts a multi-layer resolution scalable 3D stereoscopic video decoder according to an embodiment of the present disclosure, where a base layer encodes a frame-compatible version of the data and two enhancement layers encode each of the enhanced resolution data categories (each viewpoint of the 3D stereoscopic video transmission).

图17描绘了根据本公开实施例的多层分辨率可伸缩3D立体视频编码器，其中，增强层对残差进行编码并且以增强分辨率维持两个参考图片缓冲区中的每一个，以及以某个降低的分辨率执行运动/视差补偿(帧兼容)。Figure 17 depicts a multi-layer resolution scalable 3D stereoscopic video encoder according to an embodiment of the present disclosure, where the enhancement layer encodes the residual and maintains each of the two reference picture buffers at an enhanced resolution, and performs motion/disparity compensation at some reduced resolution (frame compatible).

图18描绘了根据本公开实施例的多层分辨率可伸缩3D立体视频解码器，其中，增强层对残差进行编码并且以增强分辨率维持两个参考图片缓冲区中的每一个，以及以某个降低的分辨率执行运动/视差补偿(帧兼容)。Figure 18 depicts a multi-layer resolution scalable 3D stereoscopic video decoder according to an embodiment of the present disclosure, where the enhancement layer encodes the residual and maintains each of the two reference picture buffers at an enhanced resolution, and performs motion/disparity compensation at some reduced resolution (frame compatible).

图19描绘了根据本公开实施例的多层分辨率可伸缩视频编码器，其中，基本层对帧兼容版本的数据进行编码，并且两个增强层针对增强分辨率数据类别中的每个(3D立体视频传输的每个视点)对残差进行编码。Figure 19 depicts a multi-layer resolution scalable video encoder according to an embodiment of the present disclosure, where a base layer encodes a frame-compatible version of the data and two enhancement layers encode the residual for each of the enhanced resolution data categories (each viewpoint for 3D stereoscopic video transmission).

图20描绘了根据本公开实施例的多层分辨率可伸缩视频解码器，其中，基本层对帧兼容版本的数据进行编码，并且两个增强层针对增强分辨率数据类别中的每个(3D立体视频传输的每个视点)对残差进行编码。Figure 20 depicts a multi-layer resolution scalable video decoder according to an embodiment of the present disclosure, where a base layer encodes a frame-compatible version of the data and two enhancement layers encode the residual for each of the enhanced resolution data categories (each viewpoint for 3D stereoscopic video transmission).

具体实施方式DETAILED DESCRIPTION

根据本公开内容的第一方面，提供了一种用于多层帧兼容视频传输的编码方法，所述编码方法包括：a)通过基本层对多个数据类别的图像或视频帧进行基本层处理，包括：i)提供所述多个数据类别的图像或视频帧的基本层帧兼容表示；以及b)通过一个或更多个增强层对多个数据类别的图像或视频帧进行增强层处理，包括：i)提供所述多个数据类别的图像或视频帧的增强层帧兼容表示，ii)维持至少一个增强层参考图片缓冲区，iii)针对对于所述基本层或不同的增强层的至少一个依赖关系进行参考处理；以及iv)执行运动或视差补偿，其中，所述一个或更多个增强层中的每个对所有的所述多个数据类别进行处理。According to a first aspect of the present disclosure, a coding method for multi-layer frame-compatible video transmission is provided, the coding method comprising: a) performing base layer processing of images or video frames of multiple data categories through a base layer, comprising: i) providing a base layer frame-compatible representation of the images or video frames of the multiple data categories; and b) performing enhancement layer processing of images or video frames of multiple data categories through one or more enhancement layers, comprising: i) providing an enhancement layer frame-compatible representation of the images or video frames of the multiple data categories, ii) maintaining at least one enhancement layer reference picture buffer, iii) performing reference processing for at least one dependency on the base layer or a different enhancement layer; and iv) performing motion or disparity compensation, wherein each of the one or more enhancement layers processes all of the multiple data categories.

根据本公开内容的第二方面，提供了一种用于多层帧兼容视频传输的编码方法，包括：a)通过基本层对多个数据类别的图像或视频帧进行基本层处理，包括：i)提供所述多个数据类别的图像或视频帧的基本层帧兼容表示；以及b)通过一个或更多个增强层对多个数据类别的图像或视频帧进行增强层处理，其中，所述多个数据类别中的每个在单独的增强层中被单独地处理，所述一个或更多个增强层中的每个包括：i)针对所述多个数据类别之一提供图像或视频的增强层表示，ii)在每个增强层中维持增强层参考图片缓冲区，iii)针对对于所述基本层或不同的增强层的至少一个依赖关系进行参考处理，以及iv)执行运动或视差补偿。According to a second aspect of the present disclosure, there is provided a coding method for multi-layer frame-compatible video transmission, comprising: a) performing base layer processing of images or video frames of multiple data categories through a base layer, comprising: i) providing a base layer frame-compatible representation of the images or video frames of the multiple data categories; and b) performing enhancement layer processing of images or video frames of the multiple data categories through one or more enhancement layers, wherein each of the multiple data categories is processed separately in a separate enhancement layer, and each of the one or more enhancement layers comprises: i) providing an enhancement layer representation of an image or video for one of the multiple data categories, ii) maintaining an enhancement layer reference picture buffer in each enhancement layer, iii) performing reference processing for at least one dependency on the base layer or a different enhancement layer, and iv) performing motion or disparity compensation.

根据本公开内容的第三方面，提供了一种用于多层帧兼容视频传输的解码方法，所述解码方法包括：a)通过基本层对多个基本层比特流信号进行基本层处理，包括：i)提供至少一个帧兼容基本层解码图像或视频帧；以及b)通过一个或更多个增强层对多个增强比特流信号进行增强层处理，包括：i)针对多个数据类别提供至少一个增强层已解码图像或视频帧，ii)维持至少一个增强层参考图片缓冲区，iii)针对对于所述基本层或不同的增强层的至少一个依赖关系进行参考处理，以及iv)执行视差补偿，其中，在相同的增强层中对所有的所述多个数据类别进行解码和处理。According to a third aspect of the present disclosure, a decoding method for multi-layer frame-compatible video transmission is provided, the decoding method comprising: a) performing base layer processing on multiple base layer bit stream signals through a base layer, comprising: i) providing at least one frame-compatible base layer decoded image or video frame; and b) performing enhancement layer processing on multiple enhancement bit stream signals through one or more enhancement layers, comprising: i) providing at least one enhancement layer decoded image or video frame for multiple data categories, ii) maintaining at least one enhancement layer reference picture buffer, iii) performing reference processing for at least one dependency on the base layer or different enhancement layers, and iv) performing disparity compensation, wherein all of the multiple data categories are decoded and processed in the same enhancement layer.

根据本公开内容的第四方面，提供了一种用于多层帧兼容视频传输的解码方法，所述解码方法包括：a)通过基本层对通过基本层的多个基本层比特流信号进行基本层处理，包括：i)提供至少一个帧兼容基本层解码图像或视频帧；以及b)通过一个或更多个增强层，针对多个数据类别，对通过一个或更多个增强层的多个增强比特流信号进行增强层处理，其中，所述多个数据类别中的每个在单独的增强层中被单独地处理，所述一个或更多个增强层中的每个包括：i)针对所述多个数据类别之一提供至少一个增强层已解码图像或视频帧，ii)维持至少一个增强层参考图片缓冲区，iii)针对对于所述基本层或不同的增强层的至少一个依赖关系进行参考处理，以及iv)执行视差补偿，其中，在相同的增强层中对所有的所述多个数据类别进行解码和处理。According to a fourth aspect of the present disclosure, a decoding method for multi-layer frame-compatible video transmission is provided, the decoding method comprising: a) performing base layer processing on multiple base layer bit stream signals passing through the base layer through the base layer, comprising: i) providing at least one frame-compatible base layer decoded image or video frame; and b) performing enhancement layer processing on multiple enhancement bit stream signals passing through one or more enhancement layers for multiple data categories through one or more enhancement layers, wherein each of the multiple data categories is processed separately in a separate enhancement layer, and each of the one or more enhancement layers comprises: i) providing at least one enhancement layer decoded image or video frame for one of the multiple data categories, ii) maintaining at least one enhancement layer reference picture buffer, iii) performing reference processing for at least one dependency on the base layer or a different enhancement layer, and iv) performing disparity compensation, wherein all of the multiple data categories are decoded and processed in the same enhancement layer.

鉴于现有的编解码器缺乏向后兼容性，利用机顶盒、蓝光播放器和高清晰度电视机的安装基础可以加速消费者3D部署。大多数显示器制造商提供支持3D立体显示的高清晰度电视机。这些电视机包括所有的专业显示技术的模型：LCD、等离子和DLP[参考文献1]。关键是提供具有包含两个视点但仍然符合单个帧的限定的内容的显示，同时还利用现有和配置的编解码器，如VC-1和H.264/AVC。这样的方法是所谓的帧兼容方法，该方法对立体内容进行格式化，使得内容适合单个图片或帧。帧兼容表示的大小不需要与原始视点帧的大小相同。Given the lack of backward compatibility with existing codecs, leveraging the installed base of set-top boxes, Blu-ray players, and HDTVs can accelerate consumer 3D deployment. Most display manufacturers offer HDTVs that support 3D stereoscopic display. These TVs include models from all of the professional display technologies: LCD, plasma, and DLP [Reference 1]. The key is to provide a display with defined content that contains two viewpoints but still fits into a single frame, while also leveraging existing and deployed codecs such as VC-1 and H.264/AVC. Such an approach is the so-called frame-compatible approach, which formats the stereoscopic content so that it fits into a single picture or frame. The size of the frame-compatible representation does not need to be the same as the size of the original viewpoint frame.

与H.264的MVC扩展类似，杜比(Dolby)的立体3D消费者传输系统[参考文献4]以基本层和增强层为特征。与MVC方法相对地，视点可以复用到两个层中以向消费者提供：通过携带视点子采样版本的两个视点而帧兼容的基本层；以及当与基本层结合时导致两个视点的全分辨率重建的增强层。Similar to the MVC extension of H.264, Dolby's stereoscopic 3D consumer delivery system [Reference 4] features a base layer and an enhancement layer. In contrast to the MVC approach, views can be multiplexed into two layers to provide consumers with: a base layer that is frame-compatible by carrying subsampled versions of the two views; and an enhancement layer that, when combined with the base layer, results in full-resolution reconstruction of both views.

向后兼容的3D视频传输系统可以通过现有的或传统的2D视频硬件和系统向家庭或其他场所提供3D视频。帧兼容3D视频系统提供了这样的向后兼容的传输架构。在这种情况下，可以使用分层方法，其中，基本层提供以“帧兼容”格式布置的低分辨率版本的左眼和右眼。帧兼容格式包括并排、上下和梅花/棋盘交织。图1至图6示出了一些示意性的示例。此外，可以有额外的预处理阶段，该预处理阶段在使用增强层帧作为用于预测增强层的的运动补偿参考之前考虑基本层已解码帧对增强层帧进行预测。图7和图8分别示出了用于[参考文献4]中所公开的系统的编码器和解码器。A backward compatible 3D video transmission system can provide 3D video to homes or other places through existing or traditional 2D video hardware and systems. A frame compatible 3D video system provides such a backward compatible transmission architecture. In this case, a layered approach can be used, in which the base layer provides low-resolution versions of the left eye and right eye arranged in a "frame compatible" format. Frame compatible formats include side-by-side, top-and-bottom, and plum blossom/checkerboard interleaving. Figures 1 to 6 show some schematic examples. In addition, there can be an additional pre-processing stage that predicts the enhancement layer frames taking into account the base layer decoded frames before using them as motion compensation references for predicting the enhancement layer. Figures 7 and 8 show an encoder and decoder, respectively, for the system disclosed in [Reference 4].

甚至非帧兼容编码布置如MVC的编码布置也可以使用预处理器(例如，参考处理单元(RPU)/预测器)来增强，预处理器在使用取自基本视点的参考作为用于视点预测相关视点的参考之前对取自基本视点的参考进行改进。这种架构也公开在[参考文献4]中并且在图9中示出。Even non-frame-compatible coding arrangements such as MVC can be enhanced using a pre-processor (e.g., a reference processing unit (RPU)/predictor) that improves the reference taken from the base view before using it as a reference for view prediction of the dependent view. This architecture is also disclosed in [Reference 4] and is shown in Figure 9.

[参考文献4]的帧兼容技术确保了帧兼容的基本层。通过预处理器 /RPU元件的使用，这些技术减少在实现立体视点的全分辨率重建时的开销。图10示出了全分辨率重建的处理。The frame-compatible techniques of [Reference 4] ensure a frame-compatible base layer. By using pre-processor/RPU components, these techniques reduce the overhead of achieving full-resolution reconstruction of stereo viewpoints. Figure 10 illustrates the full-resolution reconstruction process.

基于增强层的可用性，存在若干用于获取最终的重建视点的方法。其中的一些方法可以考虑对增强层中的实际的像素数据进行编码，或者可以考虑对残差数据进行编码或通常与基本层(例如，高频率与低频率)数据不同的数据编码，该基本层数据如果以一定的形式结合则可以使得能够对重建信号进行较高质量/分辨率的表示。任意分辨率可以用于这些方法，例如：其中的一些方法可以以二分之一分辨率，而其中的一些方法可以以全分辨率或甚至更低、更高，或介于两者之间的分辨率。本公开的实施例可以针对任何分辨率。它们可以根据基本层(图10的V_FC,BL(1002))的帧兼容输出被插值并且可选地被后处理以产生V_0,BL,out(1004)和V_1,BL,out (1006)。可替代地，它们可以通过增强层的适当的样本被复用以产生每个视点的较高的表示重建V_0,FR,out(1008)和V_1,FR,out(1010)。在这两种情况下所得到的重建视点可以具有相同的分辨率。然而，在第二种情况下，针对所有样本对信息进行编码，而在第一种情况下，重建视点的信息的一半通过使用智能算法的插入获得，如[参考文献4]中所公开的。从图10中可以观察到：继对基本层和增强层的解码之后，使用额外的和潜在地为存储密集型和带宽密集型的操作以获得最终的全分辨率重建视点。Based on the availability of enhancement layers, there are several methods for obtaining the final reconstructed viewpoint. Some of these methods may consider encoding the actual pixel data in the enhancement layer, or may consider encoding residual data or data that is generally different from the base layer (e.g., high frequency vs. low frequency) data, which, if combined in a certain form, can enable a higher quality/resolution representation of the reconstructed signal. Any resolution can be used for these methods, for example: some of them may be at half resolution, while some of them may be at full resolution or even lower, higher, or somewhere in between. The embodiments of the present disclosure can be directed to any resolution. They can be interpolated from the frame-compatible output of the base layer (V _FC,BL (1002) of Figure 10) and optionally post-processed to produce V _0,BL,out (1004) and V _1,BL,out (1006). Alternatively, they can be multiplexed with appropriate samples of the enhancement layer to produce a higher representation of each viewpoint, reconstructing V _0,FR,out (1008) and V _1,FR,out (1010). The resulting reconstructed viewpoint can have the same resolution in both cases. However, in the second case, information is encoded for all samples, while in the first case, half of the information for the reconstructed viewpoint is obtained by interpolation using an intelligent algorithm, as disclosed in [Reference 4]. As can be observed in Figure 10, following the decoding of the base layer and enhancement layers, additional and potentially memory-intensive and bandwidth-intensive operations are used to obtain the final full-resolution reconstructed viewpoint.

本公开提供了使得帧兼容3D视频系统能够实现全分辨率3D传输的技术。本公开还提供了用于通过在一些较高表示/分辨率的样本域中执行运动和立体视差补偿来改善增强层中的内部预测精度的方法。这些域可以与以帧兼容表示的样本相比具有更高的空间或频率分辨率。在一些实施例中，这些域可以具有等于全分辨率的分辨率，即，在每个类别的帧被滤波、采样和复用到帧兼容表示之前的这些帧的原始分辨率。用于对通过这些布置而压缩的数据进行处理的额外的方法在[参考文献5]中可以找到。贯穿整个说明书，术语“数据类别”或“类别”指代数据组。不同的数据类别可以指代可能具有或不具有组间关系的不同的数据组。对于涉及3D或立体图像或视频传输的本公开的实施例，术语“数据类别”或“类别”指代 3D图像或视频的单个视点。The present disclosure provides techniques that enable frame-compatible 3D video systems to achieve full-resolution 3D transmission. The present disclosure also provides methods for improving the accuracy of internal predictions in enhancement layers by performing motion and stereo disparity compensation in some higher representation/resolution sample domains. These domains can have a higher spatial or frequency resolution than the samples represented in a frame-compatible representation. In some embodiments, these domains can have a resolution equal to the full resolution, that is, the original resolution of the frames of each category before they are filtered, sampled, and multiplexed into a frame-compatible representation. Additional methods for processing data compressed by these arrangements can be found in [Reference 5]. Throughout this specification, the term "data category" or "category" refers to a data group. Different data categories can refer to different data groups that may or may not have a relationship between the groups. For embodiments of the present disclosure related to 3D or stereoscopic image or video transmission, the term "data category" or "category" refers to a single viewpoint of a 3D image or video.

图11示出了根据本公开实施例的多层分辨率可伸缩3D立体视频编码器，其中，增强层以增强分辨率维持两个参考图片缓冲区中的每一个，并且以某个降低的分辨率(例如，二分之一水平或竖直分辨率)执行运动 /视差补偿。图12示出了根据本公开实施例的与图11中所示的编码器相对应的解码器。根据该实施方式，提供了用于视频序列的压缩的多层编解码器，该视频序列由属于给定时间实例的多个数据类别的帧组成。FIG11 illustrates a multi-layer resolution scalable 3D stereoscopic video encoder according to an embodiment of the present disclosure, wherein the enhancement layer maintains each of the two reference picture buffers at an enhanced resolution and performs motion/disparity compensation at a reduced resolution (e.g., half the horizontal or vertical resolution). FIG12 illustrates a decoder corresponding to the encoder shown in FIG11 according to an embodiment of the present disclosure. According to this embodiment, a multi-layer codec is provided for compressing a video sequence consisting of frames belonging to multiple data categories at a given time instance.

根据本公开的实施例，图11的基本层(1102)提供了多个数据类别的帧兼容表示。这里，帧兼容表示指的是将不同的数据类别采样和复用为单个帧。这种单个帧可以与包括原始类别的帧具有不同的大小。根据本公开内容的另外的实施例，图11的基本层(1102)可以使用任意可用的或未来的视频编解码器(如H.264/AVC、VP8和VC-1)来实现和编码。According to an embodiment of the present disclosure, the base layer (1102) of Figure 11 provides a frame-compatible representation of multiple data categories. Here, frame-compatible representation refers to sampling and multiplexing different data categories into a single frame. This single frame can have a different size than the frame comprising the original category. According to another embodiment of the present disclosure, the base layer (1102) of Figure 11 can be implemented and encoded using any available or future video codec (such as H.264/AVC, VP8, and VC-1).

继续参照图11，在将数据发送给基本层编码器(1104)之前，通过采样器(1106)对数据进行采样并且通过复用器(1108)对数据进行复用。在另外的实施例中，采样也可以包括滤波。此外，滤波在不同的数据类别之间可以是非对称的。例如，在又一实施例中，可以对一种类别进行滤波和采样，使得少于一半的信息(例如，频率内容)被保留。可以对另一种类别进行滤波和采样，使得多于一半的信息被保留。图1至图6示出了两种类别的图像数据的示意性的采样和复用布置。Continuing with FIG11 , the data is sampled by a sampler (1106) and multiplexed by a multiplexer (1108) before being sent to a base layer encoder (1104). In another embodiment, sampling may also include filtering. Furthermore, filtering may be asymmetric between different data categories. For example, in yet another embodiment, one category may be filtered and sampled such that less than half of the information (e.g., frequency content) is retained. Another category may be filtered and sampled such that more than half of the information is retained. FIG1 through FIG6 illustrate schematic sampling and multiplexing arrangements for two categories of image data.

根据图11中所示的实施例，提供了额外的增强层(1152)。根据其他实施例，额外的增强层的数量取决于在基本层之内已经采样和交织的数据和帧数据的类别的数量。选择在增强层中被采样和交织的数据，使得当与已经在基本层中存在的数据相结合时，采样和交织的数据导致对大多数类别的数据的有效表示和重建。根据图11中所示的实施例，其中包括两种类别的数据，一个增强层(1152)用于对所有的原始数据进行编码。根据该实施例，基本层(1102)可以携带每个类别的一半样本，增强层(1152) 可以提供每个数据类别的另一半缺失的样本。According to the embodiment shown in FIG11 , additional enhancement layers (1152) are provided. According to other embodiments, the number of additional enhancement layers depends on the number of categories of data and frame data that have been sampled and interleaved within the base layer. The data that is sampled and interleaved in the enhancement layers is selected so that when combined with the data already present in the base layer, the sampled and interleaved data results in an efficient representation and reconstruction of most categories of data. According to the embodiment shown in FIG11 , in which two categories of data are included, one enhancement layer (1152) is used to encode all of the original data. According to this embodiment, the base layer (1102) can carry half of the samples for each category, and the enhancement layer (1152) can provide the other half of the missing samples for each data category.

根据本公开内容的另外的实施例，基本层对一种类别的三分之一的样本进行压缩，其余三分之二的样本被存储在增强层中。相反的情况也是可能的。类似地，正如基本层，增强层中的每个类别的数据内容可以与另一数据类别的内容不相同。这可以通过使用不同类型的滤波或者不同数量和布置的样本(例如，梅花形相对基于行的子采样)来实现。根据该实施例，采样操作获得用于增强层处理的样本，采样操作可以包括对这些样本的滤波。According to another embodiment of the present disclosure, the base layer compresses one-third of the samples of one category, and the remaining two-thirds of the samples are stored in the enhancement layer. The opposite situation is also possible. Similarly, as with the base layer, the data content of each category in the enhancement layer can be different from the content of another data category. This can be achieved by using different types of filtering or different numbers and arrangements of samples (for example, quincunx versus row-based subsampling). According to this embodiment, the sampling operation obtains samples for enhancement layer processing, and the sampling operation may include filtering of these samples.

根据图11中所示的实施例，增强层(1152)采用混合视频编码模型，该模型可以在现代视频编解码器如VC-1和H.264/AVC中找到。可以根据相同的图片或帧中的相邻的样本(使用帧内预测)来预测输入数据，或者根据来自属于相同的层并且在所谓的参考图片缓冲区内缓冲为运动补偿预测参考的过去的已解码帧的样本(帧间预测)来预测输入数据。如果来自优先级较低的层(如基本层)的已解码信息对于增强层可用，则层间预测也是可能的。对这些信息进行访问的一种方法是通过将来自优先级较低的层的已解码图片作为运动补偿的参考。预测之后，对预测残差(1154) 进行变换(1156)和量化(1158)，然后使用熵编码(1160)对量化系数进行编码。图12中示出的解码器的增强层(1252)逆转这个处理。According to the embodiment shown in FIG11 , the enhancement layer ( 1152 ) employs a hybrid video coding model, which can be found in modern video codecs such as VC-1 and H.264/AVC. The input data can be predicted from adjacent samples in the same picture or frame (using intra-frame prediction) or from samples from previously decoded frames belonging to the same layer and buffered in a so-called reference picture buffer as a reference for motion-compensated prediction (inter-frame prediction). Inter-layer prediction is also possible if decoded information from a lower-priority layer (such as the base layer) is available for the enhancement layer. One way to access this information is by using decoded pictures from the lower-priority layer as a reference for motion compensation. After prediction, the prediction residual ( 1154 ) is transformed ( 1156 ) and quantized ( 1158 ), and the quantized coefficients are then encoded using entropy coding ( 1160 ). The enhancement layer ( 1252 ) of the decoder shown in FIG12 reverses this process.

与图11的具有包含过去的已解码图片/帧的单个参考图片缓冲区 (1110)的基本层(1102)不同，增强层(1152)维持多个内部参考图片缓冲区(1162)，每个内部参考图片缓冲区(1162)用于每个数据类别。在图11的实施例中，存储在这些缓冲区中的参考图片的生成是通过使用解复用器和RPU处理器(1164)来实现的。解复用器和RPU处理器(1164) 对预测残差和预测帧(其通过帧内或帧间预测获得)之和进行处理。Unlike the base layer (1102) of FIG11 , which has a single reference picture buffer (1110) containing previously decoded pictures/frames, the enhancement layer (1152) maintains multiple internal reference picture buffers (1162), one for each data category. In the embodiment of FIG11 , the generation of the reference pictures stored in these buffers is achieved using a demultiplexer and RPU processor (1164). The demultiplexer and RPU processor (1164) processes the sum of the prediction residual and the predicted frame (which is obtained by intra-frame or inter-frame prediction).

图11的解复用器(1164)也执行对每个类别的丢失样本的上采样和插值。每个参考图片缓冲区(1162)仅包括属于相同的数据类别的帧。缓冲区(1162)以比输入至增强层(1152 )的样本的分辨率高的分辨率存储图像或帧、或可选地以增强分辨率存储图像或帧。此外，用于将帧存储在每个参考图片缓冲区中的分辨率可以彼此不同。一个缓冲区可以以一种分辨率存储图片，而第二图片缓冲区可以以另一分辨率存储图片。在执行视差补偿(即运动补偿或帧内预测)(1168)之前，使用采样器(1170)下采样并且使用复用器(1172)复用所选择的来自每个图片缓冲区(1162) 的参考，以生成现在可以在帧兼容布置中被格式化的单个参考图片。根据另外的实施例，下采样和复用为帧兼容格式的操作可以包括更复杂的操作，如将两个参考线性或非线性地组合成最终的帧兼容参考图片。根据再一实施例，内部缓冲区中的帧的分辨率可以与增强分辨率配。The demultiplexer (1164) of Figure 11 also performs upsampling and interpolation of lost samples for each category. Each reference picture buffer (1162) includes only frames belonging to the same data category. The buffer (1162) stores images or frames at a higher resolution than the resolution of the samples input to the enhancement layer (1152), or optionally stores images or frames at an enhanced resolution. In addition, the resolution used to store frames in each reference picture buffer can be different from each other. One buffer can store pictures at one resolution, while a second picture buffer can store pictures at another resolution. Before performing disparity compensation (i.e., motion compensation or intra-frame prediction) (1168), the selected references from each picture buffer (1162) are downsampled using a sampler (1170) and multiplexed using a multiplexer (1172) to generate a single reference picture that can now be formatted in a frame-compatible arrangement. According to further embodiments, the operations of downsampling and multiplexing to a frame-compatible format can include more complex operations, such as linearly or nonlinearly combining two references into a final frame-compatible reference picture. According to yet another embodiment, the resolution of the frames in the internal buffer may match the enhanced resolution.

根据图11中所示的实施例，在对内部多个(可选地，增强分辨率的) 参考图片缓冲区提供的参考图片进行采样(1170)和复用(1172)之后，进行增强层(1152)内的帧间预测。因此，帧间预测在“帧兼容”域中进行，但是不一定在与基本层相同的域中。根据立体视频的另外的实施例，基本层帧兼容格式可以包括来自左视点的偶数列和来自右视点的奇数列，而在增强层处，帧兼容格式可以包括来自左视点的奇数列和来自右视点的偶数列。类似的布置对于其他的交织布置(如上下、并排等)也是可能的，选择适当的布置方法，使得在帧兼容的基本层图片中编码的样本与在一个或更多个增强层中编码的样本的组合应当产生数据类别的增强分辨率重建。根据本公开内容的另一实施例，这样的技术可以扩展到任何数量的层或视点。此外，帧间预测处理包括对每个增强层的运动参数的集合进行估计，该参数被编码并且被传送给解码器。According to the embodiment shown in FIG11 , inter-frame prediction within the enhancement layer (1152) is performed after sampling (1170) and multiplexing (1172) reference pictures provided by an internal multiple (optionally enhanced resolution) reference picture buffer. Thus, inter-frame prediction is performed in a "frame-compatible" domain, but not necessarily in the same domain as the base layer. According to another embodiment of stereoscopic video, the base layer frame-compatible format may include even columns from the left view and odd columns from the right view, while at the enhancement layer, the frame-compatible format may include odd columns from the left view and even columns from the right view. Similar arrangements are possible for other interleaved arrangements (e.g., top-and-bottom, side-by-side, etc.), with the appropriate arrangement being selected such that the combination of samples encoded in the frame-compatible base layer picture and samples encoded in one or more enhancement layers should produce an enhanced resolution reconstruction of the data class. According to another embodiment of the present disclosure, such techniques can be extended to any number of layers or viewpoints. Furthermore, the inter-frame prediction process includes estimating a set of motion parameters for each enhancement layer, which are encoded and transmitted to the decoder.

根据图11中所示的实施例，增强层参考图片缓冲区(1162)包括图片，该图片不限于增强层(1152)的已经解复用和上采样(1164)的解码图片。基本层至增强层参考处理单元(RPU)/预处理器模块(BL至EL RPU)(1166)将来自基本层(1102)的参考图片缓冲区(1110)的帧兼容解码图片作为输入，然后对帧数据进行复用和上采样以对属于不同的数据类别的较高的表示(可选地，增强分辨率的)帧进行估计。According to the embodiment shown in Figure 11, the enhancement layer reference picture buffer (1162) includes pictures that are not limited to decoded pictures of the enhancement layer (1152) that have been demultiplexed and upsampled (1164). The base layer to enhancement layer reference processing unit (RPU) / pre-processor module (BL to EL RPU) (1166) takes as input the frame-compatible decoded pictures from the reference picture buffer (1110) of the base layer (1102) and then multiplexes and upsamples the frame data to estimate a higher representation (optionally, enhanced resolution) frame belonging to a different data category.

根据另外的实施例，BL至EL RPU(1166)处理可以包括滤波、上尺度(upscale)、丢失样本的插入、以及频率内容的恢复或估计。当例如基本层对低频率进行编码并且增强层对高频率进行编码时，使用频率内容的恢复或估计。然后，这些BL至EL RPU(1166)处理图像被放置以及用作增强层(1152)的较高的表示(可选地为增强分辨率)的参考图片缓冲区(1162)中的额外的运动补偿预测参考。编码器处的BL至EL RPU 模块(1166)生成预测/上采样处理的信息并且将该信息(“RPU比特流”) (1174)传送给图12的相同的BL至EL RPU(1254)，其位于图12中所示的解码器模块处。以这种方式，可以在解码器处复制编码器预测操作。使用这个RPU模块的插值和预测可以包括[参考文献6]中公开的技术。According to another embodiment, the BL to EL RPU (1166) processing may include filtering, upscaling, insertion of lost samples, and recovery or estimation of frequency content. The recovery or estimation of frequency content is used when, for example, the base layer encodes low frequencies and the enhancement layer encodes high frequencies. These BL to EL RPU (1166) processed images are then placed and used as additional motion compensated prediction references in the reference picture buffer (1162) of the higher representation (optionally enhanced resolution) of the enhancement layer (1152). The BL to EL RPU module (1166) at the encoder generates information for the prediction/upsampling process and transmits this information ("RPU bitstream") (1174) to the same BL to EL RPU (1254) of Figure 12, which is located at the decoder module shown in Figure 12. In this way, the encoder prediction operations can be replicated at the decoder. The interpolation and prediction using this RPU module may include the techniques disclosed in [Reference 6].

根据本公开的另外的实施例，内部缓存器以增强分辨率存储帧，解码器在内部重建每个数据类别的增强分辨率帧并且将其存储在参考图片缓冲区中。因此，为了显示增强分辨率数据，本实施例中不使用图10的处理模块。替代地，根据本实施例，可以直接从增强层的参考图片缓冲区中提取和显示增强分辨率重建帧。根据本公开内容的另一实施例，每个类别的增强分辨率是不相等的。在本实施例中，编码器和解码器将增强层的缓冲区中的图片重新缩放为某个相同的增强分辨率。According to another embodiment of the present disclosure, an internal buffer stores frames at an enhanced resolution, and the decoder internally reconstructs the enhanced resolution frames for each data category and stores them in a reference picture buffer. Therefore, in order to display the enhanced resolution data, the processing module of Figure 10 is not used in this embodiment. Instead, according to this embodiment, the enhanced resolution reconstructed frames can be directly extracted and displayed from the reference picture buffer of the enhancement layer. According to another embodiment of the present disclosure, the enhanced resolution of each category is unequal. In this embodiment, the encoder and decoder rescale the pictures in the buffer of the enhancement layer to a certain same enhanced resolution.

在另外的实施例中，在基本层中编码的帧的大小可以与属于每个类别的帧的大小相同。在另外的实施例中，增强层中的参考图片缓冲区包含原始帧的分辨率(全分辨率)的帧。In a further embodiment, the size of the frames encoded in the base layer may be the same as the size of the frames belonging to each category.In a further embodiment, the reference picture buffer in the enhancement layer contains frames at the resolution of the original frame (full resolution).

根据本公开的实施例，其中被施加运动补偿的帧兼容的域可以与基本层的域相同。在另外的实施例中，基本层可以以并排格式交织，并且增强层也可以对以相同的并排格式交织的帧进行编码。According to an embodiment of the present disclosure, the frame-compatible domain to which motion compensation is applied may be the same as the domain of the base layer. In another embodiment, the base layer may be interleaved in a side-by-side format, and the enhancement layer may also encode frames interleaved in the same side-by-side format.

根据本公开的实施例，基本层帧兼容格式可以包括来自第一视点的偶数列和来自第二视点的奇数列，而在增强层，帧兼容格式可以包括来自第一视点的奇数列和来自第二视点的偶数列。According to an embodiment of the present disclosure, the base layer frame-compatible format may include even columns from the first viewpoint and odd columns from the second viewpoint, while in the enhancement layer, the frame-compatible format may include odd columns from the first viewpoint and even columns from the second viewpoint.

根据本公开的实施例，层可以以不同的空间分辨率对帧进行编码，在这种情况下，可以建立具有空间可伸缩性的系统。根据另外的实施例，编解码器系统具有1280×720并排帧兼容基本层和可以以1920×1080重建这两个视点的增强层。在本实施例中，BL至EL RPU首先将帧兼容数据解复用为不同的类别，然后可以执行以下操作之一。在一个方案中，BL至EL RPU首先对该种类别的丢失样本进行插值，然后在将所得到的帧 (1280×720)存储在增强层中的相应的参考图片缓冲区中之前将该所得到的帧重新缩放至预期空间分辨率(1920×1080)。在第二个方案中，将可用的和经解复用的样本从较低的分辨率重新缩放至较高的分辨率，例如，从640×720至960×1080。然后，额外的插值操作确定丢失的列，以及可选地还对现有的样本进行滤波以推导全分辨率帧。According to an embodiment of the present disclosure, layers can encode frames at different spatial resolutions, in which case a system with spatial scalability can be established. According to another embodiment, the codec system has a 1280×720 side-by-side frame-compatible base layer and an enhancement layer that can reconstruct both viewpoints at 1920×1080. In this embodiment, the BL to EL RPU first demultiplexes the frame-compatible data into different categories and then can perform one of the following operations. In one scheme, the BL to EL RPU first interpolates the lost samples of the category and then rescales the resulting frame (1280×720) to the desired spatial resolution (1920×1080) before storing it in the corresponding reference picture buffer in the enhancement layer. In the second scheme, the available and demultiplexed samples are rescaled from a lower resolution to a higher resolution, for example, from 640×720 to 960×1080. Then, additional interpolation operations determine the lost columns and optionally also filter the existing samples to derive the full-resolution frame.

根据本公开的实施例，增强层中的多个参考图片缓冲区也可以通过存储管理控制操作(MMCO)来控制，如[参考文献2]中所公开的操作。 MMCO操作对参考图片如何添加至缓冲区中以及如何从缓冲区中移除进行控制。根据另外的实施例，MMCO操作针对增强层被发生。在本实施例中，每个参考图片缓冲区的MMCO的集合是相同的，或者，MMCO 操作的集合是用信号通知的。这适用于两个参考图片缓冲区。因此，图片缓冲区的操作维持同步。再一实施例可以使用用于参考图片列表修正/重新排序信令(reordering signaling)的类似的方法，包括但不限于[参考文献2]中所公开的方法。According to an embodiment of the present disclosure, multiple reference picture buffers in an enhancement layer can also be controlled via memory management control operations (MMCOs), such as those disclosed in [Reference 2]. MMCO operations control how reference pictures are added to and removed from the buffers. According to another embodiment, MMCO operations occur for the enhancement layer. In this embodiment, the set of MMCOs for each reference picture buffer is the same, or the set of MMCO operations is signaled. This applies to both reference picture buffers. Thus, picture buffer operations remain synchronized. Yet another embodiment may utilize similar methods for reference picture list modification/reordering signaling, including but not limited to the method disclosed in [Reference 2].

信令信息控制缓冲区中的参考图片列表的生成。然后，在运动补偿预测中使用这些列表。根据另外的实施例，修正信息对于每个数据类别的参考图片列表来说是相同的。在再一实施例中，修正信息的单个集合被发生并且将被应用于增强层中的所有列表。根据本公开的另外的实施例，在利用对参考图片缓冲区的内容进行控制的信令和对其参考图片列表的初始化和修改进行控制的信令的编解码器中使用类似的方法。Signaling information controls the generation of reference picture lists in the buffer. These lists are then used in motion-compensated prediction. According to another embodiment, the correction information is the same for the reference picture lists of each data class. In yet another embodiment, a single set of correction information is generated and applied to all lists in the enhancement layer. According to another embodiment of the present disclosure, a similar approach is used in a codec that utilizes signaling to control the contents of the reference picture buffer and the initialization and modification of its reference picture lists.

根据本公开的实施例，可以将这样的解复用器和上采样器RPU实现为如[参考文献6]中所公开的参考处理单元：其在得到增强分辨率的单个类别帧之后将帧存储到参考图片缓冲区中。According to an embodiment of the present disclosure, such a demultiplexer and upsampler RPU may be implemented as a reference processing unit as disclosed in [Reference 6], which stores frames into a reference picture buffer after obtaining a single category frame of enhanced resolution.

根据本公开内容的另外的实施例，基本层可以使用第一范围的频率内容对表示进行编码，而额外的增强层可以提供第二范围的频率内容。也可以在解码器处对它们的输出进行组合以提供原始数据类别的更好的表示。According to further embodiments of the present disclosure, a base layer may encode a representation using a first range of frequency content, while an additional enhancement layer may provide a second range of frequency content. Their outputs may also be combined at the decoder to provide a better representation of the original data class.

图13示出了根据本公开的实施例的多层分辨率可伸缩3D立体视频编码器，其中，增强层以增强分辨率维持两个参考图片缓冲区中的每一个，并且以增强分辨率执行运动/视差补偿。图14示出了根据本公开实施例的、与图13中所示的编码器相对应的解码器。FIG13 shows a multi-layer resolution scalable 3D stereoscopic video encoder according to an embodiment of the present disclosure, wherein the enhancement layer maintains each of the two reference picture buffers at an enhanced resolution and performs motion/disparity compensation at the enhanced resolution. FIG14 shows a decoder corresponding to the encoder shown in FIG13 according to an embodiment of the present disclosure.

图13和图14中所示的实施例与图11和图12中所示的实施例类似。但是，与图11和图12中所示的实施例不同，根据图13中所示的实施例，采样器(1354)和复用器(1356)被放置在视差补偿模块(1358)之后。事实上，有多少数据类别就有多少视差补偿模块(1358)。根据图13中所示的立体视频传输的实施例，有两个视差补偿模块(1358)，每个视点一个模块。在对每个较高表示参考执行视差补偿之后，将所得到的帧传递给执行下采样处理的采样器(1354)，下采样处理可以在滤波之前或之后进行(在两个视点的情况下，它将保留一半的样本)。然后，将来自每个数据类别的经下采样的数据馈送给生成帧兼容图片(1360)的复用器(1356)。然后，这个帧兼容图片(1360)被用作预测残差(1362)被添加到增强层混合视频编码回路内的预测。The embodiments shown in Figures 13 and 14 are similar to those shown in Figures 11 and 12. However, unlike the embodiments shown in Figures 11 and 12, according to the embodiment shown in Figure 13, the sampler (1354) and multiplexer (1356) are placed after the disparity compensation module (1358). In fact, there are as many disparity compensation modules (1358) as there are data classes. According to the embodiment of stereoscopic video transmission shown in Figure 13, there are two disparity compensation modules (1358), one for each viewpoint. After disparity compensation is performed for each higher representation reference, the resulting frame is passed to the sampler (1354) which performs a downsampling process, which can be performed before or after filtering (in the case of two views, it will retain half the samples). The downsampled data from each data class is then fed to the multiplexer (1356) which generates a frame-compatible picture (1360). This frame-compatible picture (1360) is then used as a prediction for the prediction residual (1362) added to the enhancement layer hybrid video coding loop.

根据图13中所示的实施例，视差补偿模块(1358)使得较多的样本可用(较高表示图片)，并且能够产生更好的预测。另外，视差补偿模块可以在运动补偿中具有更多的空间上的精确分区。例如，当使用并排格式时，帧兼容参考图片中的4×4的分区大小等于全分辨率图片中的8×4 的分区大小。类似地，16×16的分区实际上是全分辨率图片中的32×16分区。因此，在本实施例中，视差补偿模块(1358)可以具有更大和更准确的分区。According to the embodiment shown in FIG13 , the disparity compensation module ( 1358 ) makes more samples available (higher representation pictures) and is able to produce better predictions. In addition, the disparity compensation module can have more spatially accurate partitions in motion compensation. For example, when using the side-by-side format, a 4×4 partition size in a frame-compatible reference picture is equivalent to an 8×4 partition size in a full-resolution picture. Similarly, a 16×16 partition is actually a 32×16 partition in a full-resolution picture. Therefore, in this embodiment, the disparity compensation module ( 1358 ) can have larger and more accurate partitions.

根据图13中所示的实施例，可以在增强层处多次(例如两次)执行视差估计和补偿，从而增加了系统复杂度。另外，由于运动补偿的空间精确度增加而得到的益处取决于如何从通过将预测残差添加到图13的帧兼容预测图片V_FC,PRED(1362)所获得的帧兼容图片中上采样较高表示参考图片。此外，根据本实施例，增强层对运动矢量信息的量进行两次压缩。根据另外的实施例，参考图片缓冲区采用增强分辨率，并且执行每个数据类别的最终重建，作为生成存储在增强层的参考图片缓冲区中的参考的一部分。因此，与图中所示的解码器不同，本实施例不对基本层和增强层的输出进行进一步处理。According to the embodiment shown in FIG13 , disparity estimation and compensation may be performed multiple times (e.g., twice) at the enhancement layer, thereby increasing system complexity. Furthermore, the benefit gained from increased spatial accuracy of motion compensation depends on how a higher representation reference picture is upsampled from a frame-compatible picture obtained by adding the prediction residual to the frame-compatible predicted picture V _FC,PRED (1362) of FIG13 . Furthermore, according to this embodiment, the enhancement layer compresses the amount of motion vector information twice. According to another embodiment, the reference picture buffer uses an enhanced resolution, and a final reconstruction of each data class is performed as part of generating the reference stored in the reference picture buffer of the enhancement layer. Therefore, unlike the decoder shown in the figure, this embodiment does not further process the outputs of the base layer and enhancement layer.

根据本公开的另外的实施例，多层编解码器可以考虑空间可伸缩性，这与第一方法的附加实施例类似。与图11和图12中所示的实施例类似，再一实施例提供了参考图片列表修改和用信号发送给解码器的MMCO 操作。According to further embodiments of the present disclosure, a multi-layer codec may take into account spatial scalability, similar to the additional embodiments of the first approach. Similar to the embodiments shown in Figures 11 and 12, yet another embodiment provides reference picture list modification and MMCO operation signaled to the decoder.

根据本公开内容的另外的实施例，由于在多个视差/运动估计器和补偿模块中使用的运动参数中存在足够的相互关系，选择这些运动参数以使得能够根据来自其他模块的参数对一个模块的参数进行高效的预测。在再一实施例中，将运动参数选择为相同的，并且针对每个增强层，仅发送一个集合的参数。在另一实施例中，每个模块的参数的集合是用信号发送的。运动参数预测也可以使用从优先级较高的视差估计/补偿模块用信号发送的相邻的或并排的参数的信息。According to another embodiment of the present disclosure, since there is sufficient correlation among the motion parameters used in multiple disparity/motion estimators and compensation modules, these motion parameters are selected so that the parameters of one module can be efficiently predicted based on the parameters from other modules. In another embodiment, the motion parameters are selected to be the same, and only one set of parameters is sent for each enhancement layer. In another embodiment, the set of parameters for each module is signaled. Motion parameter prediction can also use information from adjacent or side-by-side parameters signaled from higher priority disparity estimation/compensation modules.

根据本公开内容的另外的实施例，在基本层中编码的帧的大小可以与属于每个类别的帧的大小相同。根据另一实施例，增强层中的参考图片缓冲区包含原始帧的分辨率(全分辨率)的帧。根据再一实施例，在被存储到参考图片缓冲区中之前得到增强分辨率的单个类别帧的解复用器和上采样器可以被实现为[参考文献6]中所公开的参考处理单元。在再一实施例中，基本层可以以第一范围的频率内容对表示进行编码，而额外的增强层可以提供第二范围的频率内容。也可以在解码器处对它们的输出进行组合以提供原始数据类别的更好的表示。According to another embodiment of the present disclosure, the size of the frames encoded in the base layer may be the same as the size of the frames belonging to each category. According to another embodiment, the reference picture buffer in the enhancement layer contains frames at the resolution of the original frame (full resolution). According to yet another embodiment, the demultiplexer and upsampler of the single category frames that obtain the enhanced resolution before being stored in the reference picture buffer may be implemented as a reference processing unit as disclosed in [Reference 6]. In yet another embodiment, the base layer may encode the representation with a first range of frequency content, while the additional enhancement layer may provide a second range of frequency content. Their outputs may also be combined at the decoder to provide a better representation of the original data category.

图15示出了根据本公开的实施例的多层分辨率可伸缩视频编码器，其中，基本层(1502)对数据的帧兼容版本进行编码，并且多个增强层中的两个(1532，1562)对两个增强分辨率数据类别中的每个(3D立体视频传输情况下的每个视点)进行编码。图16示出了根据本公开实施例的相应的解码器。FIG15 shows a multi-layer resolution scalable video encoder according to an embodiment of the present disclosure, wherein a base layer (1502) encodes a frame-compatible version of the data, and two of the multiple enhancement layers (1532, 1562) encode one of two enhanced resolution data categories (one for each viewpoint in the case of 3D stereoscopic video transmission). FIG16 shows a corresponding decoder according to an embodiment of the present disclosure.

根据图15中所示的实施例，基本层(1502)的结构与图11至图14 中所示的实施例的结构相同。根据图15中所示的实施例，基本层(1502) 可以对多个数据类别的帧兼容版本进行编码。在本实施例中，针对每个数据类别设置增强层(1532，1562)。根据用于立体视频传输的另外的实施例，每个增强层(1532，1562)提供对每个数据类别的增强分辨率重建。根据本实施例，每个增强层(1532，1562)包含单个参考图片缓冲区(1534， 1564)，并且使用与基本层(1502)非常类似的结构。在本实施例中，增强层(1532，1562)直接接收每个类别的增强(例如全)分辨率帧。相反，根据图11至图14中所示的实施例，增强层的输入由所有数据类别的帧兼容表示组成。According to the embodiment shown in FIG15 , the structure of the base layer (1502) is identical to the structure of the embodiment shown in FIG11 to FIG14 . According to the embodiment shown in FIG15 , the base layer (1502) can encode frame-compatible versions of multiple data categories. In this embodiment, an enhancement layer (1532, 1562) is provided for each data category. According to another embodiment for stereoscopic video transmission, each enhancement layer (1532, 1562) provides enhanced resolution reconstruction for each data category. According to this embodiment, each enhancement layer (1532, 1562) contains a single reference picture buffer (1534, 1564) and uses a very similar structure to the base layer (1502). In this embodiment, the enhancement layers (1532, 1562) directly receive enhanced (e.g., full) resolution frames for each category. In contrast, according to the embodiments shown in FIG11 to FIG14 , the input to the enhancement layers consists of frame-compatible representations of all data categories.

根据图15中所示的实施例，每个层(1532，1562)的参考图片缓冲区(1534，1564)存储可以用于运动补偿预测(1536，1566)的参考。这些参考包括相同层的之前的解码帧。根据另外的实施例，增强层(1532， 1562)中的附加的参考可以从基本层(1502)插入，如通过H.264的MVC 扩展所完成的一样。在本实施例中，在被插入之前，使用RPU/预处理器 (1538，1568)对这些参考进行处理，以获得与存储在目标参考图片缓冲区中的帧相对应的经处理的参考。根据用于立体视频传输的另外的实施例，基本层帧兼容图片被解复用为属于不同类别的样本。然后，在样本被存储在每个增强层(1532，1562)的参考图片缓冲区(1534，1564)中之前，将样本在RPU/预处理器(1538，1568)之内上采样为增强(例如全) 分辨率。根据本实施例，在RPU/预处理器(1538，1568)之内的预测、插值和上采样处理可以采用[参考文献6]中所公开的技术。According to the embodiment shown in FIG15 , the reference picture buffer (1534, 1564) of each layer (1532, 1562) stores references that can be used for motion compensated prediction (1536, 1566). These references include previously decoded frames of the same layer. According to another embodiment, additional references in the enhancement layer (1532, 1562) can be inserted from the base layer (1502), as is done with the MVC extension of H.264. In this embodiment, these references are processed using an RPU/preprocessor (1538, 1568) before being inserted to obtain processed references corresponding to frames stored in the target reference picture buffer. According to another embodiment for stereoscopic video transmission, the base layer frame-compatible pictures are demultiplexed into samples belonging to different classes. The samples are then upsampled to the enhanced (e.g., full) resolution within the RPU/preprocessor (1538, 1568) before being stored in the reference picture buffer (1534, 1564) of each enhancement layer (1532, 1562). According to this embodiment, the prediction, interpolation, and upsampling processes within the RPU/preprocessor (1538, 1568) may employ the techniques disclosed in [Reference 6].

根据图15中所示的实施例，可以实现单独的RPU(1538，1568)，以产生每个参考，每个参考将被存储在增强层的参考图片缓冲区(1534， 1564)中的每个中。根据另一实施例，可以设置单个模块，以共同地优化和执行将基本层解码帧兼容图片解复用和上采样为多个全参考图片，每个增强层用于一个全参考图片。According to the embodiment shown in Figure 15, a separate RPU (1538, 1568) can be implemented to generate each reference to be stored in each of the reference picture buffers (1534, 1564) of the enhancement layer. According to another embodiment, a single module can be provided to jointly optimize and perform demultiplexing and upsampling of the base layer decoded frame-compatible picture into multiple full reference pictures, one full reference picture for each enhancement layer.

根据本公开内容另一实施例，提供了增强层的除了对于基本层的依赖关系(dependency)之外的附加的依赖关系。在本实施例中，增强层可以依赖于对另一增强层的解析和解码。继续参照图15，除了基本层(1502) 之外，对增强层1(1562)的解码处理也可以依赖于增强层0(1532)。存储在增强层0(1532)的参考图片缓冲区(1534)中的用于显示的图片(1572) 被馈送到额外的RPU/预处理器模块(1570)中。额外的RPU/预处理器 (1570)将馈送的参考输入(1572)处理成与增强层1(1562)的格式类似。然后，将经处理的结果(1574)存储到增强层1(1562)的参考图片缓冲区(1564)中，并且可用于运动补偿预测(1566)。根据用于立体视频传输的另外的实施例，每个增强层对其中的一个视点进行编码，RPU 将使用运动和空间处理对一个视点进行处理，以便产生较接近于另一视点的参考图片。根据再一实施例，运动处理可以包括高阶运动模型，如仿射和透视运动模型。According to another embodiment of the present disclosure, additional dependencies of the enhancement layer in addition to the dependency on the base layer are provided. In this embodiment, the enhancement layer can rely on the parsing and decoding of another enhancement layer. Continuing with reference to Figure 15, in addition to the base layer (1502), the decoding process of enhancement layer 1 (1562) can also rely on enhancement layer 0 (1532). The picture (1572) for display stored in the reference picture buffer (1534) of enhancement layer 0 (1532) is fed into the additional RPU/preprocessor module (1570). The additional RPU/preprocessor (1570) processes the fed reference input (1572) into a format similar to that of enhancement layer 1 (1562). The processed result (1574) is then stored in the reference picture buffer (1564) of enhancement layer 1 (1562) and can be used for motion compensated prediction (1566). According to another embodiment for stereoscopic video transmission, each enhancement layer encodes one of the views, and the RPU processes one view using motion and spatial processing to produce a reference picture that is closer to the other view. According to yet another embodiment, the motion processing can include high-order motion models, such as affine and perspective motion models.

根据另外的实施例，与图11至图14中所示的实施例类似，多层编解码器可以考虑空间可伸缩性。在本实施例中，根据基本层执行对增强层的预测的预处理器模块(例如，图15的1538和1568)还可以包括重新缩放至目标层分辨率。如果增强层不具有相同的空间分辨率，则根据第二增强层(例如，图15的1570)对一个增强层进行预测的预处理器模块还可以包括重新缩放。According to another embodiment, similar to the embodiments shown in Figures 11 to 14, a multi-layer codec can take into account spatial scalability. In this embodiment, the pre-processor module that performs prediction of the enhancement layer based on the base layer (e.g., 1538 and 1568 of Figure 15) can also include rescaling to the target layer resolution. If the enhancement layers do not have the same spatial resolution, the pre-processor module that predicts one enhancement layer based on a second enhancement layer (e.g., 1570 of Figure 15) can also include rescaling.

图17示出了根据本公开实施例的多层分辨率可伸缩3D立体视频编码器，其中，增强层对残差进行编码并且以增强分辨率维持两个参考图片缓冲区中的每一个，以及以某个降低的分辨率(帧兼容)执行运动/视差补偿。图18示出了根据本公开的实施例的相应的解码器。FIG17 shows a multi-layer resolution scalable 3D stereo video encoder according to an embodiment of the present disclosure, wherein the enhancement layer encodes the residual and maintains each of the two reference picture buffers at an enhanced resolution, and performs motion/disparity compensation at a reduced resolution (frame compatible). FIG18 shows a corresponding decoder according to an embodiment of the present disclosure.

根据图17中所示的实施例，基本层(1702)对帧兼容信号进行编码，当对一个或更多个增强层(1752)进行解码以及将它们与基本层(1702) 的输出组合时，该信号可以被进一步改进(尤其是，在分辨率或空间频率内容方面)。根据本实施例，增强层(1752)对经滤波、经采样和经复用的残差(1754)进行编码[参考文献7]，该残差是减去原始的全分辨率数据类别帧的预测(1756)的结果。这个预测(1756)是使用RPU处理器 (1758)的结果，RPU处理器将来自帧兼容基本层(1702)的解码图片 (1760)作为输入，并且以原始的(全)分辨率输出原始的帧类别的预测 (1756)。在另外的实施例中，RPU(1758)可以使用如[参考文献6]中所公开的那些技术，包括滤波，插值，重新缩放等。根据图17中所示的实施例，增强层(1752)的内部图片缓冲区(1762)不通过RPU从基本层缓冲区(1704)接收经处理的参考。According to the embodiment shown in FIG17 , the base layer ( 1702 ) encodes a frame-compatible signal that can be further improved (particularly in terms of resolution or spatial frequency content) when one or more enhancement layers ( 1752 ) are decoded and combined with the output of the base layer ( 1702 ). According to this embodiment, the enhancement layer ( 1752 ) encodes a filtered, sampled, and multiplexed residual ( 1754 ) [ Reference 7 ] that is the result of subtracting a prediction ( 1756 ) of the original full-resolution data class frame. This prediction ( 1756 ) is the result of using an RPU processor ( 1758 ) that takes as input a decoded picture ( 1760 ) from the frame-compatible base layer ( 1702 ) and outputs a prediction ( 1756 ) of the original frame class at the original (full) resolution. In other embodiments, the RPU ( 1758 ) can use techniques such as those disclosed in [ Reference 6 ] , including filtering, interpolation, rescaling, etc. According to the embodiment shown in FIG. 17 , the internal picture buffer ( 1762 ) of the enhancement layer ( 1752 ) does not receive processed references from the base layer buffer ( 1704 ) through the RPU.

在图18中所示的解码器处，类似的RPU(1854)将来自基本层(1802) 的图片缓冲区(1804)的解码基本层图片作为输入，将其处理为原始的(全) 分辨率以得到每个类别的全分辨率帧(1856)，然后将这些帧(1856)添加到已经在增强层参考图片缓冲区(1860)中解码的帧(1858)以产生每个数据类别的最终的重建帧(1862)。At the decoder shown in FIG18 , a similar RPU ( 1854 ) takes as input the decoded base layer pictures from the picture buffer ( 1804 ) of the base layer ( 1802 ), processes them to their original (full) resolution to derive full resolution frames ( 1856 ) for each category, and then adds these frames ( 1856 ) to the frames ( 1858 ) already decoded in the enhancement layer reference picture buffer ( 1860 ) to produce the final reconstructed frames ( 1862 ) for each data category.

根据图11和图12中所示实施例的与图11和图12中所示实施例以及图17和图18中所示实施例之间的差异不相冲突的所有的另外的实施例也适用于根据图17和图18中所示实施例的另外的实施例。根据另外的实施例，基本层的分辨率、增强层的分辨率以及增强层的内部参考图片缓冲区可以不同。All further embodiments according to the embodiment shown in Figures 11 and 12 that do not conflict with the differences between the embodiment shown in Figures 11 and 12 and the embodiment shown in Figures 17 and 18 also apply to the further embodiments according to the embodiment shown in Figures 17 and 18. According to further embodiments, the resolution of the base layer, the resolution of the enhancement layer, and the internal reference picture buffer of the enhancement layer may be different.

在根据图17和图18中所示实施例的另外的实施例中，与图13和图 14中所示的实施例类似，增强层提供了多个视差补偿模块，每个数据类别的每个参考图片缓冲区一个视差补偿模块。在此，根据图13和图14 中所示的实施例的所有适用的另外的实施例同样适用。In another embodiment according to the embodiment shown in Figures 17 and 18 , similar to the embodiment shown in Figures 13 and 14 , the enhancement layer provides multiple disparity compensation modules, one for each reference picture buffer of each data category. All applicable alternative embodiments according to the embodiment shown in Figures 13 and 14 also apply.

图19示出了根据本公开实施例的多层分辨率可伸缩视频编码器，其中，基本层对数据的帧兼容版本进行编码，并且两个增强层针对增强分辨率数据类别中的每个(3D立体视频传输的每个视点)对残差进行编码。图20示出了根据本公开的实施例的相应的解码器。Figure 19 shows a multi-layer resolution scalable video encoder according to an embodiment of the present disclosure, where the base layer encodes a frame-compatible version of the data and two enhancement layers encode the residual for each enhanced resolution data category (each viewpoint for 3D stereoscopic video transmission). Figure 20 shows a corresponding decoder according to an embodiment of the present disclosure.

根据图19中所示的实施例，与图17和图18中所示的实施例类似，增强层(1932，1962)采用残差编码。根据本实施例，每个增强层(1932， 1962)对应于每个数据类别并且对残差(1934，1964)进行编码，该残差为减去原始的全分辨率数据类别帧的预测(1936，1966)的结果[参考文献7]。这个预测是针对每个增强层(1932，1962)使用RPU处理器(1938，1968)的结果，RPU处理器使来自帧兼容基本层(1902)的解码图片作为输入，并且以给定层的原始的(全)分辨率输出原始帧类别的预测( 1936 ， 1966)。根据另外的实施例，RPU(1938，1968)可以使用如[参考文献6] 中所公开的那些技术，包括滤波、插值、重新缩放等。根据图19中所示的实施例，增强层(1932，1962)的内部图片缓冲区(1940，1970)不通过RPU从基本层缓冲区(1904)接收经处理的参考。According to the embodiment shown in FIG19 , similar to the embodiments shown in FIG17 and FIG18 , the enhancement layers (1932, 1962) employ residual coding. According to this embodiment, each enhancement layer (1932, 1962) corresponds to each data class and encodes a residual (1934, 1964) that is the result of subtracting a prediction (1936, 1966) of the original full-resolution data class frame [Reference 7]. This prediction is the result of using an RPU processor (1938, 1968) for each enhancement layer (1932, 1962), which takes as input a decoded picture from the frame-compatible base layer (1902) and outputs a prediction (1936, 1966) of the original frame class at the original (full) resolution of the given layer. According to another embodiment, the RPU (1938, 1968) may use techniques such as those disclosed in [Reference 6], including filtering, interpolation, rescaling, etc. According to the embodiment shown in FIG. 19 , the internal picture buffers ( 1940 , 1970 ) of the enhancement layer ( 1932 , 1962 ) do not receive processed references from the base layer buffer ( 1904 ) through the RPU.

在图20中所示的解码器处，对于每个增强层(2032，2062)，类似的 RPU(2034，2064)将来自基本层(2002)的图片缓冲区(2004)的解码基本层图片作为输入，将其处理成原始的(全)分辨率以得到给定的类别的全分辨率帧(2036，2066)，然后将这个帧(2036，2066)添加到已经在增强层参考图片缓冲区(2040，2070)中解码的帧(2038，2068)以产生给定的数据类别的最终的重建帧(2042，2072)。At the decoder shown in Figure 20, for each enhancement layer (2032, 2062), a similar RPU (2034, 2064) takes as input the decoded base layer picture from the picture buffer (2004) of the base layer (2002), processes it to the original (full) resolution to obtain a full-resolution frame (2036, 2066) of the given category, and then adds this frame (2036, 2066) to the frames (2038, 2068) already decoded in the enhancement layer reference picture buffer (2040, 2070) to produce the final reconstructed frame (2042, 2072) of the given data category.

根据图15和图16中所示实施例的与图15和图16中所示的实施例与图19和图20中所示的实施例之间的差异不相冲突的所有另外的实施例也适用于根据图19和图20中所示实施例的另外的实施例。根据另外的实施例，基本层的分辨率与增强层的分辨率以及增强层的内部参考图片缓冲区可以不同。而且，每个增强层的分辨率可以不同。All further embodiments according to the embodiment shown in Figures 15 and 16 that do not conflict with the differences between the embodiment shown in Figures 15 and 16 and the embodiment shown in Figures 19 and 20 also apply to the further embodiments according to the embodiment shown in Figures 19 and 20. According to further embodiments, the resolution of the base layer and the resolution of the enhancement layer and the internal reference picture buffer of the enhancement layer can be different. Furthermore, the resolution of each enhancement layer can be different.

本公开所描述的方法及系统可以用硬件、软件、固件或其组合来实现。以块、模块或部件描述的特征可以一起实现(如，在如集成逻辑器件的逻辑器件中)或单独实现(如，单独连接的逻辑器件)。本公开的方法的软件部分可以包括计算机可读介质，其包括当被执行时至少部分地执行所描述的方法的指令。上述计算机可读介质可以包括如随机存取存储器(RAM)和/或只读存储器(ROM)。上述指令可以由处理器(如，数字信号处理器(DSP)、专用集成电路(ASIC)、或现场可编程门阵列(FPGA)) 来执行。The method and system described in the present disclosure can be implemented with hardware, software, firmware or a combination thereof. The features described in blocks, modules or components can be implemented together (e.g., in a logic device such as an integrated logic device) or separately (e.g., a separately connected logic device). The software portion of the method disclosed herein may include a computer-readable medium comprising instructions for at least partially executing the described method when executed. The above-mentioned computer-readable medium may include random access memory (RAM) and/or read-only memory (ROM). The above-mentioned instructions may be executed by a processor (e.g., a digital signal processor (DSP), an application-specific integrated circuit (ASIC) or a field programmable gate array (FPGA)).

因此，已经对本发明的实施例进行了描述，这些实施例涉及以下顺序地、直接地列举的示例实施例中的一个或更多个。Thus, embodiments of the invention have been described that relate to one or more of the example embodiments listed sequentially and directly below.

因此，本发明可以用本文中所描述的任何形式来实施，包括但不限于下面的描述本发明的某些部分的结构、特征和功能的列举的示例实施例 (EEE)：Thus, the present invention may be embodied in any form described herein, including but not limited to the following enumerated example embodiments (EEEs) which describe the structure, features, and functionality of certain portions of the present invention:

EEE1.一种用于多层帧兼容视频传输的编码方法，包括：EEE1. A coding method for multi-layer frame-compatible video transmission, comprising:

a)通过基本层对多个数据类别的图像或视频帧进行基本层处理，包括：a) performing base layer processing on images or video frames of multiple data categories through a base layer, including:

i)提供多个数据类别的图像或视频帧的基本层帧兼容表示；以及i) providing a base layer frame compatible representation of images or video frames of multiple data categories; and

b)通过一个或更多个增强层对多个数据类别的图像或视频帧进行增强层处理，包括：b) performing enhancement layer processing on the image or video frame of the plurality of data categories through one or more enhancement layers, comprising:

i)提供多个数据类别的图像或视频帧的增强层帧兼容表示；i) providing enhancement layer frame-compatible representations of images or video frames of multiple data categories;

ii)维持至少一个增强层参考图片缓冲区；ii) maintaining at least one enhancement layer reference picture buffer;

iii)针对对于基本层或不同的增强层的至少一个依赖关系进行参考处理；以及iii) performing reference processing for at least one dependency on a base layer or a different enhancement layer; and

iv)执行运动或视差补偿，iv) performing motion or parallax compensation,

其中，一个或更多个增强层中的每个对所有的多个数据类别进行处理。Each of the one or more enhancement layers processes all of the multiple data categories.

EEE2.根据列举的示例实施例1所述的编码方法，其中，多个数据类别包括用于立体图像或视频的多个视点。EEE2. The encoding method of enumerated example embodiment 1, wherein the plurality of data categories comprises a plurality of viewpoints for a stereoscopic image or video.

EEE3.根据列举的示例实施例1或2所述的编码方法，其中，提供多个数据类别的图像或视频帧的基本层帧兼容表示包括：EEE3. The encoding method of Enumerated Example Embodiment 1 or 2, wherein providing a base layer frame-compatible representation of an image or video frame of multiple data categories comprises:

将多个数据类别的图像或视频帧基本层采样和基本层复用为单个帧。Multiplex image or video frame base layer samples and base layers of multiple data categories into a single frame.

EEE4.根据列举的示例实施例1至3中的任一项所述的编码方法，其中，基本层处理符合多个现有的视频编解码器之一。EEE4. An encoding method according to any one of the enumerated example embodiments 1 to 3, wherein the base layer processing complies with one of multiple existing video codecs.

EEE5.根据列举的示例实施例4所述的编码方法，其中，多个现有的视频编解码器包括H.264/AVC、VP8和VC-1。EEE5. The encoding method according to enumerated example embodiment 4, wherein the plurality of existing video codecs include H.264/AVC, VP8, and VC-1.

EEE6.根据列举的示例实施例3至5中的任一项所述的编码方法，其中，基本层处理的采样包括对该多个数据类别的图像或视频帧进行对称或非对称滤波。EEE6. An encoding method according to any one of the enumerated example embodiments 3 to 5, wherein the sampling of the base layer processing includes symmetrical or asymmetrical filtering of the images or video frames of the multiple data categories.

EEE7.根据列举的示例实施例3至6中的任一项所述的编码方法，其中，基本层采样包括水平采样、垂直采样或梅花形采样。EEE7. The encoding method according to any one of enumerated example embodiments 3 to 6, wherein the base layer sampling includes horizontal sampling, vertical sampling or quincunx sampling.

EEE8.根据列举的示例实施例3至7中的任一项所述的编码方法，其中，所述基本层复用使用棋盘交织布置、列交织布置、行交织布置、并排布置和上下布置中的任一种帧兼容打包布置。EEE8. An encoding method according to any one of the enumerated example embodiments 3 to 7, wherein the base layer multiplexing uses any one of the frame-compatible packing arrangements of checkerboard interleaving arrangement, column interleaving arrangement, row interleaving arrangement, side-by-side arrangement and top-bottom arrangement.

EEE9.根据列举的示例实施例2至8中的任一项所述的编码方法，其中，基本层处理还包括：对多个视点的图像或视频帧的等量或不等量的样本进行处理。EEE9. The encoding method according to any one of the enumerated example embodiments 2 to 8, wherein the base layer processing further comprises: processing equal or unequal amounts of samples of images or video frames of multiple viewpoints.

EEE10.根据列举的示例实施例2至9中的任一项所述的编码方法，其中，增强层处理采用混合视频编码模型，所述混合视频编码模型符合多个现有的编解码器中的任一个。EEE10. An encoding method according to any one of enumerated example embodiments 2 to 9, wherein enhancement layer processing adopts a hybrid video coding model that conforms to any one of multiple existing codecs.

EEE11.根据列举的示例实施例10所述的编码方法，其中，多个现有的编解码器包括VC-1和H.264/AVC。EEE11. The encoding method of enumerated example embodiment 10, wherein the plurality of existing codecs include VC-1 and H.264/AVC.

EEE12.根据列举的示例实施例2至11中的任一项所述的编码方法，其中，增强层处理还包括：EEE12. The encoding method of any one of Enumerated Example Embodiments 2 to 11, wherein enhancement layer processing further comprises:

通过根据相同的图像或帧中的样本或者根据来自相同的增强层中的之前的已解码帧的样本预测至少一个预测图像，来生成至少一个参考图像或视频帧，以及generating at least one reference picture or video frame by predicting at least one predicted picture from samples in the same picture or frame or from samples from a previously decoded frame in the same enhancement layer, and

将至少一个参考图像或视频帧存储在至少一个增强层参考图片缓冲区中。At least one reference image or video frame is stored in at least one enhancement layer reference picture buffer.

EEE13.根据列举的示例实施例12所述的编码方法，其中，生成至少一个参考图像或视频帧还包括：对至少一个预测残差图像与至少一个预测图像之和进行解复用和参考处理。EEE13. The encoding method according to enumerated example embodiment 12, wherein generating at least one reference image or video frame further comprises: demultiplexing and reference processing the sum of at least one prediction residual image and at least one prediction image.

EEE14.根据列举的示例实施例13所述的编码方法，其中，解复用还包括：对至少一个预测残差图像与至少一个预测图像之和进行上采样和插值。EEE14. The encoding method according to enumerated example embodiment 13, wherein the demultiplexing further comprises: upsampling and interpolating the sum of at least one prediction residual image and at least one prediction image.

EEEl5.根据列举的示例实施例2至14中的任一项所述的编码方法，其中，至少一个增强层参考图片缓冲区之一以与至少一个增强层参考图片缓冲区中的其余缓冲区不同的分辨率来存储图像或视频帧。EEEl5. An encoding method according to any one of enumerated example embodiments 2 to 14, wherein one of the at least one enhancement layer reference picture buffers stores images or video frames at a different resolution than the remaining buffers in the at least one enhancement layer reference picture buffer.

EEE16.根据列举的示例实施例1至15中的任一项所述的编码方法，其中，增强层处理还包括：EEE16. The encoding method of any one of Enumerated Example Embodiments 1 to 15, wherein enhancement layer processing further comprises:

针对多个数据类别中的每个，从至少一个增强层参考图片缓冲区中选择至少一个参考图像；selecting, for each of a plurality of data categories, at least one reference picture from at least one enhancement layer reference picture buffer;

将所选择的至少一个参考图像增强层采样和增强层复用为至少一个帧兼容图像；以及multiplexing the selected at least one reference picture enhancement layer sample and the enhancement layer into at least one frame-compatible picture; and

基于至少一个帧兼容图像执行视差补偿或帧间预测。Disparity compensation or inter-frame prediction is performed based on at least one frame-compatible image.

EEE17.根据列举的示例实施例16所述的编码方法，其中，EEE17. The encoding method according to enumerated example embodiment 16, wherein:

对所选择的至少一个参考图像的增强层采样包括水平采样、垂直采样或梅花形采样中的任一种；以及Sampling the enhancement layer of the selected at least one reference picture includes any one of horizontal sampling, vertical sampling, or quincunx sampling; and

对所选择的至少一个参考图像的增强层复用使用棋盘交织布置、列交织布置、行交织布置、并排布置和上下布置中的任一种帧兼容打包布置。The enhancement layer of the selected at least one reference image is multiplexed using any one of a checkerboard interleaving arrangement, a column interleaving arrangement, a row interleaving arrangement, a side-by-side arrangement, and a top-and-bottom arrangement for frame-compatible packing.

EEE18.根据列举的示例实施例16或17所述的编码方法，其中，增强层处理还包括：EEE18. The encoding method of Enumerated Example Embodiment 16 or 17, wherein the enhancement layer processing further comprises:

对来自基本层中的参考缓冲区的至少一个基本层已解码帧兼容图像进行基本层至增强层(BL至EL)处理；performing base layer to enhancement layer (BL to EL) processing on at least one base layer decoded frame-compatible picture from a reference buffer in the base layer;

将经BL至EL处理的至少一个基本层已解码帧兼容图像存储在至少一个增强层参考图片缓冲区中。The at least one base layer decoded frame-compatible picture processed by BL to EL is stored in at least one enhancement layer reference picture buffer.

EEE19.根据列举的示例实施例18所述的编码方法，其中，BL至EL处理包括对至少一个基本层已解码帧兼容图像进行参考处理、解复用和上采样。EEE19. The encoding method according to enumerated example embodiment 18, wherein BL to EL processing includes reference processing, demultiplexing and upsampling of at least one base layer decoded frame-compatible image.

EEE20.根据列举的示例实施例19所述的编码方法，其中，BL至EL处理还包括：对至少一个基本层已解码帧兼容图像进行滤波、上尺度和插值。EEE20. The encoding method according to enumerated example embodiment 19, wherein the BL to EL processing further comprises: filtering, upscaling, and interpolating at least one base layer decoded frame-compatible image.

EEE21.根据列举的示例实施例18至20中的任一项所述的编码方法，其中，增强层处理还包括：将与BL至EL处理有关的信息提供给相应的解码器。EEE21. The encoding method according to any one of enumerated example embodiments 18 to 20, wherein the enhancement layer processing further comprises: providing information related to the BL to EL processing to a corresponding decoder.

EEE22.根据列举的示例实施例1至21中的任一项所述的编码方法，其中，针对多个数据类别中的每个数据类别，至少一个增强层参考图片缓冲区以相同的或不同的增强分辨率存储图像。EEE22. An encoding method according to any one of enumerated example embodiments 1 to 21, wherein, for each data category of a plurality of data categories, at least one enhancement layer reference picture buffer stores images at the same or different enhancement resolutions.

EEE23.根据列举的示例实施例1至22中的任一项所述的编码方法，其中，至少一个增强层参考图片缓冲区以与原始输入图像相同的分辨率存储图像。EEE23. An encoding method according to any one of enumerated example embodiments 1 to 22, wherein at least one enhancement layer reference picture buffer stores images at the same resolution as the original input image.

EEE24.根据列举的示例实施例1至23中的任一项所述的编码方法，其中，基本层复用和增强层复用使用相同的帧兼容打包布置。EEE24. An encoding method according to any one of enumerated example embodiments 1 to 23, wherein base layer multiplexing and enhancement layer multiplexing use the same frame-compatible packing arrangement.

EEE25.根据18至24中任一项所述的编码方法，其中，基本层处理和增强层处理以不同的空间分辨率对图像进行处理。EEE25. An encoding method according to any one of 18 to 24, wherein the base layer processing and the enhancement layer processing process the image at different spatial resolutions.

EEE26.根据列举的示例实施例25所述的编码方法，其中，BL至EL处理包括：EEE26. The encoding method of Enumerated Example Embodiment 25, wherein the BL to EL processing comprises:

对至少一个基本层解码帧兼容图像进行解复用；demultiplexing at least one base layer decoded frame compatible picture;

对经解复用的至少一个基本层已解码帧兼容图像进行插值；以及interpolating the demultiplexed at least one base layer decoded frame-compatible image; and

将经插值和解复用的至少一个基本层解码帧兼容图像重新缩放至目标空间分辨率。The interpolated and demultiplexed at least one base layer decoded frame-compatible image is rescaled to a target spatial resolution.

EEE27.根据列举的示例实施例25所述的编码方法，其中，BL至EL处理包括：EEE27. The encoding method of Enumerated Example Embodiment 25, wherein the BL to EL processing comprises:

对至少一个基本层已解码帧兼容图像进行解复用；demultiplexing at least one base layer decoded frame-compatible picture;

将经解复用的至少一个基本层解码帧兼容图像重新缩放至不同的空间分辨率；以及rescaling the demultiplexed at least one base layer decoded frame-compatible picture to a different spatial resolution; and

将经解复用和重新缩放的至少一个基本层解码帧兼容图像插值至目标空间分辨率。The demultiplexed and rescaled at least one base layer decoded frame-compatible image is interpolated to a target spatial resolution.

EEE28.根据列举的示例实施例27所述的编码方法，还包括：将经解复用和重新缩放的至少一个基本层已解码帧兼容图像滤波至目标空间分辨率。EEE28. The encoding method according to enumerated example embodiment 27 further includes: filtering the demultiplexed and rescaled at least one base layer decoded frame-compatible image to a target spatial resolution.

EEE29.根据列举的示例实施例1至28中的任一项所述的编码方法，其中，增强层处理还包括：通过存储管理控制操作(MMCO)对至少一个增强层参考图片缓冲区进行控制。EEE29. The encoding method according to any one of the enumerated example embodiments 1 to 28, wherein the enhancement layer processing further comprises: controlling at least one enhancement layer reference picture buffer through a memory management control operation (MMCO).

EEE30.根据列举的示例实施例29所述的编码方法，其中，根据MMCO 的一个或更多个集合，使至少一个增强层参考图片缓冲区中的每个同步。EEE30. The encoding method of Enumerated Example Embodiment 29, wherein each of the at least one enhancement layer reference picture buffer is synchronized according to one or more sets of MMCOs. EEE31.

EEE31.根据列举的示例实施例29或30所述的编码方法，其中，增强层处理还包括：将与MMCO有关的信息提供给相应的解码器。EEE31. The encoding method according to enumerated example embodiments 29 or 30, wherein the enhancement layer processing further comprises: providing information related to MMCO to a corresponding decoder.

EEE32.根据列举的示例实施例1至31中任一项所述的编码方法，其中，基本层处理对在第一频率范围内的图像内容进行编码，并且，增强层处理对在第二频率范围内的图像内容进行编码。EEE32. An encoding method according to any one of the enumerated example embodiments 1 to 31, wherein the base layer processing encodes the image content within a first frequency range, and the enhancement layer processing encodes the image content within a second frequency range.

EEE33.根据列举的示例实施例1至15中任一项所述的编码方法，其中，增强层处理还包括：EEE33. The encoding method of any one of Enumerated Example Embodiments 1 to 15, wherein the enhancement layer processing further comprises:

通过基于至少一个参考图像执行视差补偿或帧间预测来获取多个数据类别中的每个的至少一个补偿图像；以及acquiring at least one compensated image for each of a plurality of data categories by performing disparity compensation or inter-frame prediction based on at least one reference image; and

将至少一个补偿图像增强层采样和增强层复用为至少一个帧兼容图像。At least one compensated image enhancement layer sample and the enhancement layer are multiplexed into at least one frame-compatible image.

EEE34.根据列举的示例实施例33所述的编码方法，其中，增强层处理还包括：通过存储管理控制操作(MMCO)对至少一个增强层参考图片缓冲区进行控制。EEE34. The encoding method according to enumerated example embodiment 33, wherein the enhancement layer processing further comprises: controlling at least one enhancement layer reference picture buffer through a memory management control operation (MMCO).

EEE35.根据列举的示例实施例34所述的编码方法，其中，增强层处理还包括：将与MMCO有关的信息提供给相应的解码器。EEE35. The encoding method according to enumerated example embodiment 34, wherein the enhancement layer processing further comprises: providing information related to MMCO to a corresponding decoder.

EEE36.根据列举的示例实施例33至35中的任一项所述的编码方法，其中，至少一个增强层参考图片缓冲区针对多个数据类别中的每个以相同的或不同的增强分辨率存储图像。EEE36. An encoding method according to any one of enumerated example embodiments 33 to 35, wherein at least one enhancement layer reference picture buffer stores images at the same or different enhanced resolutions for each of a plurality of data categories.

EEE37.根据列举的示例实施例33至36中的任一项所述的编码方法，其中，至少一个增强层参考图片缓冲区以与原始输入图像相同的分辨率存储图像。EEE37. An encoding method according to any one of enumerated example embodiments 33 to 36, wherein at least one enhancement layer reference picture buffer stores images at the same resolution as the original input image.

EEE38.根据列举的示例实施例33至37中的任一项所述的编码方法，其中，基本层处理对第一频率范围内的图像内容进行编码，并且增强层处理对第二频率范围内的图像内容进行编码。EEE38. An encoding method according to any one of enumerated example embodiments 33 to 37, wherein the base layer processing encodes image content within a first frequency range and the enhancement layer processing encodes image content within a second frequency range.

EEE39.一种用于多层帧兼容视频传输的编码方法，包括：EEE39. A coding method for multi-layer frame-compatible video transmission, comprising:

b)通过一个或更多个增强层对多个数据类别的图像或视频帧进行增强层处理，其中，多个数据类别中的每个在单独的增强层中被单独地处理，一个或更多个增强层中的每个包括：b) performing enhancement layer processing on images or video frames of the plurality of data categories via one or more enhancement layers, wherein each of the plurality of data categories is processed separately in a separate enhancement layer, each of the one or more enhancement layers comprising:

i)针对多个数据类别之一提供图像或视频的增强层表示；i) providing an enhancement layer representation of an image or video for one of a plurality of data categories;

ii)维持每个增强层中的增强层参考图片缓冲区；ii) maintaining an enhancement layer reference picture buffer in each enhancement layer;

iii)针对对于所述基本层或不同的增强层的至少一个依赖关系进行参考处理；以及iii) performing reference processing for at least one dependency on the base layer or a different enhancement layer; and

iv)执行运动或视差补偿。iv) Perform motion or parallax compensation.

EEE40.根据列举的示例实施例39所述的编码方法，其中，多个数据类别包括用于立体图像或视频的多个视点。EEE40. The encoding method of enumerated example embodiment 39, wherein the plurality of data categories comprises a plurality of viewpoints for a stereoscopic image or video.

EEE41.根据列举的示例实施例39或40所述的编码方法，其中，提供多个数据类别的图像或视频帧的基本层帧兼容表示包括：EEE41. The encoding method of Enumerated Example Embodiment 39 or 40, wherein providing a base layer frame-compatible representation of an image or video frame of multiple data categories comprises:

EEE42.根据列举的示例实施例39至41中的任一项所述的编码方法，其中，基本层处理符合多个现有的视频编解码器之一。EEE42. An encoding method according to any one of enumerated example embodiments 39 to 41, wherein the base layer processing complies with one of multiple existing video codecs.

EEE43.根据列举的示例实施例42所述的编码方法，其中，多个现有的视频编解码器包括H.264/AVC、VP8和VC-1。EEE43. The encoding method of enumerated example embodiment 42, wherein the plurality of existing video codecs include H.264/AVC, VP8, and VC-1.

EEE44.根据列举的示例实施例41至43中的任一项所述的编码方法，其中，基本层处理的采样包括：对该多个数据类别的图像或视频帧进行对称或非对称滤波。EEE44. An encoding method according to any one of the enumerated example embodiments 41 to 43, wherein the sampling of the base layer processing includes: symmetrical or asymmetrical filtering of the images or video frames of the multiple data categories.

EEE45.根据列举的示例实施例41至44中的任一项所述的编码方法，其中，基本层采样包括水平采样、垂直采样或梅花形采样。EEE45. The encoding method according to any one of enumerated example embodiments 41 to 44, wherein the base layer samples include horizontal samples, vertical samples or quincunx samples.

EEE46.根据列举的示例实施例41至45中的任一项所述的编码方法，其中，基本层复用使用棋盘交织布置、列交织布置、行交织布置、并排布置和上下布置中的任一种帧兼容打包布置。EEE46. An encoding method according to any one of the enumerated example embodiments 41 to 45, wherein the base layer multiplexing uses any one of the frame-compatible packing arrangements of checkerboard interleaving arrangement, column interleaving arrangement, row interleaving arrangement, side-by-side arrangement and top-bottom arrangement.

EEE47.根据列举的示例实施例40至46中的任一项所述的编码方法，其中，基本层处理还包括：对多个视点的图像或视频帧的等量或不等量的样本进行处理。EEE47. The encoding method according to any one of the enumerated example embodiments 40 to 46, wherein the base layer processing further comprises: processing equal or unequal amounts of samples of images or video frames of multiple viewpoints.

EE48.根据列举的示例实施例40至47中的任一项所述的编码方法，其中，增强层处理采用混合视频编码模型，混合视频编码模型符合多个现有的编解码器中的任一个。EE48. The encoding method according to any one of enumerated example embodiments 40 to 47, wherein enhancement layer processing adopts a hybrid video coding model that conforms to any one of a plurality of existing codecs.

EEE49.根据列举的示例实施例48所述的编码方法，其中，多个现有的编解码器包括VC-1和H.264/AVC。EEE49. The encoding method of enumerated example embodiment 48, wherein the plurality of existing codecs include VC-1 and H.264/AVC.

EEE50.根据列举的示例实施例39至49中的任一项所述的编码方法，其中，运动或视差补偿的执行基于存储在增强层参考图片缓冲区中的至少一个图像。EEE50. The encoding method of any one of enumerated example embodiments 39 to 49, wherein motion or disparity compensation is performed based on at least one image stored in an enhancement layer reference picture buffer.

EEE51.根据列举的示例实施例39至49中的任一项所述的编码方法，其中，增强层处理还包括：EEE51. The encoding method of any one of Enumerated Example Embodiments 39 to 49, wherein enhancement layer processing further comprises:

EEE52.根据列举的示例实施例51所述的编码方法，其中，BL至EL处理包括对至少一个基本层已解码帧兼容图像进行参考处理、解复用和上采样。EEE52. The encoding method of enumerated example embodiment 51, wherein BL to EL processing comprises reference processing, demultiplexing, and upsampling of at least one base layer decoded frame-compatible image.

EEE53.根据列举的示例实施例52所述的编码方法，其中，对一个或更多个增强层中的每个的BL至EL处理是由单个处理单元处理的，所述处理单元共同执行和优化对至少一个基本层已解码帧兼容图像的参考处理、解复用和上采样。EEE53. An encoding method according to enumerated example embodiment 52, wherein BL to EL processing for each of one or more enhancement layers is handled by a single processing unit that jointly performs and optimizes reference processing, demultiplexing and upsampling of at least one base layer decoded frame-compatible image.

EEE54.根据列举的示例实施例39至53中的任一项所述的编码方法，其中，增强层处理还包括：EEE54. The encoding method of any one of Enumerated Example Embodiments 39 to 53, wherein enhancement layer processing further comprises:

对存储在不同的增强层中的增强层参考图片缓冲区中的至少一个参考图像进行增强层至增强层(EL至EL)处理；performing enhancement layer to enhancement layer (EL to EL) processing on at least one reference picture stored in an enhancement layer reference picture buffer in a different enhancement layer;

将经EL至EL处理的至少一个增强层参考图像存储在该至少一个增强层参考图像的至少一个增强层参考图片缓冲区中。The at least one EL-to-EL processed enhancement layer reference picture is stored in at least one enhancement layer reference picture buffer of the at least one enhancement layer reference picture.

EEE55.根据列举的示例实施例51至54中的任一项所述的编码方法，其中，BL至EL处理还包括：将经BL至EL处理的至少一个基本层解码帧兼容图像重新缩放至目标空间分辨率。EEE55. An encoding method according to any one of enumerated example embodiments 51 to 54, wherein the BL to EL processing further comprises: rescaling at least one base layer decoded frame compatible image processed by the BL to EL to a target spatial resolution.

EEE56.根据列举的示例实施例54或55所述的编码方法，其中，EL至 EL处理还包括：将经EL至EL处理的至少一个增强层参考图像重新缩放至目标空间分辨率。EEE56. The encoding method according to enumerated example embodiments 54 or 55, wherein the EL to EL processing further comprises: rescaling at least one enhancement layer reference image processed by EL to EL to the target spatial resolution.

EEE57.根据列举的示例实施例1所述的编码方法，其中，提供多个数据类别的图像或视频帧的增强层帧兼容表示包括：EEE57. The encoding method of Enumerated Example Embodiment 1, wherein providing enhancement layer frame-compatible representations of image or video frames of multiple data categories comprises:

对多个数据类别的原始全分辨率图像的差分图像和至少一个基本层至增强层(BL至EL)预测进行滤波、采样和复用。Differential images of original full-resolution images and at least one base layer to enhancement layer (BL to EL) prediction of multiple data classes are filtered, sampled, and multiplexed.

EEE58.根据列举的示例实施例57所述的编码方法，其中，BL至EL预测通过对来自所述基本层的至少一个帧兼容已解码图片进行解复用和参考处理来获得。EEE58. The encoding method of enumerated example embodiment 57, wherein the BL to EL prediction is obtained by demultiplexing and referencing at least one frame-compatible decoded picture from the base layer. EEE59. The encoding method of enumerated example embodiment 57, wherein the BL to EL prediction is obtained by demultiplexing and referencing at least one frame-compatible decoded picture from the base layer.

EEE59.根据列举的示例实施例58所述的编码方法，其中，所述解复用和参考处理还包括：滤波、插值或重新缩放。EEE59. The encoding method according to enumerated example embodiment 58, wherein the demultiplexing and reference processing further comprises: filtering, interpolation or rescaling.

EEE60.根据列举的示例实施例57所述的编码方法，其中，针对所述多个数据类别中的每个执行所述运动或视差补偿。EEE60. The encoding method of enumerated example embodiment 57, wherein the motion or disparity compensation is performed for each of the plurality of data categories. EEE61.

EEE61.根据列举的示例实施例39所述的编码方法，其中，针对多个数据类别之一提供图像或视频的增强层表示包括：针对多个数据类别中的一个，获取至少一个原始的全分辨率图像的至少一个差分图像和至少一个基本层至增强层(BL至EL)预测。EEE61. An encoding method according to enumerated example embodiment 39, wherein providing an enhancement layer representation of an image or video for one of multiple data categories includes: obtaining at least one differential image and at least one base layer to enhancement layer (BL to EL) prediction of at least one original full-resolution image for one of the multiple data categories.

EEE62.根据列举的示例实施例61所述的编码方法，其中，所述至少一个BL至EL预测通过对来自基本层的至少一个帧兼容已解码图片进行解复用和参考处理来获得。EEE62. The encoding method of enumerated example embodiment 61, wherein the at least one BL to EL prediction is obtained by demultiplexing and referencing at least one frame-compatible decoded picture from a base layer. EEE63.

EEE63.根据列举的示例实施例62所述的编码方法，其中，解复用和参考处理还包括：滤波、插值或重新缩放。EEE63. The encoding method according to enumerated example embodiment 62, wherein demultiplexing and reference processing further comprises filtering, interpolation or rescaling.

EEE64.一种用于多层帧兼容视频传输的解码方法，包括：EEE64. A decoding method for multi-layer frame-compatible video transmission, comprising:

a)通过基本层对多个基本层比特流信号进行基本层处理，包括：a) performing base layer processing on a plurality of base layer bit stream signals through a base layer, comprising:

i)提供至少一个帧兼容基本层解码图像或视频帧；以及i) providing at least one frame-compatible base layer decoded picture or video frame; and

b)通过一个或更多个增强层对多个增强比特流信号进行增强层处理，包括：b) performing enhancement layer processing on the plurality of enhanced bitstream signals through one or more enhancement layers, comprising:

i)针对多个数据类别提供至少一个增强层已解码图像或视频帧；i) providing at least one enhancement layer decoded picture or video frame for a plurality of data categories;

iv)执行视差补偿，iv) performing parallax compensation,

其中，在相同的增强层中对所有的多个数据类别进行解码和处理。Therein, all multiple data categories are decoded and processed in the same enhancement layer.

EEE65.根据列举的示例实施例64所述的解码方法，其中，多个数据类别包括用于立体图像或视频的多个视点。EEE65. The decoding method of enumerated example embodiment 64, wherein the plurality of data categories comprises a plurality of viewpoints for a stereoscopic image or video.

EEE66.根据列举的示例实施例64或65所述的解码方法，其中，基本层处理符合多个现有的视频编解码器之一。EEE66. A decoding method according to enumerated example embodiments 64 or 65, wherein the base layer processing complies with one of multiple existing video codecs.

EEE67.根据列举的示例实施例66所述的解码方法，其中，多个现有的视频编解码器包括H.264/AVC、VP8和VC-1。EEE67. The decoding method of enumerated example embodiment 66, wherein the plurality of existing video codecs include H.264/AVC, VP8, and VC-1.

EEE68.根据列举的示例实施例64至67中的任一项所述的解码方法，其中，增强层处理采用混合视频编码模型，混合视频编码模型符合多个现有的编解码器中的任一个。EEE68. A decoding method according to any one of enumerated example embodiments 64 to 67, wherein enhancement layer processing adopts a hybrid video coding model that conforms to any one of multiple existing codecs.

EEE69.根据列举的示例实施例68所述的解码方法，其中，多个现有的编解码器包括VC-1和H.264/AVC。EEE69. The decoding method of enumerated example embodiment 68, wherein the plurality of existing codecs include VC-1 and H.264/AVC.

EEE70.根据列举的示例实施例64至69中的任一项所述的解码方法，其中，增强层处理还包括：EEE70. The decoding method of any one of Enumerated Example Embodiments 64 to 69, wherein enhancement layer processing further comprises:

通过根据相同的图像或帧中的样本或根据相同的增强层中的之前的已解码帧的样本预测至少一个预测图像来生成至少一个参考图像或视频帧，以及generating at least one reference picture or video frame by predicting at least one predicted picture from samples in the same picture or frame or from samples of a previously decoded frame in the same enhancement layer, and

EEE71.根据列举的示例实施例70所述的解码方法，其中，生成至少一个参考图像或视频帧还包括：对从多个增强比特流信号解码的至少一个图像与至少一个预测图像之和进行解复用和参考处理。EEE71. The decoding method according to enumerated example embodiment 70, wherein generating at least one reference image or video frame further comprises: demultiplexing and reference processing the sum of at least one image decoded from multiple enhanced bitstream signals and at least one predicted image.

EEE72.根据列举的示例实施例71所述的解码方法，其中，解复用还包括：对至少一个预测残差图像与至少一个预测图像之和进行上采样和插值。EEE72. The decoding method according to enumerated example embodiment 71, wherein the demultiplexing further comprises: upsampling and interpolating the sum of at least one prediction residual image and at least one prediction image.

EEE73.根据列举的示例实施例65至72中的任一项所述的解码方法，其中，至少一个增强层参考图片缓冲区之一以与所述至少一个增强层参考图片缓冲区中的其余缓冲区不同的分辨率来存储图像或视频帧。EEE73. A decoding method according to any one of enumerated example embodiments 65 to 72, wherein one of the at least one enhancement layer reference picture buffers stores images or video frames at a different resolution than the remaining buffers in the at least one enhancement layer reference picture buffer.

EEE74.根据列举的示例实施例64至73中的任一项所述的解码方法，其中，所述增强层处理还包括：EEE74. The decoding method of any one of Enumerated Example Embodiments 64 to 73, wherein the enhancement layer processing further comprises:

所述多个数据类别中的每个，从至少一个增强层参考图片缓冲区中选择至少一个参考图像；For each of the plurality of data categories, at least one reference picture is selected from at least one enhancement layer reference picture buffer;

EEE75.根据列举的示例实施例73或74所述的解码方法，其中，增强层处理还包括：EEE75. The decoding method of Enumerated Example Embodiment 73 or 74, wherein the enhancement layer processing further comprises:

EEE76.根据列举的示例实施例75所述的解码方法，其中，BL至EL处理包括对至少一个基本层已解码帧兼容图像进行参考处理、解复用和上采样。EEE76. The decoding method of enumerated example embodiment 75, wherein BL to EL processing comprises reference processing, demultiplexing, and upsampling of at least one base layer decoded frame-compatible image.

EEE77.根据列举的示例实施例76所述的解码方法，其中，BL至EL处理还包括：对至少一个基本层已解码帧兼容图像进行滤波、上尺度和插值。EEE77. The decoding method of enumerated example embodiment 76, wherein the BL to EL processing further comprises filtering, upscaling, and interpolating at least one base layer decoded frame-compatible image.

EEE78.根据列举的示例实施例75至77中的任一项所述的解码方法，其中，所述增强层处理还包括从相应的编码器接收与BL至EL处理有关的信息。EEE78. The decoding method of any one of enumerated example embodiments 75 to 77, wherein the enhancement layer processing further comprises receiving information related to BL to EL processing from a corresponding encoder. ...

EEE79.根据列举的示例实施例75至78中的任一项所述的解码方法，其中，基本层处理和增强层处理以不同的空间分辨率对图像进行处理。EEE79. A decoding method according to any one of enumerated example embodiments 75 to 78, wherein the base layer processing and the enhancement layer processing process the image at different spatial resolutions.

EEE80.根据列举的示例实施例79所述的解码方法，其中，BL至EL处理包括：EEE80. The decoding method of Enumerated Example Embodiment 79, wherein the BL to EL processing comprises:

将经插值和解复用的至少一个基本层已解码帧兼容图像重新缩放至目标空间分辨率。The interpolated and demultiplexed at least one base layer decoded frame-compatible image is rescaled to a target spatial resolution.

EEE81.根据列举的示例实施例79所述的解码方法，其中，BL至EL处理包括：EEE81. The decoding method of Enumerated Example Embodiment 79, wherein the BL to EL processing comprises:

将经解复用的至少一个基本层已解码帧兼容图像重新缩放至不同的空间分辨率；以及rescaling the demultiplexed at least one base layer decoded frame-compatible picture to a different spatial resolution; and

将经解复用和重新缩放的至少一个基本层已解码帧兼容图像插值至目标空间分辨率。The demultiplexed and rescaled at least one base layer decoded frame-compatible image is interpolated to a target spatial resolution.

EEE82.根据列举的示例实施例81所述的解码方法，还包括：将经解复用和重新缩放的至少一个基本层已解码帧兼容图像滤波至目标空间分辨率。EEE82. The decoding method according to enumerated example embodiment 81 further includes: filtering the demultiplexed and rescaled at least one base layer decoded frame-compatible image to a target spatial resolution.

EEE83.根据列举的示例实施例64至82中的任一项所述的解码方法，其中，增强层处理还包括：通过存储管理控制操作(MMCO)对至少一个增强层参考图片缓冲区进行控制。EEE83. The decoding method of any one of enumerated example embodiments 64 to 82, wherein enhancement layer processing further comprises controlling at least one enhancement layer reference picture buffer via a memory management control operation (MMCO).

EEE84.根据列举的示例实施例83所述的解码方法，其中，根据MMCO 的一个或更多个集合，使至少一个增强层参考图片缓冲区中的每个同步。EEE84. The decoding method of Enumerated Example Embodiment 83, wherein each of the at least one enhancement layer reference picture buffers is synchronized according to one or more sets of MMCOs. EEE85. The decoding method of Enumerated Example Embodiment 83, wherein each of the at least one enhancement layer reference picture buffers is synchronized according to one or more sets of MMCOs.

EEE85.根据列举的示例实施例83或84所述的解码方法，其中，增强层处理还包括：从相应的编码器接收与所述MMCO有关的信息。EEE85. The decoding method of enumerated example embodiment 83 or 84, wherein enhancement layer processing further comprises: receiving information related to the MMCO from a corresponding encoder.

EEE86.根据列举的示例实施例64至67中的任一项所述的解码方法，其中，增强层处理还包括：EEE86. The decoding method of any one of Enumerated Example Embodiments 64 to 67, wherein enhancement layer processing further comprises:

将所述至少一个补偿图像增强层采样和增强层复用为至少一个帧兼容图像。The at least one compensated image enhancement layer sample and the enhancement layer are multiplexed into at least one frame-compatible image.

EEE87.根据列举的示例实施例86所述的解码方法，其中，增强层处理还包括：通过存储管理控制操作(MMCO)对至少一个增强层参考图片缓冲区进行控制。EEE87. The decoding method of enumerated example embodiment 86, wherein the enhancement layer processing further comprises controlling at least one enhancement layer reference picture buffer via a memory management control operation (MMCO).

EEE88.根据列举的示例实施例87所述的解码方法，其中，增强层处理还包括从相应的编码器接收与MMCO有关的信息。EEE88. The decoding method of enumerated example embodiment 87, wherein enhancement layer processing further comprises receiving information related to MMCO from a corresponding encoder. ...

EEE89.一种用于多层帧兼容视频传输的解码方法，包括：EEE89. A decoding method for multi-layer frame-compatible video transmission, comprising:

a)通过基本层对通过基本层的多个基本层比特流信号进行基本层处理，包括：a) performing base layer processing on a plurality of base layer bit stream signals passing through the base layer, comprising:

b)通过一个或更多个增强层，对多个数据类别的、通过一个或更多个增强层的多个增强比特流信号进行增强层处理，其中，所述多个数据类别中的每个在单独的增强层中被单独地处理，所述一个或更多个增强层中的每个包括：b) performing enhancement layer processing on a plurality of enhancement bitstream signals of a plurality of data categories through the one or more enhancement layers, wherein each of the plurality of data categories is processed separately in a separate enhancement layer, each of the one or more enhancement layers comprising:

i)针对多个数据类别之一提供至少一个增强层已解码图像或视频帧；i) providing at least one enhancement layer decoded picture or video frame for one of a plurality of data categories;

iii)对对于基本层或不同的增强层的至少一个依赖性进行参考处理；以及iii) referencing at least one dependency on a base layer or a different enhancement layer; and

iv)执行视差补偿，其中，在相同的增强层中对所有的所述多个数据类别进行解码和处理。iv) performing disparity compensation, wherein all of the plurality of data categories are decoded and processed in the same enhancement layer.

EEE90.根据列举的示例实施例89所述的解码方法，其中，多个数据类别包括用于立体图像或视频的多个视点。EEE90. The decoding method of enumerated example embodiment 89, wherein the plurality of data categories comprises a plurality of viewpoints for a stereoscopic image or video.

EEE91.根据列举的示例实施例89或90所述的解码方法，其中，基本层处理符合多个现有的视频编解码器之一。EEE91. A decoding method according to enumerated example embodiments 89 or 90, wherein the base layer processing complies with one of multiple existing video codecs.

EEE92.根据列举的示例实施例91所述的解码方法，其中，多个现有的视频编解码器包括H.264/AVC、VP8和VC-1。EEE92. The decoding method of enumerated example embodiment 91, wherein the plurality of existing video codecs include H.264/AVC, VP8, and VC-1.

EEE93.根据列举的示例实施例89至92中的任一项所述的解码方法，其中，编码层处理采用混合视频编码模型，所述混合视频编码模型符合多个现有的编解码器中的任一个。EEE93. A decoding method according to any one of the enumerated example embodiments 89 to 92, wherein the coding layer processing adopts a hybrid video coding model that conforms to any one of multiple existing codecs.

EEE94.根据列举的示例实施例93所述的解码方法，其中，多个现有的编解码器包括VC-1和H.264/AVC。EEE94. The decoding method of enumerated example embodiment 93, wherein the plurality of existing codecs include VC-1 and H.264/AVC.

EEE95.根据列举的示例实施例89至94中的任一项所述的解码方法，其中，运动或视差补偿的执行基于存储在增强层参考图片缓冲区中的至少一个图像。EEE95. A decoding method according to any one of enumerated example embodiments 89 to 94, wherein motion or disparity compensation is performed based on at least one image stored in an enhancement layer reference picture buffer.

EEE96.根据列举的示例实施例89至95中的任一项所述的解码方法，其中，增强层处理还包括：EEE96. The decoding method of any one of Enumerated Example Embodiments 89 to 95, wherein enhancement layer processing further comprises:

EEE97.根据列举的示例实施例96所述的解码方法，其中，BL至EL处理包括对至少一个基本层已解码帧兼容图像进行参考处理、解复用和上采样。EEE97. The decoding method of enumerated example embodiment 96, wherein BL to EL processing comprises reference processing, demultiplexing, and upsampling of at least one base layer decoded frame-compatible image.

EEE98.根据列举的示例实施例97所述的解码方法，其中，对一个或更多个增强层中的每个的BL至EL处理是由单个处理单元处理的，该处理单元共同执行和优化对至少一个基本层已解码帧兼容图像的参考处理、解复用和上采样。EEE98. A decoding method according to enumerated example embodiment 97, wherein BL to EL processing for each of one or more enhancement layers is handled by a single processing unit that jointly performs and optimizes reference processing, demultiplexing and upsampling of at least one base layer decoded frame-compatible image.

EEE99.根据列举的示例实施例89至98中的任一项所述的解码方法，其中，增强层处理还包括：EEE99. The decoding method of any one of Enumerated Example Embodiments 89 to 98, wherein enhancement layer processing further comprises:

EEE100.根据列举的示例实施例89至99中的任一项所述的解码方法，其中，BL至EL处理还包括：将经BL至EL处理的至少一个基本层已解码帧兼容图像重新缩放至目标空间分辨率。EEE100. The decoding method of any one of enumerated example embodiments 89 to 99, wherein the BL to EL processing further comprises: rescaling the at least one base layer decoded frame-compatible image processed by the BL to EL process to a target spatial resolution.

EEE101.根据列举的示例实施例99或100所述的解码方法，其中，EL 至EL处理还包括：将经EL至EL处理的至少一个增强层参考图像重新缩放至目标空间分辨率。EEE101. The decoding method of Enumerated Example Embodiment 99 or 100, wherein the EL-to-EL processing further comprises: rescaling the at least one EL-to-EL processed enhancement layer reference picture to a target spatial resolution.

EEE102.根据列举的示例实施例89所述的解码方法，其中，针对多个数据类别提供至少一个增强层已解码图像或视频帧包括：EEE102. The decoding method of Enumerated Example Embodiment 89, wherein providing at least one enhancement layer decoded picture or video frame for multiple data categories comprises:

将存储在至少一个增强层参考图片缓冲区中的参考图像添加到至少一个基本层至增强层(BL至EL)预测。A reference picture stored in at least one enhancement layer reference picture buffer is added to at least one base layer to enhancement layer (BL to EL) prediction.

EEE103.根据列举的示例实施例102所述的解码方法，其中，所述至少一个BL至EL预测通过对存储在基本层的参考图片缓冲区中的至少一个帧兼容已解码图片进行解复用和参考处理来获得。EEE103. The decoding method of enumerated example embodiment 102, wherein the at least one BL to EL prediction is obtained by demultiplexing and referencing at least one frame-compatible decoded picture stored in a reference picture buffer of a base layer. EEE104.

EEE104.根据列举的示例实施例102所述的解码方法，其中，针对多个数据类别中的每个执行视差补偿。EEE104. The decoding method of enumerated example embodiment 102, wherein disparity compensation is performed for each of a plurality of data categories. EEE105.

EEE105.根据列举的示例实施例89所述的解码方法，其中，针对多个数据类别之一提供至少一个增强层已解码图像或视频帧包括：将存储在至少一个增强层参考图片缓冲区中的参考图像添加到至少一个基本层至增强层(BL至EL)预测。EEE105. A decoding method according to enumerated example embodiment 89, wherein providing at least one enhancement layer decoded image or video frame for one of a plurality of data categories includes: adding a reference image stored in at least one enhancement layer reference picture buffer to at least one base layer to enhancement layer (BL to EL) prediction.

EEE106.根据列举的示例实施例105所述的解码方法，其中，至少一个 BL至EL预测通过对存储在基本层的参考图片缓冲区中的至少一个帧兼容已解码图片进行解复用和参考处理来获得。EEE106. A decoding method according to enumerated example embodiment 105, wherein at least one BL to EL prediction is obtained by demultiplexing and referencing at least one frame-compatible decoded picture stored in a reference picture buffer of a base layer.

EEE107.根据列举的示例实施例106所述的解码方法，其中，解复用和参考处理还包括：滤波、插值或重新缩放。EEE107. The decoding method according to enumerated example embodiment 106, wherein demultiplexing and reference processing further comprises filtering, interpolation or rescaling. EEE107.

EEE108.一种用于根据列举的示例实施例1至63以及117中的一个或更多个中所述的方法对至少一个图像或视频帧进行编码的编码器。EEE108. An encoder for encoding at least one image or video frame according to the method described in one or more of Enumerated Example Embodiments 1 to 63 and 117.

EEE109.一种用于根据列举的示例实施例1至63以及117中的一个或更多个中所述的方法对至少一个图像或视频帧进行编码的装置。EEE109. An apparatus for encoding at least one image or video frame according to the method described in one or more of Enumerated Example Embodiments 1 to 63 and 117.

EEE110.一种用于根据列举的示例实施例1至63以及117中的一个或更多个中所述的方法对至少一个图像或视频帧进行编码的系统。EEE110. A system for encoding at least one image or video frame according to the method described in one or more of Enumerated Example Embodiments 1 to 63 and 117.

EEE111.一种用于根据列举的示例实施例64至107以及118中的一个或更多个中所述的方法对至少一个图像或视频帧进行解码的解码器。EEE111. A decoder for decoding at least one image or video frame according to the method described in one or more of enumerated example embodiments 64 to 107 and 118. EEE111.

EEE112.一种用于根据列举的示例实施例64至107以及118中的一个或更多个中所述的方法对至少一个图像或视频帧进行解码的装置。EEE112. An apparatus for decoding at least one image or video frame according to the method described in one or more of Enumerated Example Embodiments 64 to 107 and 118.

EEE113.一种用于根据列举的示例实施例64至107以及118中的一个或更多个中所述的方法对至少一个图像或视频帧进行解码的系统。EEE113. A system for decoding at least one image or video frame according to the method described in one or more of enumerated example embodiments 64 to 107 and 118.

EEE114.一种包括指令的集合的计算机可读介质，所述指令的集合使得计算机执行根据列举的示例实施例1至107、117以及118中的一个或更多个中所述的方法。EEE114. A computer-readable medium comprising a set of instructions, the set of instructions causing a computer to perform a method according to one or more of enumerated example embodiments 1 to 107, 117, and 118. EEE114.

EEE115.一种将根据列举的示例实施例1至63以及117中的一个或更多个中所述的方法用于对至少一个图像或视频帧进行编码的用途。EEE115. Use of a method according to one or more of enumerated example embodiments 1 to 63 and 117 for encoding at least one image or video frame. EEE115.

EEE116.一种将根据列举的示例实施例64至107以及118中的一个或更多个中所述的方法用于对至少一个图像或视频帧进行解码的用途。EEE116. Use of a method according to one or more of enumerated example embodiments 64 to 107 and 118 for decoding at least one image or video frame. EEE116.

EEE117.根据列举的示例实施例1至38中的任一项所述的编码方法，其中，所述至少一个增强层参考图片缓冲区包括至少两个增强层参考缓冲区。EEE117. The encoding method according to any one of enumerated example embodiments 1 to 38, wherein the at least one enhancement layer reference picture buffer includes at least two enhancement layer reference buffers.

EEE118.根据列举的示例实施例64至88中的任一项所述的解码方法，其中，所述至少一个增强层参考图片缓冲区包括至少两个增强层参考缓冲区。EEE118. The decoding method of any one of enumerated example embodiments 64 to 88, wherein the at least one enhancement layer reference picture buffer comprises at least two enhancement layer reference buffers. ...

本说明书中所提到的所有的专利和出版物可以表示与本公开相关的领域的普通技术人员的水平。本公开中所引用的所有参考文献通过引用而被合并，所引用的程度如同单独将每个文献的全部内容通过引用而合并。All patents and publications mentioned in this specification are indicative of the levels of ordinary skill in the art to which this disclosure relates. All references cited in this disclosure are incorporated by reference to the same extent as if each reference were individually incorporated by reference in its entirety.

提供以上阐述的示例，以向本领域普通技术人员给出对如何做出与使用本公开的用于多层帧兼容视频传输的实施例的完整的公开和描述，并且不意在将发明人考虑的范围限制为其公开内容。本领域普通技术人员可以使用用于执行本公开的上述模式的修改，并且这些修改意在在所附权利要求的范围内。本说明书中所提到的所有专利与出版物可以表示与本公开内容相关的领域的普通技术人员的水平。本公开内容中所引用的所有参考文献通过引而被合并，所引用的程度如同单独将每个文献的全部内容通过引用而合并。The examples set forth above are provided to give those of ordinary skill in the art a complete disclosure and description of how to make and use the embodiments of the present disclosure for multi-layer frame-compatible video transmission, and are not intended to limit the scope of the inventors' consideration of their disclosure. Modifications of the above-described modes for carrying out the present disclosure may be used by those of ordinary skill in the art, and such modifications are intended to be within the scope of the appended claims. All patents and publications mentioned in this specification are indicative of the levels of ordinary skill in the art relevant to the present disclosure. All references cited in this disclosure are incorporated by reference to the same extent as if each reference were individually incorporated by reference in its entirety.

应当理解，本公开不限于具体的方法或系统，其当然可以变化。还应当理解，本文中所使用的术语仅出于描述具体实施例的目的，无意成为限制。如在本说明书和所附权利要求所使用的，除非内容另外清楚地指出，否则单数形式“一个(a)”、“一个(an)”和“该(the)”包括多个指代对象。除非内容另外清楚地指出，否则用语“多个”包括两个或更多个对象。除非另有定义，否则本文中所使用的所有的技术和科学术语具有与本公开相关的领域的普通技术人员所通常理解的含义相同的含义。It should be understood that the present disclosure is not limited to specific methods or systems, which can of course vary. It should also be understood that the terms used herein are for the purpose of describing specific embodiments only and are not intended to be limiting. As used in this specification and the appended claims, unless the content clearly indicates otherwise, the singular forms "a," "an," and "the" include multiple referents. Unless the content clearly indicates otherwise, the term "multiple" includes two or more objects. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those of ordinary skill in the art relevant to the present disclosure.

已经对本公开的大量的实施例进行了描述。然而，应当理解，可以在不偏离本公开的精神和范围的前提下做出各种修改。因此，其他实施例在所附权利要求的范围内。A number of embodiments of the present disclosure have been described. However, it should be understood that various modifications can be made without departing from the spirit and scope of the present disclosure. Therefore, other embodiments are within the scope of the appended claims.

参考文献列表References

[1]D.C.Hutchison,"Introducing DLP 3-D TV",[1]D.C.Hutchison, "Introducing DLP 3-D TV",

http://www.dlp.com/downloads/Introducing DLP 3D HDTV Whitepaper.pdfhttp://www.dlp.com/downloads/Introducing DLP 3D HDTV Whitepaper.pdf

[2]Advanced video coding for generic audiovisual services,[2]Advanced video coding for generic audiovisual services,

http://www.itu.int/rec/recommendation.asp？type＝folders&lang＝e&parent ＝T-REC-H.264,2010年3月.http://www.itu.int/rec/recommendation.asp?type=folders&lang=e&parent=T-REC-H.264, March 2010.

[3]SMPTE 421M,"VC-1Compressed Video Bitstream Format and DecodingProcess",2006年4月.[3]SMPTE 421M, "VC-1Compressed Video Bitstream Format and DecodingProcess", April 2006.

[4]A.Tourapis,P.Pahalawatta,A.Leontaris,K.Stec,and W.Husak,[4]A. Tourapis, P. Pahalawatta, A. Leontaris, K. Stec, and W. Husak,

"Encoding and Decoding Architecture for Format Compatible 3D VideoDelivery,"美国临时专利申请No.61/223,027,2009年7月."Encoding and Decoding Architecture for Format Compatible 3D Video Delivery," U.S. Provisional Patent Application No. 61/223,027, July 2009.

[5]A.Leontaris,A.Tourapis,and P.Pahalawatta,"Enhancement Methods forSampled and Multiplexed Image and Video Data,"美国临时专利申请 No.61/365,743,2010年7月.[5] A.Leontaris, A.Tourapis, and P.Pahalawatta, "Enhancement Methods for Sampled and Multiplexed Image and Video Data," U.S. Provisional Patent Application No. 61/365,743, July 2010.

[6]A.Tourapis,A.Leontaris,P.Pahalawatta,and K.Stec,"DirectedInterpolation/Postprocessing methods for video encoded data,"美国临时专利申请No.61/170,995,2009年4月.[6] A. Tourapis, A. Leontaris, P. Pahalawatta, and K. Stec, "Directed Interpolation/Postprocessing methods for video encoded data," U.S. Provisional Patent Application No. 61/170,995, April 2009.

[7]P.Pahalawatta,A.Tourapis,W.Husak,"Systems and Methods for Multi-Layered Image and Video Delivery Using Reference Processing Signals",美国临时专利申请No.61/362,661,2010年7月.[7] P. Pahalawatta, A. Tourapis, W. Husak, "Systems and Methods for Multi-Layered Image and Video Delivery Using Reference Processing Signals", U.S. Provisional Patent Application No. 61/362,661, July 2010.

Claims

1. A decoding method for multi-layer frame-compatible video transmission, comprising:

a) Performing basic layer processing on multiple basic layer bitstream signals through the basic layer, including:

i) Provide at least one frame-compatible base layer decoded image or video frame; and

b) Enhancement layer processing of multiple enhanced bitstream signals through one or more enhancement layers, including:

ii) Provide at least one enhancement layer for decoded image or video frames for multiple viewpoints;

iii) performing reference processing on at least one frame-compatible base layer decoded image or video frame from the base layer or at least one decoded image or video frame from a different enhancement layer; and

iv) Implement parallax compensation,

In this process, all of the multiple viewpoints are decoded and processed within the same enhancement layer.

The enhancement layer processing further includes:

At least one reference image or video frame is generated by predicting at least one prediction image based on samples from the same image or frame or based on samples from previously decoded frames in the same enhancement layer.

The at least one reference image or video frame is stored in at least one enhancement layer reference image buffer.

2. The method of claim 1, wherein the plurality of viewpoints includes a plurality of viewpoints for stereoscopic images or videos.

3. The method of claim 1, wherein providing at least one frame-compatible base layer decoded image or video frame from the plurality of viewpoints comprises:

Basic layer sampling, wherein the basic layer sampling includes horizontal sampling, vertical sampling, or quincunx sampling; and multiplexing the basic layers of image or video frames from multiple viewpoints into a single frame.

4. The method according to claim 3, wherein the sampling of the base layer processing includes: performing symmetric or asymmetric filtering on the image or video frames of the plurality of viewpoints.

5. The method according to claim 3, wherein the basic layer multiplexing uses any one of the following frame-compatible packing arrangements: checkerboard interleaving, column interleaving, row interleaving, side-by-side arrangement, and top-bottom arrangement.

6. The method according to claim 2, wherein the basic layer processing further comprises: processing equal or unequal amounts of samples of image or video frames from the plurality of viewpoints.

7. The method according to claim 1, wherein generating the at least one reference image or video frame further comprises: demultiplexing and referencing the sum of the at least one predicted residual image and the at least one predicted image.

8. The method of claim 7, wherein the demultiplexing further comprises: upsampling and interpolating the sum of the at least one predicted residual image and the at least one predicted image.

9. The method of claim 8, wherein one of the at least one enhancement layer reference image buffers stores an image or video frame at a different resolution than the other buffers in the at least one enhancement layer reference image buffer.

10. The method of claim 1, wherein the enhancement layer processing further comprises:

For each of the plurality of viewpoints, at least one reference image is selected from the at least one enhancement layer reference image buffer;

Enhancement layer sampling and enhancement layer multiplexing of at least one selected reference image into at least one frame-compatible image; and

Perform disparity compensation or inter-frame prediction based on the at least one frame-compatible image.

11. The method of claim 1, wherein the enhancement layer processing further comprises:

At least one base layer-to-enhancement layer processing is performed on a base layer-decoded frame-compatible image from a reference buffer in the base layer.

At least one base layer decoded frame-compatible image processed from the base layer to the enhancement layer is stored in the reference image buffer of the at least one enhancement layer.

12. The method of claim 11, wherein the base layer to enhancement layer process comprises one or more of the following:

The at least one base layer decoded frame-compatible image is subjected to reference processing, demultiplexing, and upsampling; or

The at least one base layer decoded frame compatible image is filtered, upscaled, and interpolated.