HK1235587B

HK1235587B - Method and apparatus for backward-compatible coding and decoding for video signals

Info

Publication number: HK1235587B
Application number: HK17108908.5A
Authority: HK
Inventors: 苏冠铭; R·阿特肯斯; 陈倩
Original assignee: 杜比实验室特许公司
Priority date: 2013-01-02
Filing date: 2015-10-22
Publication date: 2021-02-05

Description

Method and apparatus for backward compatible encoding and decoding of video signals

本申请是申请号为201380069054.7，申请日为2013年12月4日，题为“用于具有增强动态范围的超高清视频信号的向后兼容编码”的中国发明专利申请的分案申请。This application is a divisional application of the Chinese invention patent application with application number 201380069054.7, application date December 4, 2013, and titled "Backward compatible coding for ultra-high definition video signals with enhanced dynamic range".

相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS

本申请要求以下申请的优先权：2013年1月2日提交的第61/748,411号的美国临时申请、2013年5月8日提交的第61/821,173号的美国临时申请；以及2013年9月26日提交的第61/882,773号的美国临时专利申请，所有这些申请的全部内容都通过引用并入本文。This application claims priority to U.S. Provisional Application No. 61/748,411, filed January 2, 2013; U.S. Provisional Application No. 61/821,173, filed May 8, 2013; and U.S. Provisional Patent Application No. 61/882,773, filed September 26, 2013, all of which are incorporated herein by reference in their entireties.

技术领域Technical Field

本发明总体上涉及图像。更特别地，本发明的实施例涉及具有增强动态范围的高清信号的向后兼容的编码和解码。The present invention relates generally to images and, more particularly, to backward-compatible encoding and decoding of high-definition signals with enhanced dynamic range.

背景技术Background Art

音频和视频压缩是多媒体内容的开发、存储、发布和消费中的关键组成部分。压缩方法的选择涉及编码效率、编码复杂度和延迟之间的权衡。随着处理能力对计算成本的比率增大，使得可以开发出允许更高效的压缩的更复杂的压缩技术。作为例子，在视频压缩中，来自国际标准组织(ISO)的运动图像专家组(MPEG)通过发布MPEG-2、MPEG-4(第2部分)和H.264/AVC(或MPEG-4，第10部分)编码标准来持续地改进最初的MPEG-1视频标准。Audio and video compression is a key component in the development, storage, distribution and consumption of multimedia content. The choice of compression method involves a trade-off between coding efficiency, coding complexity and delay. As the ratio of processing power to computational cost increases, more complex compression techniques that allow for more efficient compression can be developed. As an example, in video compression, the Moving Picture Experts Group (MPEG) from the International Organization for Standardization (ISO) has continuously improved the original MPEG-1 video standard by issuing MPEG-2, MPEG-4 (Part 2) and H.264/AVC (or MPEG-4, Part 10) coding standards.

尽管H.264的压缩效率和成就，被称为高效率视频编码(HEVC)的新一代视频压缩技术现在正在开发之中。HEVC有望提供优于现有的H.264(也被称为AVC)标准的改进的压缩能力，关于HEVC的草稿可在B.Bross、W.-J.Han、G.J.Sullivan、J.-R.Ohm和T.Wiegand的“high efficiency video coding(HEVC)text specification draft 8”,ITU-T/ISO/IECJoint Collaborative Team on Video Coding(JCT-VC)document JCTVC-J1003,July2012中获得，该文献的全部内容通过引用并入本文，现有的H.264标准被发表为“AdvancedVideo Coding for generic audio-visual services”,ITU T Rec.H.264和ISO/IEC14496-10，该标准的全部内容通过引用并入本文。Despite the compression efficiency and achievements of H.264, a new generation of video compression technology called High Efficiency Video Coding (HEVC) is now under development. HEVC is expected to provide improved compression capabilities over the existing H.264 (also known as AVC) standard. A draft of HEVC is available in "High Efficiency Video Coding (HEVC) Text Specification Draft 8" by B. Bross, W.-J. Han, G.J. Sullivan, J.-R. Ohm, and T. Wiegand, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC) document JCTVC-J1003, July 2012, the entire contents of which are incorporated herein by reference. The existing H.264 standard is published as "Advanced Video Coding for generic audio-visual services", ITU T Rec. H.264 and ISO/IEC 14496-10, the entire contents of which are incorporated herein by reference.

视频信号可以用多个参数来表征，诸如位深、颜色空间、色域和分辨率。现代的电视和视频回放设备(例如，蓝光播放器)支持多种分辨率，包括标清(例如，720×480i)和高清(HD)(例如，1920×1080p)。超高清(UHD)是至少具有3,840×2,160分辨率(被称为4KUHD)并且具有高达7680×4320(被称为8K UHD)的选项的下一代分辨率格式。超高清还可以被称为Ultra HD、UHDTV或超高视觉。如本文中所使用的，UHD表示高于HD分辨率的任何分辨率。A video signal can be characterized by multiple parameters, such as bit depth, color space, color gamut, and resolution. Modern televisions and video playback devices (e.g., Blu-ray players) support a variety of resolutions, including standard definition (e.g., 720×480i) and high definition (HD) (e.g., 1920×1080p). Ultra high definition (UHD) is a next-generation resolution format with at least 3,840×2,160 resolution (known as 4K UHD) and options up to 7680×4320 (known as 8K UHD). Ultra high definition may also be referred to as Ultra HD, UHDTV, or ultra-high vision. As used herein, UHD means any resolution higher than HD resolution.

视频信号的特性的另一方面是其动态范围。动态范围(DR)是图像中的强度(例如，亮度，luma)的范围(例如，从最黑暗的暗色到最明亮的亮色)。如本文中所使用的，术语“动态范围”(DR)可以与人类心理视觉系统(HVS)感知图像中的强度(例如，亮度，luma)的范围(例如，从最黑暗的暗色到最明亮的亮色)的能力相关。从这个意义上来说，DR与“参考场景的”强度相关。DR还可以与显示设备充分地或逼近地呈现具有特定广度的强度范围的能力相关。从这个意义上来说，DR与“参考显示器的”强度相关。除非特定意义在本文的描述中的任何地方被明确地指定具有特殊重要性，否则应推断该术语例如可互换地用于两者之中任何一种意义上。Another aspect of the characteristics of a video signal is its dynamic range. Dynamic range (DR) is the range of intensities (e.g., brightness, luma) in an image (e.g., from the darkest dark to the brightest bright). As used herein, the term "dynamic range" (DR) can be related to the ability of the human psychovisual system (HVS) to perceive the range of intensities (e.g., brightness, luma) in an image (e.g., from the darkest dark to the brightest bright). In this sense, DR is related to the intensity "of a reference scene". DR can also be related to the ability of a display device to fully or nearly present an intensity range with a particular breadth. In this sense, DR is related to the intensity "of a reference display". Unless a particular meaning is explicitly designated as having particular importance anywhere in the description herein, it should be inferred that the term can be used interchangeably in either sense, for example.

如本文中所使用的，术语高动态范围(HDR)与跨越人类视觉系统(HVS)的14-15个数量级的DR广度相关。例如，(例如，从统计意义、生物计量意义或眼科意义中的一个或多个上来说)基本正常的适应性强的人具有跨越大约15个数量级的强度范围。具有适应性的人可以感知到少到仅少数几个光子的昏暗光源。然而，同样的这些人可以在沙漠中、在海中或者在雪中感知到正午的太阳的几乎令人痛苦地耀眼的强度(或者甚至望向太阳，但是短暂地望向太阳以防止伤害)。不过，该跨度可供“具有适应性的”人(例如，其HVS具有在其中进行重置和调整的时间段的那些人)使用。As used herein, the term high dynamic range (HDR) is associated with a DR breadth of 14-15 orders of magnitude across the human visual system (HVS). For example, a substantially normal, adaptable person (e.g., in one or more of a statistical, biometric, or ophthalmological sense) has an intensity range spanning approximately 15 orders of magnitude. Adaptable people can perceive dim light sources as few as a few photons. However, these same people can perceive the almost painfully bright intensity of the midday sun in the desert, at sea, or in the snow (or even look toward the sun, but briefly to prevent damage). However, this span is available for "adaptable" people (e.g., those whose HVS has a time period in which to reset and adjust).

相比之下，在其上人类可以同时感知到强度范围中的广泛广度的DR相对于HDR而言可能有所截断。如本文中所使用的，术语“增强动态范围”(EDR)、“视觉动态范围”或“可变动态范围”(VDR)可以单独地或者可互换地与HVS可同时感知的DR相关。如本文中所使用的，EDR可以与跨越5-6个数量级的DR相关。因此，虽然可能相对于真实的场景参考HDR而言有些窄，但是EDR却表示宽泛的DR广度。如本文中所使用的，术语“同时动态范围”可以与EDR相关。In contrast, the DR over which humans can simultaneously perceive a wide breadth of intensity range may be somewhat truncated relative to HDR. As used herein, the terms "enhanced dynamic range" (EDR), "visual dynamic range," or "variable dynamic range" (VDR) may be individually or interchangeably related to the DR that can be simultaneously perceived by an HVS. As used herein, EDR may be related to DR that spans 5-6 orders of magnitude. Thus, while perhaps somewhat narrow relative to true scene-referenced HDR, EDR represents a broad breadth of DR. As used herein, the term "simultaneous dynamic range" may be related to EDR.

为了支持与老式回放设备以及新式HDR或UHD显示技术的向后兼容性，可以使用多个层来将UHD和HDR(或EDR)视频数据从上游设备递送到下游设备。给定这样的多层流，老式解码器可以使用基本层来重构内容的HD SDR版本。高级解码器可以使用基本层和增强层两层来重构内容的UHD EDR版本以在更有能力的显示器上呈现它。如这里的发明人所意识到的，改进的UHD EDR视频编码技术是令人期望的。To support backward compatibility with legacy playback devices and new HDR or UHD display technologies, multiple layers can be used to deliver UHD and HDR (or EDR) video data from upstream devices to downstream devices. Given such a multi-layer stream, legacy decoders can use the base layer to reconstruct the HD SDR version of the content. Advanced decoders can use both the base layer and the enhancement layer to reconstruct the UHD EDR version of the content to present it on a more capable display. As the inventors herein have appreciated, improved UHD EDR video coding techniques are desirable.

本章节中所描述的方法是可以寻求的方法，但是不一定是以前已经设想过或寻求过的方法。因此，除非另有指示，否则不应仅因本章节中所描述的任一方法被包括在本章节中就假定该方法作为现有技术。类似地，对于一种或多种方法被标识的问题不应基于本章节就被假定已经在任何现有技术中被认识到，除非另有指示。The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any approach described in this section is prior art simply because it is included in this section. Similarly, the problems identified for one or more approaches should not be assumed to have been recognized in any prior art based on this section, unless otherwise indicated.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

本发明的实施例在附图的图中以举例的方式、而不是以限制的方式被例示，在附图中，相似的标号指代类似的元件，其中：Embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like references refer to similar elements and in which:

图1描绘根据本发明的实施例的UHD EDR编码系统的示例实现；FIG1 depicts an example implementation of a UHD EDR encoding system according to an embodiment of the present invention;

图2描绘根据本发明的实施例的UHD EDR解码系统的示例实现；FIG2 depicts an example implementation of a UHD EDR decoding system according to an embodiment of the present invention;

图3描绘根据本发明的实施例的图1中所描绘的系统的变型，其中，基本层包括隔行信号(interlaced signal)；3 depicts a variation of the system depicted in FIG. 1 in which the base layer comprises an interlaced signal, according to an embodiment of the present invention;

图4描绘根据本发明的实施例的图2的解码系统的变型，其中，基本层包括隔行视频信号；4 depicts a variation of the decoding system of FIG. 2 in which the base layer comprises an interlaced video signal, according to an embodiment of the present invention;

图5描绘根据本发明的实施例的用于增强层中的残差信号的非线性量化器的示例实现；FIG5 depicts an example implementation of a nonlinear quantizer for a residual signal in an enhancement layer according to an embodiment of the present invention;

图6A描绘根据本发明的实施例的残差像素的自适应预量化处理；和FIG6A depicts an adaptive prequantization process for residual pixels according to an embodiment of the present invention; and

图6B描绘根据本发明的实施例的设置用于残差信号的非线性量化器的下输入边界或上输入边界的自适应处理。6B depicts an adaptive process for setting a lower or upper input bound for a nonlinear quantizer of a residual signal according to an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

本文描述了具有增强动态范围的超高清信号的向后兼容编码。给定可以用如下两个信号表示的输入视频信号：一个信号具有超高清(UHD)分辨率和高或增强动态范围(EDR)，另一个信号具有UHD(或较低)分辨率和标准动态范围(SDR)，这两个信号被编码在向后兼容的分层流中，这使得老式解码器可以提取HD标准动态范围(SDR)信号并且使得新式解码器可以提取UHD EDR信号。This paper describes backward-compatible encoding of ultra-high-definition signals with enhanced dynamic range. Given an input video signal that can be represented by two signals: one with ultra-high-definition (UHD) resolution and high or enhanced dynamic range (EDR), and the other with UHD (or lower) resolution and standard dynamic range (SDR), the two signals are encoded in a backward-compatible layered stream, which allows older decoders to extract the HD standard dynamic range (SDR) signal and newer decoders to extract the UHD EDR signal.

在以下描述中，为了说明的目的，阐述了许多特定细节，以便提供本发明的透彻理解。然而，将显而易见的是，可以在没有这些特定细节的情况下实施本发明。在其它情况下，不对公知的结构和设备进行详尽的描述，以便避免不必要地模糊本发明。In the following description, for illustrative purposes, numerous specific details are set forth to provide a thorough understanding of the present invention. However, it will be apparent that the present invention can be practiced without these specific details. In other cases, well-known structures and apparatus are not described in detail to avoid unnecessarily obscuring the present invention.

概述Overview

本文中所描述的示例实施例涉及具有增强动态范围的超高清信号的向后兼容的编码和解码。给定用如下两个信号表示的输入视频信号：一个信号具有超高清(UHD)分辨率和高或增强动态范围(EDR)，另一个信号具有UHD(或较低)分辨率和标准动态范围(SDR)，这两个信号被编码在向后兼容的分层流中，这使得老式解码器可以提取HD标准动态范围(SDR)信号并且使得新式解码器可以提取UHD EDR信号。响应于基本层HD SDR信号，使用单独的亮度预测模型和色度预测模型产生预测信号。在亮度预测器中，仅基于基本层的亮度像素值计算预测信号的亮度像素值，而在色度预测器中，基于基本层的亮度像素值和色度像素值两者计算预测信号的色度像素值。基于输入的UHD EDR信号和预测信号计算残差信号。分别对基本层信号和残差信号进行编码以形成编码的位流。The example embodiments described herein relate to backward-compatible encoding and decoding of ultra-high-definition signals with enhanced dynamic range. Given an input video signal represented by two signals: one with ultra-high-definition (UHD) resolution and high or enhanced dynamic range (EDR), and the other with UHD (or lower) resolution and standard dynamic range (SDR), the two signals are encoded in a backward-compatible layered stream, which allows legacy decoders to extract the HD standard dynamic range (SDR) signal and allows modern decoders to extract the UHD EDR signal. In response to a base layer HD SDR signal, a prediction signal is generated using separate luma prediction models and chroma prediction models. In the luma predictor, the luma pixel values of the prediction signal are calculated based only on the luma pixel values of the base layer, while in the chroma predictor, the chroma pixel values of the prediction signal are calculated based on both the luma pixel values and the chroma pixel values of the base layer. A residual signal is calculated based on the input UHD EDR signal and the prediction signal. The base layer signal and the residual signal are encoded separately to form an encoded bitstream.

在另一实施例中，接收器对所接收的分层位流进行解复用以产生HD分辨率的、标准动态范围(SDR)的编码的基本层(BL)流和UHD分辨率的、增强动态范围(EDR)的编码的增强层流。使用BL解码器对编码的BL流进行解码以产生HD分辨率的、标准动态范围的解码的BL信号。响应于解码的BL信号，产生预测EDR信号，其中，该预测信号的亮度分量的像素值仅基于解码的BL信号的亮度像素值被预测，而该预测信号的至少一个色度分量的像素值基于解码的BL信号的亮度值和色度值两者被预测。使用EL解码器对编码的EL流进行解码以产生解码的残差信号。响应于解码的残差信号和预测信号，还可以产生输出UHD EDR信号。In another embodiment, a receiver demultiplexes a received layered bitstream to generate an encoded base layer (BL) stream at HD resolution and standard dynamic range (SDR) and an encoded enhancement layer (EDR) stream at UHD resolution and enhanced dynamic range (EDR). The encoded BL stream is decoded using a BL decoder to generate a decoded BL signal at HD resolution and standard dynamic range. In response to the decoded BL signal, a predicted EDR signal is generated, wherein pixel values of a luma component of the prediction signal are predicted based solely on luma pixel values of the decoded BL signal, and pixel values of at least one chroma component of the prediction signal are predicted based on both luma and chroma values of the decoded BL signal. The encoded EL stream is decoded using an EL decoder to generate a decoded residual signal. In response to the decoded residual signal and the prediction signal, an output UHD EDR signal may also be generated.

在另一实施例中，增强层中的残差信号在用非线性量化器量化之前进行自适应预处理。在一个实施例中，如果残差像素值周围的像素的标准差低于阈值，则将这些残差像素值预量化为零。In another embodiment, the residual signal in the enhancement layer is adaptively pre-processed before being quantized with a non-linear quantizer. In one embodiment, residual pixel values are pre-quantized to zero if the standard deviation of pixels surrounding the residual pixel value is below a threshold.

在另一实施例中，根据具有非常大的或非常小的像素值的残差像素的像素连接性的度量，限制非线性量化器的输入范围。In another embodiment, the input range of the non-linear quantizer is limited based on a measure of pixel connectivity of residual pixels having very large or very small pixel values.

在另一实施例中，基于场景中的连续帧序列上的残差像素的极值来设置非线性量化器的参数。In another embodiment, the parameters of the non-linear quantizer are set based on the extreme values of the residual pixels over a sequence of consecutive frames in the scene.

用于超高清EDR信号的编码器Encoder for Ultra HD EDR signals

现有的显示和回放设备，诸如HDTV、机顶盒或蓝光播放器，通常支持高达1080p HD分辨率(例如，以每秒60帧的1920×1080)的信号。对于消费者应用，现在通常以亮度-色度颜色格式使用每一颜色分量每一像素8位的位深对这样的信号进行压缩，在所述亮度-色度颜色格式中，通常，色度分量具有比亮度分量低的分辨率(例如，YCbCr或YUV 4:2:0颜色格式)。因为8位深度和相应的低动态范围，这样的信号通常被称为具有标准动态范围(SDR)的信号。Existing display and playback devices, such as HDTVs, set-top boxes, or Blu-ray players, typically support signals with resolutions up to 1080p HD (e.g., 1920×1080 at 60 frames per second). For consumer applications, such signals are now typically compressed using a bit depth of 8 bits per pixel per color component in a luminance-chrominance color format, where the chrominance components typically have a lower resolution than the luminance component (e.g., YCbCr or YUV 4:2:0 color formats). Because of the 8-bit depth and the corresponding low dynamic range, such signals are often referred to as signals with standard dynamic range (SDR).

随着新的电视标准(诸如超高清(UHD))正被开发，可能可取的是，以老式HDTV解码器和新式UHD解码器都可以处理的格式对具有增强分辨率和/或增强动态范围的信号进行编码。As new television standards, such as Ultra High Definition (UHD), are being developed, it may be desirable to encode signals with enhanced resolution and/or enhanced dynamic range in a format that can be processed by both older HDTV decoders and newer UHD decoders.

图1描绘了支持具有增强动态范围(EDR)的UHD信号的向后兼容的编码的系统的示例实现的实施例。编码器包括基本层(BL)编码器(130)和增强层(EL)编码器(160)。在实施例中，BL编码器130是老式编码器，诸如MPEG-2或H.264编码器，EL编码器160是新式标准编码器，诸如HEVC编码器。为了支持老式BL解码器，BL编码器130通常是8位编码器；然而，EL编码器160可以支持具有如H.264和HEVC标准所指定的更高位深(诸如10位)的输入流。然而，该系统可应用于已知的或未来的编码器的任何组合，而不管它们是基于标准的还是专有的。FIG1 depicts an embodiment of an example implementation of a system that supports backward-compatible encoding of UHD signals with enhanced dynamic range (EDR). The encoder includes a base layer (BL) encoder (130) and an enhancement layer (EL) encoder (160). In an embodiment, the BL encoder 130 is a legacy encoder, such as an MPEG-2 or H.264 encoder, and the EL encoder 160 is a new standard encoder, such as an HEVC encoder. To support legacy BL decoders, the BL encoder 130 is typically an 8-bit encoder; however, the EL encoder 160 can support input streams with higher bit depths (such as 10 bits) as specified by the H.264 and HEVC standards. However, the system is applicable to any combination of known or future encoders, regardless of whether they are standard-based or proprietary.

如图1中所描绘的，输入信号，诸如电影或电视广播，可以用如下两个信号表示：UHD EDR输入(102)和UHD SDR输入(104)。例如，UHD EDR信号(102)可以是HDR照相机捕捉的并且针对EDR显示器进行颜色分级的4K(例如，3840×2160)分辨率信号。相同的信号还可以在4K SDR显示器上进行颜色分级以产生相应的4K SDR信号104。可替代地，可以通过将本领域中已知的色调映射或显示管理技术中的任何一个应用于EDR信号来产生SDR信号104。不失一般性，这两个输入信号通常都可以在RGB颜色空间中使用16位或等同(例如，浮点)位深表示来表示。如本文中所使用的，术语N位信号表示具有一个或多个颜色分量(例如，RGB或YCbCr)的图像或视频信号，其中，这些颜色分量中的任何一个(例如，Y)中的每个像素用N位像素值表示。给定N位表示，每个这样的像素可以取0和2^N-1之间的值。例如，在8位表示中，对于每个颜色分量，每个像素可以取0和255之间的值。As depicted in FIG1 , an input signal, such as a movie or television broadcast, can be represented by two signals: a UHD EDR input (102) and a UHD SDR input (104). For example, the UHD EDR signal (102) can be a 4K (e.g., 3840×2160) resolution signal captured by an HDR camera and color graded for an EDR display. The same signal can also be color graded on a 4K SDR display to produce a corresponding 4K SDR signal 104. Alternatively, the SDR signal 104 can be produced by applying any of the tone mapping or display management techniques known in the art to the EDR signal. Without loss of generality, both input signals can typically be represented in an RGB color space using a 16-bit or equivalent (e.g., floating point) bit depth representation. As used herein, the term N-bit signal refers to an image or video signal having one or more color components (e.g., RGB or YCbCr), where each pixel in any of these color components (e.g., Y) is represented by an N-bit pixel value. Given an N-bit representation, each such pixel can take a value between 0 and ^2N - 1. For example, in an 8-bit representation, each pixel can take a value between 0 and 255 for each color component.

在实施例中，可以将UHD SDR信号104下采样为HD SDR信号(例如，1080p)，然后将该HD SDR信号颜色转换为适合于使用老式8位编码器编码的颜色格式(例如，8位YCbCr 4:2:0颜色格式)。这样的转换可以包括颜色变换(诸如RGB到YCbCr转换115-C)和色度二次采样(例如，4:4:4到4:2:0转换120-C)。因此，HD SDR信号128表示原始UHD EDR信号102的向后兼容的信号表示。信号128可以用BL编码器130编码以产生向后兼容的编码的位流132。BL编码器130可以使用已知的或未来的视频压缩算法(诸如MPEG-2、MPEG-4第2部分、H.264、HEVC、VP8等)中的任何一个来对HD SDR信号128进行压缩或编码。In an embodiment, the UHD SDR signal 104 may be downsampled to an HD SDR signal (e.g., 1080p), which is then color-converted to a color format suitable for encoding using an older 8-bit encoder (e.g., an 8-bit YCbCr 4:2:0 color format). Such conversion may include color conversion (e.g., RGB to YCbCr conversion 115-C) and chroma subsampling (e.g., 4:4:4 to 4:2:0 conversion 120-C). Thus, the HD SDR signal 128 represents a backward-compatible signal representation of the original UHD EDR signal 102. The signal 128 may be encoded using a BL encoder 130 to produce a backward-compatible encoded bitstream 132. The BL encoder 130 may compress or encode the HD SDR signal 128 using any of known or future video compression algorithms (e.g., MPEG-2, MPEG-4 Part 2, H.264, HEVC, VP8, etc.).

给定UHD EDR信号102，下采样(110-A)和颜色转换处理(115-B和120-B)可以将UHDEDR信号102转换为参考预测HD EDR信号124。在优选实施例中，这个阶段中的下采样和颜色转换处理(110-A、115-B和120-B)(例如，选定的滤波器和颜色空间)应与用于在基本层中产生HD SDR信号128的下采样和颜色转换处理(110-B、115-C和120-C)相同或者尽可能地接近。Given a UHD EDR signal 102, downsampling (110-A) and color conversion processes (115-B and 120-B) can convert the UHD EDR signal 102 into a reference predicted HD EDR signal 124. In a preferred embodiment, the downsampling and color conversion processes (110-A, 115-B, and 120-B) (e.g., the selected filters and color space) in this stage should be the same as or as close as possible to the downsampling and color conversion processes (110-B, 115-C, and 120-C) used to generate the HD SDR signal 128 in the base layer.

在UHD EDR变换为HD EDR之后，将HD EDR信号124的输出分离为亮度(Y 124-Y)和色度(CbCr 124-C)分量，这些分量被应用于确定用于亮度预测器145和色度预测器140的预测系数。After UHD EDR conversion to HD EDR, the output of the HD EDR signal 124 is separated into luma (Y 124 -Y) and chroma (CbCr 124 -C) components, which are applied to determine prediction coefficients for the luma predictor 145 and the chroma predictor 140 .

给定HD SDR信号128，BL编码器130不仅产生编码的BL位流132，而且还产生表示HDSDR信号128的BL信号126，因为它将被相应的BL解码器解码。在一些实施例中，信号126可以由BL编码器130后面的单独的BL解码器(未示出)产生。在一些其它的实施例中，信号126可以从用于在BL编码器130中执行运动补偿的反馈回路产生。如图1中所描绘的，还可以将HDEDR信号126的输出分离为其亮度(Y 126-Y)和色度分量(CbCr 126-C)，这些分量被应用于亮度预测器145和色度预测器140来预测HD EDR信号147。Given an HD SDR signal 128, the BL encoder 130 generates not only an encoded BL bitstream 132, but also a BL signal 126 representing the HD SDR signal 128 as it will be decoded by a corresponding BL decoder. In some embodiments, the signal 126 may be generated by a separate BL decoder (not shown) following the BL encoder 130. In some other embodiments, the signal 126 may be generated from a feedback loop used to perform motion compensation in the BL encoder 130. As depicted in FIG1 , the output of the HD EDR signal 126 may also be separated into its luma (Y 126-Y) and chroma (CbCr 126-C) components, which are applied to a luma predictor 145 and a chroma predictor 140 to predict an HD EDR signal 147.

在实施例中，亮度预测器145可以包括基于基本层HD SDR信号126-Y的亮度像素值预测HD EDR信号147的亮度分量的多项式预测器。在这样的预测器中，亮度像素分量可以在不考虑信号的其它任一颜色分量中的任何像素值的情况下被预测。例如，设g_i表示BL HDSDR信号的亮度像素值(126-Y)，于是，在不失一般性的情况下，三次多项式预测器可以被表达为：In an embodiment, the luma predictor 145 may include a polynomial predictor that predicts the luma component of the HD EDR signal 147 based on the luma pixel values of the base layer HD SDR signal 126-Y. In such a predictor, the luma pixel component may be predicted without considering any pixel values in any other color components of the signal. For example, let g _i denote the luma pixel value (126-Y) of the BL HDSDR signal. Then, without loss of generality, the cubic polynomial predictor may be expressed as:

其中，a_k、b_k和c_k是预测器系数。在实施例中，可以用本领域中任何已知的最小误差技术(诸如最小化预测值和参考HD EDR信号中的亮度像素值(124-Y)(s_i)之间的均方差(例如，))来确定预测器系数。Where a _k , b _k and c _k are predictor coefficients. In an embodiment, the predictor coefficients may be determined using any minimum error technique known in the art, such as minimizing the mean square error (e.g., ) between the predicted value and the luma pixel value (124-Y)(s _i ) in the reference HD EDR signal.

在实施例中，色度预测器140也可以是类似于前述多项式预测器的多项式预测器；然而，在优选实施例中，色度预测器140包括多色通道、多元回归(MMR)预测器，如G-M Su等人在2012年4月13日提交的、序号为PCT/US2012/033605的PCT申请(发表为WO2012/142471)“Multiple color channel multiple regression predictor”中所描述的预测器那样，该申请的全部内容通过引用并入本文。MMR预测器使用来自HD EDR参考信号124和基本层HDSDR信号126中的亮度和色度像素值两者的信息来预测HD EDR信号的色度分量。还可以使用均方差最小化技术通过最小化预测色度值与参考HD EDR信号124的亮度和色度像素值之间的MSE来确定MMR模型中的预测系数。In an embodiment, the chroma predictor 140 may also be a polynomial predictor similar to the polynomial predictor described above; however, in a preferred embodiment, the chroma predictor 140 comprises a multi-color channel, multiple regression (MMR) predictor, such as the predictor described in PCT application Ser. No. PCT/US2012/033605, filed Apr. 13, 2012, by G.M. Su et al., “Multiple Color Channel Multiple Regression Predictor,” published as WO 2012/142471, the entire contents of which are incorporated herein by reference. The MMR predictor uses information from both luma and chroma pixel values in the HD EDR reference signal 124 and the base layer HDSDR signal 126 to predict the chroma components of the HD EDR signal. The prediction coefficients in the MMR model may also be determined using a mean square error minimization technique by minimizing the MSE between the predicted chroma values and the luma and chroma pixel values of the reference HD EDR signal 124.

因为HD SDR信号126和参考HD HDR信号124都为YCbCr4:2:0格式(其中，亮度分量的空间分辨率是每个色度分量的空间分辨率的两倍)，所以在将这两个信号的亮度分量应用于色度预测器140之前，对这两个信号的亮度分量进行下采样(135-A和135-B)。在优选实施例中，亮度下采样135-A和135-B中所使用的滤波器与4:4:4到4:2:0处理(120)中所使用的色度下采样滤波器相同。可以以各种感兴趣的时间间隔更新亮度和色度预测系数，诸如每一场景、每一图片组或每一帧。可以用各种方法将预测滤波器系数传送给解码器，诸如将它们的值作为辅助数据或元数据嵌入在位流中。Because both the HD SDR signal 126 and the reference HD HDR signal 124 are in the YCbCr 4:2:0 format (where the spatial resolution of the luma component is twice that of each chroma component), the luma components of both signals are downsampled (135-A and 135-B) before being applied to the chroma predictor 140. In a preferred embodiment, the filters used in the luma downsampling 135-A and 135-B are the same as the chroma downsampling filters used in the 4:4:4 to 4:2:0 processing (120). The luma and chroma prediction coefficients can be updated at various time intervals of interest, such as per scene, per group of pictures, or per frame. The prediction filter coefficients can be communicated to the decoder in various ways, such as by embedding their values in the bitstream as ancillary data or metadata.

给定预测的HD EDR信号147，上采样器150产生UHD EDR信号152，该信号用于产生残差信号167。因为UHD EDR信号为优选的编码格式(例如，YCbCr 4:2:0)，所以可能需要附加的颜色变换(115-A)和色度下采样(120A)步骤来将原始格式(例如，RGB)的原始UHD EDR信号102转换为优选编码格式的UHD EDR信号122。信号122和152相减以创建EL残差信号167。Given the predicted HD EDR signal 147, the upsampler 150 generates a UHD EDR signal 152, which is used to generate a residual signal 167. Because the UHD EDR signal is in a preferred encoding format (e.g., YCbCr 4:2:0), additional color conversion (115-A) and chroma downsampling (120A) steps may be required to convert the original UHD EDR signal 102 in the original format (e.g., RGB) to the preferred encoding format UHD EDR signal 122. Signals 122 and 152 are subtracted to create the EL residual signal 167.

在实施例中，颜色变换(115-A)和色度二次采样处理(120-A)与用于产生BL编码的信号128和预测信号124的颜色变换(115-B和115-C)以及色度二次采样处理(120B和120-C)相同或者尽可能地接近。In an embodiment, the color transform (115-A) and chroma subsampling process (120-A) are the same as or as close as possible to the color transform (115-B and 115-C) and chroma subsampling process (120B and 120-C) used to generate the BL encoded signal 128 and the prediction signal 124.

在实施例中，在EL编码器160对EL信号167进行编码之前，可以用非线性量化器(NLQ)155对该信号进行处理。合适的非线性量化器的例子可以在2012年4月24日提交的序号为PCT/US2012/034747(公布为WO/2012/148883)的PCT专利申请“Non-linear VDRResidual Quantizer”中找到，该申请的全部内容通过引用并入本文。可以使用EL编码器160对NLQ 155的输出进行压缩以产生可以发送到合适的解码器的编码的EL位流162。此外，在一些实施例中，残差(167)也可以用下采样模块(未示出)被空间下采样。这样的下采样(例如，在两个维度上，下采样因子为2或4)提高了编码效率，在非常低的位率下尤其如此。下采样可以在非线性量化器(155)之前或之后执行。In an embodiment, the EL signal 167 may be processed using a nonlinear quantizer (NLQ) 155 before being encoded by the EL encoder 160. An example of a suitable nonlinear quantizer can be found in PCT patent application number PCT/US2012/034747, filed on April 24, 2012 (published as WO/2012/148883), entitled "Non-linear VDR Residual Quantizer," which is incorporated herein by reference in its entirety. The output of the NLQ 155 may be compressed using the EL encoder 160 to produce an encoded EL bitstream 162 that may be sent to a suitable decoder. In addition, in some embodiments, the residual (167) may also be spatially downsampled using a downsampling module (not shown). Such downsampling (e.g., downsampling by a factor of 2 or 4 in two dimensions) improves coding efficiency, particularly at very low bit rates. Downsampling may be performed before or after the nonlinear quantizer (155).

EL编码器160可以是任何合适的编码器，诸如MPEG-2、MPEG-4、H.264、HEVC规范等所描述的那些编码器。在实施例中，可以将BL编码的位流132、EL编码的位流162和与编码处理相关的元数据(例如，预测器参数或查找表)复用为单个位流(未示出)。The EL encoder 160 may be any suitable encoder, such as those described in MPEG-2, MPEG-4, H.264, HEVC specifications, etc. In an embodiment, the BL encoded bitstream 132, the EL encoded bitstream 162, and metadata related to the encoding process (e.g., predictor parameters or lookup tables) may be multiplexed into a single bitstream (not shown).

如图1中所描绘的，在优选实施例中，下采样(110-A或110-B)优选地在颜色格式变换(115-B和120-B或115-C和120-C)之前应用；然而，在一些实施例中，下采样可以在颜色变换之后执行。例如，在一个实施例中，110-A的输入可以直接从UHD EDR YCbCr信号122接收，因此消除了进行颜色变换处理115-B和120-B以产生HD EDR参考信号124的需要。类似地，下采样110-B可以在颜色转换步骤120-C之后执行。As depicted in FIG1 , in a preferred embodiment, downsampling (110-A or 110-B) is preferably applied before color format conversion (115-B and 120-B or 115-C and 120-C); however, in some embodiments, downsampling can be performed after color conversion. For example, in one embodiment, the input to 110-A can be received directly from the UHD EDR YCbCr signal 122, thereby eliminating the need to perform color conversion processing 115-B and 120-B to generate the HD EDR reference signal 124. Similarly, downsampling 110-B can be performed after the color conversion step 120-C.

在一些实施例中，基线HD SDR信号128可能已经为可供编码器100使用的正确的分辨率和颜色格式。在这样的情况下，可以绕过下采样(110-B)和颜色变换步骤(115-C和120-C)。In some embodiments, the baseline HD SDR signal 128 may already be in the correct resolution and color format for use by the encoder 100. In such a case, the downsampling (110-B) and color conversion steps (115-C and 120-C) may be bypassed.

在一些实施例中，UHD EDR信号120可以以低于或高于16位的精度使用；然而，其精度预期高于8位(例如，10位或12位)。类似地，UHD SDR信号104可能已经可以以低于16位的精度使用(例如，8位或10位)。In some embodiments, the UHD EDR signal 120 may be used with less than or greater than 16 bits of precision; however, its precision is expected to be greater than 8 bits (e.g., 10 bits or 12 bits). Similarly, the UHD SDR signal 104 may already be used with less than 16 bits of precision (e.g., 8 bits or 10 bits).

用于超高清EDR信号的解码器Decoder for Ultra HD EDR signals

图2描绘了支持具有增强动态范围(EDR)的UHD信号的向后兼容的解码的系统的示例实现的实施例。响应于编码器(例如，100)发送的编码的信号，解码器200接收并解复用包括至少两个编码的子流的编码的位流：编码的BL流132和编码的EL流162。FIG2 illustrates an embodiment of an exemplary implementation of a system that supports backward-compatible decoding of UHD signals with enhanced dynamic range (EDR). In response to an encoded signal transmitted by an encoder (e.g., 100), a decoder 200 receives and demultiplexes an encoded bitstream comprising at least two encoded substreams: an encoded BL stream 132 and an encoded EL stream 162.

编码的BL流132包括可以使用BL解码器215解码的HD SDR信号(217)。在实施例中，BL解码器215与BL编码器130匹配。例如，为了与现有的广播和蓝光标准向后兼容，BL解码器215可以遵循MPEG-2或H.264编码规范中的一个或多个。在BL解码215之后，HD SDR解码器可以将附加的颜色变换(270)应用于解码的HD SDR信号217以将传入的信号从适合于压缩的颜色格式(例如，YCbCr4:2:0)转化为适合于显示的颜色格式(例如，RGB 4:4:4)。具有增强分辨率和/或EDR显示能力的接收器可以组合来自BL和EL位流(132和162)两者的信息以产生如图2中所描绘的具有增强动态范围的UHD信号(例如，232)。The encoded BL stream 132 includes an HD SDR signal (217) that can be decoded using a BL decoder 215. In an embodiment, the BL decoder 215 matches the BL encoder 130. For example, to be backward compatible with existing broadcast and Blu-ray standards, the BL decoder 215 can follow one or more of the MPEG-2 or H.264 encoding specifications. After BL decoding 215, the HD SDR decoder can apply an additional color transform (270) to the decoded HD SDR signal 217 to convert the incoming signal from a color format suitable for compression (e.g., YCbCr 4:2:0) to a color format suitable for display (e.g., RGB 4:4:4). A receiver with enhanced resolution and/or EDR display capabilities can combine information from both the BL and EL bitstreams (132 and 162) to produce a UHD signal (e.g., 232) with enhanced dynamic range as depicted in FIG.

在BL解码215之后，将解码的信号217划分为其亮度(217-Y)和色度(217-C)分量。亮度分量(217-Y)被亮度预测器240处理以产生关于HD EDR信号255的亮度估计值。亮度和色度分量还被色度预测器250处理以产生关于HD EDR信号255的色度估计值。在实施例中，在色度预测器处理亮度信号217-Y之前，它被下采样器245二次采样，所以它与色度分量的分辨率匹配。亮度和色度预测器(240和250)与编码器100中的亮度和色度预测器(145和140)匹配。因此，亮度预测器240可以是多项式预测器，而色度预测器可以是MMR预测器。在实施例中，可以使用嵌入在所接收的编码的位流中的元数据来确定这些预测器的特性和滤波器参数。在亮度和色度预测步骤(240和250)之后，对预测的HD EDR信号255进行上采样(260)以产生UHD EDR信号265。After BL decoding 215, the decoded signal 217 is divided into its luma (217-Y) and chroma (217-C) components. The luma component (217-Y) is processed by a luma predictor 240 to produce a luma estimate for an HD EDR signal 255. The luma and chroma components are also processed by a chroma predictor 250 to produce a chroma estimate for the HD EDR signal 255. In an embodiment, before the luma signal 217-Y is processed by the chroma predictor, it is subsampled by a downsampler 245 so that it matches the resolution of the chroma components. The luma and chroma predictors (240 and 250) match the luma and chroma predictors (145 and 140) in encoder 100. Thus, the luma predictor 240 can be a polynomial predictor, while the chroma predictor can be an MMR predictor. In an embodiment, the characteristics and filter parameters of these predictors can be determined using metadata embedded in the received encoded bitstream. After the luma and chroma prediction steps ( 240 and 250 ), the predicted HD EDR signal 255 is upsampled ( 260 ) to produce a UHD EDR signal 265 .

给定编码的位流162，EL解码器210对它进行解码以产生UHD EDR残差信号212。EL解码器210与EL编码器160匹配。如果编码器100将非线性量化器155应用于残差167，则通过应用非线性去量化器(NLDQ)220产生去量化的残差222，来反转非线性量化处理。如果编码器(100)将空间下采样应用于残差(167)，则NLDQ(220)之前或之后的空间上采样器(未示出)可以将解码的残差(例如，212或222)上采样为其适当的空间分辨率。通过将残差222添加(225)到UHD EDR的估计265，解码器200可以产生与编码器发送的UHD EDR信号122的分辨率和颜色格式(例如，4:2:0YCbCr)匹配的UHD EDR信号227。根据目标应用，一组颜色变换(230)可以将UHD EDR信号232变换为适合于显示或其它处理的格式。在实施例中，给定YCbCr 4:2:0信号227，颜色变换230可以包括4:2:0到4:4:4色度上采样步骤，之后为YCbCr到RGB颜色变换步骤。Given the encoded bitstream 162, the EL decoder 210 decodes it to produce a UHD EDR residual signal 212. The EL decoder 210 matches the EL encoder 160. If the encoder 100 applies the nonlinear quantizer 155 to the residual 167, the nonlinear quantization process is reversed by applying a nonlinear dequantizer (NLDQ) 220 to produce a dequantized residual 222. If the encoder (100) applies spatial downsampling to the residual (167), a spatial upsampler (not shown) before or after the NLDQ (220) can upsample the decoded residual (e.g., 212 or 222) to its appropriate spatial resolution. By adding (225) the residual 222 to the estimate 265 of UHD EDR, the decoder 200 can produce a UHD EDR signal 227 that matches the resolution and color format (e.g., 4:2:0 YCbCr) of the UHD EDR signal 122 sent by the encoder. Depending on the target application, a set of color transforms (230) may transform the UHD EDR signal 232 into a format suitable for display or other processing. In an embodiment, given a YCbCr 4:2:0 signal 227, the color transform 230 may include a 4:2:0 to 4:4:4 chroma upsampling step followed by a YCbCr to RGB color conversion step.

混合逐行和隔行格式的编码和解码Encoding and decoding of mixed progressive and interlaced formats

尽管逐行视频信号(例如，720p或1080p)的采用增加，但是隔行视频信号(例如，1080i)的广播在视频广播中仍然相当普遍。在另一实施例中，图3描绘了支持使用逐行和隔行格式的混合的层编码的UHD EDR编码系统(300)的另一个例子。在例子中，BL信号(332)被以隔行格式(例如，1080i或2160i)编码，而EL信号(162)被以逐行格式(progressiveformat)(例如，2160p)编码。Despite the increased adoption of progressive video signals (e.g., 720p or 1080p), broadcasts of interlaced video signals (e.g., 1080i) remain quite common in video broadcasting. In another embodiment, FIG3 depicts another example of a UHD EDR encoding system (300) that supports layer encoding using a mix of progressive and interlaced formats. In the example, the BL signal (332) is encoded in an interlaced format (e.g., 1080i or 2160i), while the EL signal (162) is encoded in a progressive format (e.g., 2160p).

编码系统(300)共享编码系统(100)的大部分功能，因此，在该部分中，将仅讨论这两个系统之间的关键差异。如图3中所描绘的，在基本层处理中，对SDR信号(104)进行颜色转换以转换为适合于使用BL编码器(130)编码的颜色格式(例如，4:2:0YCbCr)。在示例实施例中，BL编码器(130)的输出(332)可以包括隔行SDR信号。隔行器(320-A)可以应用本领域中已知的任何隔行和下采样技术来将逐行输入(128)转换为基本层信号(332)的期望的编码分辨率的隔行信号(例如，1080i)。The encoding system (300) shares most of the functionality of the encoding system (100), and therefore, in this section, only the key differences between the two systems will be discussed. As depicted in Figure 3, in base layer processing, the SDR signal (104) is color converted to a color format suitable for encoding using the BL encoder (130) (e.g., 4:2:0 YCbCr). In an example embodiment, the output (332) of the BL encoder (130) may include an interlaced SDR signal. The interlacer (320-A) may apply any interlacing and downsampling technique known in the art to convert the progressive input (128) to an interlaced signal (e.g., 1080i) of the desired encoding resolution for the base layer signal (332).

与系统(100)相比，在增强层中，系统(100)的处理组件(110-A)、(115-B)和(120-B)可以全都用隔行器(interlacer)(320-B)取代。隔行器(320-B)可以应用本领域中已知的任何隔行和下采样技术来将逐行输入(122)转换为与隔行信号(126)的分辨率匹配的隔行信号(124)。在优选实施例中，(320-A)和(320-B)的下采样和隔行功能应彼此相同或者尽可能地接近以减小颜色伪像并且改进总体图像编码质量。Compared to system (100), in the enhancement layer, the processing components (110-A), (115-B), and (120-B) of system (100) can all be replaced by an interlacer (320-B). Interlacer (320-B) can apply any interlacing and downsampling techniques known in the art to convert the progressive input (122) into an interlaced signal (124) that matches the resolution of the interlaced signal (126). In a preferred embodiment, the downsampling and interlacing functions of (320-A) and (320-B) should be identical to each other or as close as possible to reduce color artifacts and improve overall image encoding quality.

系统(300)中的亮度和色度预测器(145和140)保持与系统(100)中的亮度和色度预测器相同；然而，它们现在对它们的输入的单独的字段进行操作，因为信号(124)和(126)现在是隔行信号。The luma and chroma predictors (145 and 140) in system (300) remain the same as the luma and chroma predictors in system (100); however, they now operate on separate fields of their inputs since signals (124) and (126) are now interlaced signals.

去隔行器(350)也具有双重功能；它对预测的HD EDR信号(347)进行去隔行，并且将它上采样为与UHD EDR信号(122)的分辨率匹配，从而产生具有与信号(122)相同的分辨率和格式的预测的UHD EDR信号(152)。系统(300)中的残差(167)的处理保持与对于系统(100)描述的处理相同。The de-interlacer (350) also has a dual function; it de-interlaces the predicted HD EDR signal (347) and upsamples it to match the resolution of the UHD EDR signal (122), thereby producing a predicted UHD EDR signal (152) having the same resolution and format as the signal (122). The processing of the residual (167) in the system (300) remains the same as that described for the system (100).

在一些实施例中，SDR信号(104)可能已经为隔行格式，那么隔行器(320-A)可以用下采样器取代。如果输入信号(104)已经是隔行的并且为适当的分辨率，则可以除去隔行器(320-A)。In some embodiments, the SDR signal (104) may already be in an interlaced format, and the interlacer (320-A) may be replaced with a downsampler. If the input signal (104) is already interlaced and of an appropriate resolution, the interlacer (320-A) may be eliminated.

在实施例中，输入信号(102)和(104)可以都是HD分辨率信号(例如，1080p)。那么，系统(300)的输出可以包括编码的隔行HD(例如，1080i)基本层信号(332)和编码的逐行HD(例如，1080p)残差(162)。In an embodiment, the input signals (102) and (104) may both be HD resolution signals (e.g., 1080p). Then, the output of the system (300) may include an encoded interlaced HD (e.g., 1080i) base layer signal (332) and an encoded progressive HD (e.g., 1080p) residual (162).

在实施例中，BL信号(332)和残差(162)两者可以为相同的分辨率，但是为混合格式。例如，BL信号(332)可以被以2160i编码，而EL信号(162)可以被以2160p编码。In an embodiment, the BL signal (332) and the residual (162) may both be of the same resolution but in a mixed format. For example, the BL signal (332) may be encoded at 2160i, while the EL signal (162) may be encoded at 2160p.

图4描绘了用于对混合格式编码器(300)所产生的信号进行解码的解码器系统(400)的示例实现的实施例。系统(400)与解码器系统(200)几乎相同，除了以下差异之外：a)解码的BL信号(417)现在是隔行视频信号，(b)亮度和色度预测器(240和250)对隔行信号(417)和(247)的字段进行操作，以及c)预测的HD EDR信号(455)是隔行信号。FIG4 depicts an embodiment of an example implementation of a decoder system (400) for decoding a signal produced by a mixed format encoder (300). The system (400) is nearly identical to the decoder system (200) except for the following differences: a) the decoded BL signal (417) is now an interlaced video signal, (b) the luma and chroma predictors (240 and 250) operate on fields of the interlaced signals (417) and (247), and c) the predicted HD EDR signal (455) is an interlaced signal.

去隔行器(460)在功能上与系统(300)中的去隔行器(350)匹配；因此，它对隔行HDEDR信号(455)进行去隔行和上采样，以使得其输出(UHD EDR信号(465))具有与解码的误差残差信号(222)相同的分辨率和格式。The deinterlacer (460) functionally matches the deinterlacer (350) in the system (300); thus, it deinterlaces and upsamples the interlaced HD EDR signal (455) so that its output (the UHD EDR signal (465)) has the same resolution and format as the decoded error residual signal (222).

如前所指出的，系统(300)还可以将空间下采样模块(未示出)包括在EL路径中、非线性量化器(155)之前或之后。在这样的情况下，在解码器(400)中，NLDQ(220)之前或之后的空间上采样器可以用于将解码的残差(212)恢复到其适当的空间分辨率。As previously noted, the system (300) may also include a spatial downsampling module (not shown) in the EL path, before or after the nonlinear quantizer (155). In such a case, in the decoder (400), a spatial upsampler before or after the NLDQ (220) may be used to restore the decoded residual (212) to its proper spatial resolution.

亮度范围驱动的自适应上采样Luminance range driven adaptive upsampling

如图1中所描绘的，在亮度和色度预测步骤(140、145)之后，以因子2对预测的HDEDR信号(147)进行上采样(150)以产生预测的UHD EDR信号152。类似的处理也在解码器(200)中执行，在解码器(200)中，在亮度和色度预测步骤(240、250)之后，以因子2对预测的HD EDR信号(255)进行上采样(260)以产生预测的UHD EDR信号(265)。上采样器(150)和(260)可以包括本领域中已知的任何上采样技术；然而，可以通过利用如该部分中所描述的亮度范围驱动的自适应上采样技术来实现改进的图像质量。As depicted in FIG1 , after the luma and chroma prediction steps (140, 145), the predicted HD EDR signal (147) is upsampled (150) by a factor of 2 to produce a predicted UHD EDR signal 152. Similar processing is also performed in the decoder (200), where, after the luma and chroma prediction steps (240, 250), the predicted HD EDR signal (255) is upsampled (260) by a factor of 2 to produce a predicted UHD EDR signal (265). The upsamplers (150) and (260) may include any upsampling technique known in the art; however, improved image quality may be achieved by utilizing a luma range driven adaptive upsampling technique as described in this section.

已经观察到，原始EDR信号(122)及其预测值(152)之间的预测误差(167)可以根据相应的SDR信号(104)中的亮度值而变化。也就是说，图像中的明亮的或高亮的区域中的残差(167)表现出与暗色调或中间色调区域中的残差不同类型的特性。在实施例中，可以将SDR输入的亮度范围划分为两个或更多个亮度子范围。自适应上采样滤波方法可以将不同的上采样滤波器应用于EDR预测图像的不同像素，其中，每个滤波器是根据SDR图像中的相应像素的亮度子范围而选择的。识别这些亮度子范围中的每个的阈值和所使用的滤波器的标识和/或滤波器系数本身可以经由元数据或其它辅助数据从编码器(100)传送到解码器(200)，以使得编码器和解码器两者可以应用相同的上采样滤波器来改进图像质量。It has been observed that the prediction error (167) between the original EDR signal (122) and its predicted value (152) can vary depending on the luma value in the corresponding SDR signal (104). That is, the residual (167) in bright or highlighted areas of the image exhibits different types of characteristics than the residual in dark or mid-tone areas. In an embodiment, the luma range of the SDR input can be divided into two or more luma sub-ranges. An adaptive upsampling filtering method can apply different upsampling filters to different pixels of the EDR predicted image, where each filter is selected based on the luma sub-range of the corresponding pixel in the SDR image. Thresholds identifying each of these luma sub-ranges and the identity of the filter used and/or the filter coefficients themselves can be transmitted from the encoder (100) to the decoder (200) via metadata or other auxiliary data so that both the encoder and decoder can apply the same upsampling filter to improve image quality.

设表示HD EDR信号(147)的亮度像素值，该亮度像素值基于BL编码器(130)的输出的亮度值(即，SDR信号s_ij(126-Y))而被预测。设th(i)(i＝0,N)表示将像素的亮度范围(0≤s_ij≤1)划分为感兴趣的N个亮度范围(N≥1)(例如，对于N＝3，划分为黑色、中间色调和高光)的一组阈值。设H_i表示在步骤(150)或(260)中用于感兴趣的第i亮度范围的第i(i＝1,N)上采样滤波器的一组滤波器系数，并且设表示s_ij或者其局部近邻的函数，那么在实施例中，可以根据以下用伪代码表达的算法1来执行上采样滤波(例如，150或260)：Let t denote the luma pixel value of the HD EDR signal (147), which is predicted based on the luma value of the output of the BL encoder (130), i.e., the SDR signal s _ij (126-Y). Let th(i) (i=0, N) denote a set of thresholds for dividing the luma range of a pixel (0≤s _ij ≤1) into N luma ranges of interest (N≥1) (e.g., black, midtones, and highlights for N=3). Let H _i denote a set of filter coefficients for the i-th (i=1, N) upsampling filter for the i-th luma range of interest in step (150) or (260), and let t denote a function of s _ij or its local neighbors, then in an embodiment, upsampling filtering (e.g., 150 or 260) may be performed according to the following algorithm 1 expressed in pseudo code:

算法1——亮度范围驱动的上采样处理Algorithm 1 - Luminance Range Driven Upsampling

在一些实施例中，H_i可以表示2-D不可分离滤波器的滤波器系数。在一些其它的实施例中，H_i可以表示2-D可分离上采样滤波器的系数，包括但不限于用于水平和垂直上采样滤波器的系数。滤波器系数H_i可以被预先计算并且存储在存储器中，或者它们可以自适应地根据某一图像质量准则计算。例如，在实施例中，滤波器系数H_i可以被计算为使得扩展(up-scaling)滤波器的输出(预测的UHD EDR信号(152))和输入的UHD EDR信号(122)之间的均方差最小。In some embodiments, _Hi may represent the filter coefficients of a 2-D non-separable filter. In some other embodiments, _Hi may represent the coefficients of a 2-D separable upsampling filter, including but not limited to coefficients for horizontal and vertical upsampling filters. The filter coefficients _Hi may be pre-computed and stored in memory, or they may be adaptively computed based on some image quality criterion. For example, in an embodiment, the filter coefficients _Hi may be computed to minimize the mean square error between the output of the up-scaling filter (the predicted UHD EDR signal (152)) and the input UHD EDR signal (122).

在一些实施例中，可以表示感兴趣的单个像素值(例如，s_ij或s_ij-1)，而在一些其它的实施例中，可以表示s_ij周围的一个或多个像素的局部平均值或某一其它函数(例如，中间值、最小值或最大值)。In some embodiments, it may represent a single pixel value of interest (e.g., _sij or _sij-1 ), while in some other embodiments it may represent a local average or some other function (e.g., median, minimum, or maximum) of one or more pixels surrounding _sij .

在实施例中，可以基于输入信号的图像统计(例如，黑色、中间色调或高光的平均值)来确定th(i)阈值。可以基于每一像素区域、每一帧或每一场景(例如，具有类似亮度特性的一组顺序图片)来计算这些统计。在一些实施例中，可以作为滤波设计处理的一部分迭代地确定th(i)。例如，考虑基于某一优化准则(例如，最小化信号(167)的均方差(MSE))计算滤波器系数H_i的情况，那么，在实施例中，算法2用伪代码描述在给定两个边界阈值(t_low和t_high)和阈值搜索步长(step)的情况下确定新阈值(th*)的示例方法：In embodiments, the th(i) threshold may be determined based on image statistics of the input signal (e.g., the average of blacks, midtones, or highlights). These statistics may be calculated per pixel region, per frame, or per scene (e.g., a set of sequential pictures with similar luminance characteristics). In some embodiments, th(i) may be determined iteratively as part of the filter design process. For example, considering the case where filter coefficients _Hi are calculated based on some optimization criterion (e.g., minimizing the mean square error (MSE) of the signal (167)), then, in embodiments, Algorithm 2 describes in pseudocode an example method for determining a new threshold (th*) given two boundary thresholds (t_low and t_high) and a threshold search step size (step):

算法2——对于两个亮度子范围(N＝2)的阈值确定Algorithm 2 - Threshold determination for two brightness sub-ranges (N=2)

在以上描述中，t_low和t_high表示可能搜索阈值的感兴趣的边界值。例如，在实施例中，t_low＝min(s_ij)＝0和t_high＝max(s_ij)＝1(其中，1表示被归一化的最大可能值)覆盖可能的亮度值的整个范围；然而，在其它实施例中，边界值的范围可能小得多。例如，时间t时计算用于输入帧的阈值可以考虑早先(比如说，在时间t-1时)计算的阈值，从而仅在以前一阈值为中心的较小范围(例如，th(i)-C、th(i)+C，其中，C是常数)内进行搜索。In the above description, t_low and t_high represent boundary values of interest for which a threshold may be searched. For example, in an embodiment, t_low = min(s _ij ) = 0 and t_high = max(s _ij ) = 1 (where 1 represents the normalized maximum possible value) covers the entire range of possible luminance values; however, in other embodiments, the range of boundary values may be much smaller. For example, the threshold value calculated for the input frame at time t may take into account a threshold value calculated earlier (say, at time t-1), thereby searching only within a smaller range (e.g., th(i)-C, th(i)+C, where C is a constant) centered around the previous threshold value.

给定算法2，在一些实施例中，类似的方法可以用于使用附加阈值将图片帧的亮度范围细分为亮度范围的附加分区。在示例实施例中，以下算法(算法3)可以用于将给定的亮度范围(A、B)细分为两个或三个亮度子范围。Given Algorithm 2, in some embodiments, a similar approach can be used to subdivide the luminance range of a picture frame into additional partitions of the luminance range using additional thresholds. In an example embodiment, the following algorithm (Algorithm 3) can be used to subdivide a given luminance range (A, B) into two or three luminance sub-ranges.

算法3——对于三个亮度子范围(N＝3)的阈值确定Algorithm 3 - Threshold determination for three brightness sub-ranges (N=3)

通过算法2和3计算的阈值可以在编码器(100)和解码器(200)两者中都应用于算法1。在实施例中，可以使用元数据将所计算的阈值从编码器(100)发送到解码器(200)。The thresholds calculated by Algorithms 2 and 3 may be applied to Algorithm 1 in both the encoder (100) and the decoder (200). In an embodiment, the calculated thresholds may be sent from the encoder (100) to the decoder (200) using metadata.

如前所述，去隔行器(350)和(460)可以组合去隔行和上采样功能两者。图像处理领域中的技术人员将意识到，本文中所讨论的用于上采样器(150)和(126)的改进设计的亮度范围驱动的方法也可以应用于去隔行器(350)和(460)中的上采样器的设计中。As previously mentioned, deinterlacers (350) and (460) can combine both deinterlacing and upsampling functions. Those skilled in the art of image processing will appreciate that the luma range driven approach discussed herein for the improved design of upsamplers (150) and (126) can also be applied to the design of the upsamplers in deinterlacers (350) and (460).

自适应残差处理Adaptive residual processing

如图1和图3中所描绘的，在增强层(EL)中，在用EL编码器(160)压缩残差信号(167)以产生EL流(162)之前，可以用非线性量化器(NLQ)(155)对残差信号(167)进行处理。不失一般性，图5描绘了根据本发明的实施例的关于NLQ(155)的示例输入-输出关系。As depicted in Figures 1 and 3, in an enhancement layer (EL), a residual signal (167) may be processed by a nonlinear quantizer (NLQ) (155) before being compressed by an EL encoder (160) to produce an EL stream (162). Without loss of generality, Figure 5 depicts an example input-output relationship for the NLQ (155) according to an embodiment of the present invention.

如图5中所描绘的，设(-X_max,X_max)表示感兴趣的帧或帧区域中将被编码的残差像素x(167)的像素值的范围。设Level表示量化器的每侧的可用码字的数量(例如，对于x≥0，Level＝128)，那么，给定正阈值T，设As depicted in FIG5 , let (−X _max , X _max ) denote the range of pixel values for the residual pixel x (167) to be encoded in the frame or frame region of interest. Let Level denote the number of available codewords on each side of the quantizer (e.g., Level = 128 for x ≥ 0), then, given a positive threshold T, let

那么，给定输入残差x，在将x裁剪在范围(-X_max,X_max)内之后，图5的量化运算可以被表达为：Then, given an input residual x, after clipping x to be within the range (-X _max , X _max ), the quantization operation of FIG5 can be expressed as:

其中，Q(x)表示量化的输出，SL表示Q(x)在(T,X_max)内的斜率，M表示偏移值，该偏移值表示当残差x＝0时的输出码字。阈值T是相对小的值，并且在一些实施例中，T＝0。Wherein, Q(x) represents the quantized output, SL represents the slope of Q(x) within (T, X _max ), and M represents an offset value representing the output codeword when the residual x = 0. The threshold T is a relatively small value, and in some embodiments, T = 0.

参数T、M、X_max和SL可以分别针对残差信号x的每个颜色分量定义，并且可以使用元数据传送到接收器。在一些实施例中，NLQ量化参数中的一个或多个还可以针对整个帧、帧的一个或多个分区或子区域、或者一组帧(例如，场景)定义。The parameters T, M, X _max and SL can be defined separately for each color component of the residual signal x and can be transmitted to the receiver using metadata. In some embodiments, one or more of the NLQ quantization parameters can also be defined for an entire frame, one or more partitions or sub-regions of a frame, or a group of frames (e.g., a scene).

给定这样的量化器，在接收器(例如，(200))上，去量化处理(例如，NLDQ(220))可以被表达为：Given such a quantizer, at the receiver (e.g., (200)), the dequantization process (e.g., NLDQ (220)) can be expressed as:

其中in

R_cmp表示接收的(解码的)残差(或EL信号(212))，表示去量化的输出(222)，该输出也可以被限定在例如范围内。R _cmp represents the received (decoded) residual (or EL signal (212)), and represents the dequantized output (222), which may also be limited to, for example, a range.

实验结果表明，与NLQ(155)的参数的自适应设置相组合的残差数据(167)的适当的预处理可以得到EL流的更高效的编码，从而导致编码伪像减小并且总体图像质量更好。在该部分中，接着描述三种残差预处理算法。Experimental results show that appropriate preprocessing of the residual data (167) combined with adaptive setting of the parameters of the NLQ (155) can lead to more efficient encoding of the EL stream, resulting in reduced coding artifacts and better overall image quality. In this section, three residual preprocessing algorithms are described next.

使用标准差度量的残差预量化Residual prequantization using standard deviation metric

残差信号(167)的不适当的量化和编码，尤其是当以相对较低的位率(例如，0.5Mbits/s)对EL流进行编码时，可能在解码的信号(232)中导致块状伪像。在实施例中，可以通过自适应地预量化被感知为位于相对“平滑”区域中的某些残差值来减小这些伪像。图6A中描绘了根据本发明的实施例的这样的处理的例子，其中，在不作为限制的情况下，测量围绕每个残差像素的矩形像素区域的平滑度是基于计算该区域中的像素的标准差的。Improper quantization and encoding of the residual signal (167), particularly when encoding the EL stream at a relatively low bitrate (e.g., 0.5 Mbits/s), can result in blocking artifacts in the decoded signal (232). In an embodiment, these artifacts can be reduced by adaptively pre-quantizing certain residual values that are perceived as being in relatively "smooth" regions. An example of such a process according to an embodiment of the present invention is depicted in FIG6A , where, without limitation, measuring the smoothness of a rectangular region of pixels surrounding each residual pixel is based on calculating the standard deviation of the pixels in that region.

设r_fi表示第f帧的第i残差像素。设该像素在被表示为n_fi的W_σ×W_σ像素区域(例如，W_σ＝15)的中心处。那么，在步骤(602)中，该像素的标准差σ_fi可以被确定为：Let r _fi denote the i-th residual pixel of the f-th frame. Let the pixel be at the center of a W _σ ×W _σ pixel region (e.g., W _σ = 15) denoted as n _fi . Then, in step (602), the standard deviation σ _fi of the pixel can be determined as:

其中in

给定阈值T_σ，在步骤(606)中，如果σ_fi<T_σ，则可以将残差像素r_fi设置为预定值(例如，零)。阈值T_σ可以是固定的，或者在优选实施例中，可以根据残差帧特性和总体位率要求自适应地确定。例如，设P_f表示第f帧中的像素的总数。设σ_fi表示在步骤(602)中计算的标准差值。在步骤(604)中，可以如下确定T_σ：Given a threshold T _σ , in step (606), if σ _fi < T _σ , the residual pixel r _fi can be set to a predetermined value (e.g., zero). The threshold T _σ can be fixed, or in a preferred embodiment, can be adaptively determined based on the residual frame characteristics and the overall bit rate requirements. For example, let P _f denote the total number of pixels in the f-th frame. Let σ _fi denote the standard deviation value calculated in step (602). In step (604), T _σ can be determined as follows:

(a)按降序对σ_fi进行排序以产生排序的的列表；(a) Sort σ _fi in descending order to produce a sorted list;

(b)然后，T_σ是排序列表中的k*P_f值，其中，k被定义在范围0.0至1.0内。例如，对于k＝0.25，给定1920×1080帧，T_σ对应于排序列表中的第518,400标准差值的值。(b) Then, _Tσ is the k* _Pf value in the sorted list, where k is defined in the range 0.0 to 1.0. For example, for k=0.25, given a 1920×1080 frame, _Tσ corresponds to the value of the 518,400th standard deviation value in the sorted list.

计算平滑度的替代方法还可以包括计算W_σ×W_σ像素的均值或方差，或者计算基于每个像素周围的区域的边缘图的度量，或者使用本领域中已知的任何其它的平滑度检测和确定算法。Alternative methods of calculating smoothness may also include calculating the mean or variance of _Wσ × _Wσ pixels, or calculating a metric based on an edge map of the area surrounding each pixel, or using any other smoothness detection and determination algorithm known in the art.

残差尾端边界调整Residual tail boundary adjustment

设表示帧f中的最大正残差值，设表示帧f中的最小负残差值的绝对值。那么，Let denote the maximum positive residual value in frame f, and let denote the absolute value of the minimum negative residual value in frame f. Then,

并且and

如图5中所描绘的，可以按照和来确定量化器的输入边界(例如，)；然而，实验结果表明，残差值具有钟形分布，并且在每个帧中通常存在非常少的接近于或的像素。如前所指出的，对于图5中所描绘的量化器，量化步长与成比例。对于固定数量的码字(例如，Level的值)，由于量化而导致的失真与X_max的值成正比；因此，较小的X_max值是优选的。在实施例中，不是根据或确定X_max，而是确定新的更小的范围[Th_f-Th_f+]。在应用NLQ(155)之前，限制(或裁剪)残差像素值以位于新范围[Th_f-Th_f+]内；其中，对于帧f，Th_f+表示正残差的边界，Th_f-表示负残差的边界。也就是说，As depicted in FIG5 , the input boundaries of the quantizer (e.g., ) can be determined in terms of and ; however, experimental results show that the residual values have a bell-shaped distribution, and there are typically very few pixels close to or in each frame. As previously noted, for the quantizer depicted in FIG5 , the quantization step size is proportional to . For a fixed number of codewords (e.g., the value of Level), the distortion due to quantization is proportional to the value of X _max ; therefore, smaller values of X _max are preferred. In an embodiment, instead of determining X _max in terms of or , a new, smaller range [Th _f- Th _f+ ] is determined. Before applying NLQ (155), the residual pixel values are restricted (or clipped) to lie within the new range [Th _f- Th _f+ ]; where, for frame f, Th _f+ represents the boundary of the positive residual and Th _f- represents the boundary of the negative residual. That is,

r_fi＝clip3(r_fi，Th_f-，Th_f+)，r _fi =clip3 (r _fi , Th _f- , Th _f+ ),

其中，clip3()函数表示：大于Th_f+的残差像素值被裁剪为Th_f+，小于Th_f-的残差像素值被裁剪为Th_f-。The clip3() function indicates that residual pixel values greater than Th _f+ are clipped to Th _f+ , and residual pixel values less than Th _f− are clipped to Th _f− .

虽然用于NLQ处理的较小的输入范围由于量化得到较小的误差，但是残差信号的无限制的裁剪可能得到明显的伪像，因此需要根据残差信号的特性来改动新范围的选择。在实施例中，自适应地基于所观察的残差像素值的连接性(或稀疏性)来确定这两个阈值。也就是说，具有非常大的值的孤立残差像素可以被裁剪而对总体质量影响最小；然而，连接的残差像素的像素值应被适当地编码。图6B中用处理(650)描绘了根据本发明的实施例的这样的边界确定处理的示例实现。While a smaller input range for the NLQ process results in smaller errors due to quantization, unrestricted clipping of the residual signal may result in noticeable artifacts, so the selection of the new range needs to be modified based on the characteristics of the residual signal. In an embodiment, the two thresholds are adaptively determined based on the connectivity (or sparsity) of the observed residual pixel values. That is, isolated residual pixels with very large values can be clipped with minimal impact on the overall quality; however, the pixel values of connected residual pixels should be appropriately encoded. An example implementation of such a boundary determination process according to an embodiment of the present invention is depicted in Figure 6B as process (650).

处理(650)计算阈值Th，该阈值Th满足如下条件：等于或大于Th的残差像素值被认为是稀疏地连接，因此它们可以被裁剪。处理(650)可以用于根据输入的残差值计算Th_f-或Th_f+边界中的任何一个。例如，为了确定Th_f+＝Th，该处理仅考虑例如在范围(0，)内的正残差像素值：Process (650) calculates a threshold value Th that satisfies the following condition: residual pixel values equal to or greater than Th are considered to be sparsely connected, so they can be clipped. Process (650) can be used to calculate either _Thf- or Thf ₊ boundaries based on the input residual value. For example, to determine Thf ₊ = Th, the process only considers positive residual pixel values, such as in the range (0,):

为了确定Th_f-＝Th，该处理仅考虑例如在范围(0，)内的负残差像素值的绝对值：To determine Th _f −=Th, the process only considers the absolute values of negative residual pixel values, for example, in the range (0,):

在步骤(610)中，该处理通过将初始值设置为阈值Th而开始。所以，给定r_fi的原始边界(例如，Th_L＝0并且或)，在示例实施例中，初始阈值可以被设置为已知范围的中间值，例如：In step (610), the process begins by setting an initial value to a threshold value Th. Therefore, given the original bounds of r _fi (e.g., Th_L=0 and or), in an example embodiment, the initial threshold value can be set to the middle of a known range, such as:

Th＝(Th_H+Th_L)/2.Th＝(Th_H+Th_L)/2.

给定阈值Th，在步骤(612)中，产生二值图M_f，其中，该二值图的元素被计算为：Given a threshold Th, in step (612), a binary map _Mf is generated, wherein the elements of the binary map are calculated as:

m_fi＝(R_fi≥Th)m _fi =(R _fi ≥ Th)

M_f(i)＝m_fi.M _f (i)＝m _fi .

给定M_f，在步骤(614)中，可以确定每个二值像素的连接性。例如，在MATLAB中，可以使用函数bwconncomp计算近邻连接性(例如，4像素或8像素连接的邻域)。设NC_f(i)表示二值图像M_f中的每个像素的近邻的数量。在步骤(618)中，阈值Th可以被调整为使得，如果像素的连接性超过预定的连接性阈值T_∝(例如，T_∝＝5个像素)，则这些像素都不被裁剪。例如，如果所有像素上的最大像素连接性超过预定的连接性阈值T_∝，则可以增大阈值Th，否则，可以减小阈值Th。例如，使用二值搜索，Given M _f , in step (614), the connectivity of each binary pixel can be determined. For example, in MATLAB, the function bwconncomp can be used to calculate the neighbor connectivity (e.g., a neighborhood of 4 pixels or 8 pixels connected). Let NC _f (i) denote the number of neighbors of each pixel in the binary image M _f . In step (618), the threshold Th can be adjusted so that if the connectivity of the pixels exceeds a predetermined connectivity threshold T _∝ (e.g., T _∝ = 5 pixels), then these pixels are not cropped. For example, if the maximum pixel connectivity over all pixels exceeds a predetermined connectivity threshold T _∝ , then the threshold Th can be increased, otherwise, the threshold Th can be decreased. For example, using a binary search,

为了降低计算复杂度，在实施例中，所述处理可以包括收敛测试步骤(620)。例如，收敛步骤(620)可以计算先前的(或旧的)阈值和新的阈值之间的差值。如果它们的差值大于预定的收敛阈值，则所述处理用该新阈值再次从步骤(612)继续进行。否则，它终止，并且输出将被使用的最终边界(例如，Th_f+＝Th)。To reduce computational complexity, in an embodiment, the process may include a convergence test step (620). For example, the convergence step (620) may calculate the difference between the previous (or old) threshold and the new threshold. If the difference is greater than a predetermined convergence threshold, the process continues again from step (612) with the new threshold. Otherwise, it terminates and outputs the final boundary to be used (e.g., Th _f+ = Th).

基于场景的非线性量化Scenario-based nonlinear quantization

如前所讨论的，在一些实施例中，可以按照以下参数来表达非线性量化器(155)：X_max、offset(偏差)(例如，M)和Level(也参见关于图5的讨论)。在一些实施例中，可能有益的是按照帧序列(例如，场景)中的残差像素特性来确定这些参数。As previously discussed, in some embodiments, the nonlinear quantizer (155) may be expressed in terms of the following parameters: _Xmax , offset (e.g., M), and Level (see also the discussion regarding FIG5). In some embodiments, it may be beneficial to determine these parameters in terms of residual pixel characteristics in a sequence of frames (e.g., a scene).

给定用于F个帧的序列的和设Given the sum for a sequence of F frames, let

那么，非线性量化器的参数可以针对整个场景被设置为：Then, the parameters of the nonlinear quantizer can be set for the entire scene as:

Level＝max(2^EL_bitdepth-1)-Offset，Offset}，Level=max(2 ^EL_bitdepth -1)-Offset, Offset},

并且and

X_MAX＝(1+Δ)max{X^-，X⁺}，X _MAX = (1+Δ)max{X ^- ,X ⁺ },

其中，EL_bitdepth表示EL编码器(160)的位深(例如，EL_bitdepth＝8)，Δ表示小的正数值(例如，Δ＝0.1)。在实施例中，对于色度分量，可以使用下式来确定量化级的数量：Where EL_bitdepth represents the bit depth of the EL encoder (160) (e.g., EL_bitdepth=8), and Δ represents a small positive value (e.g., Δ=0.1). In an embodiment, for chroma components, the number of quantization levels can be determined using the following formula:

在另一实施例中，和值还可以用如前计算的相应的Th_f+和Th_f-值取代。In another embodiment, the sum value may also be replaced by the corresponding Th _f+ and Th _f- values calculated as before.

示例计算机系统实现Example Computer System Implementation

本发明的实施例可以用计算机系统、用电子电路和组件中配置的系统、集成电路(IC)器件(诸如微控制器)、现场可编程门阵列(FPGA)或另一可配置或可编程逻辑器件(PLD)、离散时间或数字信号处理器(DSP)、专用IC(ASIC)、和/或包括这样的系统、器件或组件中的一个或多个的装置来实现。计算机和/或IC可以执行、控制或运行与对UHD EDR信号进行编码(诸如本文中所描述的那些)相关的指令。计算机和/或IC可以计算与如本文中所描述的UHD EDR信号的编码相关的各种参数或值中的任何一个。编码和解码实施例可以用硬件、软件、固件及其各种组合来实现。Embodiments of the present invention may be implemented using a computer system, a system configured using electronic circuits and components, an integrated circuit (IC) device (such as a microcontroller), a field programmable gate array (FPGA) or another configurable or programmable logic device (PLD), a discrete-time or digital signal processor (DSP), an application-specific IC (ASIC), and/or an apparatus comprising one or more of such systems, devices, or components. The computer and/or IC may execute, control, or run instructions related to encoding UHD EDR signals such as those described herein. The computer and/or IC may calculate any of the various parameters or values related to encoding UHD EDR signals as described herein. The encoding and decoding embodiments may be implemented using hardware, software, firmware, and various combinations thereof.

本发明的某些实现包括如下计算机处理器，其运行使处理器执行本发明的方法的软件。例如，显示器、编码器、机顶盒、转码器等中的一个或多个处理器可以通过运行可供这些处理器访问的程序存储器中的软件指令来实现如上所述的与对UHD EDR信号进行编码相关的方法。本发明还可以以程序产品的形式提供。程序产品可以包括承载一组计算机可读信号的任何介质，这些信号包括当被数据处理器运行时使数据处理器运行本发明的方法的指令。根据本发明的程序产品可以为多种形式中的任何一种。程序产品可以包括例如物理介质，诸如磁性数据存储介质(包括软盘、硬盘驱动器)、光学数据存储介质(包括CD ROM、DVD)、电子数据存储介质(包括ROM、闪存RAM等)。程序产品上的计算机可读信号可选地可以被压缩或加密。Certain implementations of the present invention include a computer processor that runs software that causes the processor to perform the method of the present invention. For example, one or more processors in a display, an encoder, a set-top box, a transcoder, etc. can implement the method related to encoding UHD EDR signals as described above by running software instructions in a program memory accessible to these processors. The present invention can also be provided in the form of a program product. The program product may include any medium that carries a set of computer-readable signals that include instructions that, when executed by a data processor, cause the data processor to perform the method of the present invention. The program product according to the present invention may be in any of a variety of forms. The program product may include, for example, physical media such as magnetic data storage media (including floppy disks, hard drives), optical data storage media (including CD ROMs, DVDs), electronic data storage media (including ROMs, flash RAMs, etc.). The computer-readable signals on the program product may optionally be compressed or encrypted.

在组件(例如，软件模块、处理器、组装件、器件、电路等)在上面被提及的情况下，除非另有指示，否则对于该组件的论述(包括对于“手段”的论述)应被解释为，作为该组件的等同物，包括执行所描述的组件的功能的任何组件(例如，在功能上等同)，包括在结构上不等同于所公开的结构的、执行本发明的所例示的示例实施例中的功能的组件。Where a component (e.g., a software module, processor, assembly, device, circuit, etc.) is referred to above, unless otherwise indicated, discussion of that component (including discussion of "means") should be interpreted as an equivalent of that component, including any component that performs the function of the described component (e.g., functionally equivalent), including components that perform the functions in the illustrated example embodiments of the invention that are not structurally equivalent to the disclosed structure.

等同、扩展、替代及其它Equivalence, extension, substitution, and others

如此描述了与UHD EDR信号的向后兼容的编码和解码相关的示例实施例。在前述说明书中，已经参照可因实现而变化的许多特定细节描述了本发明的实施例。因此，什么是本发明、申请人意图什么是本发明的唯一的且排他的指示是本申请以特定形式发布的一组权利要求，在该特定形式中，这样的权利要求发布，包括任何后续校正。本文中针对这样的权利要求中所包含的术语明确阐述的任何定义应决定这样的术语在权利要求中使用的意义。因此，在权利要求中未被明确记载的任何限制、元素、性质、特征、优点或属性都不应以任何方式限制这样的权利要求的范围。说明书和附图因此要从例示性、而非限制性的意义上来看待。Thus described are example embodiments relating to backward-compatible encoding and decoding of UHD EDR signals. In the foregoing specification, embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to implementation. Thus, the sole and exclusive indicator of what is the invention, and what the applicants intend for the invention, is the set of claims from this application, in the specific form in which such claims issue, including any subsequent correction. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Accordingly, any limitation, element, property, feature, advantage, or attribute that is not expressly recited in a claim shall not limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims

1. A method for decoding a layered stream using a decoder including a processor, the method comprising:

The encoded bitstream is received, the encoded bitstream including an enhancement layer EL stream with a first spatial resolution and a first dynamic range, and a base layer BL stream with a second spatial resolution and a second dynamic range, wherein the first dynamic range is higher than the second dynamic range.

The encoded BL stream is decoded using a BL decoder to generate the first decoded BL signal;

In response to the decoded BL signal, a prediction signal having the first dynamic range is generated, wherein the luminance pixel value of the prediction signal is predicted based only on the luminance pixel value of the decoded BL signal, and the chrominance pixel value of at least one chrominance component of the prediction signal is predicted based on both the luminance pixel value and the chrominance pixel value of the decoded BL signal.

The encoded EL stream is decoded using an EL decoder to produce the first decoded residual signal;

The residual signal from the first decoding is processed by a nonlinear dequantizer to generate the residual signal from the second decoding; and

In response to the residual signal and the prediction signal of the second decoding, an output signal with a first spatial resolution and a first dynamic range is generated.

2. The method of claim 1, wherein the first dynamic range is a high or enhanced dynamic range, and the second dynamic range is a standard dynamic range.

3. The method according to claim 1, wherein the first spatial resolution is the same as the second spatial resolution.

4. The method of claim 1, wherein the first spatial resolution is higher than the second spatial resolution, and generating the output signal further comprises:

The predicted signal is upsampled to generate an upsampled predicted signal with the first spatial resolution; and

In response to the second decoded residual signal and the upsampled prediction signal, an output signal with a first spatial resolution and a first dynamic range is generated.

5. The method of claim 4, wherein the first spatial resolution is an ultra-high definition (UHD) spatial resolution and the second spatial resolution is a high definition (HD) spatial resolution.

6. The method of claim 4, wherein generating the upsampled prediction signal further comprises:

Receive luminance thresholds from the encoder to divide the luminance range into luminance sub-ranges;

Receive extended filter information for each of the lumen sub-ranges from the encoder;

For one or more pixels in the predicted signal, the luminance threshold and selection criteria are applied to determine a luminance subrange within the luminance subrange for one or more corresponding pixels in the decoded BL signal; and

An extended filter corresponding to the said luminance subrange is applied to the one or more pixels in the predicted signal to produce the corresponding pixel values of the upsampled predicted signal.

7. The method of claim 1, wherein the decoded BL signal comprises an interlaced signal, and generating the output signal further comprises:

The predicted signal is upsampled and deinterlaced to produce a line-by-line upsampled predicted signal with the first spatial resolution; and

In response to the second decoded residual signal and the line-by-line upsampled prediction signal, an output signal with a first spatial resolution and a first dynamic range is generated.

8. The method of claim 1, wherein generating the second decoded residual signal comprises calculating:

in

R _cmp represents the residual of the first decoding, represents the residual of the second decoding, and SL, M, and T represent the dequantization parameters transmitted from the encoder to the decoder in the encoded bitstream.

9. The method of claim 8, wherein the value of the residual of the second decoding is further limited to a minimum residual value and a maximum residual value.

10. An apparatus for decoding a layered stream, comprising a processor and configured to perform the method according to any one of claims 1 to 9.

11. A non-transitory computer-readable storage medium storing computer-executable instructions for performing the method according to any one of claims 1 to 9 using one or more processors.

12. An apparatus for decoding a layered stream, comprising means for performing the method according to any one of claims 1 to 9.