HK1238458A1 - Multiple color channel multiple regression predictor - Google Patents
Multiple color channel multiple regression predictor Download PDFInfo
- Publication number
- HK1238458A1 HK1238458A1 HK17112086.1A HK17112086A HK1238458A1 HK 1238458 A1 HK1238458 A1 HK 1238458A1 HK 17112086 A HK17112086 A HK 17112086A HK 1238458 A1 HK1238458 A1 HK 1238458A1
- Authority
- HK
- Hong Kong
- Prior art keywords
- image
- mmr
- model
- prediction
- pixel
- Prior art date
Links
Description
The application is a divisional application of PCT application with the international application number of PCT/US2012/033605 and the invention name of a multi-color channel multiple regression predictor, which is submitted at 13/4/2012, and the phase date of the PCT application entering China is 10/11/2013, and the national application number is 201280018070.9.
Cross Reference to Related Applications
This application claims the benefit of U.S. provisional patent application No.61/475,359, filed on 14/4/2011, the entire contents of which are incorporated herein by reference.
This application is also related to co-pending U.S. provisional patent application No.61/475,372, filed on 14.4.2011, which is incorporated herein by reference in its entirety.
Technical Field
The present invention generally relates to images. More particularly, embodiments of the invention relate to multi-color channel, multiple regression predictors between high dynamic range images and standard dynamic range images.
Background
The term "dynamic range" (DR) as used herein relates to the ability of the Human Visual System (HVS) to perceive a range of intensities (e.g., intensities) in an image, for example, from the darkest dark portions to the brightest bright portions. In this sense, DR relates to the "scene-related" strength. DR may also relate to the ability of a display device to properly or approximately present an intensity range of a particular width. In this sense, DR relates to the intensity "associated with the display". Unless a particular meaning is explicitly stated at any point in the description herein to have a particular meaning, it is to be inferred that the term can be used (e.g., interchangeably) in either sense.
The term High Dynamic Range (HDR) as used herein relates to a DR breadth spanning some 14-15 orders of magnitude of the Human Visual System (HVS). For example, a substantially normal well-adapted person (e.g., in one or more of a statistical, biometric, or ophthalmological sense) has an intensity range that spans about 15 orders of magnitude. Adapted humans can perceive a weak light source with as few as a few photons. However, the same person may perceive the nearly stinging glare intensity of the noon sun in a desert, sea or snow (or even look at the sun, but only briefly to prevent injury). While this span is accessible to "adapted" people, for example, those people's HVS have time periods to reset and adjust.
In contrast, the DR of the extended width in the human synchronizable perceptual intensity range may be shortened to some extent compared to HDR. The terms "visual dynamic range" or "variable dynamic range" (VDR) as used herein may relate to DR that may be perceived by the HVS simultaneously, either individually or interchangeably. VDR as used herein may relate to DR spanning 5-6 orders of magnitude. Thus, although HDR associated with real scenes may be somewhat narrower, VDR still represents a wider DR breadth. The term "synchronous dynamic range" as used herein may relate to VDR.
Until recently, displays had significantly narrower DR than HDR or VDR. Television (TV) and computer monitor devices using conventional Cathode Ray Tubes (CRT), Liquid Crystal Displays (LCD) with constant fluorescent white backlighting or plasma screen technology are limited to about three orders of magnitude in their DR rendering capabilities. Such conventional displays are therefore characterized by a Low Dynamic Range (LDR), also referred to as Standard Dynamic Range (SDR), for VDR and HDR.
However, advances in their underlying technology allow for more modern display designs to present image and video content in a manner that has significant improvements in a variety of quality characteristics relative to that presented on less modern displays. For example, newer display devices may be capable of presenting High Definition (HD) content and/or content that may be scaled according to a variety of display capabilities, such as an image scaler. Furthermore, some newer displays are capable of presenting content at a higher DR than the SDR of conventional displays.
For example, some modern LCD displays have a backlight unit (BLU) including an array of Light Emitting Diodes (LEDs). The LEDs of the BLU array may be modulated separately from the modulation of the polarization state of the active LCD elements. Such a dual modulation approach is scalable (e.g., to N modulation layers, where N comprises an integer greater than 2), such as by a controllable intermediate layer between the BLU array and the LCD screen elements. Its BLU and dual (or N) modulation based on an LED array effectively increases the display-related DR of LCD monitors having such features.
With respect to conventional SDR displays, such "HDR displays" are commonly referred to (although in practice their capabilities may more closely approximate the range of VDRs) and their possible DR extensions represent a significant advance in the ability to display images, video content and other video information. The color gamut that such HDR displays can render can also significantly exceed that of most conventional displays, even to the extent that they can render a Wide Color Gamut (WCG). Scene-related HDR or VDR and WCG image content, such as may be produced by "next generation" movies and TV cameras, can now be more realistically and efficiently displayed by an "HDR" display (hereinafter referred to as an "HDR display").
In the case of scalable video coding and HDTV technologies, extended picture DR typically involves a forking method. For example, scene-related HDR content acquired by a new HDR-capable camera may be used to generate an SDR version of the content, which may be displayed on a traditional SDR display. In one approach, generating an SDR version from an acquired VDR version may involve applying a global Tone Mapping Operator (TMO) to intensity (e.g., luma) related pixel values in the HDR content. In a second approach (as described in international patent application No. pct/US2011/048861 filed on 23/8/2011, which is incorporated by reference for all purposes), generating SDR images may involve applying reversible operators (or predictors) on the VDR data. To conserve bandwidth or for other considerations, transmitting the actual captured VDR content may not be the best approach.
Thus, an Inverse Tone Mapping Operator (iTMO) with respect to the inverse of the initial TMO or an inverse operator with respect to the initial predictor may be applied to the generated SDR content version, which allows for predicting the version of the VDR content. The predicted VDR content version may be compared to the originally acquired HDR content. For example, subtracting the predicted VDR version from the initial VDR version may produce a residual image. The encoder may send the generated SDR content as a Base Layer (BL) and package the generated SDR content version, any residual images, and iTMO or other predictors as Enhancement Layers (ELs) or as metadata.
Sending the EL and metadata (with its SDR content, residual image and predictor) into the bitstream typically takes less bandwidth than sending both HDR and SDR content directly into the bitstream. A compatible decoder receiving the bitstream sent by the encoder can decode the SDR and present it on a legacy display. However, compatible decoders may also use residual images, iTMO predictors or metadata to compute from them a predetermined version of the HDR content for use on more functional displays. It is an object of the present invention to provide a new method for generating predictors that allow efficient encoding, transmission and decoding of VDR data with corresponding SDR data.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Accordingly, unless otherwise indicated herein, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, a problem identified with respect to one or more methods should not be considered to have been identified in any prior art based on this section unless otherwise indicated.
Drawings
Embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
FIG. 1 depicts an example data flow of a VDR-SDR system according to an embodiment of the invention;
FIG. 2 depicts an example VDR encoding system, according to an embodiment of the invention;
FIG. 3 depicts the input and output interfaces of a multivariate regression predictor according to an embodiment of the invention;
FIG. 4 depicts an example multivariate regression prediction process, according to an embodiment of the invention;
FIG. 5 depicts an example process for determining a model of a multivariate regression predictor, in accordance with an embodiment of the invention;
FIG. 6 depicts an example image decoder with predictors operating in accordance with an embodiment of the present invention.
Detailed Description
Inter-color image prediction based on multivariate multiple regression modeling is described herein. Given a pair of corresponding VDR and SDR images, i.e., images representing the same scene but at different dynamic range levels, this section describes a method that allows an encoder to approximate a VDR image from multivariate multiple-regression (MMR) predictors and SDR images. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It may be evident, however, that the subject invention may be practiced without these specific details. In other instances, well-known structures and devices are not described in detail to avoid unnecessarily obscuring, or obscuring the present invention.
SUMMARY
Exemplary embodiments described herein relate to encoded images having high dynamic range. Embodiments create MMR predictors that allow VDR images to be expressed with respect to their corresponding SDR representations.
Example VDR-SDR System
Fig. 1 depicts an example data flow in a VDR-SDR system 100 according to an embodiment of the invention. An HDR image or video sequence is acquired with HDR camera 110. After the acquisition, the acquired image or video is processed by a mastering process to create a target VDR image 125. The potting process may include a number of processing steps, such as: editing, primary and secondary color correction, color transformation, and noise filtering. The VDR output 125 of this process represents the intent of the responsible person regarding how the acquired image will be displayed on the target VDR display.
The mastering process may also output a corresponding SDR image 145 representing the intent of the person in charge as to how the acquired image will be displayed on a legitimate SDR display. The SDR output 145 may be provided directly from the mastering circuit 120 or the SDR output 145 may be generated by a separate VDR-to-SDR converter 140.
In this example embodiment, the VDR 125 and SDR 145 signals are input into encoder 130. The purpose of the encoder 130 is to create an encoded bitstream that reduces the bandwidth required to transmit the VDR and SDR signals and also allows the corresponding decoder 150 to decode and render either the SDR signal or the VDR signal. In an example implementation, the encoder 130 may be a layered encoder, such as one of those defined by the MPEG-2 and h.264 coding standards, which represents its output as a base layer, an optional enhancement layer, and metadata. The term "metadata" as used herein relates to any auxiliary information that is transmitted as part of the encoded bitstream and assists the decoder in rendering the decoded image. Such metadata may include (but is not limited to) the following: color space or gamut information, dynamic range information, tone mapping information, or MMR predictors, such as those described herein.
At the receiver, the decoder 150 uses the received encoded bitstream and metadata to render either an SDR image or a VDR image according to the capabilities of the target display. For example, an SDR display may only use the base layer and metadata to render an SDR image. In contrast, a VDR display may use information and metadata from all input layers to render a VDR signal.
Fig. 2 shows an example implementation of the encoder 130 comprising the method of the present invention in more detail. In fig. 2, SDR' represents an enhanced SDR signal. SDR video is now 8-bit, 4:2:0, ITU rec.709 data. SDR' may have the same color space (primaries and white point) as SDR, but may use high precision, say 12 bits per pixel, for all color components at full spatial resolution (e.g., 4:4:4 RGB). From fig. 2, SDR can be easily derived from an SDR' signal using a set of forward transforms, which may include quantization from say 12 bits per pixel to 8 bits per pixel, say a color transform from RGB to YUV, and color subsampling from say 4:4:4 to 4:2: 0. The SDR output of transformer 210 is applied to compression system 220. Depending on the application, the compression system 220 may be lossy (such as H.264 or MPEG-2) or lossless. The output of the compression system 220 may be transmitted as a base layer 225. In order to reduce the offset between the encoded signal and the decoded signal, it is not uncommon for the encoder 130 to immediately follow the compression process 220 with a corresponding decompression process 230 and an inverse transform 240 corresponding to the forward transform 210. Thus, predictor 250 may have the following inputs: VDR input 205, and either SDR ' signal 245 (which corresponds to an SDR ' signal when it is to be received by a corresponding decoder) or input SDR ' 207. The predictor 250 using the input VDR and SDR' data will create a signal 257, which signal 257 represents an approximation or estimate of the input VDR 205. The adder 260 subtracts the predicted VDR 257 from the initial VDR 205 to form an output residual signal 265. Subsequently (not shown), the residue 265 may also be encoded by another lossy or lossless encoder and may be transmitted as an enhancement layer to a decoder.
Predictor 250 may also provide prediction parameters for use in the prediction process as metadata 255. Since the prediction parameters may change during the encoding process, e.g., frame-by-frame or scene-by-scene, these metadata may be transmitted to the decoder as part of the data that also includes the base and enhancement layers.
Since VDR 205 and SDR' 207 both represent the same scene, but are directed to different displays with different characteristics (such as dynamic range and color gamut), it is desirable that the two signals have very close correlation. In an example embodiment of the present invention, a new multivariate, multivariate regression (MMR) predictor 250 was developed that allows the input VDR signal to be predicted using the SDR' signal corresponding to the VDR signal and the multivariate MMR operator.
Example predictive models
FIG. 3 illustrates input and output interfaces of an MMR predictor 300 in accordance with an example implementation of the invention. According to fig. 3, predictor 330 receives input vectors v 310 and s 320, which represent VDR image data and SDR image data, respectively, and predictor 330 outputs the vectorsWhich represents the predicted value of input v.
Example notation and nomenclature
The three color components of the ith pixel in the SDR image 320 are labeled:
si=[si1si2si3]。 (1)
the three color components of the ith pixel in VDR input 310 are labeled:
vi=[vi1vi2vi3]。 (2)
the predicted three color components for the ith pixel in predicted VDR 340 are labeled:
the total number of pixels in one color component is labeled p.
In equations (1-3), the color pixel may be RGB, YUV, YCbCr, XYZ, or any other color representation. Although equations (1-3) assume three color representations for each pixel in each image or video frame, as also shown later, the methods described herein can be readily extended to image and video representations having more than three color components per pixel, or to image representations where one of the inputs may have a different number of pixels in color representation than the other inputs.
First order model (MMR-1)
Using Multivariate Multiple Regression (MMR) models, the first order prediction model can be expressed as:
wherein the content of the first and second substances,is a 3 × 3 matrix and n is a 1 × 3 vector, defined as:
and n ═ n11n12n13]。 (5)
It should be noted that this is a multi-color channel prediction model. In equation (4)Each color component is represented as a linear combination of all color components in the input. In other words, unlike other single channel color predictors (where each color channel processes itself and independently of each other for each output pixel), the present model takes into account all color components of the pixel and thus exploits inter-color correlation and redundancy.
Equation (4) can be simplified to by using a single matrix-based representation:
wherein the content of the first and second substances,
and s'i=[1 si1si2si3](7)
By grouping together all p pixels of a frame (or other suitable segment or portion of the input), it is possible to have the following matrix representation:
wherein the content of the first and second substances,
and
representing the input and predicted output data, S' is a p × 4 data matrix,is a p × 3 matrix, and M(1)Is a 4 × 3 matrix, as used herein, M(1)Interchangeably referred to as multivariate operators or prediction matrices.
Based on this linear system of equation (8), this MMR system can be formulated as two different problems: (a) a least squares problem, or (b) a total least squares problem; both problems can be solved using known numerical methods. For example, using a least squares method, the problem for solving M can be formulated to minimize the residual or predicted mean square error, or
Where V is a p x 3 matrix formed using corresponding VDR input data.
Given equations (8) and (10), M(1)Is given as
M(1)=(S′TS′)-1S′TV, (11)
Wherein, S'TDenotes the transpose matrix of S ', S'TS' is a 4 × 4 matrix.
If S' is the full rank, e.g.,
rank(S′)=4≤p,
then, M may also be solved using a variety of alternative numerical techniques (including SVD, QR, or LU decomposition)(1)。
Second order model (MMR-2)
Equation (4) represents a first order MMR prediction model. It is also contemplated to employ higher order prediction as described next.
The second order predictive MMR model can be expressed as:
whereinIs a 3 × 3 matrix of the image data,
and
equation (12) can be simplified by using a single prediction matrix,
wherein the content of the first and second substances,
and is
By grouping all p pixels together, the following matrix representation can be defined:
wherein the content of the first and second substances,
equation (14) can be solved using the same optimization and solution described in the previous section. M of least squares problem(2)The best solution is
Wherein S is(2)TS(2)Now a 7 × 7 matrix.
Third or higher order MMR models can also be constructed in a similar manner.
First order model with cross multiplication (MMR-1C)
In the alternative MMR model, the first order prediction model of equation (4) can be enhanced to include cross-multiplication (cross-multiplication) between the color components of each pixel as follows:
wherein the content of the first and second substances,is a 3 × 3 matrix and n is a 1 × 3 vector, both as defined in equation (5), and
and sci=[si1·si2si1·si3si2·si3si1·si2·si3]。 (21)
According to the same approach as before, the MMR-1C model of equation (20) can be simplified by using a single prediction matrix MC, as follows:
wherein the content of the first and second substances,
and is
By grouping all p pixels together, a simplified matrix representation can be derived, as follows:
wherein the content of the first and second substances,
and
SC is a p × (1+7) matrix and can be solved using the same least squares solution described previously.
Second order model with cross multiplication (MMR-2C)
The first order MMR-1C can be extended to also include second order data. For example,
wherein the content of the first and second substances,
and is
And the remaining components of equation (27) are the same as those previously defined in equations (5-26).
As before, equation (27) is satisfied by using the simple prediction matrix MC(2)In order to simplify the process,
wherein the content of the first and second substances,
and is
By grouping all p pixels together, it is possible to have a simplified matrix representation
Wherein the content of the first and second substances,
and SC(2)Is a px (1+2 x 7) matrix and the same least squares solution as described before can be applied.
Third or higher order models with cross-multiplication parameters can be constructed in a similar manner. Alternatively, as described in "Chaper 5.4.3 of" Digital Color Imaging Handbook ", CRC Press,2002, Edited by Gaurav Sharma", the K-th order representation of the MMR cross multiplication model can also be described using the following formula.
And is
Where K represents the highest order of the MMR predictor.
MMR (MMR-C-S) based on space expansion
In all MMR models described so far, the predicted pixelsDepends only on the corresponding, normally configured input value si. In the case of MMR based prediction, it may also benefit by taking into account data from neighboring pixels. This method corresponds to the integration into the MMR model of any linear type of processing of the input in the spatial domain, such as FIR type filtering.
If all eight possible neighboring pixels are considered in an image, this method can add up to eight more first order variables for each color component into the prediction matrix M. However, in practice, it is usually sufficient to add only predictor variables corresponding to two horizontally adjacent pixels and two vertically adjacent pixels, ignoring diagonally adjacent pixels. This adds up to four variables per color component into the prediction matrix, i.e., the four variables correspond to the upper, left, lower, and right pixels. Similarly, parameters corresponding to higher orders of adjacent pixel values can also be added.
To simplify the complexity and computational requirements of such MMR spatial models, it may be considered to add spatial extensions to the traditional model for only a single color component, such as a luminance component (as in a luminance-chrominance representation) or a green component (as in an RGB representation). For example, assuming that the spatial-based pixel prediction is added only for the color component of green, then according to equations (34-36), a general representation of the predicted green output pixel value would be
First order model with spatial extension (MMR-1-S)
As another example implementation, the first order MMR model (MMR-1) of equation (4) may be considered again, but is now enhanced to include spatial expansion in one or more color components. For example, when applied to four neighboring pixels of each pixel in the first color component,
wherein the content of the first and second substances,is a 3 × 3 matrix and n is a 1 × 3 vector, both as defined in equation (5),
and is
Where m in equation (39) represents the number of columns in an input frame having m columns and n rows, or m × n ═ p total pixels. Equation (39) can be easily extended to apply these methods to other color components and to alternative neighboring pixel configurations.
Equation (38) can be easily formulated as a system of linear equations according to the same method as before
Which can be solved as described previously.
Application of a VDR signal with more than three primary colors
All proposed MMR prediction models can easily be extended to signal spaces with more than three primary colors. As an example, one can consider the case where the SDR signal has three primary colors, say RGB, but the VDR signal is defined in a P6 color space having six primary colors. In this case, equations (1-3) can be rewritten as
si=[si1si2si3], (41)
Vi=[vi1vi2vi3vi4vi5vi6], (42)
And
as before, the number of pixels in one color component is denoted as p. Considering now the first order MMR prediction model (MMR-1) of equation (4),
now a 3 × 6 matrix and n is a 1 × 6 vector, given by
And n ═ n11n12n13n14n15n16]。 (46)
Equation (41) can use a single prediction matrix M(1)Expressed as:
wherein the content of the first and second substances,
and s'i=[1 si1si2si3]。 (48)
By grouping all p pixels together, this prediction problem can be described as
Wherein the content of the first and second substances,
is a matrix of p × 6 which is,is a p × 4 matrix, and M(1)Is a 4 × 6 matrix.
Higher order MMR prediction models can also be extended in a similar manner and a solution to the prediction matrix can be obtained via the methods described previously.
Example processing of Multi-channel, multiple regression prediction
FIG. 4 illustrates an example process of multi-channel multivariate regression prediction according to an example implementation of the invention.
The process begins at step 410, where a predictor (such as predictor 250) receives input VDR and SDR signals. Given the two inputs, the predictor decides which MMR model to select in step 420. As described previously, the predictor can be selected among a variety of MMR models, including (but not limited to): a first order model (MMR-1), a second order model (MMR-2), a third or higher order model, a first order model with cross multiplication (MMR-1C), a second order model with cross multiplication (MMR-2C), a third order model with cross multiplication (MMR-3C), a third or higher order model with cross multiplication, or any of the above with the addition of a spatial extension.
The selection of the MMR model can be made using a variety of methods that take into account a number of criteria, including: prior knowledge about SDR and VDR inputs, available computational and memory resources, and target coding efficiency. FIG. 5 illustrates an example implementation of step 420 based on a requirement that the residual be less than a predetermined threshold.
As described previously, a set of linear equations of the form can represent an arbitrary MMR model
Where M is a prediction matrix.
At step 430, M can be solved using a variety of numerical methods. For example, in making V estimate therewithWith the constraint that the mean square value of the residuals in between is minimal,
M=(STS)-1STV。 (51)
finally, at step 440, using equation (50), prediction is madeOperator outputAnd M.
FIG. 5 illustrates an example process 420 for selecting an MMR model during prediction. The predictor 250 in step 510 may begin with an initial MMR model, such as the MMR model that has been used in a previous frame or scene, e.g., a second order model (MMR-2), or the simplest possible model, such as MMR-1. After solving for M, the predictor calculates the prediction error between the input V and its predicted value in step 520. In step 530, if the prediction error is below a given threshold, the predictor selects the existing model and stops the selection process (540), otherwise, in step 550, it is checked whether a more complex model is used. For example, if the current model is MMR-2, the predictor can decide to use MMR-2-C or MMR-2-C-S. As described previously, this decision may depend on a variety of criteria, including the value of the prediction error, the processing power requirements, and the target coding efficiency. If it is feasible to use a more complex model, a new model is selected in step 560 and the process returns to step 520. Otherwise, the predictor will use the existing model (540).
The prediction process 400 may be repeated at various intervals as needed to maintain coding efficiency while utilizing available computing resources. For example, when encoding a video signal, process 400 may be repeated based on each predefined video segment size for each frame, group of frames, or whenever the prediction residual exceeds a particular threshold.
The prediction process 400 can also use all available input pixels or sub-samples of these pixels. In one example implementation, pixels from every kth pixel row and every kth pixel column of the input data may be used, where k is an integer equal to or greater than 2. In another example implementation, it may be decided to skip input pixels below a particular clipping threshold (e.g., very close to 0) or at a particular saturation threshold (e.g., very close to 2 for n-bit data)nImage of-1Pixel value) or more. In another implementation, a combination of such sub-sampling and thresholding techniques may be used to reduce the pixel sample size and to accommodate the computational constraints of a particular implementation.
Image decoding
Embodiments of the present invention may be implemented on an image encoder or on an image decoder. Fig. 6 shows an example implementation of the decoder 150 according to an embodiment of the present invention. The decoding system 600 receives an encoded bitstream, which may have both a base layer 690, an optional enhancement layer (or residual) 665, and metadata 645, which are extracted after decompression 630 and various inverse transforms 640. For example, in a VDR-SDR system, the base layer 690 may represent an SDR representation of the encoded signal, and the metadata 645 may include information related to the MMR prediction model and corresponding prediction parameters used in the encoder predictor 250. In one example implementation, when the encoder uses the MMR predictor in accordance with the methods of the present invention, the metadata may include the identification of the model used (e.g., MMR-1, MMR-2C, etc.) and all matrix coefficients associated with the particular model. Given the base layer 690 and the color MMR-related parameters extracted from the metadata 645, the predictor 650 can calculate the predicted using any of the corresponding equations described hereinFor example, if the identified model is MMR-2C, it can be calculated using equation (32)If no residue is present, or the residue is negligible, the predicted value 680 can be directly output as the final VDR image. Otherwise, in adder 660, the output of predictor (680) is added to residue 665 to output VDR signal 670.
Example computer System implementation
Embodiments of the invention may be implemented by a computer system, a system configured with electronic circuits and components, an Integrated Circuit (IC) device such as a microcontroller, a Field Programmable Gate Array (FPGA) or another configurable or Programmable Logic Device (PLD), a discrete time or Digital Signal Processor (DSP), an Application Specific IC (ASIC), and/or an apparatus including one or more such systems, devices, or components. The computer and/or IC may execute, control, or execute instructions related to MMR-based prediction, such as those described herein. The computer and/or IC may calculate any of a variety of parameters or values related to MMR prediction as described herein. Image and video dynamic range extension embodiments may be implemented in hardware, software, firmware, and various combinations thereof.
Particular implementations of the invention include a computer processor executing software instructions that cause the processor to perform the methods of the invention. For example, one or more processors in a display, encoder, set-top box, transcoder, etc. may implement the MMR-based prediction method described above by executing software instructions in a program memory accessible to the processors. The present invention is also provided in the form of a program product. The program product may comprise any medium carrying a set of computer readable signals comprising instructions which, when executed by a data processor, cause the data processor to perform the method of the invention. The program product according to the invention may be in any of a wide variety of forms. The program product may include, for example, physical media such as magnetic data storage media including floppy disks, hard disk drives, optical data storage media including CD ROMs, DVDs, electronic data storage media including ROMs, flash RAMs, etc. The computer readable signal on the program product may optionally be compressed or encrypted.
Unless otherwise indicated, reference to a component (e.g., a software module, processor, assembly, device, circuit, etc.) above, including a reference to a "means," should be interpreted as including as equivalents of that component any component which performs the function of the described component (e.g., that is functionally equivalent), including components which are not structurally equivalent to the disclosed structure which performs the function in the illustrated exemplary embodiments of the invention.
Equality, extension, substitution and diversification
Example embodiments are thus described that relate to applying MMR prediction in the encoding of VDR and SDR images. In the foregoing specification, embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to implementation. Accordingly, the sole indication that an embodiment of the present invention, and the applicant's claims as directed to the invention, is a set of claims that issue from this application, in the specific form in which such claims issue, including subsequent correction. Any explicit definition given herein for a term contained in such a claim shall encompass the meaning of such term as used in the claim. Hence, no limitation, element, property, feature, advantage or notation, not expressly stated in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
First set of attached notes:
1. a method, comprising:
receiving a first image and a second image, wherein the second image has a different dynamic range than the first image;
selecting a multi-channel, multiple regression (MMR) prediction model from a family of MMR models;
solving the prediction parameters of the selected MMR model;
computing an output image representing a predicted value of the first image using the second image and prediction parameters of the MMR model;
and outputting the prediction parameters of the MMR model and the output image.
2. The method according to supplementary note 1, wherein the first image comprises a VDR image and the second image comprises an SDR image.
3. The method of supplementary note 1, wherein the MMR model is at least one of a first order MMR model, a second order MMR model, a third order MMR model, a first order MMR model with cross multiplication, a second order MMR model with cross multiplication, or a third order MMR model with cross multiplication.
4. The method according to supplementary note 3, wherein any one of the MMR models further includes prediction parameters relating to neighboring pixels.
5. The method according to supplementary note 4, wherein the considered adjacent pixels include a left adjacent pixel, a right adjacent pixel, an upper adjacent pixel, and a lower adjacent pixel.
6. The method according to supplementary note 2, wherein pixels in the VDR image have more color components than pixels in the SDR image.
7. The method of supplementary note 1, wherein solving the prediction parameters of the selected MMR model further comprises applying a numerical method that minimizes a mean square error between the first image and the output image.
8. The method of supplementary note 1, wherein selecting the MMR prediction model from the family of MMR models further comprises an iterative selection process comprising:
(a) selecting and applying an initial MMR model;
(b) calculating a residual error between the first image and the output image;
(c) selecting an existing MMR model if the residual error is less than a threshold and no other MMR models are available; otherwise, selecting a new MMR model different from the previous model; and returning to step (b).
9. An image decoding method, comprising:
receiving a first image having a first dynamic range;
receiving metadata, wherein the metadata defines an MMR prediction model and corresponding prediction parameters of the MMR prediction model;
applying the first image and the prediction parameters to the MMR prediction model to compute an output image representing a predicted value of a second image, wherein the second image has a different dynamic range than the dynamic range of the first image.
10. The method of supplementary note 9, wherein the MMR model is at least one of a first order MMR model, a second order MMR model, a third order MMR model, a first order MMR model with cross multiplication, a second order MMR model with cross multiplication, or a third order MMR model with cross multiplication.
11. The method according to supplementary note 10, wherein any one of the MMR models further includes prediction parameters relating to neighboring pixels.
12. The method of supplementary note 9, wherein the first image comprises an SDR image and the second image comprises a VDR image.
13. An apparatus comprising a processor and configured to perform any of the methods described in supplementary notes 1-12.
14. A computer-readable storage medium storing computer-executable instructions for performing the method according to any one of supplementary notes 1 to 12.
The second group of attached notes:
1. a method, comprising:
providing a plurality of multi-channel, multiple regression (MMR) prediction models, each MMR prediction model adapted to approximate an image having a first dynamic range according to,
an image having a second dynamic range, an
Prediction parameters of the respective MMR prediction models obtained by applying inter-color image prediction;
receiving a first image and a second image, wherein the second image has a different dynamic range than the first image;
selecting a multi-channel, multiple regression (MMR) prediction model from the plurality of MMR models;
determining a value of a prediction parameter of the selected MMR model;
computing an output image approximating the first image based on the second image and the determined values of the prediction parameters applied to the selected MMR prediction model;
outputting the determined values of the prediction parameters and the calculated output image, wherein the plurality of MMR models comprises a first order multi-channel, multivariate regression prediction model combining cross multiplication between color components of each pixel according to the following formula,
wherein the content of the first and second substances,representing the predicted three color components of the ith pixel of the first image,
si=[si1si2si3]three color components representing an ith pixel of the second image,
according to the following formula,is a 3 × 3 matrix and n is a 1 × 3 vector
And n ═ n11n12n13],
sci=[si1·si2si1·si3si2·si3si1·si2·si3]And is and
wherein the prediction parameters of the first order multi-channel, multi-regression prediction model are numerically obtained by minimizing a mean square error between the first image and the output image.
2. The method according to supplementary note 1, wherein the first image comprises a VDR image and the second image comprises an SDR image.
3. The method of supplementary note 1, wherein the selected MMR prediction model is at least one of a first order MMR model, a second order MMR model, a third order MMR model, a first order MMR model with cross multiplication, a second order MMR model with cross multiplication, or a third order MMR model with cross multiplication.
4. The method according to supplementary note 3, wherein any one of the MMR models further includes prediction parameters relating to neighboring pixels.
5. The method according to supplementary note 4, wherein the adjacent pixels include a left adjacent pixel, a right adjacent pixel, an upper adjacent pixel, and a lower adjacent pixel.
6. The method according to supplementary note 2, wherein pixels in the VDR image have more color components than pixels in the SDR image.
7. The method of supplementary note 1, wherein selecting the MMR predictive model from the plurality of MMR predictive models further comprises an iterative selection process comprising:
(a) selecting and applying an initial MMR prediction model;
(b) calculating a residual error between the first image and the output image;
(c) selecting the initial MMR model if the residual error is less than an error threshold and no other MMR prediction model can be selected; otherwise, selecting a new MMR prediction model from the plurality of MMR prediction models, the new MMR prediction model being different from the previously selected MMR prediction model; and returning to step (b).
8. An image decoding method, comprising:
receiving a first image having a first dynamic range;
receiving metadata, wherein the metadata comprises
A multiple regression (MMR) prediction model adapted to approximate a second image having a second dynamic range according to,
the first image, and
prediction parameters of the MMR prediction model obtained by applying inter-color image prediction, the metadata further comprising previously determined values of the prediction parameters, an
Applying the first image and the previously determined values of the prediction parameters to the MMR prediction model to compute an output image for approximating the second image, wherein the second dynamic range is different from the first dynamic range, wherein the MMR prediction model is a first order multi-channel, multi-regression prediction model incorporating cross-multiplication between color components of each pixel according to the following formula,
wherein the content of the first and second substances,representing the predicted three color components of the ith pixel of the first image,
si=[si1si2si3]three color components representing an ith pixel of the second image,
according to the following formula,is a 3 × 3 matrix and n is a 1 × 3 vector
And n ═ n11n12n13],
sci=[si1·si2si1·si3si2·si3si1·si2·si3]And is and
9. the method according to supplementary note 8, wherein the first order MMR prediction model is extended to a second order MMR prediction model or a third order MMR prediction model with pixel cross multiplication.
10. The method according to supplementary note 8 or 9, wherein the MMR prediction model further comprises prediction parameters relating to neighboring pixels.
11. The method of supplementary note 8, wherein the first image comprises an SDR image and the second image comprises a VDR image.
12. An apparatus comprising a processor and configured to perform any of the methods described in supplementary notes 1-11.
13. A computer-readable storage medium storing computer-executable instructions for performing the method according to any one of supplementary notes 1 to 11.
14. A method, comprising:
providing a plurality of multi-channel, multiple regression (MMR) prediction models, each MMR prediction model adapted to approximate an image having a first dynamic range according to,
an image having a second dynamic range, an
Prediction parameters of the respective MMR prediction models obtained by applying inter-color image prediction;
receiving a first image and a second image, wherein the second image has a different dynamic range than the first image;
selecting a multi-channel, multiple regression (MMR) prediction model from the plurality of MMR models;
determining a value of a prediction parameter of the selected MMR model;
computing an output image approximating the first image based on the second image and the determined values of the prediction parameters applied to the selected MMR prediction model;
outputting the determined values of the prediction parameters and the calculated output image, wherein the plurality of MMR models comprises a second-order multi-channel, multivariate regression prediction according to the following formula,
wherein the content of the first and second substances,representing the predicted three color components of the ith pixel of the first image,
si=[si1si2si3]three color components representing an ith pixel of the second image,
according to the following formula,andis a 3 × 3 matrix and n is a 1 × 3 vector,
n=[n11n12n13],
and
wherein the prediction parameters of the second-order multi-channel, multi-regression prediction model are obtained numerically by minimizing a mean square error between the first image and the output image.
15. The method of supplementary note 14, wherein any one of the MMR models includes prediction parameters relating to neighboring pixels.
16. The method according to supplementary note 15, wherein the adjacent pixels include a left adjacent pixel, a right adjacent pixel, an upper adjacent pixel, and a lower adjacent pixel.
17. The method of supplementary note 14, wherein selecting the MMR predictive model from the plurality of MMR predictive models further comprises an iterative selection process comprising:
(a) selecting and applying an initial MMR prediction model;
(b) calculating a residual error between the first image and the output image;
(c) selecting an initial MMR model if the residual error is less than an error threshold and no other MMR model can be selected; otherwise, selecting a new MMR prediction model from the plurality of MMR prediction models, the new MMR prediction model being different from the previously selected MMR prediction model; and returning to step (b).
18. An image decoding method, comprising:
receiving a first image having a first dynamic range;
receiving metadata, wherein the metadata comprises
A multiple regression (MMR) prediction model adapted to approximate a second image having a second dynamic range according to,
the first image, and
prediction parameters of the MMR prediction model obtained by applying inter-color image prediction, the metadata further comprising previously determined values of the prediction parameters; and
applying the first image and previously determined values of the prediction parameters to the MMR prediction model to compute an output image for approximating the second image, wherein the second dynamic range is different from the first dynamic range, wherein the MMR prediction model is a second order multi-channel, multi-regression prediction according to the following formula,
wherein the content of the first and second substances,representing the predicted three color components of the ith pixel of the first image,
si=[si1si2si3]three color components representing an ith pixel of the second image,
according to the following formula,andis a 3 × 3 matrix and n is a 1 × 3 vector,
n=[n11n12n13],
and
wherein the prediction parameters of the second-order multi-channel, multi-regression prediction model are obtained numerically by minimizing a mean square error between the first image and the output image.
19. The method of supplementary note 18, wherein the second order MMR prediction model is extended to either a second order MMR prediction model or a third order MMR prediction model with pixel cross multiplication.
20. The method according to supplementary note 18 or 19, wherein the MMR prediction model further comprises prediction parameters relating to neighboring pixels.
21. An apparatus comprising a processor and configured to perform any one of the methods described in appendant notes 14-17.
22. A computer-readable storage medium storing computer-executable instructions for performing the method of any of the appendant notes 14-17.
Claims (15)
1. A method of approximating, using a processor, an image having a first dynamic range from an image having a second dynamic range, the method comprising:
receiving a first image and a second image, wherein the second image has a different dynamic range than the first image;
selecting an MMR model from one or more multi-channel, multi-regression MMR prediction models;
determining a value of a prediction parameter of the selected MMR model;
computing an output image that approximates the first image based on the second image and the determined values of the prediction parameters of the selected MMR prediction model, wherein the pixel value of at least one color component in the output image is computed based on the pixel values of at least two color components in the second image; and
outputting the determined values of the prediction parameters and the calculated output image;
wherein selecting the MMR model from the one or more MMR prediction models further comprises an iterative selection process comprising:
(a) selecting and applying an initial MMR prediction model;
(b) calculating a residual error between the first image and the output image;
(c) selecting the initial MMR model if the residual error is less than an error threshold and no other MMR prediction model can be selected; if not, then,
selecting a new MMR prediction model from a plurality of MMR prediction models, the new MMR prediction model being different from a previously selected MMR prediction model; and returning to step (b).
2. The method as recited in claim 1, wherein the first image comprises a Visual Dynamic Range (VDR) image and the second image comprises a Standard Dynamic Range (SDR) image.
3. The method of claim 1, wherein the selected MMR prediction model is at least one of a first order MMR model, a second order MMR model, a third order MMR model, a first order MMR model with cross multiplication, a second order MMR model with cross multiplication, or a third order MMR model with cross multiplication.
4. A method according to claim 3, wherein any one of the MMR models used to predict pixels of the output image further comprises prediction parameters relating to neighbouring pixels of the corresponding pixel in the second image.
5. The method of claim 4, wherein the neighboring pixels comprise a left neighboring pixel, a right neighboring pixel, a top neighboring pixel, and a bottom neighboring pixel of the respective pixel in the second image.
6. The method of claim 2, wherein the pixels in the first image have more color components than the pixels in the second image.
7. The method of claim 1, wherein determining values of prediction parameters of the selected MMR prediction model further comprises applying a numerical method that minimizes a mean square error between the first image and the output image.
8. The method of claim 3 wherein the second order MMR model comprises a predictive model according to the following equation
Wherein the content of the first and second substances,
representing the predicted three color components of the ith pixel of the first image,
si=[si1si2si3]three color components representing an ith pixel of the second image,
a square value representing three color components of an ith pixel of the second image,andis a 3 × 3 prediction parameter matrix and n is a 1 × 3 prediction parameter vector, an
Wherein the second order MMR model further comprises cross multiplication according to the following formula
Wherein the content of the first and second substances,
sci=[si1·si2si1·si3si2·si3si1·si2·si3],
sci 2=[si1 2·si2 2si1 2·si3 2si2 2·si3 2si1 2·si2 2·si3 2]and is andandincluding the 4 × 3 prediction parameter matrix.
9. An image decoding method, comprising:
receiving a first image having a first dynamic range;
receiving metadata, wherein the metadata comprises a multiple regression MMR prediction model adapted to approximate a second image having a second dynamic range from the first image and prediction parameters of the MMR prediction model, the metadata further comprising previously determined values of the prediction parameters, and
applying the first image and the previously determined values of the prediction parameters to the MMR prediction model to compute an output image for approximating the second image, wherein the second dynamic range is different from the first dynamic range, and wherein a pixel value of at least one color component in the output image is computed based on pixel values of at least two color components in the first image,
wherein the MMR model is a second order MMR model comprising a predictive model according to the following formula:
wherein the content of the first and second substances,
representing the predicted three color components of the ith pixel of the second image,
si=[si1si2si3]three color components representing an ith pixel of the first image,
a square value representing three color components of an ith pixel of the first image,andis a 3 × 3 prediction parameter matrix and n is a 1 × 3 prediction parameter vector, an
Wherein the second order MMR model further comprises cross multiplication according to the following formula
Wherein the content of the first and second substances,
sci=[si1·si2si1·si3si2·si3si1·si2·si3],
sci 2=[si1 2·si2 2si1 2·si3 2si2 2·si3 2si1 2·si2 2·si3 2]and is andandincluding the 4 × 3 prediction parameter matrix.
10. The method of claim 9, wherein the first image comprises an SDR image and the second image comprises a VDR image.
11. A non-transitory computer-readable storage medium having stored thereon computer-executable instructions for performing the method of claim 1 using one or more processors.
12. A non-transitory computer-readable storage medium having stored thereon computer-executable instructions for performing the method of claim 9 using one or more processors.
13. A method of approximating, using a processor, an image having a first dynamic range from an image having a second dynamic range, the method comprising:
receiving a first image and a second image, wherein the second image has a lower dynamic range than the first image;
selecting an MMR model from one or more multi-channel, multi-regression MMR prediction models;
determining a value of a prediction parameter of the selected MMR model;
computing an output image that approximates the first image based on the second image and the determined values of the prediction parameters of the selected MMR prediction model, wherein the pixel value of at least one color component in the output image is computed based on the pixel values of at least two color components in the second image; and
outputting the determined values of the prediction parameters and the calculated output image;
wherein selecting the MMR model from the one or more MMR prediction models further comprises an iterative selection process comprising:
(a) selecting and applying an initial MMR prediction model;
(b) calculating a residual error between the first image and the output image;
(c) selecting the initial MMR model if the residual error is less than an error threshold and no other MMR prediction model can be selected; if not, then,
selecting a new MMR prediction model from a plurality of MMR prediction models, the new MMR prediction model being different from a previously selected MMR prediction model; and is
And (c) returning to the step (b).
14. A method of approximating, using a processor, an image having a first dynamic range from an image having a second dynamic range, the method comprising:
receiving a first image and a second image, wherein the second image has a lower dynamic range than the first image;
selecting an MMR model from one or more multi-channel, multi-regression MMR prediction models;
determining a value of a prediction parameter of the selected MMR model;
computing an output image that approximates the first image based on the second image and the determined values of the prediction parameters of the selected MMR prediction model, wherein the pixel value of at least one color component in the output image is computed based on the pixel values of at least two color components in the second image; and
outputting the determined values of the prediction parameters and the calculated output image;
wherein the MMR model is a second order MMR model comprising a predictive model according to the following formula:
wherein the content of the first and second substances,
three predicted color components, s, representing the ith pixel of the first imagei=[si1si2si3]Three color components representing an ith pixel of the second image,
a square value representing three color components of an ith pixel of the first image,andis a 3 × 3 prediction parameter matrix and n is a 1 × 3 prediction parameter vector, an
Wherein the second order MMR model further comprises cross multiplication according to the following formula
Wherein, sci=[si1·si2si1·si3si2·si3si1·si2·si3],
sci 2=[si1 2·si2 2si1 2·si3 2si2 2·si3 2si1 2·si2 2·si3 2]And is andandincluding the 4 × 3 prediction parameter matrix.
15. An image decoding method, comprising:
receiving a first image having a first dynamic range;
receiving metadata, wherein the metadata comprises a multiple regression MMR prediction model adapted to approximate a second image having a second dynamic range from the first image and prediction parameters of the MMR prediction model, the metadata further comprising previously determined values of the prediction parameters based on the first and second images, and
applying the first image and the previously determined values of the prediction parameters to the MMR prediction model to compute an output image for approximating the second image, wherein the second dynamic range is higher than the first dynamic range, and wherein a pixel value of at least one color component in the output image is computed based on pixel values of at least two color components in the first image,
wherein the MMR model is a second order MMR model comprising a predictive model according to the following formula:
wherein the content of the first and second substances,
representing the predicted three color components of the ith pixel of the second image,
si=[si1si2si3]represents the aboveThe three color components of the ith pixel of the first image,
a square value representing three color components of an ith pixel of the first image,andis a 3 × 3 prediction parameter matrix and n is a 1 × 3 prediction parameter vector, an
Wherein the second order MMR model further comprises cross multiplication according to the following formula
Wherein the content of the first and second substances,
sci=[si1·si2si1·si3si2·si3si1·si2·si3],
sci 2=[si1 2·si2 2si1 2·si3 2si1 2·si3 2si1 2·si2 2·si3 2]and is andandincluding the 4 × 3 prediction parameter matrix.
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US61/475,359 | 2011-04-14 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1238458A1 true HK1238458A1 (en) | 2018-04-27 |
| HK1238458B HK1238458B (en) | 2019-07-19 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN107105229B (en) | Image decoding method, video decoder, and non-transitory computer-readable storage medium | |
| CN103493489B (en) | Image Prediction Based on Primary Color Grading Model | |
| WO2013112532A2 (en) | Piecewise cross color channel predictor | |
| US20140369409A1 (en) | Piecewise Cross Color Channel Predictor | |
| HK1238458A1 (en) | Multiple color channel multiple regression predictor | |
| HK1238458B (en) | Multiple color channel multiple regression predictor | |
| HK1241609A1 (en) | Image decoding method, video decoder, and non-transitory computer-readable storage medium | |
| HK1241609B (en) | Image decoding method, video decoder, and non-transitory computer-readable storage medium | |
| HK1193688B (en) | Multiple color channel multiple regression predictor | |
| HK1214049B (en) | Piecewise cross color channel predictor | |
| HK1204741B (en) | Piecewise cross color channel predictor |