WO2024009962A1 - Cclm予測部、動画像復号装置、および動画像符号化装置 - Google Patents
Cclm予測部、動画像復号装置、および動画像符号化装置 Download PDFInfo
- Publication number
- WO2024009962A1 WO2024009962A1 PCT/JP2023/024662 JP2023024662W WO2024009962A1 WO 2024009962 A1 WO2024009962 A1 WO 2024009962A1 JP 2023024662 W JP2023024662 W JP 2023024662W WO 2024009962 A1 WO2024009962 A1 WO 2024009962A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- prediction
- cclm
- unit
- cclm prediction
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- Embodiments of the present invention relate to a CCLM prediction unit, a video decoding device, and a video encoding device.
- This application claims priority based on Japanese Patent Application No. 2022-107511 filed in Japan on July 4, 2022, the contents of which are incorporated herein.
- a moving image encoding device In order to efficiently transmit or record moving images, a moving image encoding device generates encoded data by encoding a moving image, and a moving image generates a decoded image by decoding the encoded data.
- An image decoding device is used.
- Specific video encoding methods include, for example, H.264/AVC, HEVC (High-Efficiency Video Coding), and Versatile Video Coding (VVC).
- H.264/AVC High-Efficiency Video Coding
- HEVC High-Efficiency Video Coding
- VVC Versatile Video Coding
- the images (pictures) that make up a video are divided into slices obtained by dividing the image and coding tree units (CTU) obtained by dividing the slices. ), a coding unit obtained by dividing a coding tree unit (sometimes called a coding unit (CU)), and a transform unit (TU: obtained by dividing a coding unit). It is managed by a hierarchical structure consisting of Transform Units, and is encoded/decoded for each CU.
- CTU coding tree units
- a predicted image is usually generated based on a locally decoded image obtained by encoding/decoding the input image, and the predicted image is generated from the input image (original image).
- the prediction error obtained by subtraction (sometimes referred to as a "difference image” or “residual image”) is encoded.
- Non-patent document 1 can be cited as a recent technique for video encoding and decoding.
- CCLM Cross-component linear model
- CCCM Convolutional cross-component model
- linear prediction parameters are derived using multiple decoded images adjacent to the target block, and the color difference of the target block is predicted from the linear prediction model (CCLM model).
- AHG12 Convolutional cross-component model (CCCM) for intra prediction
- JVET Joint Video Exploration Team
- CCCM processing uses multiple polynomial linear models, so there is a problem in that the amount of calculation for parameter derivation is extremely large. Furthermore, since a linear model with seven parameters, ie, one target pixel, four adjacent pixels, one nonlinear element of the target pixel, and one bias, is derived, there is a problem that the amount of calculation is necessarily large.
- a CCLM prediction unit that generates a predicted image of a chrominance image using a luminance image, and uses a reference pixel of a reference image adjacent to a target block and pixels adjacent to the reference pixel to calculate a first weight, a second weight, and a second weight.
- a CCLM prediction parameter derivation unit that derives a CCLM prediction parameter consisting of a weight and a first offset value, two luminance pixels of the target pixel of the target block and a pixel adjacent to the target pixel, and a color difference calculation unit using the CCLM prediction parameter.
- a CCLM prediction filter unit that generates a predicted image
- the CCLM prediction filter unit includes a product of the target pixel and the first weight; a product of an adjacent pixel of the target pixel and the second weight;
- the pixel value of the predicted pixel is derived from the sum with the first offset value.
- the position of the adjacent pixel in the reference image and the target image is the pixel (x+1, y) to the right of the target pixel (x, y).
- a video decoding device comprising the CCLM prediction unit according to claim 1 and a parameter decoding unit that decodes an index indicating the position of an adjacent pixel from encoded data
- the CCLM prediction parameter derivation unit comprising: Regardless of the index, the number of CCLM prediction parameters is fixed, and the CCLM prediction parameters are derived by switching the positions of pixels adjacent to the reference pixel according to the index, and the CCLM filter section calculates the number of CCLM prediction parameters according to the index.
- the method is characterized in that a predicted image is derived by switching the adjacent pixels.
- a video encoding device comprising the CCLM prediction unit according to claim 1, and a parameter encoding unit that encodes an index indicating the position of an adjacent pixel derived from an image, the CCLM prediction parameter derivation unit
- the number of CCLM prediction parameters is fixed regardless of the index, and the CCLM prediction parameters are derived by switching the positions of pixels adjacent to the reference pixel according to the index, and the CCLM filter section calculates the number of CCLM prediction parameters based on the index.
- the method is characterized in that a predicted image is derived by switching the adjacent pixels accordingly.
- the parameter decoding unit decodes the index from the encoded data of the sequence header, slice header, or CTU header, derives a flag indicating whether to perform CCLM prediction from the encoded data, and filters the CCLM prediction unit.
- the unit is characterized by deriving a predicted image of the target block.
- a CCLM prediction unit that generates a predicted image of a chrominance image using a luminance image, and a CCLM prediction parameter derivation unit that derives a CCLM prediction parameter; and a CCLM prediction parameter derivation unit that derives a CCLM prediction parameter; a CCLM prediction filter unit that generates a CCLM prediction filter; the CCLM prediction unit includes a linear prediction unit that derives two parameters as a multiplication coefficient and a bias coefficient; and a linear prediction unit that derives three or more parameters.
- It is equipped with a multi-model that classifies luminance signals into groups based on the magnitude of the luminance signal and derives multiple types of CCLM prediction parameters according to the classification, and a single model that derives one type of CCLM prediction parameter. In some cases, it is characterized by not deriving more than two parameters.
- the CCLM prediction unit decodes a syntax element that includes whether to classify into two groups, and if the syntax element indicates that one group is used, derives three or more parameters.
- the feature is that a syntax element indicating whether or not is further decoded.
- the CCLM prediction unit decodes a syntax element indicating whether or not three or more parameters are to be derived, and if the syntax element indicates two parameters, the CCLM prediction unit may classify the syntax element into two groups. It is characterized by further decoding syntax elements including whether or not.
- derivation of linear prediction parameters is simplified in CCCM prediction.
- FIG. 1 is a schematic diagram showing the configuration of an image transmission system according to the present embodiment.
- FIG. 3 is a diagram showing a hierarchical structure of data of an encoded stream. It is a schematic diagram showing the types (mode numbers) of intra prediction modes.
- 1 is a schematic diagram showing the configuration of a moving image decoding device. It is a schematic diagram showing the composition of an intra prediction parameter derivation part. It is a figure showing the reference area used for intra prediction.
- FIG. 2 is a diagram showing the configuration of an intra predicted image generation section.
- (a) is a block diagram showing an example of the configuration of a CCLM prediction unit according to an embodiment of the present invention, and
- (b) is a diagram showing a method for deriving IntraPredModeC.
- FIG. 1 is a block diagram showing the configuration of a video encoding device. It is a schematic diagram showing the composition of an intra prediction parameter derivation part.
- FIG. 3 is a diagram showing the positions of reference pixels according to an embodiment of the present invention.
- FIG. 2 is a diagram showing a syntax configuration according to an embodiment of the present invention.
- FIG. 2 is a diagram showing a syntax configuration according to an embodiment of the present invention.
- FIG. 2 is a diagram showing a syntax configuration according to an embodiment of the present invention.
- FIG. 2 is a diagram showing a syntax configuration according to an embodiment of the present invention.
- 3 is a flowchart showing the operation of a CCLM prediction unit according to an embodiment of the present invention.
- 3 is a flowchart showing the operation of a CCLM prediction unit according to an embodiment of the present invention.
- FIG. 1 is a schematic diagram showing the configuration of an image transmission system 1 according to the present embodiment.
- the image transmission system 1 is a system that transmits an encoded stream obtained by encoding an image to be encoded, decodes the transmitted encoded stream, and displays the image.
- the image transmission system 1 includes a video encoding device (image encoding device) 11, a network 21, a video decoding device (image decoding device) 31, and a video display device (image display device) 41. .
- An image T is input to the video encoding device 11.
- the network 21 transmits the encoded stream Te generated by the video encoding device 11 to the video decoding device 31.
- the network 21 is the Internet, a wide area network (WAN), a local area network (LAN), or a combination thereof.
- the network 21 is not necessarily a bidirectional communication network, but may be a unidirectional communication network that transmits broadcast waves such as terrestrial digital broadcasting and satellite broadcasting. Further, the network 21 may be replaced by a storage medium on which the encoded stream Te is recorded, such as a DVD (Digital Versatile Disc: registered trademark) or a BD (Blue-ray Disc: registered trademark).
- the video decoding device 31 decodes each encoded stream Te transmitted by the network 21, and generates one or more decoded images Td.
- the moving image display device 41 displays all or part of one or more decoded images Td generated by the moving image decoding device 31.
- the moving image display device 41 includes a display device such as a liquid crystal display or an organic EL (Electro-luminescence) display. Display formats include stationary, mobile, HMD, etc. Further, when the video decoding device 31 has high processing capacity, it displays a high quality image, and when it has only a lower processing capacity, it displays an image that does not require high processing capacity or display capacity. .
- x?y:z is a ternary operator that takes y when x is true (other than 0) and z when x is false (0).
- Abs(a) is a function that returns the absolute value of a.
- Int(a) is a function that returns the integer value of a.
- floor(a) is a function that returns the largest integer less than or equal to a.
- ceil(a) is a function that returns the smallest integer greater than or equal to a.
- a/d represents the division of a by d (rounding down to the decimal point).
- a ⁇ b represents a raised to the b power.
- FIG. 2 is a diagram showing the hierarchical structure of data in the encoded stream Te.
- the encoded stream Te exemplarily includes a sequence and a plurality of pictures that constitute the sequence.
- (a) to (f) in FIG. 2 are respectively an encoded video sequence that defines the sequence SEQ, an encoded picture that defines the picture PICT, an encoded slice that defines the slice S, and an encoded slice that defines the slice data.
- FIG. 3 is a diagram showing data, a coding tree unit included in coded slice data, and a coding unit included in the coding tree unit.
- the encoded video sequence In the encoded video sequence, a set of data that the moving image decoding device 31 refers to in order to decode the sequence SEQ to be processed is defined.
- the sequence SEQ includes a video parameter set (Video Parameter Set), a sequence parameter set SPS (Sequence Parameter Set), a picture parameter set PPS (Picture Parameter Set), a picture PICT, and an additional extension.
- a video parameter set Video Parameter Set
- SPS Sequence Parameter Set
- PPS Picture Parameter Set
- PICT Picture Parameter Set
- SEI Supplemental Enhancement Information
- Video parameter set VPS is a set of encoding parameters common to multiple video images and encoding parameters related to multiple layers and individual layers included in the video image. A set is defined.
- the sequence parameter set SPS defines a set of encoding parameters that the video decoding device 31 refers to in order to decode the target sequence. For example, the width and height of the picture are defined. Note that a plurality of SPSs may exist. In that case, select one of the multiple SPSs from the PPSs.
- the picture parameter set PPS defines a set of encoding parameters that the video decoding device 31 refers to in order to decode each picture in the target sequence. For example, it includes a reference value (pic_init_qp_minus26) for the quantization width used for picture decoding. Note that multiple PPSs may exist. In that case, one of the plurality of PPSs is selected from each picture in the target sequence.
- the encoded picture In the encoded picture, a set of data that the video decoding device 31 refers to in order to decode the picture PICT to be processed is defined. As shown in FIG. 2(b), the picture PICT includes slices 0 to NS-1 (NS is the total number of slices included in the picture PICT).
- encoded slice In the encoded slice, a set of data that the video decoding device 31 refers to in order to decode the slice S to be processed is defined.
- a slice includes a slice header and slice data, as shown in FIG. 2(c).
- the slice header includes a group of encoding parameters that the video decoding device 31 refers to in order to determine the decoding method for the target slice.
- Slice type designation information (slice_type) that designates the slice type is an example of an encoding parameter included in the slice header.
- Slice types that can be specified by the slice type designation information include (1) an I slice that uses only intra prediction during encoding, (2) a P slice that uses unidirectional prediction or intra prediction during encoding, (3) Examples include B slices that use unidirectional prediction, bidirectional prediction, or intra prediction during encoding.
- inter prediction is not limited to uni-prediction or bi-prediction, and a predicted image may be generated using more reference pictures.
- P and B slices they refer to slices that include blocks for which inter prediction can be used.
- the slice header may include a reference to the picture parameter set PPS (pic_parameter_set_id).
- the encoded slice data defines a set of data that the video decoding device 31 refers to in order to decode the slice data to be processed.
- the slice data includes CTUs, as shown in FIG. 2(d).
- a CTU is a block of fixed size (for example, 64x64) that constitutes a slice, and is also called a largest coding unit (LCU).
- CTU is the basic encoding process using recursive quad tree partitioning (QT (Quad Tree) partitioning), binary tree partitioning (BT (Binary Tree) partitioning), or ternary tree partitioning (TT (Ternary Tree) partitioning). It is divided into encoding units CU, which are standard units.
- the combination of BT partitioning and TT partitioning is called multi-tree partitioning (MT (Multi Tree) partitioning).
- MT Multi Tree partitioning
- a tree-structured node obtained by recursive quadtree partitioning is called a coding node.
- An intermediate node of a quadtree, a binary tree, and a tertiary tree is a coding node, and the CTU itself is defined as the topmost coding node.
- the CT includes a division flag indicating whether to perform division as CT information.
- the CU size is 64x64 pixels, 64x32 pixels, 32x64 pixels, 32x32 pixels, 64x16 pixels, 16x64 pixels, 32x16 pixels, 16x32 pixels, 16x16 pixels, 64x8 pixels, 8x64 pixels.
- the CU includes a CU header CUH, prediction parameters, transformation parameters, quantized transformation coefficients, and the like.
- the prediction mode etc. are defined in the CU header.
- Prediction processing may be performed on a CU basis or on a sub-CU basis by further dividing a CU. If the sizes of the CU and sub-CU are equal, there is one sub-CU in the CU. If the CU is larger than the sub-CU size, the CU is divided into sub-CUs. For example, if the CU is 8x8 and the sub-CU is 4x4, the CU is divided into four sub-CUs, two horizontally and two vertically.
- Intra prediction is prediction within the same picture
- inter prediction refers to prediction processing performed between mutually different pictures (for example, between display times).
- the quantized transform coefficients may be entropy encoded in units of subblocks such as 4x4.
- prediction parameter A predicted image is derived by prediction parameters associated with a block.
- the prediction parameters include intra prediction and inter prediction parameters.
- the intra prediction parameters are composed of a luminance prediction mode IntraPredModeY and a color difference prediction mode IntraPredModeC.
- FIG. 3 is a schematic diagram showing types (mode numbers) of intra prediction modes. As shown in the figure, there are, for example, 67 types (0 to 66) of intra prediction modes. For example, planar prediction (0), DC prediction (1), Angular prediction (2 to 66). Furthermore, CCLM modes (81 to 83) may be added for color difference.
- Examples of syntax elements for deriving intra prediction parameters include intra_luma_mpm_flag, mpm_idx, mpm_remainder, etc.
- intra_luma_mpm_flag is a flag indicating whether the brightness prediction mode IntraPredModeY of the target block matches MPM (Most Probable Mode).
- MPM is a prediction mode included in the MPM candidate list mpmCandList[].
- the MPM candidate list is a list that stores candidates that are estimated to have a high probability of being applied to the target block based on the intra prediction modes of adjacent blocks and a predetermined intra prediction mode.
- intra_luma_mpm_flag is 1, the brightness prediction mode IntraPredModeY of the target block is derived using the MPM candidate list and index mpm_idx.
- IntraPredModeY mpmCandList[mpm_idx] (REM)
- intra_luma_mpm_flag 0
- brightness prediction mode IntraPredModeY is derived using mpm_remainder. Specifically, an intra prediction mode is selected from among the remaining modes RemIntraPredMode after excluding the intra prediction modes included in the MPM candidate list from all intra prediction modes.
- the video decoding device 31 includes an entropy decoding section 301, a parameter decoding section (predicted image decoding device) 302, a loop filter 305, a reference picture memory 306, a prediction parameter memory 307, a predicted image generation section 308, and an inverse quantization/inverse transformation section. 311, an addition section 312, and a prediction parameter derivation section 320. Note that there is also a configuration in which the loop filter 305 is not included in the video decoding device 31 in accordance with the video encoding device 11 described later.
- the parameter decoding unit 302 further includes a header decoding unit 3020, a CT information decoding unit 3021, and a CU decoding unit 3022 (prediction mode decoding unit), and the CU decoding unit 3022 further includes a TU decoding unit 3024. These may be collectively called a decoding module.
- the header decoding unit 3020 decodes parameter set information such as VPS, SPS, and PPS, and a slice header (slice information) from encoded data.
- CT information decoding section 3021 decodes CT from encoded data.
- CU decoding section 3022 decodes CU from encoded data.
- the TU decoding unit 3024 decodes QP update information (quantization correction value) and quantization prediction error (residual_coding) from encoded data when a prediction error is included in the TU.
- the prediction parameter derivation unit 320 is configured to include an inter prediction parameter derivation unit 303 and an intra prediction parameter derivation unit 304.
- the predicted image generation unit 308 is configured to include an inter predicted image generation unit 309 and an intra predicted image generation unit 310.
- CTU and CU are used as processing units
- the processing is not limited to this example, and processing may be performed in sub-CU units.
- CTU and CU may be read as blocks and sub-CUs as sub-blocks, and processing may be performed in units of blocks or sub-blocks.
- the entropy decoding unit 301 performs entropy decoding on the encoded stream Te input from the outside, and separates and decodes individual codes (syntax elements).
- the separated codes include prediction information for generating a predicted image, prediction errors for generating a difference image, and the like.
- Entropy decoding section 301 outputs the separated codes to parameter decoding section 302.
- Intra prediction parameter deriving section 304 derives an intra prediction parameter, for example, an intra prediction mode IntraPredMode, based on the code input from entropy decoding section 301 and referring to the prediction parameters stored in prediction parameter memory 307.
- the intra prediction parameter derivation unit 304 outputs the derived intra prediction parameters to the predicted image generation unit 308 and stores them in the prediction parameter memory 307.
- the intra prediction parameter deriving unit 304 may derive different intra prediction modes for luminance and color difference.
- FIG. 5 is a schematic diagram showing the configuration of the intra prediction parameter derivation unit 304 of the prediction parameter derivation unit 320.
- the intra prediction parameter derivation unit 304 is configured to include a parameter decoding control unit 3041, a luminance intra prediction parameter derivation unit 3042, and a chrominance intra prediction parameter derivation unit 3043.
- the parameter decoding control unit 3041 instructs the entropy decoding unit 301 to decode the syntax element, and receives the syntax element from the entropy decoding unit 301. If intra_luma_mpm_flag is 1, the parameter decoding control unit 3041 outputs mpm_idx to the MPM parameter deriving unit 30422 in the luminance intra prediction parameter deriving unit 3042. Further, when intra_luma_mpm_flag is 0, the parameter decoding control unit 3041 outputs mpm_remainder to the non-MPM parameter deriving unit 30423 of the luminance intra prediction parameter deriving unit 3042. Further, the parameter decoding control unit 3041 outputs the chrominance intra prediction parameter intra_chroma_pred_mode to the chrominance intra prediction parameter deriving unit 3043.
- the luminance intra prediction parameter derivation unit 3042 is configured to include an MPM candidate list derivation unit 30421, an MPM parameter derivation unit 30422, and a non-MPM parameter derivation unit 30423 (derivation unit).
- the MPM parameter derivation unit 30422 refers to the MPM candidate list mpmCandList[] and mpm_idx derived by the MPM candidate list derivation unit 30421, derives the brightness prediction mode IntraPredModeY, and outputs it to the intra predicted image generation unit 310.
- the non-MPM parameter derivation unit 30423 derives IntraPredModeY from the MPM candidate list mpmCandList[] and mpm_remainder, and outputs it to the intra predicted image generation unit 310.
- the chrominance intra prediction parameter derivation unit 3043 derives the chrominance prediction mode IntraPredModeC from intra_chroma_pred_mode, and outputs it to the intra prediction image generation unit 310.
- the loop filter 305 is a filter provided in the encoding loop, and is a filter that removes block distortion and ringing distortion and improves image quality.
- the loop filter 305 applies filters such as a deblocking filter, a sample adaptive offset (SAO), and an adaptive loop filter (ALF) to the decoded image of the CU generated by the addition unit 312.
- filters such as a deblocking filter, a sample adaptive offset (SAO), and an adaptive loop filter (ALF) to the decoded image of the CU generated by the addition unit 312.
- the reference picture memory 306 stores the decoded image of the CU generated by the addition unit 312 in a predetermined position for each target picture and target CU.
- the prediction parameter memory 307 stores prediction parameters in a predetermined position for each CTU or CU to be decoded. Specifically, the prediction parameter memory 307 stores parameters decoded by the parameter decoding unit 302, parameters derived by the prediction parameter derivation unit 320, and the like.
- the predicted image generation unit 308 receives input parameters and the like derived by the predicted parameter derivation unit 320.
- the predicted image generation unit 308 also reads a reference picture from the reference picture memory 306.
- the predicted image generation unit 308 generates a predicted image of a block or subblock using the prediction parameter and the read reference picture (reference picture block) in the prediction mode indicated by the prediction mode predMode.
- the reference picture block is a set of pixels on a reference picture (usually called a block because it is rectangular), and is an area to be referenced to generate a predicted image.
- the intra prediction image generation unit 310 performs intra prediction using the intra prediction parameters input from the intra prediction parameter deriving unit 304 and the reference pixels read from the reference picture memory 306.
- the intra predicted image generation unit 310 reads adjacent blocks on the target picture within a predetermined range from the target block from the reference picture memory 306.
- the predetermined range is adjacent blocks to the left, upper left, upper, and upper right of the target block, and the reference area differs depending on the intra prediction mode.
- the intra predicted image generation unit 310 generates a predicted image of the target block by referring to the read decoded pixel value and the prediction mode indicated by IntraPredMode.
- the intra predicted image generation unit 310 outputs the generated predicted image of the block to the addition unit 312.
- a decoded peripheral area adjacent to (close to) the prediction target block is set as the reference area R. Then, a predicted image is generated by extrapolating pixels on the reference area R in a specific direction.
- the reference area R is an L-shaped area (for example, the area indicated by the diagonally circled pixels in FIG. 6(a) ).
- the intra predicted image generation unit 310 includes a prediction target block setting unit 3101, an unfiltered reference image setting unit 3102 (first reference image setting unit), a filtered reference image setting unit 3103 (second reference image setting unit), a prediction target block setting unit 3101, a filtered reference image setting unit 3103 (second reference image setting unit), 3104 (intra prediction unit 3104), and a predicted image correction unit 3105 (predicted image correction unit, filter switching unit, weighting coefficient changing unit).
- the prediction unit 3104 predicts the prediction target block.
- a tentative predicted image (pre-correction predicted image) is generated and output to the predicted image correction unit 3105.
- the predicted image correction unit 3105 corrects the temporary predicted image according to the intra prediction mode, generates a predicted image (corrected predicted image), and outputs it.
- Each unit included in the intra predicted image generation unit 310 will be described below.
- the prediction target block setting unit 3101 sets the target CU as the prediction target block, and outputs information regarding the prediction target block (prediction target block information).
- the prediction target block information includes at least the size, position, and index indicating whether the prediction target block is based on luminance or color difference.
- the unfiltered reference image setting unit 3102 sets the adjacent surrounding area of the prediction target block as the reference area R based on the size and position of the prediction target block. Subsequently, each pixel value (unfiltered reference image, boundary pixel) in the reference area R is set to each decoded pixel value at the corresponding position on the reference picture memory 306.
- the line r[x][-1] of decoded pixels adjacent to the upper side of the prediction target block and the row r[-1][y] of decoded pixels adjacent to the left side of the prediction target block shown in Fig. 6(a) are unresolved. This is a filter reference image.
- the filtered reference image setting unit 3103 applies a reference pixel filter (first filter) to the unfiltered reference image according to the intra prediction mode to set the filtered reference image at each position (x, y) on the reference area R. Derive the reference image s[x][y]. Specifically, a low-pass filter is applied to the unfiltered reference image at position (x, y) and its surroundings, and a filtered reference image (FIG. 6(b)) is derived. Note that it is not necessary to apply a low-pass filter to all intra prediction modes, and a low-pass filter may be applied to some intra prediction modes.
- the filter applied to the unfiltered reference image on the reference area R in the filtered reference pixel setting unit 3103 is referred to as a “reference pixel filter (first filter),” whereas the filter applied to the unfiltered reference image on the reference area R in the filtered reference pixel setting unit 3103 is The filter that corrects the tentative predicted image is called a "boundary filter (second filter)."
- the intra prediction unit 3104 generates a temporary predicted image (tentative predicted pixel value, pre-correction predicted image) of the prediction target block based on the intra prediction mode, the unfiltered reference image, and the filtered reference pixel value, and Output to 3105.
- the prediction unit 3104 internally includes a Planar prediction unit 31041, a DC prediction unit 31042, an Angular prediction unit 31043, and a CCLM prediction unit (predicted image generation device) 31044.
- the prediction unit 3104 selects a specific prediction unit according to the intra prediction mode and inputs the unfiltered reference image and the filtered reference image.
- the relationship between the intra prediction mode and the corresponding prediction unit is as follows.
- planar prediction unit 31041 linearly adds a plurality of filtered reference images s[x][y] according to the distance between the prediction target pixel position and the reference pixel position to generate a temporary predicted image q[x][y]. It is generated and output to the predicted image correction unit 3105.
- the DC prediction unit 31042 derives a DC predicted value corresponding to the average value of the filtered reference image s[x][y], and outputs a temporary predicted image q[x][y] whose pixel value is the DC predicted value. do.
- the Angular prediction unit 31043 generates a temporary predicted image q[x][y] using the filtered reference image s[x][y] in the prediction direction (reference direction) indicated by the intra prediction mode, and the predicted image correction unit Output to 3105.
- the CCLM prediction unit 31044 predicts the chrominance pixel value based on the luminance pixel value. Specifically, this method uses a linear model to generate a predicted image of a color difference image (Cb, Cr) based on a decoded luminance image. CCLM prediction will be described later.
- the predicted image correction unit 3105 corrects the tentative predicted image output from the prediction unit 3104 according to the intra prediction mode. Specifically, the predicted image correction unit 3105 performs weighted addition (weighted average) of the unfiltered reference image and the temporary predicted image for each pixel of the temporary predicted image, depending on the distance between the reference region R and the target predicted pixel. By doing so, a predicted image (corrected predicted image) Pred obtained by correcting the provisional predicted image is derived. Note that in some intra prediction modes (for example, planar prediction, DC prediction, etc.), the predicted image correction unit 3105 may not correct the temporary predicted image, and the output of the prediction unit 3104 may be used as the predicted image.
- the predicted image correction unit 3105 may not correct the temporary predicted image, and the output of the prediction unit 3104 may be used as the predicted image.
- the inverse quantization/inverse transform unit 311 inversely quantizes the quantized transform coefficients input from the entropy decoding unit 301 to obtain transform coefficients.
- This quantized transform coefficient is a coefficient obtained by performing frequency transform such as DCT (Discrete Cosine Transform) or DST (Discrete Sine Transform) on the prediction error and quantizing it in the encoding process. It is.
- the inverse quantization/inverse transform unit 311 performs inverse frequency transform such as inverse DCT and inverse DST on the obtained transform coefficients to calculate a prediction error.
- the inverse quantization/inverse transformation section 311 outputs the prediction error to the addition section 312.
- the addition unit 312 adds the predicted image of the block input from the predicted image generation unit 308 and the prediction error input from the inverse quantization/inverse transformation unit 311 for each pixel to generate a decoded image of the block.
- the adding unit 312 stores the decoded image of the block in the reference picture memory 306 and also outputs it to the loop filter 305.
- FIG. 11 is a block diagram showing the configuration of the video encoding device 11 according to this embodiment.
- the video encoding device 11 includes a predicted image generation section 101, a subtraction section 102, a transformation/quantization section 103, an inverse quantization/inverse transformation section 105, an addition section 106, a loop filter 107, a prediction parameter memory (prediction parameter storage section , frame memory) 108, reference picture memory (reference image storage unit, frame memory) 109, encoding parameter determination unit 110, parameter encoding unit 111, prediction parameter derivation unit 120, and entropy encoding unit 104.
- a prediction parameter memory prediction parameter storage section , frame memory
- reference picture memory reference image storage unit, frame memory
- the predicted image generation unit 101 generates a predicted image for each CU, which is a region obtained by dividing each picture of the image T.
- the predicted image generation unit 101 operates in the same manner as the predicted image generation unit 308 already described, and the explanation will be omitted.
- the subtraction unit 102 subtracts the pixel value of the predicted image of the block input from the predicted image generation unit 101 from the pixel value of the image T to generate a prediction error.
- Subtraction section 102 outputs the prediction error to conversion/quantization section 103.
- the conversion/quantization unit 103 calculates conversion coefficients by frequency conversion for the prediction error input from the subtraction unit 102, and derives quantized conversion coefficients by quantization. Transformation/quantization section 103 outputs the quantized transformation coefficients to entropy encoding section 104 and inverse quantization/inverse transformation section 105.
- the inverse quantization/inverse transformation unit 105 is the same as the inverse quantization/inverse transformation unit 311 (FIG. 4) in the moving image decoding device 31, and the description thereof will be omitted.
- the calculated prediction error is output to addition section 106.
- the entropy encoding unit 104 receives quantized transform coefficients from the transform/quantization unit 103 and inputs encoding parameters from the parameter encoding unit 111.
- the encoding parameters include, for example, codes for intra prediction mode parameters (intra_luma_mpm_flag, mpm_idx, mpm_remainder), prediction mode predMode, and the like.
- the entropy encoding unit 104 entropy encodes the division information, encoding parameters, quantization transform coefficients, etc., generates an encoded stream Te, and outputs the encoded stream Te.
- the parameter encoding unit 111 includes a header encoding unit 1110, a CT information encoding unit 1111, and a CU encoding unit 1112 (prediction mode encoding unit), which are not shown.
- CU encoding section 1112 further includes a TU encoding section 1114.
- the prediction parameter derivation unit 120 derives inter prediction parameters and intra prediction parameters from the inter prediction parameter derivation unit 112 and the intra prediction parameter derivation unit 113.
- the derived inter prediction parameters and intra prediction parameters are output to parameter encoding section 111.
- the intra prediction parameter encoding unit 113 derives a format for encoding (for example, mpm_idx, mpm_remainder, etc.) from the intra prediction mode IntraPredMode input from the encoding parameter determination unit 110.
- Intra prediction parameter derivation unit 113 includes a part of the same configuration as the configuration by which intra prediction parameter derivation unit 304 derives intra prediction parameters.
- FIG. 12 is a schematic diagram showing the configuration of the intra prediction parameter derivation unit 113 of the prediction parameter derivation unit 120.
- the intra prediction parameter derivation unit 113 is configured to include a parameter encoding control unit 1131, a luminance intra prediction parameter derivation unit 1132, and a chrominance intra prediction parameter derivation unit 1133.
- the luminance prediction mode IntraPredModeY and the color difference prediction mode IntraPredModeC are input to the parameter encoding control unit 1131 from the encoding parameter determination unit 110.
- the parameter encoding control unit 1131 refers to the MPM candidate list mpmCandList[] of the reference candidate list deriving unit 30421 and determines intra_luma_mpm_flag. Then, intra_luma_mpm_flag and IntraPredModeY are output to the luminance intra prediction parameter deriving unit 1132. Furthermore, IntraPredModeC is output to the color difference intra prediction parameter deriving unit 1133.
- the luminance intra prediction parameter deriving unit 1132 includes an MPM candidate list deriving unit 30421 (candidate list deriving unit), an MPM parameter deriving unit 11322, and a non-MPM parameter deriving unit 11323 (encoding unit, deriving unit). Ru.
- the MPM candidate list derivation unit 30421 refers to the intra prediction modes of adjacent blocks stored in the prediction parameter memory 108 and derives the MPM candidate list mpmCandList[].
- intra_luma_mpm_flag 1
- MPM parameter deriving section 11322 derives mpm_idx from IntraPredModeY and mpmCandList[], and outputs it to entropy encoding section 104.
- intra_luma_mpm_flag is 0, non-MPM parameter deriving section 11323 derives mpm_remainder from IntraPredModeY and mpmCandList[], and outputs it to entropy encoding section 104.
- the chrominance intra prediction parameter derivation unit 1133 derives intra_chroma_pred_mode from IntraPredModeY and IntraPredModeC and outputs it.
- the addition unit 106 adds the pixel value of the predicted image of the block input from the predicted image generation unit 101 and the prediction error input from the inverse quantization/inverse transformation unit 105 for each pixel to generate a decoded image. Adding unit 106 stores the generated decoded image in reference picture memory 109.
- the loop filter 107 applies a deblocking filter, SAO, and ALF to the decoded image generated by the addition unit 106.
- SAO deblocking filter
- ALF ALF
- the loop filter 107 does not necessarily need to include the above three types of filters, and may have a configuration including only a deblocking filter, for example.
- the prediction parameter memory 108 stores the prediction parameters generated by the encoding parameter deriving unit 120 at predetermined positions for each target picture and CU.
- the transform coefficients etc. generated by the transform/quantization unit 103 may be stored.
- the reference picture memory 109 stores the decoded image generated by the loop filter 107 in a predetermined position for each target picture and CU.
- the encoding parameter determination unit 110 selects one set from among the multiple sets of encoding parameters.
- the encoding parameter is the above-mentioned QT, BT or TT division information, a prediction parameter, or a parameter to be encoded that is generated in relation to these.
- Predicted image generation section 101 generates a predicted image using these encoding parameters.
- the encoding parameter determining unit 110 calculates an RD cost value indicating the amount of information and encoding error for each of the plurality of sets.
- the RD cost value is, for example, the sum of the code amount and the value obtained by multiplying the squared error by a coefficient ⁇ .
- Encoding parameter determining section 110 selects a set of encoding parameters that minimizes the calculated cost value. Thereby, entropy encoding section 104 outputs the selected set of encoding parameters as encoded stream Te.
- the encoding parameter determining unit 110 outputs the determined encoding parameters to the parameter encoding unit 111, the prediction parameter deriving unit 120, and the predicted image generating unit 101.
- some of the video encoding device 11 and video decoding device 31 in the embodiment described above such as the entropy decoding unit 301, the parameter decoding unit 302, the loop filter 305, the predicted image generation unit 308, and the dequantization/inverse Transform unit 311, addition unit 312, prediction parameter derivation unit 320, predicted image generation unit 101, subtraction unit 102, transformation/quantization unit 103, entropy encoding unit 104, inverse quantization/inverse transformation unit 105, loop filter 107,
- the encoding parameter determining section 110, the parameter encoding section 111, and the prediction parameter deriving section 120 may be realized by a computer.
- a program for realizing this control function may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read into a computer system and executed.
- the "computer system” herein refers to a computer system built into either the video encoding device 11 or the video decoding device 31, and includes hardware such as an OS and peripheral devices.
- the term "computer-readable recording medium” refers to portable media such as flexible disks, magneto-optical disks, ROMs, and CD-ROMs, and storage devices such as hard disks built into computer systems.
- a "computer-readable recording medium” refers to a medium that dynamically stores a program for a short period of time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In that case, it may also include something that retains a program for a certain period of time, such as a volatile memory inside a computer system that is a server or a client. Further, the above-mentioned program may be one for realizing a part of the above-mentioned functions, or may be one that can realize the above-mentioned functions in combination with a program already recorded in the computer system.
- part or all of the video encoding device 11 and video decoding device 31 in the embodiments described above may be realized as an integrated circuit such as an LSI (Large Scale Integration).
- LSI Large Scale Integration
- Each functional block of the moving image encoding device 11 and the moving image decoding device 31 may be made into a processor individually, or some or all of them may be integrated into a processor.
- the method of circuit integration is not limited to LSI, but may be implemented using a dedicated circuit or a general-purpose processor. Further, if an integrated circuit technology that replaces LSI emerges due to advances in semiconductor technology, an integrated circuit based on this technology may be used.
- the above-described video encoding device 11 and video decoding device 31 can be used by being installed in various devices that transmit, receive, record, and reproduce video images.
- the moving image may be a natural moving image captured by a camera or the like, or an artificial moving image (including CG and GUI) generated by a computer or the like.
- FIG. 10 is a diagram showing an overview of luminance and color difference prediction.
- luminance/color difference prediction color difference is linearly predicted from luminance.
- (a) shows a case where one prediction model is used for the target block, and one CCLM prediction parameter is derived for the target block.
- (b) shows a case where a plurality of prediction models are used for the target block, and two or more (here, two) CCLM prediction parameters are derived for the target block.
- MMLM Multi Mode Linear Model
- CCLM linear prediction using two parameters consisting of one weighting coefficient a and one offset coefficient b (bias)
- bias linear prediction using two parameters consisting of one weighting coefficient a and one offset coefficient b (bias)
- CCCM Convolutional cross-component model
- a, b, ak and b are called CCLM prediction parameters and are derived using adjacent images of the target block.
- a shift value may be derived in addition to the weight and bias value in the CCLM prediction parameters.
- the number of CCLM prediction parameters in this specification does not include shift values. In other words, it is defined as follows.
- Linear prediction with a, b, shiftA as parameters is 2-parameter linear prediction
- Linear prediction with a0, a1, b, shiftA as parameters is 3-parameter linear prediction This is the amount of calculation involved in deriving CCLM prediction parameters, CCLM prediction In the calculation amount related to linear prediction in , the derivation process of shiftA and the shift process by shiftA are negligible, so shiftA is not included in the number of parameters.
- the following prediction may be used as luminance/chrominance prediction.
- the intra prediction parameter derivation unit 304 refers to the luminance prediction modes IntraPredModeY, intra_chroma_pred_mode, and the table in FIG. 8(b) when deriving the above-mentioned color difference prediction mode IntraPredModeC.
- the figure shows how to derive IntraPredModeC.
- intra_chroma_pred_mode is 0 to 3 and 4
- intra prediction parameter derivation unit 304 derives IntraPredModeC depending on the value of IntraPredModeY. For example, if intra_chroma_pred_mode is 0 and IntraPredModeY is 0, IntraPredModeC is 66.
- IntraPredModeC is 1. Note that the values of IntraPredModeY and IntraPredModeC represent the intra prediction mode in FIG. 3.
- the CCLM prediction unit 31044 includes a downsampling unit 310441, a CCLM prediction parameter derivation unit (parameter derivation unit) 310442, and a CCLM prediction filter unit 310443.
- FIG. 13 is a diagram showing the positional relationship between a target pixel and adjacent pixels according to this embodiment.
- the CCLM prediction filter unit 310443 of this configuration includes the luminance target pixel refSamples[x*SubWidthC][y*SubHeightC] corresponding to the color difference pixel position (x,y) to be predicted and its adjacent pixel refSamples[x*SubWidthC+dX Perform CCLM prediction using ][y*SubHeightC+dY]. Then, a predicted image predSamples[x][y] at the color difference pixel position (x, y) is generated.
- the figure shows the following four examples.
- SubWidthC and SubHeightC are sampling ratios of luminance pixels to chrominance pixels.
- (a) Target pixel (x, y) and right adjacent pixel (x+1, y) predSamples[x][y] (a0*refSamples[x*SubWidthC][y*SubHeightC]+a1*refSamples[x*SubWidthC+1][y*SubHeightC])>>shiftA)+b
- Target pixel (x, y) and lower adjacent pixel (x, y+1) predSamples[x][y] (a0*refSamples[x*SubWidthC][y*SubHeightC]+a1*refSamples[x*SubWidthC][y*SubHeightC+1])>>shiftA)+b
- the CCLM prediction parameter deriving unit 310442 calculates a reference pixel pRefY (x, y) included in the reference area of an adjacent block (for example, left, upper, upper right of the target block) and its adjacent pixel pRefY (x+dX,y+dY ) to derive the following temporary reference arrays refX[][] and refY[].
- the bias term is the last term here, it may be the first term.
- the CCLM prediction parameter derivation unit 310442 derives the following matrix sumXX and vector sumXY from the reference images pRefY and pRefC.
- sumXX[i][j] ⁇ refX[i][cnt]*refX[j][cnt]
- sumXY[i] ⁇ refX[i][cnt]*refY[cnt] ⁇
- ⁇ represents the sum regarding cnt. Note that sumXX and sumXY may be derived directly from pRefY and pRefC without using refX[][] and refY[].
- the intra-parameter derivation unit 304 includes a CCLM prediction filter unit that generates a chrominance predicted image using a luminance target pixel corresponding to the chrominance pixel of the target block, its neighboring pixels, and CCLM prediction parameters, and an intra-parameter derivation unit 304 that predicts chrominance from the luminance. Decode cclm_mode_flag indicating whether CCLM prediction is performed.
- cclm_mode_flag is a value indicating that CCLM prediction is to be performed (here, 1)
- a flag cccm_mode_flag indicating whether to perform luminance color difference prediction using two luminance pixels is decoded.
- cclm_mode_flag is a value indicating that CCLM prediction is to be performed (here, 1)
- the CCLM prediction unit generates a predicted image of the chrominance image using the luminance image.
- the CCLM prediction parameter derivation unit 310442 uses the reference pixel pRef[x][y] of the luminance reference area adjacent to the target block and the adjacent pixel pRef[x+dX][y+dY] of the reference pixel.
- a CCLM prediction parameter consisting of a first weight a0, a second weight a1, and a first offset value b is derived. Furthermore, the CCLM prediction filter unit 310443 calculates the product of the luminance reference pixel refSamples[x][y] and the first weight a0, the product of the luminance adjacent pixel prefSamples[x+dX][y+dY] and the second The pixel value of the color difference predicted pixel predSamples is derived from the product of the weight a1 and the sum of the first offset value b.
- the position of the adjacent pixel in the reference image and the target image is the pixel (x+1, y) to the right of the target pixel (x, y).
- cclm_mode_flag is a value indicating that CCLM prediction is to be performed (here, 1)
- the intra prediction parameter deriving unit 304 further decodes an index cclm_nei_idx indicating the position of a pixel adjacent to the target pixel. Select adjacent pixels using cclm_nei_idx.
- CCLM prediction parameters are derived using adjacent pixels at relative positions (dX, dY) to the reference pixel.
- the number of options is not limited to two, but may be four on the top, bottom, left, and right of the reference pixel.
- the CCLM prediction parameter derivation unit fixes the number of CCLM prediction parameters regardless of the index, and derives the CCLM prediction parameter by switching the position of the adjacent pixel of the reference pixel according to the index, and the CCLM filter unit , a predicted image is derived by switching the adjacent pixels according to the decoded index.
- FIG. 14 is a diagram showing a syntax configuration according to an embodiment of the present invention.
- the intra prediction parameter deriving unit 304 decodes cclm_mode_flag indicating whether to perform CCLM prediction that predicts color difference from luminance.
- cclm_mode_flag is a value indicating that CCLM prediction is to be performed (1 in this case)
- CCCM prediction parameters are derived and the flag mmlm_mode_flag indicating whether CCLM prediction is to be performed is decoded.
- a flag cccm_mode_flag indicating whether to perform luminance color difference prediction (CCCM mode) using multiple luminance pixels is decoded. If cccm_mode_flag does not appear (in the case of MMLM mode), cccm_mode_flag is derived as a value (here, 0) indicating that multiple target images are not used.
- CCCM mode the number of parameters required for the filter that generates predicted pixels increases.
- the intra prediction parameter deriving unit 304 may decode the index cclm_ref_idx indicating the position of the reference pixel.
- Figure (b) shows the relationship between IntraPredModeC, each flag, and index.
- cclm_ref_idx is 0, the reference pixel is located in the area above and to the left of the target block.
- cclm_ref_idx is 1, the reference pixel is located in the left area of the target block.
- cclm_ref_idx is 2, the reference pixel is located in the area above the target block.
- the flag (cccm_mode_flag) indicating whether to predict a chrominance pixel using multiple luminance pixels is estimated to be 0 without being decoded.
- the MMLM mode and the CCCM mode can be made exclusive. Therefore, complicated processing such as deriving a plurality of CCLM prediction parameters for each of a plurality of models can be avoided, and the effect of reducing complexity can be achieved while maintaining performance.
- FIG. 15 is a diagram showing a syntax configuration according to an embodiment of the present invention.
- FIG. 2 is a diagram showing a syntax configuration according to an embodiment of the present invention.
- the intra prediction parameter deriving unit 304 decodes cclm_mode_flag indicating whether to perform CCLM prediction that predicts color difference from luminance.
- cclm_mode_flag is a value indicating that CCLM prediction is to be performed (here, 1)
- a flag cccm_mode_flag indicating whether to perform filtering of luminance and color difference prediction using a plurality of luminance pixels is decoded.
- the intra prediction parameter deriving unit 304 may decode the index cclm_ref_idx indicating the position of the reference pixel.
- Figure (b) shows the relationship between IntraPredModeC, each flag, and index.
- cclm_ref_idx is 0, the reference pixel is located in the area above and to the left of the target block.
- cclm_ref_idx is 1, the reference pixel is located in the left area of the target block.
- cclm_ref_idx is 2, the reference pixel is located in the area above the target block.
- the flag indicating multi-model is estimated to be 0 without being decoded.
- the MMLM mode and the CCCM mode can be made exclusive. Therefore, it is possible to avoid complicated processing such as deriving CCLM prediction parameters for CCCM mode with a large number of parameters in each of a plurality of models, thereby achieving an effect of reducing complexity while maintaining performance.
- FIG. 16 is a diagram showing a syntax configuration according to an embodiment of the present invention.
- the intra prediction parameter deriving unit 304 decodes cclm_mode_flag indicating whether to perform CCLM prediction that predicts color difference from luminance.
- cclm_mode_flag is a value indicating that CCLM prediction is to be performed (here, 1)
- an index cclm_mode_idx indicating whether or not to perform MMLM prediction and a reference pixel for CCLM prediction is decoded.
- MMLM prediction is CCLM prediction using multiple models (CCCM prediction parameters). In CCCM prediction, filtering is performed using multiple luminance reference pixels to predict color difference pixels.
- cclm_mode_idx When cclm_mode_idx is 0, 2, or 3, CCLM prediction is not in MMLM mode, and the reference pixels are located in the upper and left regions, the left region, and the upper region of the target block, respectively.
- cclm_mode_idx When cclm_mode_idx is 1, 4, or 5, CCLM prediction is in MMLM mode, and the reference pixels are located in the upper and left regions, the left region, and the upper region of the target block, respectively.
- cccm_mode_flag is decoded.
- cccm_mode_flag is a flag indicating whether to filter the luminance/chrominance predicted pixel using a plurality of luminance pixels. If cccm_mode_flag does not appear, cccm_mode_flag is derived as a value (0) indicating that multiple target images are not used.
- IsMMLM is 1 (TRUE) if cclm_mode_idx is INTRA_LT_MMLM (for example, 1), INTRA_L_MMLM (for example, 4), or INTRA_T_MMLM (for example, 5), and is 0 (FALSE) otherwise.
- "+” may be a logical sum "
- Figure (b) shows the relationship between IntraPredModeC, each flag, and index.
- the flag indicating whether or not to use multiple reference pixels in a filter for predicting color difference is estimated to be 0 without being decoded.
- MMLM mode and CCCM mode are exclusive. Therefore, in each of a plurality of models, complicated processing such as deriving a plurality of CCLM prediction parameters in the CCCM mode can be avoided, and the effect of reducing complexity can be achieved while maintaining performance.
- FIG. 17 is a diagram showing a syntax configuration according to an embodiment of the present invention.
- the intra prediction parameter deriving unit 304 decodes cclm_mode_flag indicating whether to perform CCLM prediction that predicts color difference from luminance.
- cclm_mode_flag is a value indicating that CCLM prediction is to be performed (here, 1)
- cccm_mode_flag indicating whether to perform filtering of luminance/chrominance prediction of chrominance pixels using a plurality of luminance pixels is decoded.
- MMLM mode is a mode that performs CCLM prediction using multiple models (CCCM prediction parameters).
- CCCM prediction parameters The value of cclm_mode_idx is as described in ⁇ Example 3 of exclusive configuration>.
- TB Truncated Binary
- Figure (b) shows the relationship between IntraPredModeC, each flag, and index.
- the flag indicating whether to use CCCM mode is estimated to be 0 without being decoded.
- the MMLM mode and the CCCM mode can be made exclusive. Therefore, in each of a plurality of models, complicated processing such as deriving a plurality of CCLM prediction parameters in the CCCM mode can be avoided, and the effect of reducing complexity can be achieved while maintaining performance.
- cclm_nei_idx described in ⁇ 3-parameter Configuration Example 1> may be further signaled.
- cclm_nei_idx is an index indicating the position of a pixel adjacent to the target pixel.
- FIG. 18 is a flowchart showing the operation of the CCLM prediction unit according to an embodiment of the present invention.
- the intra prediction parameter deriving unit 304 decodes cclm_mode_flag from the encoded data.
- S3502 When using CCLM prediction, transition to S3503. In other cases, the intra prediction unit 3104 performs prediction other than CCLM prediction.
- S3503 The intra prediction parameter deriving unit 304 derives information regarding the type of CCLM prediction from the CU information of the encoded data.
- mmlm_mode_flag and cclm_mode_idx in FIGS. 14 to 17 are decoded to determine whether the target block is in MMLM mode, CCCM mode, or other mode, or to derive the reference position, the position of adjacent pixels, etc.
- S3504 When the information regarding the type of CCLM prediction indicates that multi-model is not used, the process moves to S3507 and CCCM prediction with three or more parameters is used. , Conversely, when using multimode, two-parameter CCCM prediction is used. (S3506) The intra prediction unit 3104 does not perform CCCM prediction.
- luminance/chrominance prediction using CCLM prediction parameters consisting of three or more parameters is not performed, but two parameters, a weighting coefficient and a bias, for luminance pixels are derived, and luminance/chrominance prediction is performed using the two parameters.
- prediction is performed using the formula (MMLM-1).
- the intra prediction unit 3104 performs CCCM prediction.
- CCLM prediction parameters consisting of three or more parameters are derived, and luminance/chrominance prediction is performed using the CCLM prediction parameters having three or more parameters.
- prediction is performed using the formulas (CCCM-1) and (CCCM-2).
- the CCLM prediction unit generates a predicted image of a color difference image using a luminance image, and is capable of classifying into groups according to luminance pixel values and deriving a plurality of CCLM prediction parameters for each group.
- the CCLM prediction parameter derivation unit includes a CCLM prediction parameter derivation unit and a CCLM prediction filter unit 310443 that generates a color difference prediction image using a luminance reference image and the CCLM prediction parameters, and the CCLM prediction parameter derivation unit
- the number of parameters for CCLM prediction is changed depending on whether to divide into two or more groups.
- the CCLM prediction parameter derivation unit 310442 derives CCLM prediction parameters using the number of parameters of two-parameter CCLM prediction when classifying into two or more groups according to the pixel value of the luminance image; When using groups, the number of parameters is derived using the number of parameters in the three-parameter CCLM prediction.
- the video decoding apparatus includes a parameter decoding unit 302 that decodes a CCLM flag indicating whether to perform 3-parameter CCLM prediction and a CCLM flag indicating whether to perform 2-parameter CCLM prediction from encoded data, and a CCLM prediction unit. If the CCLM flag is 1, one CCLM prediction parameter is derived, and in other cases, two or more CCLM prediction parameters are derived.
- FIG. 19 is a flowchart showing the operation of the CCLM prediction unit according to an embodiment of the present invention.
- S3501 S3503 Since this has already been explained using FIG. 18, the explanation will be omitted.
- S3504 When the information regarding the type of CCLM prediction indicates that multi-model is used, the process moves to S3506 and CCCM prediction is not used. Conversely, if multi-model is not used, the process transitions to S3505.
- S3505 If the size of the target block is smaller than a predetermined size, for example if cbWidth*cbHeight ⁇ TH, the process moves to S3506 and uses two-parameter multi-model prediction without using CCCM prediction. If the size of the target block is greater than or equal to the predetermined size, the process moves to S3507 and uses CCCM prediction to perform CCLM prediction using three or more parameters.
- cccm_mode_flag The above may be derived using cccm_mode_flag as follows.
- the target block and adjacent blocks of the luminance image are represented by pY[][] and pRefY[][].
- the target block has a width bW and a height bH.
- the CCLM prediction unit 31044 (unfiltered reference image setting unit 3102) refers to the luminance adjacent image pRefY[][] in FIGS. 9(a) to (c) and the chrominance adjacent image pRefC[][] in FIG. 9(e). CCLM prediction parameters are derived using the region as a region. The CCLM prediction unit 31044 derives a predicted color difference image using the luminance target image pRef[].
- IntraPredModeC is INTRA_LT_CCLM, INTRA_LT_MMLM, INTRA_LT_CCCM_SINGLE
- the CCLM prediction unit 31044 derives CCLM prediction parameters using the pixel values of the upper and left neighboring blocks of the target block, as shown in (a).
- IntraPredModeC is 82 (INTRA_L_CCLM, INTRA_L_MMLM, INTRA_L_CCCM_SINGLE)
- the CCLM prediction parameters are derived using the pixel values of the left adjacent block as shown in (b).
- IntraPredModeC is 83 (INTRA_T_CCLM, INTRA_T_MMLM, INTRA_T_CCCM_SINGLE), (c) Derive CCLM prediction parameters using the pixel values of the upper adjacent block as shown in (a).
- the size of each region may be as follows.In (a), the upper side of the target block has a width bW and a height refH (refH>1). , the left side of the target block has height bH and width refW (refW>1). In (b), the height is 2*bH and width is refW. In (c), the width is 2*bW and , the height is refH.
- refW and refH may be set to a value larger than 1 according to the number of taps of the downsampling filter.
- the color difference image (Cb, The target block and adjacent blocks of Cr) are represented by pC[][] and pRefC[][].
- the target block has a width bWC and a height bHC.
- FIG. 8 is a block diagram showing an example of the configuration of the CCLM prediction unit 31044.
- the CCLM prediction unit 31044 includes a downsampling unit 310441, a CCLM prediction parameter derivation unit (parameter derivation unit) 310442, and a CCLM prediction filter unit 310443.
- the downsampling unit 310441 downsamples pRefY[][] and pY[][] in order to match the size of the color difference image.
- the color difference format is 4:2:0
- the number of pixels in the horizontal and vertical directions of pRefY[][] and pY[][] is sampled at 2:1, and the result is pRefDsY[][ in Figure 9(d). ], stored in pDsY[][].
- bW/2 and bH/2 are equal to bWC and bHC, respectively.
- the color difference format is 4:2:2, sample the horizontal pixel count of pRefY[][] and pY[][] to 2:1, and save the results to pRefDsY[][] and pDsY[][].
- the color difference format is 4:4:4, sampling is not performed and pRefY[][] and pY[][] are stored in pRefDsY[][] and pDsY[][].
- An example of sampling is shown by the formula below.
- pDsY[x][y] (pY[2*x ⁇ 1][2*y]+pY[2*x ⁇ 1][2*y+1]+2*pY[2*x][2* y]+2*pY[2*x][2*y+1]+pY[2*x+1][2*y+1]+4 )>>3
- pRefDsY[x][y] (pRefY[2*x ⁇ 1][2*y]+pRefY[2*x ⁇ 1][2*y+1]+2*pRefY[2*x][2* y]+2*pRefY[2*x][2*y+1]+pRefY[2*x+1][2*y+1]+4 )>>3
- predSamples[x][y] ((a*refSamples[x][y])>>shiftA)+b (CCLM-1)
- refSamples is pDsY in Figure 9(d)
- (a,b) are the CCLM prediction parameters derived by the CCLM prediction parameter derivation unit 310442
- predSamples[][] is the color difference predicted image ( Figure 9( e) pC).
- (a, b) are derived for Cb and Cr, respectively.
- the CCLM prediction filter unit 310443 uses the reference image refSamples[x][y] and its adjacent pixels refSamples[x+dX][y+dY] as input signals, and uses the CCLM prediction parameters (a, b ) to output the predicted image predSamples[x][y].
- (dX,dY) is (-1,0),(1,0),(0,-1),(0,1) etc.
- predSamples[x][y] ((a0*refSamples[x][ y]+ ⁇ ak*refSamples[x+dXk][y+dYk])>>shiftA)+b (CCCM-1)
- the weighting coefficient ak of adjacent pixels may be a plurality of parameters.
- predSamples[x][y] ((a0*refSamples[x][y]+a1*refSamples[x+1][y]+a2*refSamples[x][y+1])>>shiftA)+b (CCCM-1)
- the luminance image before downsampling may be used.
- predSamples may be derived as in the following formula.
- predSamples[x][y] (a0*refSamples[x*SubWidthC][y*SubHeightC]+ ⁇ ak*refSamples[x*SubWidthC+dXk][y*SubHeightC+dYk])>>shiftA)+b (CCCM- 2) It may be derived as follows depending on the syntax element sps_chroma_format_idc in the encoded data.
- the luminance signal is classified according to the magnitude of the luminance signal
- multiple CCLM prediction parameters are derived according to the classification
- a predicted image is derived according to this prediction parameter. You may do so. For example, using a certain threshold value thVal, pixels are classified into modeId according to the size of refSamples as shown below. Then, filter processing is performed using the determined CCLM prediction parameters a[modelId] and b[modelId] according to modeId.
- predSamples[x][y] ((a0[modelId]*refSamples[x][y]+ ⁇ ak[modelId]*refSamples[x+1][y])>>shiftA)+b (CCCM-1) Note that this configuration is not used in the exclusive configuration described above.
- the CCLM prediction filter section 310443 includes a linear prediction section 310444.
- the linear prediction unit 310444 uses refSamples[][] as an input signal and outputs predSamples[][] using CCLM prediction parameters (a, b).
- the linear prediction unit 310444 derives the color difference Cb or Cr from the luminance Y using the following formula using the CCLM prediction parameters (a, b), and uses this to output predSamples[][].
- the CCLM prediction parameter derivation unit 310442 calculates the downsampled adjacent block pRefY for luminance (pRefDsY[][] in FIG. 9(d)) and the adjacent block pRefC[][] for chrominance (pRefC[] in FIG. 9(e)). []) is used as an input signal to derive CCLM prediction parameters.
- the CCLM prediction parameter deriving section 310442 outputs the derived CCLM prediction parameters (a, b) to the CCLM prediction filter section 310443.
- the CCLM prediction parameter derivation unit 310442 selects a point (x1, y1) at which the brightness value Y is maximum (Y_MAX) from a set of adjacent blocks (brightness value Y, color difference value C). Derive the point (x2, y2) that is the minimum (Y_MIN). Next, set the pixel values of (x1, y1) and (x2, y2) on pRefC corresponding to (x1, y1) and (x2, y2) on pRefDsY to C_MAX (or C_Y_MAX) and C_MIN (or C_Y_MIN), respectively. do. Then, as shown in FIG.
- CCLM prediction parameters (a, b) of this straight line can be derived using the following formula.
- the CCLM prediction parameter derivation unit 310442 derives the CCLM prediction parameters a, b, and shiftA from the luminance difference value diff and the chrominance difference value diffC, using the reciprocal table divSigTable.
- the CCLM prediction parameter derivation unit 310442 calculates the CCLM prediction parameter (a0, a1, ...aN- 2, b) may be derived.
- the CCLM prediction parameter derivation unit 310442 derives the following temporary reference arrays refX[][] and refY[] from the reference images pRefY and pRefC.
- the CCLM prediction parameter derivation unit 310442 derives the following matrix sumXX and vector sumXY from the reference images pRefY and pRefC.
- sumXX[i][j] ⁇ refX[i][cnt]*refX[j][cnt]
- sumXY[i] ⁇ refX[i][cnt]*refY[cnt]
- ⁇ is the sum regarding cnt. Note that it may be derived by directly adding sumXX and sumXY from pRefY and pRefC without using refX[][] and refY[].
- inverse(X) is the inverse matrix of X.
- MMLM multiple CCLM prediction parameters are derived for the target block.
- a straight line connecting (Y_MAX, C_MAX) and (Y_MIN, C_MIN) is found on the graph with Y and C on the x and y axes, respectively.
- a straight line connecting (Y_MAX, C_MAX) and (Y_MIN, C_MIN) is determined for each model.
- the reference area is classified according to the brightness value and the modelId is derived.
- thVal is a threshold for classification, and the average value of the brightness values of the target block (or the average value of down-sampled brightness values, or the average value of brightness values sampled from the reference area) may be used. .
- each block of the video decoding device 31 and the video encoding device 11 described above may be realized in hardware by a logic circuit formed on an integrated circuit (IC chip), or may be implemented using a CPU (Central Processing It may also be realized in software using
- each of the above devices includes a CPU that executes instructions of programs that implement each function, a ROM (Read Only Memory) that stores the above programs, a RAM (Random Access Memory) that expands the above programs, and the above programs and various other components. It is equipped with a storage device (recording medium) such as a memory for storing data.
- the purpose of the embodiment of the present invention is to provide a computer-readable record of the program code (executable program, intermediate code program, source program) of the control program for each of the above devices, which is software that realizes the above-described functions. This can also be achieved by supplying a medium to each of the above devices, and having the computer (or CPU or MPU) read and execute the program code recorded on the recording medium.
- Examples of the above-mentioned recording media include tapes such as magnetic tape and cassette tape, magnetic disks such as floppy (registered trademark) disks/hard disks, and CD-ROM (Compact Disc Read-Only Memory)/MO disks (Magneto-Optical discs). ) / MD (Mini Disc) / DVD (Digital Versatile Disc: registered trademark) / CD-R (CD Recordable) / Blu-ray Disc (Blu-ray Disc: registered trademark), etc.
- Discs including optical discs, IC cards (memory cards) ) / cards such as optical cards, mask ROM / EPROM (Erasable Programmable Read-Only Memory) / EEPROM (Electrically Erasable and Programmable Read-Only Memory: registered trademark) / semiconductor memory such as flash ROM, or PLD (Logic circuits such as FPGA (Programmable logic device) and FPGA (Field Programmable Gate Array) can be used.
- IC cards memory cards
- EPROM Erasable Programmable Read-Only Memory
- EEPROM Electrical Erasable and Programmable Read-Only Memory: registered trademark
- semiconductor memory such as flash ROM, or PLD (Logic circuits such as FPGA (Programmable logic device) and FPGA (Field Programmable Gate Array) can be used.
- each of the above devices may be configured to be connectable to a communication network, and the program code may be supplied via the communication network.
- This communication network is not particularly limited as long as it can transmit program codes. Examples include the Internet, intranet, extranet, LAN (Local Area Network), ISDN (Integrated Services Digital Network), VAN (Value-Added Network), CATV (Community Antenna television/Cable Television) communication network, and Virtual Private Network. network), telephone line network, mobile communication network, satellite communication network, etc.
- the transmission medium constituting this communication network may be any medium that can transmit program codes, and is not limited to a specific configuration or type.
- IEEE Institute of Electrical and Electronic Engineers 1394, USB, power line carrier, cable TV line, telephone line, ADSL (Asymmetric Digital Subscriber Line) line, etc., as well as IrDA (Infrared Data Association) and infrared rays such as remote control.
- BlueTooth registered trademark
- IEEE802.11 wireless IEEE802.11 wireless
- HDR High Data Rate
- NFC Near Field Communication
- DLNA Digital Living Network Alliance: registered trademark
- mobile phone networks satellite lines, digital terrestrial broadcasting networks, etc. It is also available wirelessly.
- embodiments of the present invention may also be implemented in the form of a computer data signal embedded in a carrier wave, in which the program code is embodied in electronic transmission.
- Image decoding device 301 Entropy decoding section 302 Parameter decoding section 303 Inter prediction parameter derivation unit 304 Intra prediction parameter derivation part 308 Predicted image generation unit 309 Inter predicted image generation unit 310 Intra predicted image generation unit 3104 Prediction section (intra prediction section) 31044 CCLM prediction unit (predicted image generation device) 310441 Downsampling section 310442 CCLM prediction parameter derivation unit (parameter derivation unit) 310443 CCLM prediction filter section 311 Inverse quantization/inverse transformation section 312 Addition section 320 Prediction parameter derivation part 11 Image encoding device 101 Predicted image generation unit 102 Subtraction part 103 Conversion/quantization section 104 Entropy encoder 105 Inverse quantization/inverse transformation section 107 Loop filter 110 Encoding parameter determination unit 111 Parameter encoding section 112 Inter prediction parameter derivation unit 113 Intra prediction parameter derivation part 120 Prediction parameter derivation part
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Color Television Systems (AREA)
Abstract
Description
以下、図面を参照しながら本発明の実施形態について説明する。
本明細書で用いる演算子を以下に記載する。
、|=はOR代入演算子であり、||は論理和を示す。
本実施形態に係る動画像符号化装置11および動画像復号装置31の詳細な説明に先立って、動画像符号化装置11によって生成され、動画像復号装置31によって復号される符号化ストリームTeのデータ構造について説明する。
図2の(a)~(f)は、それぞれ、シーケンスSEQを既定する符号化ビデオシーケンス、ピクチャPICTを規定する符号化ピクチャ、スライスSを規定する符号化スライス、スライスデータを規定する符号化スライスデータ、符号化スライスデータに含まれる符号化ツリーユニット、符号化ツリーユニットに含まれる符号化ユニットを示す図である。
符号化ビデオシーケンスでは、処理対象のシーケンスSEQを復号するために動画像復号装置31が参照するデータの集合が規定されている。シーケンスSEQは、図2(a)に示すように、ビデオパラメータセット(Video Parameter Set)、シーケンスパラメータセットSPS(Sequence Parameter Set)、ピクチャパラメータセットPPS(Picture Parameter Set)、ピクチャPICT、及び、付加拡張情報SEI(Supplemental Enhancement Information)を含んでいる。
符号化ピクチャでは、処理対象のピクチャPICTを復号するために動画像復号装置31が参照するデータの集合が規定されている。ピクチャPICTは、図2(b)に示すように、スライス0~スライスNS-1を含む(NSはピクチャPICTに含まれるスライスの総数)。
符号化スライスでは、処理対象のスライスSを復号するために動画像復号装置31が参照するデータの集合が規定されている。スライスは、図2(c)に示すように、スライスヘッダ、および、スライスデータを含んでいる。
符号化スライスデータでは、処理対象のスライスデータを復号するために動画像復号装置31が参照するデータの集合が規定されている。スライスデータは、図2(d)に示すように、CTUを含んでいる。CTUは、スライスを構成する固定サイズ(例えば64x64)のブロックであり、最大符号化単位(LCU:Largest Coding Unit)と呼ぶこともある。
図2(e)には、処理対象のCTUを復号するために動画像復号装置31が参照するデータの集合が規定されている。CTUは、再帰的な4分木分割(QT(Quad Tree)分割)、2分木分割(BT(Binary Tree)分割)あるいは3分木分割(TT(Ternary Tree)分割)により符号化処理の基本的な単位である符号化ユニットCUに分割される。BT分割とTT分割を合わせてマルチツリー分割(MT(Multi Tree)分割)と呼ぶ。再帰的な4分木分割により得られる木構造のノードのことを符号化ノード(Coding Node)と称する。4分木、2分木、及び3分木の中間ノードは、符号化ノードであり、CTU自身も最上位の符号化ノードとして規定される。
図2(f)に示すように、処理対象の符号化ユニットを復号するために動画像復号装置31が参照するデータの集合が規定されている。具体的には、CUは、CUヘッダCUH、予測パラメータ、変換パラメータ、量子化変換係数等から構成される。CUヘッダでは予測モード等が規定される。
予測画像は、ブロックに付随する予測パラメータによって導出される。予測パラメータには、イントラ予測とインター予測の予測パラメータがある。
intra_luma_mpm_flagは、対象ブロックの輝度予測モードIntraPredModeYとMPM(Most Probable Mode)とが一致するか否かを示すフラグである。MPMは、MPM候補リストmpmCandList[]に含まれる予測モードである。MPM候補リストは、隣接ブロックのイントラ予測モードおよび所定のイントラ予測モードから、対象ブロックに適用される確率が高いと推定される候補を格納したリストである。intra_luma_mpm_flagが1の場合、MPM候補リストとインデックスmpm_idxを用いて、対象ブロックの輝度予測モードIntraPredModeYを導出する。
(REM)
intra_luma_mpm_flagが0の場合、mpm_remainderを用いて輝度予測モードIntraPredModeYを導出する。具体的には、イントラ予測モード全体からMPM候補リストに含まれるイントラ予測モードを除いた残りのモードRemIntraPredModeからイントラ予測モードを選択する。
本実施形態に係る動画像復号装置31(図4)の構成について説明する。
イントラ予測パラメータ導出部304は、エントロピー復号部301から入力された符号に基づいて、予測パラメータメモリ307に記憶された予測パラメータを参照してイントラ予測パラメータ、例えば、イントラ予測モードIntraPredModeを導出する。イントラ予測パラメータ導出部304は、導出したイントラ予測パラメータを予測画像生成部308に出力し、また予測パラメータメモリ307に記憶する。イントラ予測パラメータ導出部304は、輝度と色差で異なるイントラ予測モードを導出しても良い。
予測画像生成部308は、予測モードpredModeが示す予測モードで、予測パラメータと読み出した参照ピクチャ(参照ピクチャブロック)を用いてブロックもしくはサブブロックの予測画像を生成する。ここで、参照ピクチャブロックとは、参照ピクチャ上の画素の集合(通常矩形であるのでブロックと呼ぶ)であり、予測画像を生成するために参照する領域である。
予測モードpredModeがイントラ予測モードを示す場合、イントラ予測画像生成部310は、イントラ予測パラメータ導出部304から入力されたイントラ予測パラメータと参照ピクチャメモリ306から読み出した参照画素を用いてイントラ予測を行う。
次に、図7を用いてイントラ予測画像生成部310の構成の詳細を説明する。イントラ予測画像生成部310は、予測対象ブロック設定部3101、未フィルタ参照画像設定部3102(第1の参照画像設定部)、フィルタ済参照画像設定部3103(第2の参照画像設定部)、予測部3104(イントラ予測部3104)、および、予測画像補正部3105(予測画像補正部、フィルタ切替部、重み係数変更部)を備える。
予測対象ブロック設定部3101は、対象CUを予測対象ブロックに設定し、予測対象ブロックに関する情報(予測対象ブロック情報)を出力する。予測対象ブロック情報には、予測対象ブロックのサイズ、位置、輝度か色差かを示すインデックスが少なくとも含まれる。
未フィルタ参照画像設定部3102は、予測対象ブロックのサイズと位置に基づいて、予測対象ブロックの隣接周辺領域を参照領域Rとして設定する。続いて、参照領域R内の各画素値(未フィルタ参照画像、境界画素)に、参照ピクチャメモリ306上で対応する位置の各復号画素値をセットする。図6(a)に示す予測対象ブロック上辺に隣接する復号画素のラインr[x][-1]、および、予測対象ブロック左辺に隣接する復号画素の列r[-1][y]が未フィルタ参照画像である。
フィルタ済参照画像設定部3103は、イントラ予測モードに応じて、未フィルタ参照画像に参照画素フィルタ(第1のフィルタ)を適用して、参照領域R上の各位置(x,y)のフィルタ済参照画像s[x][y]を導出する。具体的には、位置(x,y)とその周辺の未フィルタ参照画像にローパスフィルタを適用し、フィルタ済参照画像(図6(b))を導出する。なお、必ずしも全イントラ予測モードにローパスフィルタを適用する必要はなく、一部のイントラ予測モードに対してローパスフィルタを適用してもよい。なお、フィルタ済参照画素設定部3103において参照領域R上の未フィルタ参照画像に適用するフィルタを「参照画素フィルタ(第1のフィルタ)」と呼称するのに対し、後述の予測画像補正部3105において仮予測画像を補正するフィルタを「バウンダリフィルタ(第2のフィルタ)」と呼称する。
イントラ予測部3104は、イントラ予測モードと、未フィルタ参照画像、フィルタ済参照画素値に基づいて予測対象ブロックの仮予測画像(仮予測画素値、補正前予測画像)を生成し、予測画像補正部3105に出力する。予測部3104は、内部にPlanar予測部31041、DC予測部31042、Angular予測部31043、およびCCLM予測部(予測画像生成装置)31044を備えている。予測部3104は、イントラ予測モードに応じて特定の予測部を選択して、未フィルタ参照画像、フィルタ済参照画像を入力する。イントラ予測モードと対応する予測部との関係は次の通りである。
・Planar予測 ・・・Planar予測部31041
・DC予測 ・・・DC予測部31042
・Angular予測 ・・・Angular予測部31043
・CCLM予測 ・・・CCLM予測部31044
(Planar予測)
Planar予測部31041は、予測対象画素位置と参照画素位置との距離に応じて、複数のフィルタ済参照画像s[x][y]を線形加算して仮予測画像q[x][y]を生成し、予測画像補正部3105に出力する。
DC予測部31042は、フィルタ済参照画像s[x][y]の平均値に相当するDC予測値を導出し、DC予測値を画素値とする仮予測画像q[x][y]を出力する。
Angular予測部31043は、イントラ予測モードの示す予測方向(参照方向)のフィルタ済参照画像s[x][y]を用いて仮予測画像q[x][y]を生成し、予測画像補正部3105に出力する。
CCLM予測部31044は、輝度の画素値に基づいて色差の画素値を予測する。具体的には、復号した輝度画像をもとに、線形モデルを用いて、色差画像(Cb、Cr)の予測画像を生成する方式である。CCLM予測は後述する。
予測画像補正部3105は、イントラ予測モードに応じて、予測部3104から出力された仮予測画像を修正する。具体的には、予測画像補正部3105は、仮予測画像の各画素に対し、参照領域Rと対象予測画素との距離に応じて、未フィルタ参照画像と仮予測画像を重み付け加算(加重平均)することで、仮予測画像を修正した予測画像(補正済予測画像)Predを導出する。なお、一部のイントラ予測モード(例えば、Planar予測、DC予測等)は、予測画像補正部3105で仮予測画像を補正せず、予測部3104の出力を予測画像としてもよい。
加算部312はブロックの復号画像を参照ピクチャメモリ306に記憶し、また、ループフィルタ305に出力する。
次に、本実施形態に係る動画像符号化装置11の構成について説明する。図11は、本実施形態に係る動画像符号化装置11の構成を示すブロック図である。動画像符号化装置11は、予測画像生成部101、減算部102、変換・量子化部103、逆量子化・逆変換部105、加算部106、ループフィルタ107、予測パラメータメモリ(予測パラメータ記憶部、フレームメモリ)108、参照ピクチャメモリ(参照画像記憶部、フレームメモリ)109、符号化パラメータ決定部110、パラメータ符号化部111、予測パラメータ導出部120、エントロピー符号化部104を含んで構成される。
イントラ予測パラメータ符号化部113は、符号化パラメータ決定部110から入力されたイントラ予測モードIntraPredModeから、符号化するための形式(例えばmpm_idx、mpm_remainder等)を導出する。イントラ予測パラメータ導出部113は、イントラ予測パラメータ導出部304がイントラ予測パラメータを導出する構成と、一部同一の構成を含む。
上述した動画像符号化装置11及び動画像復号装置31は、動画像の送信、受信、記録、再生を行う各種装置に搭載して利用することができる。なお、動画像は、カメラ等により撮像された自然動画像であってもよいし、コンピュータ等により生成された人工動画像(CGおよびGUIを含む)であってもよい。
次に、図8~図23を参照してCCLM予測について説明する。
ただし、本明細書のCCLM予測パラメータの数にはシフト値を含めない。つまり、以下のように定義する。
a0, a1, b, shiftAをパラメータとする線形予測を3パラメータの線形予測
これはCCLM予測パラメータの導出に係る計算量、CCLM予測における線形予測に係る計算量において、shiftAの導出処理、shiftAによるシフト処理は無視できる程度であるため、shiftAをパラメータ数に含めない。
INTRA_L_CCLM(82) 左参照、1モデル、2パラメータ
INTRA_T_CCLM(83) 上参照、1モデル、2パラメータ
INTRA_LT_MMLM(84) 左と上参照、2モデル、2パラメータ
INTRA_L_MMLM(85) 左参照、2モデル、2パラメータ
INTRA_T_MMLM(86) 上参照、2モデル、2パラメータ
INTRA_LT_CCCM_SINGLE(87) 左と上参照、2モデル、3パラメータ
INTRA_L_CCCM_SINGLE(88) 左参照、2モデル、3パラメータ
INTRA_T_CCCM_SINGLE(89) 上参照、2モデル、3パラメータ
INTRA_LT_MMLM_CCCM(90) 左と上参照、2モデル、3パラメータ
INTRA_L_MMLM_CCCM(91) 左参照、2モデル、3パラメータ
INTRA_T_MMLM_CCCM(92) 上参照、2モデル、3パラメータ
括弧内は対応するIntraPredModeCの値である。がこの値に限定されない。また上記予測の全てを利用するのではなく一部を利用する構成でもよい。特に、後述する排他構成では、INTRA_{LT,L,T}_MMLM_CCCMで示される複数モデル複数隣接パラメータ(上記の2モデル、3パラメータ)を用いない。
図13は、本実施形態に係る対象画素と隣接画素の位置関係を示す図である。本構成のCCLM予測フィルタ部310443は、予測すべき色差画素位置(x,y)に対応する輝度の対象画素refSamples[x*SubWidthC][y*SubHeightC]とその隣接画素refSamples[x*SubWidthC+dX][y*SubHeightC+dY]を用いてCCLM予測を行う。そして、色差画素位置(x,y)の予測画像predSamples[x][y]を生成する。図では以下の4つの例を示す。SubWidthC、SubHeightCは、色差画素に対する輝度画素のサンプリング比である。
(a)対象画素(x,y)と右隣接画素(x+1, y)
predSamples[x][y] =(a0*refSamples[x*SubWidthC][y*SubHeightC]+a1*refSamples[x*SubWidthC+1][y*SubHeightC])>>shiftA)+b
(b)対象画素(x,y)と下隣接画素(x, y+1)
predSamples[x][y] =(a0*refSamples[x*SubWidthC][y*SubHeightC]+a1*refSamples[x*SubWidthC][y*SubHeightC+1])>>shiftA)+b
(c)対象画素(x,y)と左隣接画素(x-1, y)
predSamples[x][y] =(a0*refSamples[x*SubWidthC][y*SubHeightC]+a1*refSamples[x*SubWidthC-1][y*SubHeightC])>>shiftA)+b
(d)対象画素(x,y)と上隣接画素(x, y-1)
predSamples[x][y] =(a0*refSamples[x*SubWidthC][y*SubHeightC]+a1*refSamples[x*SubWidthC][y*SubHeightC-1])>>shiftA)+b
シミュレーションでは(a)の構成が最も高い精度の予測画像を生成する頻度が高かった。従って、本構成のCCLM予測フィルタ部310443は、少なくとも隣接画素位置(dX,dY)=(1,0)をもつ(a)の構成でCCLM予測パラメータを導出し、CCLMフィルタ処理を行う。
refX[0][cnt]=pRefY[x][y]
refX[1][cnt]=pRefY[x+dX][y+dY]
refX[2][cnt]=1
refY[cnt] = pRefC[x/SubWidthC][y/SubHeightC]
cnt=cnt+1
}
ここで(x,y)、(dX,dY)は輝度の座標であり、x=-3..-1、y=0..cbHeight-1及びx=-0..cbWidth-1、y=-1..-3、(dX,dY)=(1,0)、(0,1)、(-1,0)、(0,-1)の何れか。
for (j=0; j<3;j++) {
sumXX[i][j] = ΣrefX[i][cnt]*refX[j][cnt]
sumXY[i] = ΣrefX[i][cnt]*refY[cnt]
}
}
ここでΣはcntに関する和を表わす。なお、refX[][]、refY[]を用いず、pRefY、pRefCから直接sumXX、sumXYを導出してもよい。
for ((x,y) in 参照領域) {
sumXX[0][0] = sumXY[0][0] + pRefY[x][y]*pRefY[x][y]
sumXX[0][1] = sumXX[0][1] + pRefY[x][y]*pRefY[x+dX][y+dY]
sumXX[1][1] = sumXY[1][1] + pRefY[x+dX][y+dY]* pRefY[x+dX][y+dY]
sumXX[0][2] = sumXY[0][2] + pRefY[x][y]
sumXX[1][2] = sumXX[1][2] + pRefY[x+dX][y+dY]
sumXX[2][2] = sumXY[2][2] + 1
sumXY[0] = sumXY[0] + pRefY[x][y]*pRefC[x/SubWidthC][y/SubHeightC]
sumXY[1] = sumXY[1] + pRefY[x+dX][y+dY]*pRefC[x/SubWidthC][y/SubHeightC]
sumXY[2] = sumXY[2] + pRefC[x/SubWidthC][y/SubHeightC]
}
sumXX[1][0] = sumXX[0][1]
sumXX[2][1] = sumXX[1][2]
さらに正則化項を対角成分に加算する
sumXX[i][i] = sumXX[i][i] + (1<<(bitDepth-1))
CCLM予測パラメータ導出部310442は、cparam=sumXY*inverse(sumXX)に相当する線形演算によりcparam[k]、k=0..2を導出する。ここでinverse(X)はXの逆行列。
a1 = cparam[1]
b = cparam[2]
対象ブロックの色差画素に対応する輝度の対象画素とその隣接画素とCCLM予測パラメータを用いて色差予測画像を生成するCCLM予測フィルタ部とを備え、イントラパラメータ導出部304は、輝度から色差を予測するCCLM予測を行うかを示すcclm_mode_flagを復号する。cclm_mode_flagがCCLM予測を行うことを示す値(ここでは1)の場合、2つの輝度画素を用いて輝度色差予測を行うかを示すフラグcccm_mode_flagを復号する。cclm_mode_flagがCCLM予測を行うことを示す値(ここでは1)の場合、CCLM予測部は、輝度画像を用いて色差画像の予測画像を生成する。このとき、CCLM予測パラメータ導出部310442は、対象ブロックに隣接する輝度の参照領域の参照画素pRef[x][y]と上記参照画素の隣接画素pRef[x+dX][y+dY]を用いて、第1の重みa0、第2の重みa1、第1のオフセット値bからなるCCLM予測パラメータを導出する。さらに、CCLM予測フィルタ部310443は、輝度の参照画素refSamples[x][y]と第1の重みa0の積と、上記輝度の隣接画素prefSamples[x+dX][y+dY]と第2の重みa1の積と、第1のオフセット値bの和から、色差の予測画素predSamplesの画素値を導出する。
イントラ予測パラメータ導出部304は、cclm_mode_flagがCCLM予測を行うことを示す値(ここでは1)の場合、さらに対象画素に対して隣接画素の位置を示すインデックスcclm_nei_idxを復号する。cclm_nei_idxにより隣接画素を選択する。
cclm_nei_idxが0の場合、(dX,dY) = (1,0)
cclm_nei_idxが1の場合、(dX,dY) = (0,1)
つまり、参照画素に対する相対位置(dX,dY)の隣接画素を用いて、CCLM予測パラメータを導出する。CCLM予測フィルタ部310443は、輝度参照画素と相対位置(dX,dY)の隣接輝度画素を用いて色差画素を予測する。(dX,dY)の位置として必ず、参照画素の右(dX,dY)=(1,0)を含む。
cclm_nei_idxが0の場合、(dX,dY) = (1,0)
cclm_nei_idxが1の場合、(dX,dY) = (0,1)
cclm_nei_idxが2の場合、(dX,dY) = (-1,0)
cclm_nei_idxが3の場合、(dX,dY) = (0,-1)
上記によれば、参照画素と隣接画素とバイアスのみを利用し計算量を低減しながら好適な予測画像を生成可能とする効果を奏する。
上記CCLM予測パラメータ導出部は、上記インデックスに関わらずCCLM予測パラメータの数を固定とし、上記インデックスに応じて上記参照画素の隣接画素の位置を切り替えて上記CCLM予測パラメータを導出し、CCLMフィルタ部は、復号したインデックスに応じて上記隣接画素を切り替えて予測画像を導出する。
図14は、本発明の一実施形態に係るシンタックス構成を示す図である。図(a)に示すように、イントラ予測パラメータ導出部304は、輝度から色差を予測するCCLM予測を行うかを示すcclm_mode_flagを復号する。cclm_mode_flagがCCLM予測を行うことを示す値(ここでは1)の場合、複数のモデル(CCCM予測パラメータ)を導出して、CCLM予測を行うかを示すフラグmmlm_mode_flagを復号する。mmlm_mode_flagが複数モデルを用いないことを示す値(ここでは0)の場合(mmlm_mode_flag==0)、複数の輝度画素を用いて輝度色差予測(CCCMモード)を行うかを示すフラグcccm_mode_flagを復号する。cccm_mode_flagが現れない場合(MMLMモードの場合)、cccm_mode_flagを、複数対象画像を用いないことを示す値(ここでは0)として導出する。CCCMモードでは予測画素を生成するフィルタに必要なパラメータ数が大きくなる。
ここで"-"は、当該シンタックス要素cccm_mode_flagを復号しないことを示し、"-"の場合には、cccm_mode_flag=0と推定(infer)する。
図15は、本発明の一実施形態に係るシンタックス構成を示す図である。本発明の一実施形態に係るシンタックス構成を示す図である。図(a)に示すように、イントラ予測パラメータ導出部304は輝度から色差を予測するCCLM予測を行うかを示すcclm_mode_flagを復号する。cclm_mode_flagがCCLM予測を行うことを示す値(ここでは1)の場合、複数の輝度画素を用いて輝度色差予測のフィルタリングを行うかを示すフラグcccm_mode_flagを復号する。複数参照画素を用いない場合(cccm_mode_flag==0)、複数のモデル(CCCM予測パラメータ)を導出して、CCLM予測を行うかを示すフラグmmlm_mode_flagを復号する。mmlm_mode_flagが現れない場合には、mmlm_mode_flagを、マルチモデルを用いないことを示す0として導出する。
図16は、本発明の一実施形態に係るシンタックス構成を示す図である。図(a)に示すように、イントラ予測パラメータ導出部304は、輝度から色差を予測するCCLM予測を行うかを示すcclm_mode_flagを復号する。cclm_mode_flagがCCLM予測を行うことを示す値(ここでは1)の場合、MMLM予測を行うか否か、および、CCLM予測の参照画素を示すインデックスcclm_mode_idxを復号する。MMLM予測は複数のモデル(CCCM予測パラメータ)を用いたCCLM予測である。CCCM予測では色差画素を予測するために複数の輝度の参照画素を用いてフィルタリングする。cclm_mode_idxが0、2、3の場合、CCLM予測はMMLMモードではなく、参照画素は各々、対象ブロックの上および左の領域、左の領域、上の領域に位置する。cclm_mode_idxが1、4、5の場合、CCLM予測はMMLMモードであって、参照画素は各々、対象ブロックの上および左の領域、左の領域、上の領域に位置する。
図17は、本発明の一実施形態に係るシンタックス構成を示す図である。(a)に示すように、イントラ予測パラメータ導出部304は、輝度から色差を予測するCCLM予測を行うかを示すcclm_mode_flagを復号する。cclm_mode_flagがCCLM予測を行うことを示す値(ここでは1)の場合、複数の輝度画素を用いて色差画素の輝度色差予測のフィルタリングを行うかを示すcccm_mode_flagを復号する。さらにMMLMモードか否か、および、CCLM予測の参照画素位置を示すインデックスcclm_mode_idxを復号する。MMLMモードは複数のモデル(CCCM予測パラメータ)を用いてCCLM予測を行うモードである。cclm_mode_idxの値は<排他構成の例3>で説明した通りである。
cclm_nei_idx
}
<排他構成の動作例>
図18は、本発明の一実施形態に係るCCLM予測部の動作を示すフローチャートである。
(S3501) イントラ予測パラメータ導出部304は、符号化データからcclm_mode_flagを復号する。
(S3502) CCLM予測を用いる場合にはS3503に遷移する。それ以外の場合にはイントラ予測部3104はCCLM予測以外の予測を行う。
(S3503) イントラ予測パラメータ導出部304は、符号化データのCU情報から、CCLM予測の種別に関する情報を導出する。例えば図14~図17のmmlm_mode_flag、cclm_mode_idxを復号し、対象ブロックがMMLMモードか、CCCMモードか、それ以外のモードか、あるいは参照位置、隣接画素の位置等を導出する。
(S3504) CCLM予測の種別に関する情報がマルチモデルを利用しないことを示す場合には、S3507に遷移し3パラメータ以上のCCCM予測を利用する。、逆にマルチモードを利用する場合には2パラメータのCCCM予測を利用する。
(S3506) イントラ予測部3104はCCCM予測を行わない。つまり3以上のパラメータ数からなるCCLM予測パラメータを利用した輝度色差予測を行わず、輝度画素への重み係数とバイアスの2パラメータを導出し、2パラメータを利用した輝度色差予測を行う。例えば、(MMLM-1)の式で予測を行う。
(S3507) イントラ予測部3104はCCCM予測を行う。3以上のパラメータ数からなるCCLM予測パラメータを導出し、3パラメータ以上のCCLM予測パラメータを利用した輝度色差予測を行う。例えば、(CCCM-1), (CCCM-2)の式で予測を行う。
(S3501)~(S3503) 図18を利用してすでに説明済みなので説明を省略する。
(S3504) CCLM予測の種別に関する情報がマルチモデルを利用することを示す場合には、S3506に遷移しCCCM予測を利用しない。逆にマルチモデルを利用しない場合にはS3505に遷移する。
(S3505) 対象ブロックのサイズが所定のサイズよりも小さい場合、例えばcbWidth*cbHeight<THの場合、S3506に遷移しCCCM予測を利用せず2パラメータのマルチモデルの予測を利用する。対象ブロックのサイズが所定のサイズ以上の場合、S3507に遷移しCCCM予測を利用し3パラメータ以上のCCLM予測を行う。上記は、cccm_mode_flagを利用して以下のように導出してもよい。
(S3506)~(S3507) 図18を利用してすでに説明済みなので説明を省略する。
CCLM予測部31044について図8に基づいて説明する。図8は、CCLM予測部31044の構成の一例を示したブロック図である。CCLM予測部31044には、ダウンサンプリング部310441とCCLM予測パラメータ導出部(パラメータ導出部)310442と、CCLM予測フィルタ部310443とが含まれている。
pRefDsY[x][y] = (pRefY[2*x-1][2*y]+pRefY[2*x-1][2*y+1]+2*pRefY[2*x][2*y]+2*pRefY[2*x][2*y+1]+pRefY[2*x+1][2*y]+pRefY[2*x+1][2*y+1]+4)>>3
CCLM予測フィルタ部310443は、cccm_mode_flag==0の場合、1点の参照画素refSamples[x][y]を入力信号とし、CCLM予測パラメータ(a,b)を用いて予測画像predSamples[x][y]を出力する。
ここで、refSamplesは図9(d)のpDsYであり、(a,b)はCCLM予測パラメータ導出部310442で導出されたCCLM予測パラメータであり、predSamples[][]は色差予測画像(図9(e)のpC)である。なお、(a,b)はCb、Cr用に各々導出される。また、shiftAはa値の精度を示す正規化シフト数であり、小数精度の傾きをafとおくとa = af << shiftAである。例えばshiftA=16。
CCLM予測フィルタ部310443は、cccm_mode_flag==1の場合、参照画像refSamples[x][y]とその隣接画素refSamples[x+dX][y+dY]を入力信号とし、CCLM予測パラメータ(a,b)を用いて予測画像predSamples[x][y]を出力する。(dX,dY)は例えば(-1,0),(1,0),(0,-1),(0,1)など
predSamples[x][y] =((a0*refSamples[x][y]+Σak*refSamples[x+dXk][y+dYk])>>shiftA)+b (CCCM-1)
ここで、Σはkに関する和、で、k=1、(dX1,dY1)= (-1,0),(1,0),(0,-1),(0,1)の何れか、であってもよい。
また、色差の予測画像predSamplesの生成に置いて、cclm_mode_flag==0の場合にはダウンサンプル後の輝度画像を使い、cccm_mode_flag==1の場合には、ダウンサンプル前の輝度画像を用いてもよい。例えば、cccm_mode_flag==1の場合、predSamplesを下式のように導出してもよい。
符号化データ中のシンタックス要素sps_chroma_format_idcに応じて次のように導出してもよい。
sps_chroma_format_idc=1(4:2:0)のとき、SubWidthC=2、SubHeightC=2
sps_chroma_format_idc=2(4:2:2)のとき、SubWidthC=2、SubHeightC=1
sps_chroma_format_idc=3(4:4:4)のとき、SubWidthC=1、SubHeightC=1
さらに、マルチモデルを用いてフィルタ処理を行ってもよい。
IntraPredModeC==INTRA_LT_MMLM、INTRA_L_MMLM、INTRA_T_MMLM(マルチモード)の場合、輝度信号の大きさで輝度信号を分類し、分類に応じて複数のCCLM予測パラメータを導出し、この予測パラメータに応じて予測画像の導出を行ってもよい。例えばある閾値thValを用い、以下のようにrefSamplesの大きさに応じて画素をmodeIdに分類する。そして、modeIdに応じて、定まるCCLM予測パラメータa[modelId]、b[modelId]を用いてフィルタ処理を行う。
modelId=0
else
modelId=1
predSamples[x][y] =((a0[modeId]*refSamples[x][y])>>shiftA)+b (MMLM-1)
(マルチモデルかつマルチ隣接)
IntraPredModeC==INTRA_LT_MMLM_CM _CCCM、INTRA_L_MMLM、INTRA_T_MMLM_CCCMの場合、つまり。、MMLMとCCCMを併用する場合には以下の処理を行う。
なお、上述の排他構成ではこの構成は用いない。
CCLM予測パラメータ導出部310442は、輝度のダウンサンプリングされた隣接ブロックpRefY(図9(d)のpRefDsY[][])と色差の隣接ブロックpRefC[][](図9(e)のpRefC[][])を入力信号としてCCLM予測パラメータを導出する。CCLM予測パラメータ導出部310442は、導出したCCLM予測パラメータ(a, b)をCCLM予測フィルタ部310443に出力する。
CCLM予測パラメータ導出部310442は、2パラメータ(cccm_mode_flag==0の場合、IntraPredModeC = INTRA_LT_CCLM, INTRA_L_CCLM, INTRA_T_CCLM, INTRA_LT_MMLM, INTRA_L_MMLM, INTRA_T_MMLMの何れか)の場合、対象ブロックの予測ブロックpredSamples[][]を参照ブロックrefSamples[][]から線形予測する場合のCCLM予測パラメータ(a,b)を導出する。
b= C_MIN-(a*Y_MIN)
この(a,b)を使用する場合、式(CCLM-1)のshiftA=0である。
ここで、色差がCbの場合は(C_MAX,C_MIN)はCbの隣接ブロックpRefCb[][]の(x1,y1)、(x2,y2)の画素値であり、色差がCrの場合は(C_MAX,C_MIN)はCrの隣接ブロックpRefCr[][]の(x1,y1)、(x2,y2)の画素値である。
diff !=0の場合、
diffC = maxC ? minC
x = Floor(Log2(diff))
normDiff = ((diff << 4) >> x) & 15
x += (normDiff!=0) ? 1 : 0
y = Abs(diffC)>0 ? Floor(Log2(Abs(diffC))) + 1 : 0
a = (diffC * (divSigTable[normDiff] | 8) + 2 * y ? 1) >> y
shiftA = ((3 + x ? y) < 1) ? 1 : 3 + x ? y
a = ((3 + x ? y ) < 1) ? Sign(a) * 15 : a
b = minC ? ((a * minY) >> k)
divSigTable[ ] = { 0, 7, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 1, 1, 0 }
diff ==0の場合、
shiftA = 0
a = 0
b = minC
CCLM予測パラメータ導出部310442は、3パラメータ以上(cccm_mode_flag==1の場合、IntraPredModeC = INTRA_LT_CCCM_SINGLE, INTRA_L_CCCM_SINGLE, INTRA_T_CCCM_SINGLEの何れか)の場合、N個の要素からなるCCLM予測パラメータ(a0, a1, …aN-2, b)を導出してもよい。
refX[1][cnt]=pRefY[x+dX1][y+dY1]
refX[2][cnt]=pRefY[x+dX2][y+dY2]
…
refX[N-2][cnt]=pRefY[x+dXN-2][y+dYN-2]
refX[N-1][cnt]=1
refY[cnt] = pRefC[x/SubWidthC][y/SubHeightC]
cnt=cnt+1
上記は対象ブロックの参照領域の(x,y)に関して繰り返し処理を行い、繰り返すたびにcntを1だけインクリメントする。
sumXY[i] = ΣrefX[i][cnt]*refY[cnt]
ここでΣはcntに関する和。なお、refX[][]、refY[]を用いず、pRefY、pRefCから直接sumXX、sumXYに加算することにより導出してもよい。
さらに正則化項を対角成分に加算する
sumXX[i][i] = sumXX[i][i] + (1<<(bitDepth-1))
CCLM予測パラメータ導出部310442は、cparam=sumXY*inverse(sumXX)に相当する線形演算によりcparam[k]、k=0..N-1を導出する。ここでinverse(X)はXの逆行列。Nはパラメータ数、ここではN>=3。隣接画素の数+1。
a1 = cparam[2]
…
b = cparam[N-1]
(MMLMの場合のCCLM予測パラメータ導出)
MMLMの場合には、対象ブロックに対して複数のCCLM予測パラメータを導出する。そして、図10(b)に示すように、YとCをそれぞれx,y軸にとったグラフ上で(Y_MAX,C_MAX)と(Y_MIN,C_MIN)を結ぶ直線を求める。ただし、図10(a)と異なり、輝度色差モデルは複数あり、モデル毎に(Y_MAX,C_MAX)と(Y_MIN,C_MIN)を結ぶ直線を求める。ここでは参照領域の輝度値に応じて分類し、modelIdを導出する。
modelId=0
else
modelId=1
ここでthValは、分類を行うための閾値であり、対象ブロックの輝度値の平均値(もしくはダウンサンプリングした輝度値の平均値、参照領域からサンプリングした輝度値の平均値)を利用してもよい。
a1[modelId] = cparam[1]
…
b[modelId] = cparam[N-1]
(ハードウェア的実現およびソフトウェア的実現)
また、上述した動画像復号装置31および動画像符号化装置11の各ブロックは、集積回路(ICチップ)上に形成された論理回路によってハードウェア的に実現してもよいし、CPU(Central Processing Unit)を用いてソフトウェア的に実現してもよい。
301 エントロピー復号部
302 パラメータ復号部
303 インター予測パラメータ導出部
304 イントラ予測パラメータ導出部
308 予測画像生成部
309 インター予測画像生成部
310 イントラ予測画像生成部
3104 予測部(イントラ予測部)
31044 CCLM予測部(予測画像生成装置)
310441 ダウンサンプリング部
310442 CCLM予測パラメータ導出部(パラメータ導出部)
310443 CCLM予測フィルタ部
311 逆量子化・逆変換部
312 加算部
320 予測パラメータ導出部
11 画像符号化装置
101 予測画像生成部
102 減算部
103 変換・量子化部
104 エントロピー符号化部
105 逆量子化・逆変換部
107 ループフィルタ
110 符号化パラメータ決定部
111 パラメータ符号化部
112 インター予測パラメータ導出部
113 イントラ予測パラメータ導出部
120 予測パラメータ導出部
Claims (8)
- 輝度画像を用いて色差画像の予測画像を生成するCCLM予測部であって、
対象ブロックに隣接する参照画像の参照画素と該参照画素の隣接画素とを用いて、第1の重み、第2の重み、第1のオフセット値からなるCCLM予測パラメータを導出するCCLM予測パラメータ導出部と、
上記対象ブロックの対象画素と該対象画素の隣接画素の2つの輝度画素と、上記CCLM予測パラメータを用いて色差予測画像を生成するCCLM予測フィルタ部と、を備え、
上記CCLM予測フィルタ部は、上記対象画素と上記第1の重みの積と、上記対象画素の隣接画素と上記第2の重みの積と、上記第1のオフセット値との和から、予測画素の画素値を導出することを特徴とするCCLM予測部。 - 上記参照画像および対象画像における隣接画素の位置は、対象画素(x, y)の右の画素(x+1, y)であることを特徴とする請求項1に記載のCCLM予測部。
- 請求項1に記載のCCLM予測部と、
符号化データから、隣接画素の位置を示すインデックスを復号するパラメータ復号部と、を備える動画像復号装置であって、
上記CCLM予測パラメータ導出部は、上記インデックスに関わらず、CCLM予測パラメータの数を固定とし、上記インデックスに応じて上記参照画素の隣接画素の位置を切り替えて上記CCLM予測パラメータを導出し、
上記CCLMフィルタ部は、上記インデックスに応じて上記隣接画素を切り替えて予測画像を導出することを特徴とする動画像復号装置。 - 上記請求項1に記載のCCLM予測部と、
画像から導出した隣接画素の位置を示すインデックスを符号化するパラメータ符号化部と、を備える動画像符号化装置であって、
上記CCLM予測パラメータ導出部は、上記インデックスに関わらず、CCLM予測パラメータの数を固定とし、上記インデックスに応じて上記参照画素の隣接画素の位置を切り替えて上記CCLM予測パラメータを導出し、
上記CCLMフィルタ部は、上記インデックスに応じて上記隣接画素を切り替えて予測画像を導出することを特徴とする動画像符号化装置。 - 上記パラメータ復号部は、シーケンスヘッダ、スライスヘッダ、又は、CTUヘッダの符号化データから、上記インデックスを復号し、
符号化データから、CCLM予測を行うか否かを示すフラグを導出し、
上記CCLMフィルタ部は、上記対象ブロックの予測画像を導出することを特徴とする請求項3に記載の動画像復号装置。 - 輝度画像を用いて色差画像の予測画像を生成するCCLM予測部であって、
CCLM予測パラメータを導出するCCLM予測パラメータ導出部と、
輝度参照画像と上記CCLM予測パラメータとを用いて、色差予測画像を生成するCCLM予測フィルタ部とを備え、
CCLM予測部は、
乗算の係数とバイアスの係数として、2つのパラメータを導出する線形予測手段と、3つ以上のパラメータを導出する線形予測手段とを備え、
輝度信号の大きさで輝度信号をグループに分類し、分類に応じて複数種類のCCLM予測パラメータを導出するマルチモデルと、1種類のCCLM予測パラメータを導出するシングルモデルとを備え、
上記マルチモデルである場合、3つ以上のパラメータを導出しないことを特徴とするCCLM予測部。 - 上記CCLM予測部は、2つのグループに分類するか否かを含むシンタックス要素を復号し、上記シンタックス要素が1つのグループを用いることを示す場合には、3つ以上のパラメータを導出するか否かを示すシンタックス要素をさらに復号することを特徴とする請求項6に記載のCCLM予測部。
- 上記CCLM予測部は、3つ以上のパラメータを導出するか否かを示すシンタックス要素を復号し、上記シンタックス要素が2つのパラメータであることを示す場合には、2つのグループに分類するか否かを含むシンタックス要素をさらに復号することを特徴とする請求項6に記載のCCLM予測部。
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP23835494.8A EP4554212A1 (en) | 2022-07-04 | 2023-07-03 | Cclm prediction unit, video decoding device, and video encoding device |
| CN202380051858.8A CN119487838A (zh) | 2022-07-04 | 2023-07-03 | Cclm预测部、运动图像解码装置以及运动图像编码装置 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2022107511A JP2024006522A (ja) | 2022-07-04 | 2022-07-04 | 予測画像生成装置、動画像復号装置、および動画像符号化装置 |
| JP2022-107511 | 2022-07-04 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024009962A1 true WO2024009962A1 (ja) | 2024-01-11 |
Family
ID=89453371
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2023/024662 Ceased WO2024009962A1 (ja) | 2022-07-04 | 2023-07-03 | Cclm予測部、動画像復号装置、および動画像符号化装置 |
Country Status (4)
| Country | Link |
|---|---|
| EP (1) | EP4554212A1 (ja) |
| JP (1) | JP2024006522A (ja) |
| CN (1) | CN119487838A (ja) |
| WO (1) | WO2024009962A1 (ja) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2020129990A1 (ja) * | 2018-12-20 | 2020-06-25 | シャープ株式会社 | 予測画像生成装置、動画像復号装置、動画像符号化装置、および、予測画像生成方法 |
| JP2022024208A (ja) * | 2018-12-07 | 2022-02-09 | シャープ株式会社 | 動画像復号装置および動画像符号化装置 |
| JP2022516180A (ja) * | 2019-01-03 | 2022-02-24 | 華為技術有限公司 | クロマブロック予測方法及び装置 |
| JP2022107511A (ja) | 2021-01-08 | 2022-07-21 | 昭和アルミニウム缶株式会社 | フェイスシールド |
-
2022
- 2022-07-04 JP JP2022107511A patent/JP2024006522A/ja active Pending
-
2023
- 2023-07-03 CN CN202380051858.8A patent/CN119487838A/zh active Pending
- 2023-07-03 WO PCT/JP2023/024662 patent/WO2024009962A1/ja not_active Ceased
- 2023-07-03 EP EP23835494.8A patent/EP4554212A1/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2022024208A (ja) * | 2018-12-07 | 2022-02-09 | シャープ株式会社 | 動画像復号装置および動画像符号化装置 |
| WO2020129990A1 (ja) * | 2018-12-20 | 2020-06-25 | シャープ株式会社 | 予測画像生成装置、動画像復号装置、動画像符号化装置、および、予測画像生成方法 |
| JP2022516180A (ja) * | 2019-01-03 | 2022-02-24 | 華為技術有限公司 | クロマブロック予測方法及び装置 |
| JP2022107511A (ja) | 2021-01-08 | 2022-07-21 | 昭和アルミニウム缶株式会社 | フェイスシールド |
Also Published As
| Publication number | Publication date |
|---|---|
| CN119487838A (zh) | 2025-02-18 |
| JP2024006522A (ja) | 2024-01-17 |
| EP4554212A1 (en) | 2025-05-14 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7402016B2 (ja) | 画像復号装置及び画像符号化装置 | |
| JP7612805B2 (ja) | 画像符号化装置、および、画像復号装置 | |
| CN112532976B (zh) | 运动图像解码装置以及运动图像编码装置 | |
| JP7644849B2 (ja) | 動画像復号装置、動画像符号化装置、動画像復号方法および動画像符号化方法 | |
| JP7796924B2 (ja) | 画像復号装置及び記録媒体 | |
| JPWO2020241858A5 (ja) | ||
| JP7684960B2 (ja) | 動画像復号装置、動画像符号化装置、動画像復号方法及び動画像符号化方法 | |
| JP7765231B2 (ja) | 動画像復号装置および動画像符号化装置 | |
| WO2024009962A1 (ja) | Cclm予測部、動画像復号装置、および動画像符号化装置 | |
| JP7659674B2 (ja) | 動画像復号装置、動画像符号化装置および予測画像生成方法 | |
| CN113170169A (zh) | 预测图像生成装置、运动图像解码装置、运动图像编码装置以及预测图像生成方法 | |
| JP2021013110A (ja) | 画像復号装置および画像符号化装置 | |
| WO2020045275A1 (ja) | 画像復号装置および画像符号化装置 | |
| JP2021034848A (ja) | 画像復号装置 | |
| JP2020195013A (ja) | 画像復号装置および画像符号化装置 | |
| JP2020195012A (ja) | 画像復号装置及び画像符号化装置 | |
| JP7397586B2 (ja) | 画像復号装置及び画像符号化装置 | |
| JP2020195042A (ja) | 動画像復号装置 | |
| JP2020205483A (ja) | 画像復号装置および画像符号化装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23835494 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202380051858.8 Country of ref document: CN |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023835494 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2023835494 Country of ref document: EP Effective date: 20250204 |
|
| WWP | Wipo information: published in national office |
Ref document number: 202380051858.8 Country of ref document: CN |
|
| WWP | Wipo information: published in national office |
Ref document number: 2023835494 Country of ref document: EP |