WO2012081636A1 - Dispositif de décodage d'image, dispositif de codage d'image et structure de données de données codées - Google Patents
Dispositif de décodage d'image, dispositif de codage d'image et structure de données de données codées Download PDFInfo
- Publication number
- WO2012081636A1 WO2012081636A1 PCT/JP2011/078953 JP2011078953W WO2012081636A1 WO 2012081636 A1 WO2012081636 A1 WO 2012081636A1 JP 2011078953 W JP2011078953 W JP 2011078953W WO 2012081636 A1 WO2012081636 A1 WO 2012081636A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- channel
- prediction
- decoded
- channels
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- the present invention relates to an image decoding device that performs intra prediction on image color differences, an image encoding device, and a data structure of encoded data.
- a moving image encoding device that generates encoded data by encoding the moving image, and a moving image that generates a decoded image by decoding the encoded data
- An image decoding device is used.
- Non-Patent Document 2 Specific examples of the moving image encoding method include H.264. H.264 / MPEG-4. Adopted in KTA software, which is a codec for joint development in AVC (Non-patent Document 1) and VCEG (Video Coding Expert Group), and TMuC (Test Model Under Consideration) software, which is the successor codec (Non-Patent Document 2).
- a predicted image is usually generated based on a locally decoded image obtained by encoding / decoding an input image, and the predicted image is subtracted from the input image (original image).
- the prediction residual (which may be referred to as “difference image” or “residual image”) is encoded.
- examples of the method for generating a predicted image include inter-screen prediction (inter prediction) and intra-screen prediction (intra prediction).
- TMuC Non-Patent Document 2 proposes to divide a moving image and manage it by the following hierarchical structure.
- an image (picture) constituting a moving image is divided into slices.
- the slice is also divided into maximum coding units (sometimes called macroblocks or maximum coding units (Largest (Coding Units)).
- the maximum coding unit can be divided into smaller coding units (Coding Unit) by quadtree division.
- a coding unit (leaf CU) that cannot be further divided is treated as a conversion unit and a prediction unit. These are sometimes called blocks.
- a unit called a partition that uses a prediction unit as it is or is further divided is defined.
- intra prediction is performed in units of partitions.
- Non-Patent Document 1 an expression format in which a pixel is represented by a combination of a luminance component Y and color difference components C b and C r is employed.
- Luminance and color difference are inherently independent components. Therefore, H.I. H.264 / MPEG-4.
- AVC Non-Patent Document 1
- horizontal prediction, vertical prediction, DC prediction, and plane prediction are used for intra prediction of color difference signals.
- the human eye has a visual characteristic that it is sensitive to pixel luminance changes but insensitive to color changes.
- the resolution of the chrominance pixel is lowered, there is less visual influence than when the resolution of the luminance pixel is lowered. Therefore, in the encoding of moving images, the resolution of color difference pixels is made lower than the resolution of luminance pixels to reduce the data amount.
- FIG. 27 shows an example of a correspondence relationship between an original image, a luminance pixel, and a color difference pixel in the prior art.
- 27A shows an image (YUV) to be encoded
- FIG. 27B shows a luminance pixel
- FIG. 27C shows a color difference pixel (exemplified for U). Show.
- the resolution of the luminance pixel is 8 ⁇ 8
- the color difference pixel has a resolution of 4 ⁇ 4.
- Non-Patent Document 3 discloses predicting a color difference image (UV) from a luminance image (Y) by linear conversion. Specifically, performing linear transformation according to the following formula (A1) is disclosed.
- PredC predicted image (color difference) [X C , y C ], [x Y , y Y ]: coordinates indicating the position of the same sample
- RecY decoded image (luminance) ⁇ C , ⁇ C : coefficients derived by the least squares method from the pixel values of the surrounding decoded images (hereinafter referred to as local decoded images)
- the coordinates of [x C , y C ] and [x Y , y Y ] need to be appropriately converted.
- Non-Patent Document 3 also mentions how to take sample points when the resolution is different between the luminance image (Y) and the color difference image (U, V), as illustrated in FIG. .
- FIG. 28A shows the case where sample points are taken from a 2N ⁇ 2N size luminance image
- FIG. 28B shows the case where samples are taken from an N ⁇ N color difference image.
- Non-Patent Document 3 describes that the sample point smpl100 shown in (a) of FIG. 28 is associated with the sample point smpl200 shown in (b) of FIG.
- the conventional technology as described above performs linear conversion by the least square method in inter-channel prediction. For this reason, when the local decoded image used as a sample is not suitable for linear transformation, the accuracy of inter-channel prediction may not be sufficient.
- the sample distribution is distributed in two regions of the group Gr101 and the group Gr102, and the sample distribution varies in each region.
- the error may become large in the linear conversion.
- the sample points Smpl300 are distributed at positions far away from the values approximated by the equation (A1).
- the present invention has been made in view of the above-mentioned problems, and its purpose is to perform a prediction between channels in accordance with a linear correlation because each component of each pixel included in a locally decoded image varies.
- an object of the present invention is to realize an image decoding apparatus capable of improving the possibility that higher prediction accuracy can be obtained even when the accuracy of prediction is lowered.
- an image decoding apparatus generates a prediction image for each of a plurality of channels indicating each component constituting an image, and adds a prediction residual to the generated prediction image
- channel decoding means for decoding one or a plurality of channels among the plurality of channels for the processing target block, and each of the plurality of channels has been decoded
- a nonlinear correlation between the one or more channels that have been decoded by the channel decoding means and the other channels to be decoded is referred to with reference to a locally decoded image located around the block to be processed.
- the correlation derivation means to be derived and the processing target block are restored according to the above derived correlation. From already the one or more channels of the decoded image, characterized in that it comprises, the predicted image generating means for generating the prediction image of the other channels.
- a channel is a generalized component that constitutes an image.
- the luminance component and the color difference component correspond to channels. That is, in this example, the channel includes a luminance channel and a color difference channel.
- the color difference channel includes a U channel indicating the U component of the color difference and a V channel indicating the V component of the color difference.
- the channel may relate to the RGB color space.
- the image decoding process is performed for each channel.
- the local decoded image is a decoded image that has been decoded for the plurality of channels and is located around the block to be processed.
- Processing target block refers to various processing units in the decoding process.
- a coding unit, a conversion unit, a prediction unit, and the like can be given.
- the processing unit includes a unit obtained by further subdividing the encoding unit, the conversion unit, and the prediction unit.
- the periphery of the processing target block includes, for example, a pixel adjacent to the target block, a block adjacent to the left side of the target block, a block adjacent to the upper side of the target block, and the like.
- the non-linear correlation can be derived, for example, by examining the correspondence of each point composed of the luminance value and the color difference value.
- the non-linear correlation for example, when describing the YUV color space, it can be derived from the correspondence between the luminance value of each pixel included in the locally decoded image and the color difference value.
- the correlation may be realized as an LUT in which a decoded channel and a decoding target channel are associated with each other.
- the correlation may be expressed by a function including a relational expression established between the decoded channel and the decoding target channel.
- the channel to be decoded in the processing target block is predicted from the channel that has been decoded in the processing target block according to the nonlinear correlation derived in this way.
- such prediction is also referred to as inter-channel prediction.
- the pixel value of the decoded image of the decoded channel is converted according to the nonlinear correlation, and the pixel value of the predicted image of the channel to be decoded is obtained.
- the pixel value is a generalized value of a component constituting the image.
- an image encoding device encodes a prediction residual obtained by subtracting, from an original image, a prediction image generated for each of a plurality of channels indicating each component constituting the image.
- a channel decoding unit that decodes one or a plurality of channels among the plurality of channels and a plurality of channels that have been decoded for the processing target block
- a non-linear correlation between the one or more channels decoded by the channel decoding means and other channels to be decoded is derived with reference to a local decoded image located around the processing target block.
- the correlation derivation means and the processing target block are decoded according to the derived correlation.
- Mino from the decoded image of the one or more channels characterized in that it comprises a prediction image generating means for generating the prediction image of the other channels.
- the data structure of the encoded data according to the present invention includes a prediction residual obtained by subtracting, from an original image, a prediction image generated for each of a plurality of channels indicating each component constituting the image.
- a prediction residual obtained by subtracting, from an original image, a prediction image generated for each of a plurality of channels indicating each component constituting the image.
- channel decoding processing order information indicating in which order the plurality of channels are decoded for the processing target block, and which of the decoded 1 above for the processing target block
- the processing target that has been decoded for each of the plurality of channels from the decoded images of the plurality of channels
- the predicted image of the other channel is determined according to a non-linear correlation between the one or more decoded channels and the other channel to be decoded.
- prediction source channel designation information for designating whether to generate.
- the same effects as those of the image decoding device according to the present invention can be obtained.
- the image encoding device specifies channel decoding processing order information indicating in which order the plurality of channels are to be decoded, and from which of the decoded channels the channel to be decoded is to be predicted
- the prediction source channel information may be included in the data structure of the encoded data.
- the image encoding device may encode the information in, for example, side information.
- the image decoding apparatus includes a channel decoding unit that decodes one or a plurality of channels among a plurality of channels, and a periphery of the processing target block that has been decoded for each of the plurality of channels.
- a correlation deriving means for deriving a non-linear correlation between the one or more channels already decoded by the channel decoding means and the other channels to be decoded with reference to the local decoded image located at
- the block includes a predicted image generating unit that generates the predicted image of the other channel from the decoded image of the one or more channels that has been decoded in accordance with the derived correlation.
- the image encoding device includes a channel decoding unit that decodes one or more channels among a plurality of channels for the processing target block, and the processing target block that has been decoded for each of the plurality of channels.
- Correlation deriving means for deriving a non-linear correlation between the one or more channels already decoded by the channel decoding means and other channels to be decoded with reference to locally decoded images located in the periphery, and processing
- the target block includes a predicted image generating unit that generates the predicted image of the other channel from the decoded images of the one or more channels that have been decoded in accordance with the derived correlation.
- the data structure of the encoded data according to the present invention is an image that decodes encoded image data by generating a prediction image for each of the plurality of channels and adding a prediction residual to the generated prediction image.
- channel decoding processing order information indicating in which order the plurality of channels are decoded for the processing target block, and the decoded image of the one or more channels already decoded for the processing target block
- Prediction specifying whether to generate the predicted image of the other channel according to the nonlinear correlation of And channel information, a data structure containing.
- FIG. 4 is a diagram illustrating an image format of a YUV format, where (a) to (d) are a 4: 2: 0 format, a 4: 4: 4 format, a 4: 2: 2 format, and a 4: 1: 1 format, respectively.
- FIG. 6 is a diagram illustrating sample positions of color difference pixels, and (a) to (c) show three sample positions. It is a figure shown about the pattern of the processing order of a color difference channel. It is the flowchart which illustrated about the schematic flow of the color difference estimated image generation process in the said estimated image generation part.
- FIG. 4 is a diagram illustrating an example of an image in which luminance-color difference distributions do not overlap and extend over two regions.
- 11 is a graph plotting luminance (Y) -color difference (U) of pixels included in the image shown in FIG.
- FIG. 12 is a flowchart showing a modification of the flow of LUT derivation processing by the LUT derivation unit. It is a figure which shows an example of the derived
- FIG. 5 is a graph plotting luminance (Y) -color difference (U) of an image, and two types of images are shown in (a) and (b).
- (A) shows a transmitting apparatus equipped with a moving picture coding apparatus, and (b) shows a receiving apparatus equipped with a moving picture decoding apparatus. It is the figure shown about the structure of the recording device which mounts the said moving image encoder, and the reproducing
- (A) shows a recording apparatus equipped with a moving picture coding apparatus, and (b) shows a reproduction apparatus equipped with a moving picture decoding apparatus.
- FIG. 2 is a functional block diagram showing a schematic configuration of the moving picture decoding apparatus 1.
- VCEG Video Coding Expert Group
- KTA software which is a joint development codec
- TMuC successor Codec
- the moving image decoding apparatus 1 receives encoded data (data structure of encoded data) # 1 obtained by encoding a moving image by the moving image encoding apparatus 2.
- the video decoding device 1 decodes the input encoded data # 1 and outputs the video # 2 to the outside.
- the configuration of the encoded data # 1 will be described below.
- the configuration of encoded data # 1 that is generated by the video encoding device 2 and decoded by the video decoding device 1 will be described with reference to FIG.
- the encoded data # 1 has a hierarchical structure including a sequence layer, a GOP (Group Of Pictures) layer, a picture layer, a slice layer, and a maximum coding unit (LCU: Large Coding Unit) layer.
- GOP Group Of Pictures
- LCU Large Coding Unit
- FIG. 3 shows the hierarchical structure below the picture layer in the encoded data # 1.
- 3A to 3F show a picture layer PICT, a slice layer S, an LCU layer LCU, a leaf CU included in the LCU (denoted as CUL in FIG. 3D), and inter prediction (between screens).
- PI_Inter which is the prediction information PI about (prediction) partition
- PI_Intra which is the prediction information PI about intra prediction (prediction in a screen) partition.
- the picture layer PICT is a set of data referred to by the video decoding device 1 in order to decode a target picture that is a processing target picture.
- the picture layer PICT includes a picture header PH and slice layers S 1 to S NS (NS is the total number of slice layers included in the picture layer PICT).
- NS is the total number of slice layers included in the picture layer PICT.
- the reference numerals may be omitted. The same applies to other configurations included in the encoded data # 1.
- the picture header PH includes a coding parameter group referred to by the video decoding device 1 in order to determine a decoding method of the target picture.
- the encoding mode information (entoropy_coding_mode_flag) indicating the variable length encoding mode used in encoding by the moving image encoding device 2 is an example of an encoding parameter included in the picture header PH.
- entorpy_coding_mode_flag When entorpy_coding_mode_flag is 0, the picture is encoded by CAVLC (Context-based Adaptive Variable Length Coding). It has become.
- Each slice layer S included in the picture layer PICT is a set of data referred to by the video decoding device 1 in order to decode a target slice that is a processing target slice.
- the slice layer S includes a slice header SH and LCU layers LCU 1 to LCU NC (NC is the total number of LCUs included in the slice S).
- the slice header SH includes a coding parameter group that the moving image decoding apparatus 1 refers to in order to determine a decoding method of the target slice.
- Slice type designation information (slice_type) for designating a slice type is an example of an encoding parameter included in the slice header SH.
- the slice header SH includes a filter parameter FP that is referred to by a loop filter included in the video decoding device 1.
- slice types that can be specified by the slice type specification information, (1) I slice using only intra prediction at the time of encoding, and (2) P using unidirectional prediction or intra prediction at the time of encoding. Slice, (3) B-slice using unidirectional prediction, bidirectional prediction, or intra prediction at the time of encoding.
- Each LCU layer LCU included in the slice layer S is a set of data that the video decoding device 1 refers to in order to decode the target LCU that is the processing target LCU.
- LCU layer LCU as shown in (c) of FIG. 3, LCU header LCUH, and a plurality of coding units obtained by the quadtree dividing the LCU: the (CU Coding Unit) CU 1 ⁇ CU NL Contains.
- the size that each CU can take depends on the LCU size and the hierarchical depth included in the sequence parameter set SPS of the encoded data # 1. For example, when the size of the LCU is 128 ⁇ 128 pixels and the maximum hierarchical depth is 5, the CU included in the LCU has five types of sizes, that is, 128 ⁇ 128 pixels, 64 ⁇ 64 pixels, Any of 32 ⁇ 32 pixels, 16 ⁇ 16 pixels, and 8 ⁇ 8 pixels can be taken. A CU that is not further divided is called a leaf CU.
- the LCU header LCUH includes an encoding parameter referred to by the video decoding device 1 in order to determine a decoding method of the target LCU. Specifically, as shown in FIG. 3C, CU partition information SP_CU that specifies a partition pattern for each leaf CU of the target LCU, and a quantization parameter difference ⁇ qp that specifies the size of the quantization step. (Mb_qp_delta) is included.
- CU division information SP_CU is information that specifies the shape and size of each CU (and leaf CU) included in the target LCU, and the position in the target LCU.
- the CU partition information SP_CU does not necessarily need to explicitly include the shape and size of the leaf CU.
- the CU partition information SP_CU may be a set of flags (split_coding_unit_flag) indicating whether or not the entire LCU or a partial region of the LCU is divided into four. In that case, the shape and size of each leaf CU can be specified by using the shape and size of the LCU together.
- the quantization parameter difference ⁇ qp is a difference qp ⁇ qp ′ between the quantization parameter qp in the target LCU and the quantization parameter qp ′ in the LCU encoded immediately before the LCU.
- Leaf CU A CU (leaf CU) that cannot be further divided is treated as a prediction unit (PU: Prediction Unit) and a transform unit (TU: Transform Unit).
- PU Prediction Unit
- TU Transform Unit
- the leaf CU (denoted as CUL in (d) of FIG. 3) is (1) PU information PUI that is referred to when the moving image decoding apparatus 1 generates a predicted image. And (2) the TU information TUI that is referred to when the moving image decoding apparatus 1 decodes the residual data.
- the PU information PUI may include a skip flag SKIP. When the value of the skip flag SKIP is 1, the TU information is omitted.
- the PU information PUI includes prediction type information PT and prediction information PI, as shown in FIG.
- the prediction type information PT is information that specifies whether intra prediction or inter prediction is used as a predicted image generation method for the target leaf CU (target PU).
- the prediction information PI includes intra prediction information PI_Intra or inter prediction information PI_Inter depending on which prediction method is specified by the prediction type information PT.
- a PU to which intra prediction is applied is also referred to as an intra PU
- a PU to which inter prediction is applied is also referred to as an inter PU.
- the PU information PUI includes information specifying the shape and size of each partition included in the target PU and the position in the target PU.
- the partition is one or a plurality of non-overlapping areas constituting the target leaf CU, and the generation of the predicted image is performed in units of partitions.
- the TU information TUI includes TU partition information SP_TU that specifies a partition pattern for each block of the target leaf CU (target TU), and quantized prediction residuals QD 1 to QD NT.
- target TU the target leaf CU
- QD 1 to QD NT the total number of blocks included in the target TU
- TU partition information SP_TU is information that specifies the shape and size of each block included in the target TU and the position in the target TU.
- Each TU can be, for example, a size from 64 ⁇ 64 pixels to 2 ⁇ 2 pixels.
- the block is one or a plurality of non-overlapping areas constituting the target leaf CU, and prediction residual encoding / decoding is performed in units of TUs or blocks obtained by dividing TUs.
- Each quantized prediction residual QD is encoded data generated by the moving image encoding apparatus 2 performing the following processes 1 to 3 on a target block that is a processing target block.
- Process 1 DCT transform (Discrete Cosine Transform) is performed on the prediction residual obtained by subtracting the prediction image from the encoding target image.
- Process 2 The DCT coefficient obtained in Process 1 is quantized.
- Process 3 The DCT coefficient quantized in Process 2 is variable length encoded.
- the inter prediction information PI_Inter includes a coding parameter that is referred to when the video decoding device 1 generates an inter prediction image by inter prediction. As shown in FIG. 3E, the inter prediction information PI_Inter includes inter PU partition information SP_Inter that specifies a partition pattern of the target PU into each partition, and inter prediction parameters PP_Inter1 to PP_InterNe (Ne for each partition). , The total number of inter prediction partitions included in the target PU).
- the inter-PU partition information SP_Inter is information for designating the shape and size of each inter prediction partition included in the target PU (inter PU) and the position in the target PU.
- the inter PU is composed of four symmetric splittings of 2N ⁇ 2N pixels, 2N ⁇ N pixels, N ⁇ 2N pixels, and N ⁇ N pixels, and 2N ⁇ nU pixels, 2N ⁇ nD pixels, and nL ⁇ 2N. It is possible to divide into 8 types of partitions in total by four asymmetric splits of pixels and nR ⁇ 2N pixels.
- the specific value of N is defined by the size of the CU to which the PU belongs, and the specific values of nU, nD, nL, and nR are determined according to the value of N.
- an inter PU of 128 ⁇ 128 pixels is 128 ⁇ 128 pixels, 128 ⁇ 64 pixels, 64 ⁇ 128 pixels, 64 ⁇ 64 pixels, 128 ⁇ 32 pixels, 128 ⁇ 96 pixels, 32 ⁇ 128 pixels, and 96 ⁇ It is possible to divide into 128-pixel inter prediction partitions.
- the inter prediction parameter PP_Inter includes a reference image index RI, an estimated motion vector index PMVI, and a motion vector residual MVD.
- the intra prediction information PI_Intra includes an encoding parameter that is referred to when the video decoding device 1 generates an intra predicted image by intra prediction.
- the intra prediction information PI_Intra includes intra PU partition information SP_Intra that specifies a partition pattern of the target PU (intra PU) into each partition, and intra prediction parameters PP_Intra 1 for each partition.
- -PP_Intra NA NA is the total number of intra prediction partitions included in the target PU).
- the intra-PU partition information SP_Intra is information that specifies the shape and size of each intra-predicted partition included in the target PU, and the position in the target PU.
- the intra PU split information SP_Intra includes an intra split flag (intra_split_flag) that specifies whether or not the target PU is split into partitions. If the intra partition flag is 1, the target PU is divided symmetrically into four partitions. If the intra partition flag is 0, the target PU is not divided and the target PU itself is one partition.
- N 2 n , n is an arbitrary integer of 1 or more.
- a 128 ⁇ 128 pixel intra PU can be divided into 128 ⁇ 128 pixel and 64 ⁇ 64 pixel intra prediction partitions.
- the video decoding device 1 generates a predicted image for each partition, generates a decoded image # 2 by adding the generated predicted image and the prediction residual decoded from the encoded data # 1, and generates The decoded image # 2 is output to the outside.
- the generation of the predicted image is performed with reference to the encoding parameter obtained by decoding the encoded data # 1.
- the encoding parameter is a parameter referred to in order to generate a prediction image, and in addition to a prediction parameter such as a motion vector referred to in inter-screen prediction and a prediction mode referred to in intra-screen prediction. Partition size and shape, block size and shape, and residual data between the original image and the predicted image.
- side information a set of all information excluding the residual data among the information included in the encoding parameter is referred to as side information.
- the prediction unit is a partition constituting the LCU
- the present embodiment is not limited to this, and the prediction unit is a unit larger than the partition.
- the present invention can also be applied to the case where the prediction unit is a unit smaller than the partition.
- a frame (picture), a slice, an LCU, a block, and a partition to be decoded are referred to as a target frame, a target slice, a target LCU, a target block, and a target partition, respectively.
- the LCU size is, for example, 64 ⁇ 64 pixels
- the partition size is, for example, 64 ⁇ 64 pixels, 32 ⁇ 32 pixels, 16 ⁇ 16 pixels, 8 ⁇ 8 pixels, 4 ⁇ 4 pixels, or the like. These sizes do not limit the present embodiment, and the size and partition of the LCU may be other sizes.
- FIG. 2 is a functional block diagram showing a schematic configuration of the moving picture decoding apparatus 1.
- the moving picture decoding apparatus 1 includes a variable length code demultiplexing unit 11, an inverse quantization / inverse conversion unit 12, a predicted image generation unit (channel decoding unit) 13, and an adder (channel decoding unit) 14. And a frame memory 15.
- variable-length code demultiplexing unit 11 demultiplexes the encoded data # 1 for one frame input to the video decoding device 1 to obtain various kinds of information included in the hierarchical structure shown in FIG. To separate.
- the variable length code demultiplexing unit 11 refers to information included in various headers, and sequentially separates the encoded data # 1 into slices and LCUs.
- the various headers include (1) information on the method of dividing the target frame into slices, and (2) information on the size, shape, and position of the LCU belonging to the target slice.
- variable length code demultiplexer 11 refers to the CU partition information SP_CU included in the encoded LCU header LCUH and divides the target LCU into leaf CUs. In addition, the variable-length code demultiplexing unit 11 acquires the TU information TUI and the PU information PUI for the target leaf CU: CUL.
- variable length code demultiplexing unit 11 supplies the TU information TUI obtained for the target leaf CU to the dequantization / inverse transform unit 12. Further, the variable length code demultiplexing unit 11 supplies the PU information PUI obtained for the target leaf CU to the predicted image generation unit 13.
- the inverse quantization / inverse transform unit 12 performs inverse quantization / inverse transform of the quantization prediction residual for each block for the target leaf CU.
- the inverse quantization / inverse transform unit 12 first decodes the TU partition information SP_TU from the TU information TUI about the target leaf CU supplied from the variable length code demultiplexer 11.
- the inverse quantization / inverse transform unit 12 divides the target leaf CU into one or a plurality of blocks according to the decoded TU partition information SP_TU.
- the inverse quantization / inverse transform unit 12 decodes the TU partition information SP_TU and the quantized prediction residual QD from the TU information TUI for each block.
- the inverse quantization / inverse transform unit 12 restores the prediction residual D for each pixel for each target partition by performing inverse quantization and inverse DCT transform (Inverse DiscretecreCosine Transform).
- the inverse quantization / inverse transform unit 12 supplies the restored prediction residual D to the adder 14.
- the predicted image generation unit 13 For each partition included in the target leaf CU, the predicted image generation unit 13 refers to a local decoded image P ′ that is a decoded image around the partition, and generates a predicted image Pred by intra prediction or inter prediction.
- intra prediction and inter prediction include luminance prediction and color difference prediction, respectively.
- the predicted image generation unit 13 refers to a luminance decoded image.
- the video decoding device 1 may generate the predicted image Pred by inter prediction.
- Intra prediction is sometimes referred to as intra prediction or spatial prediction, but in the following, it is unified with the expression intra prediction.
- the local decoded image P ′ includes a luminance local decoded image P ′ Y related to luminance and a color difference local decoded image P ′ C related to color difference.
- the predicted image generation unit 13 operates as follows. First, the predicted image generation unit 13 decodes the PU information PUI for the target leaf CU supplied from the variable length code demultiplexing unit 11. Subsequently, the predicted image generation unit 13 determines a division pattern for each partition of the target leaf CU according to the PU information PUI. Further, the predicted image generation unit 13 selects a prediction mode of each partition according to the PU information PUI, and assigns each selected prediction mode to each partition.
- the predicted image generation unit 13 generates a predicted image Pred for each partition included in the target leaf CU with reference to the selected prediction mode and the pixel values of the local decoded image P ′ around the partition.
- the predicted image generation unit 13 supplies the predicted image Pred generated for the target leaf CU to the adder 14.
- the predicted image Pred specifically includes a luminance predicted image PredY related to luminance and a color difference predicted image PredC related to color difference.
- the color difference prediction image PredC includes a color difference prediction image PredU for the U channel and a color difference prediction image PredV for the V channel. Further, a more specific configuration of the predicted image generation unit 13 will be described later.
- the adder 14 adds the predicted image Pred supplied from the predicted image generation unit 13 and the prediction residual D supplied from the inverse quantization / inverse transform unit 12, thereby decoding the decoded image P for the target leaf CU. Is generated.
- Decoded image P includes, luminance decoded image (hereinafter, referred to by the luminance decoded picture P Y) with a color difference decoded image.
- the decoded image P that has been decoded is sequentially recorded in the frame memory 15.
- decoded images corresponding to all the LCUs decoded before the target LCU are recorded. .
- the moving image decoding apparatus 1 one frame of encoded data # 1 input to the moving image decoding apparatus 1 at the time when the decoded image generation processing for each LCU is completed for all the LCUs in the image.
- the decoded image # 2 corresponding to is output to the outside.
- intra prediction parameter PP data structure of encoded data
- the intra prediction parameter PP (data structure of encoded data) illustratively includes an inter-channel prediction flag, a color difference channel processing order flag, and a second channel prediction source channel specifier.
- the inter-channel prediction flag is a flag indicating whether or not the color difference is predicted by inter-channel prediction. For example, if the inter-channel prediction flag is “1”, it indicates that the color difference is predicted by the inter-channel prediction, and if “0”, 1 bit indicates that the color difference is predicted without using the inter-channel prediction. Information.
- the color difference channel processing order flag is a flag for designating whether the prediction process is performed from the U channel or the V channel.
- the color difference channel processing order flag is, for example, 1-bit information indicating that processing is performed in the order of U and V if “0” and in the order of V and U if “1”.
- the second channel prediction source channel specifier is information for designating which channel the second predicted channel is predicted from. That is, the second predicted channel can be predicted from either the Y channel, the first predicted first channel, or both. For example, if the second channel prediction source channel specifier is “0”, it indicates that prediction is performed from the Y channel, if “10”, it indicates that prediction is performed from the first channel, and “11”. If it is, it is 1 or 2-bit information indicating that prediction is performed from both the Y channel and the first channel.
- the information indicating the combination of the processing order and the prediction source channel is encoded as separate information of the color difference channel processing order flag and the second channel prediction source channel specifier, respectively.
- FIG. 4 is a diagram for explaining the YUV color space, and shows the relationship between the luminance Y and the U component and V component, which are the components of the color difference.
- FIG. 5 is a diagram showing an image format of YUV format. (A) to (d) are respectively 4: 2: 0 format, 4: 4: 4 format, 4: 2: 2 format, and 4: 1: 1 format is shown.
- the YUV color space will be described with reference to FIG.
- an image is expressed by luminance Y and U and V components that are color differences.
- the luminance Y is defined as an independent coordinate system orthogonal to the U-V plane with respect to the U component and the V component.
- the luminance Y, U component, and V component each take a value from 0 to 255.
- the image When the color difference value of the U component is close to 0, the image is generally green, and when it is close to 255, the image is generally red. In addition, when the color difference value of the V component is close to 0, the image is generally yellow, and when it is close to 255, the image is generally blue.
- the types of pixel values used are limited.
- the luminance Y has a correlation with each of the U component and the V component. Therefore, locally, it is possible to derive the U component and the V component from the luminance Y using this correlation.
- the luminance Y is called the Y channel
- the color difference consisting of the U component and the V component is called the color difference channel.
- a channel is a generalized concept of luminance Y, U component, and V component.
- U component and the V component when it is necessary to distinguish the U component and the V component in the color difference channel, they are referred to as a U channel and a V channel, respectively.
- Prediction of the U channel and the V channel from the Y channel using the correlation between the luminance Y and the U component and the V component is referred to as inter-channel prediction.
- the image format of luminance and color difference will be described as follows. Even if the resolution of the color difference is lowered, the visual effect is less than when the luminance resolution is lowered. Therefore, the data amount can be reduced by reducing the resolution of the color difference.
- the data amount is reduced by the following data structure. Note that the left block of FIG. 5A shows the resolution of the luminance Y, and the right block shows the resolution of the U component and the V component. The same applies to (b) to (d) below.
- the resolution of the color difference is 1 ⁇ 2 of the luminance resolution in both the horizontal and vertical directions. That is, as a whole, the resolution of the color difference is 1/4 of the resolution of the luminance.
- the 4: 2: 0 format is used in television broadcasting and consumer video equipment.
- the luminance resolution and the color difference resolution are the same.
- the 4: 4: 4 format is used, for example, in specialized equipment for image processing when high image quality is required rather than reducing the amount of data.
- FIG. 1 is a functional block diagram illustrating an example of the configuration of the predicted image generation unit 13.
- the predicted image generation unit 13 includes a local image input unit 131, a luminance predicted image generation unit 132, an inter-channel prediction determination unit 133, an LUT derivation unit (correlation derivation unit) 134, and a color difference prediction image generation unit ( A prediction image generation unit, a processing information acquisition unit, a prediction control unit) 135, and a prediction image output unit 136.
- the local image input unit 131 acquires the luminance local decoded image P ′ Y and the color difference local decoded image P ′ C from the local decoded image P ′.
- the local image input unit 131 transfers the luminance local decoded image P ′ Y to the luminance predicted image generation unit 132 and transfers the color difference local decoded image P ′ C to the inter-channel prediction determination unit 133.
- Brightness prediction image generation unit 132 refers to the luminance local decoded image P 'Y, when predictions are based on PU information PUI, it generates a brightness prediction image PredY.
- the predicted brightness image generation unit 132 transmits the generated predicted brightness image PredY to the predicted image output unit 136.
- the inter-channel prediction determination unit 133 refers to an inter-channel prediction flag included in the intra-prediction parameter PP, and a channel in which intra prediction of color difference (hereinafter simply referred to as color difference prediction) generates a color difference prediction image by inter-channel prediction. It is determined whether or not it is an inter prediction mode.
- the inter-channel prediction determination unit 133 informs the LUT derivation unit 134 and the inter-channel prediction unit 351 (described later) of the color difference prediction image generation unit 135 that it is the inter-channel prediction mode. Notice. As a result of the determination, if it is not between channels prediction mode, the prediction determining section 133 among the channels, and transfers the color difference local decoded image P 'C channel prediction unit 352 of the color difference prediction image generation unit 135 (described later).
- LUT deriving unit 134 for each target partition, it derived based an LUT (Look Up Table) for performing inter-channel prediction chrominance local decoded image P 'Y.
- the LUT derived by the LUT deriving unit is illustratively structured as follows. That is, the LUT, the pixel position of the color difference local decoded image P 'Y [x Y, y Y] in association with the luminance value in the above pixel position [x Y, y Y] corresponding to the color difference local decoded image P 'pixel positions C [x C, y C] chrominance values in is stored.
- the LUT deriving unit 134 transmits the derived LUT to the inter-channel prediction unit 351 (predicted image generation means; described later) of the color difference predicted image generation unit 135. Details of the operation of the LUT deriving unit 134 will be described later.
- the color difference predicted image generation unit 135 predicts a color difference image and generates a color difference predicted image PredC. More specifically, the color difference predicted image generation unit 135 includes an inter-channel prediction unit 351 and an intra-channel prediction unit 352.
- Inter-channel prediction unit 351 at the time the color difference prediction is a prediction mode among channels, with reference to the luminance decoded image P Y, to generate the color difference prediction image PredC performs prediction of the color difference images by inter-channel prediction.
- the channel prediction unit 352 at the time the color difference prediction not prediction mode among channels, with reference to the chrominance local decoded image P 'C, to produce a color difference prediction image PredC performs prediction of the color difference images.
- the prediction of the color difference image by the intra-channel prediction unit 352 is performed by, for example, direction prediction or DC prediction.
- inter-channel prediction unit 351 Details of the operation of the inter-channel prediction unit 351 will be described later.
- the predicted image output unit 136 outputs the luminance predicted image PredY generated by the luminance predicted image generating unit 132 and the color difference predicted image PredC generated by the color difference predicted image generating unit 135 as the predicted image Pred.
- FIG. 6 is a diagram illustrating a correspondence relationship between luminance and color difference pixel positions in the 4: 2: 0 format.
- A) of FIG. 6 shows the pixel position of the luminance decoded image P Y
- B) shows the pixel position of the color difference prediction image PredU to be predicted.
- Pixel positions of the luminance decoded image P Y and the color difference prediction image PredU are both represented by relative coordinates with the origin at the upper left of the block.
- FIGS. 7A to 7B are diagrams illustrating sample positions of three color difference pixels.
- FIG. 7A shows the case already described with reference to FIG. That is, as shown in FIG. 7A, the sample position of the color difference pixel may be set at the upper left of the block.
- the sample position of the color difference pixel may be set to the left from the center of the block.
- the sample position of the color difference pixel may be set at the center of the block.
- the luminance value corresponding to the value of a certain color difference pixel is derived by filtering the luminance value in the vicinity of the pixel position obtained in accordance with the correspondence relationship shown in (a) to (c) of FIG.
- the vicinity of the pixel position is a coordinate obtained when the value of each coordinate of the pixel position obtained from the correspondence relation is rounded up and down. That is, in the example shown in FIG. 7B, [x Y , y Y ] and [x Y , y Y +1]. Further, in the example shown in FIG. 7 (c), in [x Y, y Y], [x Y + 1, y Y], [x Y, y Y +1] and [x Y + 1, y Y +1] is there.
- the luminance value at the pixel position [x Y , y Y ] may be used as a luminance value as a sample as it is. Good.
- a smoothing filter is mentioned as an example of this filtering.
- FIG. 8 is a diagram showing a pattern of the color difference channel processing order when there is one prediction source channel.
- the U channel and the V channel can be predicted using the luminance (Y) channel as a base point, as shown in FIG.
- Y luminance
- FIG. 8 there are the following three prediction patterns.
- the first is a pattern indicated by a solid line in FIG. That is, while predicting the U channel from the Y channel, the V channel is predicted from the Y channel. Which of the U channel and the V channel is predicted first can be arbitrarily selected. Further, the U channel and V channel prediction processing may be performed in parallel.
- the second is a pattern indicated by a dotted line in FIG. That is, the U channel is first predicted from the Y channel, and then the V channel is predicted from the predicted U channel.
- the third is the reverse of the second, and is a pattern indicated by a broken line in FIG. That is, the V channel is first predicted from the Y channel, and then the U channel is predicted from the predicted V channel.
- the chrominance channel processing order flag indicating which processing order is used, and the second channel prediction source channel specifier that specifies the prediction source channel of the second prediction channel are the codes in the moving picture encoding device 2. Encoded in the encoding process and transmitted to the video decoding device 1. Then, the inter-channel prediction unit 351 of the video decoding device 1 performs inter-channel prediction according to the color difference channel processing order flag and the second channel prediction source channel specifier.
- inter-channel prediction unit 35 in accordance with the following equation (1), by performing the inter-channel prediction of the U channel, and generates the color difference prediction image PredU from the luminance decoded image P Y.
- PredV [x V, y V] LUT V [RecY [x Y, y Y]] ... (2)
- the meaning of each symbol in the formula (2) is the same as that in the formula (1), and the description thereof is omitted.
- FIG. 9 is a flowchart illustrating a schematic flow of color difference predicted image generation processing in the predicted image generation unit 13.
- the inter-channel prediction determination unit 133 refers to the inter-channel prediction flag and determines whether or not the inter-channel prediction mode is set (S10).
- the intra-channel prediction unit 352 generates the color difference prediction image PredC without depending on the inter-channel prediction (S11), and the process ends.
- LUT deriving unit 134 derives the LUT with reference to the luminance decoded image P Y (S12). Then, the inter-channel prediction unit 351 generates a color difference prediction image PredC by inter-channel prediction with reference to the LUT derived by the LUT deriving unit 134 (S13). This is the end of the process.
- FIG. 10 is a diagram showing an example of an image B in which the luminance-color difference distributions do not overlap and extend over two regions.
- the image B shown in FIG. 10 is composed of six pixel areas.
- the pixel regions R1 to R3 are regions where the luminance value (Y) is low and the color difference value (U) is high.
- the pixel regions R4 to R6 are regions having a high luminance value (Y) and a low color difference value (U).
- the color difference value (U) gradually increases from the pixel region R1 to R3.
- the color difference value (U) decreases from the pixel region R4 to R6.
- a plot of the luminance (Y) -color difference (U) of the pixels included in such an image B is shown in the graph of FIG.
- the pixels included in the pixel regions R1 to R3 are plotted in the group Gr1 in the graph shown in FIG.
- the luminance (Y) of the sample is low and the color difference value (U) is high.
- the pixels included in the pixel regions R4 to R6 are plotted in the group Gr2 in the graph shown in FIG.
- the luminance (Y) of the sample is high and the color difference value (U) is low.
- Image B is not suitable for linear approximation because there are variations in brightness and color difference in the image. Such variations tend to be often seen in images that include boundaries between multiple objects or textures of multiple colors.
- FIG. 12 is a flowchart showing an example of the flow of LUT derivation processing by the LUT derivation unit 134.
- the LUT derivation unit 134 initializes the LUT (S100). To initialize the LUT is to make the LUT unregistered.
- the LUT deriving unit 134 enters a loop LP11 of registration processing for each luminance pixel adjacent to the target partition (S101).
- the LUT deriving unit 134 determines whether or not a color difference value is registered in the LUT [n] (S103).
- the LUT deriving unit 134 registers the color difference value m acquired in Step S102 as it is in the LUT [n]. The process returns to the beginning of the loop LP11 (S106).
- the LUT derivation unit 134 calculates (m + LUT [n] +1) / 2, thereby obtaining the acquired color difference value m and the registered color difference value. And the average value is calculated. The LUT deriving unit 134 substitutes the average value thus calculated for the color difference value m (S104). Subsequently, the LUT deriving unit 134 registers the color difference value m into which the average value is substituted in step S104 in the LUT [n] (S105), and returns to the top of the loop LP11 (S106).
- the LUT deriving unit 134 determines whether or not LUT [n] is unregistered (S108).
- the LUT deriving unit 134 searches for the latest registered entries before and after n as shown in FIG. 13 (S109). Specifically, the LUT deriving unit 134 searches for the latest registered entry before and after n as follows.
- the LUT deriving unit 134 searches for a registered entry for n forward, that is, nL smaller than n, with n as a base point. That is, with reference to FIG. 13, a search is made for a sample point Smpl1 in the n forward direction.
- the LUT deriving unit 134 searches for an entry registered for n backward, that is, nR larger than n, with n as a base point. That is, with reference to FIG. 13, a search is made for a sample point Smpl2 in the backward direction of n.
- the LUT derivation unit 134 registers a value obtained by linear interpolation between the LUT [nL] and the LUT [nR] in the LUT [n] (S110).
- step S110 an interpolation process is performed to connect the nearest sample point Smpl1 ahead of n and the sample point Smpl2 behind n by a straight line L1. Yes.
- the value at n of the straight line L1 is registered in LUT [n].
- step S104 if an entry has already been registered in the registration process, the average value of the acquired color difference value m and the registered color difference value is registered as a new entry. This is due to the following reason.
- step S104 if the acquired color difference value m is overwritten with the registered entry value, if the acquired color difference value m is statistical noise, the noise is directly registered in the entry. It becomes. On the other hand, such noise can be reduced by acquiring an average value of the acquired color difference value m and the registered color difference value.
- the average value acquired here is a weighted average in the order of registration. Further, the present invention is not limited to this, and in step S104, a weighted average in which weights are arbitrarily set may be acquired.
- the moving average process may be performed on the entire table. By performing the moving average process on the entire table, a rapid change in color difference can be suppressed.
- FIG. 14 shows a graph of the LUT values derived from the image B, that is, entries.
- a graph L11 shown in FIG. 14 shows LUT entry values derived from the image B.
- the graph L11 is a line passing through each sample point.
- the entry between each sample point is created by linear interpolation. For example, a straight line is connected between the leftmost sample of the group Gr1 and the leftmost sample of the group Gr2.
- an average value is taken for sample points where n values overlap, and this value is used as an actual sample point for inter-channel prediction.
- an unregistered entry that is, an entry between each sample point
- the present invention is not limited to this.
- the entry may be created by cubic interpolation. Thereby, the prediction accuracy of the table can be improved.
- FIG. 15 is a flowchart illustrating an example of a flow of color difference prediction image generation processing by inter-channel prediction in the inter-channel prediction unit 351.
- the inter-channel prediction unit 351 sets the color difference A and color difference A prediction source channels and the color difference B and color difference B prediction source channels according to the intra prediction parameter PP (S120). .
- the contents set by the inter-channel prediction unit 351 are specifically as follows. That is, the inter-channel prediction unit 351 sets the color difference A and the color difference B according to the color difference channel processing order flag. In the following, the color difference B is processed after the color difference A. For example, the inter-channel prediction unit 351 may set the color difference A as “U channel” and the color difference B as “V channel”.
- the inter-channel prediction unit 351 sets “Y channel” as the color difference A prediction source channel.
- the inter-channel prediction unit 351 sets a color difference B prediction source channel according to the second channel prediction source channel specifier. For example, the inter-channel prediction unit 351 may set the prediction source channel of the U channel with the color difference A as “Y channel” and set the prediction source channel of the V channel with the color difference B as “Y channel and U channel”. Good.
- the inter-channel prediction unit 351 selects a color difference A prediction mode (S121), and generates a color difference prediction image by inter-channel prediction for the color difference A according to the setting in step S120 (S122).
- the inter-channel prediction unit 351 selects a color difference B prediction mode (S123), and generates a color difference prediction image by inter-channel prediction for the color difference B according to the setting in step S120 (S124). The process ends as described above.
- the video decoding device 1 generates the prediction image Pred for each of the luminance (Y) channel and the color difference (U, V) channel, and adds the prediction residual D to the generated prediction image Pred.
- the prediction image generation unit 13 and the adder 14 that decode the luminance (Y) channel and generate the luminance decoded image PY for the target partition, and the target partition
- An LUT deriving unit 134 for deriving a non-linear correlation between a decoded luminance (Y) channel and a color difference (U, V) channel to be decoded as an LUT with reference to a local decoded image P ′ located in the vicinity. If, for the target partition, according to the LUT, inter-channel pre from luminance decoded image P Y, to generate the color difference prediction image PredC It includes a section 351, a.
- the luminance component of each pixel included in the locally decoded image P ′ varies, and when prediction between channels is performed according to a linear correlation, the prediction accuracy is reduced. Even if it exists, there exists an effect that possibility that a higher prediction precision can be improved can be improved.
- FIG. 16 and FIG. 17 are diagrams showing the pattern of the color difference channel processing order when there are two prediction source channels for the second channel to be predicted.
- the U channel may be predicted using the Y channel as a base point, and then (2) the V channel may be predicted based on a combination of the Y channel and the U channel. Also, as shown in FIG. 17, (1) the V channel may be predicted using the Y channel as a base point, and then (2) the U channel may be predicted based on a combination of the Y channel and the V channel.
- the LUT may be extended to two dimensions. Even when the LUT is extended to two dimensions, the same technique as the one-dimensional LUT derivation can be adopted, and a known technique used when creating a two-dimensional LUT can be adopted. Is possible.
- the LUT deriving unit 134 derives the LUT as follows. That is, first, a primary table is derived for the luminance value Y and the color difference value U of the sample points as described above. Subsequently, the color difference value V of the sample point is registered in the table in association with the luminance Y and the color difference value U of the sample point. The same applies to the processing order shown in FIG.
- the LUT can be configured to extend to more than two dimensions.
- Another example of a configuration that expands the LUT to two or more dimensions includes a configuration that uses luminance values Y in a plurality of pixels. More specifically, in the LUT entry, the color difference value U or the color difference value V may be looked up by a combination of adjacent luminance values.
- the color difference can be predicted from a plurality of channels or from a plurality of pixel values of the same channel by using a two-dimensional or more LUT as described above, the accuracy of the predicted image can be improved.
- the intra prediction parameter PP is illustratively configured to include the inter-channel prediction flag, the color difference channel processing order flag, and the second channel prediction source channel specifier.
- the intra prediction parameter PP may be configured to include an inter-channel prediction index indicating a combination of processing order and prediction source channels as follows.
- the above-mentioned inter-channel prediction index is encoded and transmitted to the video decoding device 1 in the encoding process in the video encoding device 2. Then, the inter-channel prediction unit 351 of the video decoding device 1 performs inter-channel prediction according to the inter-channel prediction index transmitted from the video encoding device 2.
- the intra prediction parameter PP indicating the combination of the processing order and the prediction source channel is included in the intra prediction parameter PP, but the present invention is not limited to this.
- information indicating the combination of the processing order and the prediction source channel may be stored in the header of the processing unit other than the slice header, and the processing order or the like may be changed according to the processing unit.
- information indicating a combination of processing order and prediction source channel may be stored in a sequence header (SPS: Sequence Parameter Set) or a picture header (PPS: Picture Parameter Set).
- processing order or the like may be changed in units of processing smaller than slices, for example, in units of LCUs.
- the LUT derivation unit can be changed within a range in which there is a correlation between luminance and color difference.
- the information indicating the combination of the processing order and the prediction source channel may be encoded in different processing units.
- the information indicating the processing order may be encoded in LCU units
- the information indicating the combination of prediction source channels may be encoded in PU units.
- PredU [x U , y U ] is calculated as follows.
- PredU [x U , y U ] LUT [RecY [x Y , y Y ] / 2]
- PredU [x U , y U ] (LUT [RecY [x Y, y Y] / 2] + LUT [RecY [x Y, y Y] / 2 + 1]) / 2
- FIG. 18 is a flowchart showing a modified example of the flow of the LUT derivation process by the LUT derivation unit 134.
- the LUT derivation unit 134 initializes the LUT (S130).
- the LUT deriving unit 134 enters a registration processing loop LP11A for each luminance pixel adjacent to the target partition (S131).
- the LUT deriving unit 134 substitutes sum [n] + m for sum [n] and substitutes count [n] +1 for count [n] (S133).
- the LUT deriving unit 134 determines whether count [n] is greater than 0 (S136).
- count [n] 0
- count [n] 0
- the LUT deriving unit 134 returns to the top of the loop LP12A (S138), and continues processing for the next entry.
- the LUT deriving unit 134 substitutes sum [n] / count [n] into LUT [n] (S137). That is, in step S137, the arithmetic average of the color difference values at the luminance value n is acquired and substituted into LUT [n].
- the LUT deriving unit 134 performs interpolation processing for unregistered entries (S139). Since the process in step S139 is the same as the process in steps S107 to S111 (loop LP13) shown in FIG. 12, the description thereof is omitted here. Thereafter, the LUT derivation process ends.
- FIG. 19 is a diagram illustrating an example of the derived LUT.
- the LUT may hold only a set of samples until it is referred to.
- the LUT deriving unit 134 does not interpolate unregistered entries other than the entries for 16 sets of samples during the LUT deriving process.
- the LUT deriving unit 134 derives the referenced unregistered entry.
- the LUT deriving unit 134 derives the referenced unregistered entry n (nL ⁇ n ⁇ nR) by linear interpolation according to the following equation (3) as an example.
- LUT [n] (LUT [nL] ⁇ (n ⁇ nL) + LUT [nR] ⁇ (nR ⁇ n) / (nR ⁇ nL)) (3)
- nL and nR are the numbers of the most recently registered entries before and after n, respectively, as described with reference to FIG.
- the memory area may be prepared for the number of registered entries. That is, when such a configuration is adopted, only a memory area proportional to the number of samples is consumed in creating the LUT table.
- the inter-channel prediction based on the conventional linear transformation and the inter-channel prediction based on the LUT may be switched.
- the higher accuracy may be used for inter-channel prediction.
- a configuration using linear conversion can be used.
- the moving image encoding device 2 is a device that generates and outputs encoded data # 1 by encoding the input image # 10.
- FIG. 21 is a functional block diagram showing the configuration of the moving image encoding device 2.
- the moving image encoding device 2 includes an encoding setting unit 21, an inverse quantization / inverse conversion unit 22, a predicted image generation unit 23, an adder 24, a frame memory 25, a subtractor 26, a conversion / A quantization unit 27 and a variable length coding unit 28 are provided.
- the encoding setting unit 21 generates image data related to encoding and various setting information based on the input image # 10.
- the encoding setting unit 21 generates the next image data and setting information.
- the encoding setting unit 21 generates the leaf CU image # 100 for the target leaf CU by sequentially dividing the input image # 10 into slice units and LCU units.
- the encoding setting unit 21 generates header setting information H ′ based on the result of the division process.
- the header information H ′ includes (1) information about the size, shape and position of the LCU belonging to the target slice, and (2) the size, shape and shape of the leaf CU belonging to each LCU. It includes CU information CU ′ about the position.
- the encoding setting unit 21 refers to the leaf CU image # 100 and the CU information CU 'to generate PU setting information PUI'.
- the PU setting information PUI ' includes information on all combinations of (1) possible division patterns for each partition of the target leaf CU and (2) prediction modes that can be assigned to each partition.
- the encoding setting unit 21 supplies the leaf CU image # 100 to the subtractor 26.
- the encoding setting unit 21 supplies the header information H ′ to the variable length encoding unit 28. Also, the encoding setting unit 21 supplies the PU setting information PUI ′ to the predicted image generation unit 23.
- the inverse quantization / inverse transform unit 22 performs inverse quantization and inverse DCT transform (Inverse Discrete Cosine Transform) on the quantization prediction residual for each block supplied from the transform / quantization unit 27, Restore the prediction residual for each block. Further, the inverse quantization / inverse transform unit 22 integrates the prediction residual for each block according to the division pattern specified by the TU partition information, and generates a prediction residual D for the target leaf CU. The inverse quantization / inverse transform unit 22 supplies the prediction residual D for the generated target leaf CU to the adder 24.
- inverse quantization / inverse transform unit 22 supplies the prediction residual D for the generated target leaf CU to the adder 24.
- the predicted image generation unit 23 refers to the locally decoded image P ′ recorded in the frame memory 25 and the PU setting information PUI ′, and generates a predicted image Pred for the target leaf CU.
- the prediction image generation unit 23 refers to the luminance decoded image P Y.
- the predicted image generation unit 23 sets the prediction parameter obtained by the predicted image generation process in the PU setting information PUI ′, and transfers the set PU setting information PUI ′ to the variable length encoding unit 28. Note that the predicted image generation process performed by the predicted image generation unit 23 is the same as that performed by the predicted image generation unit 13 included in the video decoding device 1, and thus description thereof is omitted here.
- the adder 24 adds the predicted image Pred supplied from the predicted image generation unit 23 and the prediction residual D supplied from the inverse quantization / inverse transform unit 22 to add a decoded image P for the target leaf CU. Is generated.
- Decoded decoded image P is sequentially recorded in the frame memory 25.
- decoded images corresponding to all the LCUs decoded before the target LCU for example, all the LCUs preceding in the raster scan order) at the time of decoding the target LCU are recorded. .
- the subtracter 26 generates a prediction residual D for the target leaf CU by subtracting the prediction image Pred from the leaf CU image # 100.
- the subtractor 26 supplies the generated prediction residual D to the transform / quantization unit 27.
- the transform / quantization unit 27 performs a DCT transform (Discrete Cosine Transform) and quantization on the prediction residual D to generate a quantized prediction residual.
- DCT transform Discrete Cosine Transform
- the transform / quantization unit 27 refers to the leaf CU image # 100 and the CU information CU ', and determines the division pattern of the target leaf CU into one or a plurality of blocks. Further, according to the determined division pattern, the prediction residual D is divided into prediction residuals for each block.
- the transform / quantization unit 27 generates a prediction residual in the frequency domain by performing DCT transform (DiscretecreCosine Transform) on the prediction residual for each block, and then quantizes the prediction residual in the frequency domain. Thus, a quantized prediction residual for each block is generated.
- DCT transform DiscretecreCosine Transform
- the transform / quantization unit 27 relates to the generated quantization prediction residual for each block, TU partition information that specifies the partition pattern of the target leaf CU, and all possible partition patterns for each block of the target leaf CU.
- TU setting information TUI ′ including the information is generated.
- the transform / quantization unit 27 supplies the generated TU setting information TUI 'to the inverse quantization / inverse transform unit 22 and the variable length coding unit 28.
- variable length encoding unit 28 generates and outputs encoded data # 1 based on the TU setting information TUI ′, the PU setting information PUI ′, and the header information H ′.
- FIG. 22 is a flowchart illustrating an example of the flow of the color difference channel processing order and the prediction source channel encoding process in the moving image encoding apparatus 2.
- the predicted image generation unit 23 selects a luminance prediction mode in the luminance predicted image generation process for the target partition (S200).
- the predicted image generation unit 23 generates a brightness predicted image based on the selected prediction mode (S201).
- the predicted image generation unit 23 enters the color difference predicted image creation processing loop LP21 for each pattern in the color difference channel processing order.
- the predicted image generation unit 23 performs the color difference A and the prediction source channel of the color difference A according to the pattern of each processing order described using FIG. 8, FIG. 16, and FIG.
- the color difference B and the prediction source channel of the color difference B are set (S203).
- step S203 the inter-channel prediction unit 351 sets one of the U channel and the V channel to the color difference A for which channel estimation is performed first, and the other is set to the color difference B for which estimation is performed after the color difference A. Further, the inter-channel prediction unit 351 sets one or a plurality of prediction source channels for the color difference A and the color difference B, respectively.
- the predicted image generation unit 23 selects a prediction mode for the color difference A (S204), and generates a color difference predicted image for the color difference A by inter-channel prediction according to the setting in step S120 (S205).
- the predicted image generation unit 23 selects a prediction mode for the color difference B (S206), and generates a color difference predicted image by inter-channel prediction for the color difference B according to the setting in step S120 (S207). Furthermore, the color difference predicted image creation process by inter-channel prediction continues, and the process returns to the top of the loop LP21 (S208).
- the predicted image generation unit 23 selects a color difference channel processing order and a combination of prediction source channels that are most suitable for encoding (S209).
- the prediction image generation unit 23 performs encoding by including the inter-channel prediction flag, the color difference channel processing order flag, and the second channel prediction source channel specifier in the intra prediction parameter PP (S210).
- the predicted image generation unit 23 may encode the inter-channel prediction index indicating the combination of the processing order and the prediction source channel in step S210.
- Embodiment 2 Another embodiment of the present invention will be described below with reference to FIGS. For convenience of explanation, members having the same functions as those in the drawings described in the first embodiment are denoted by the same reference numerals and description thereof is omitted.
- FIG. 23 is a functional block diagram illustrating another example of the configuration of the predicted image generation unit 13.
- the predicted image generation unit 13A initializes the LUT in units of LCU, and updates the LUT in each block (target partition) in the same LCU.
- the inter-channel prediction unit 351 is changed to an inter-channel prediction unit 351A.
- the predicted image generation unit 13A is provided with an LUT derivation unit (correlation derivation means) 16 separately from the predicted image generation unit 13A.
- the LUT derivation unit 16 newly derives an LUT for each target LCU.
- the LUT deriving unit 16 updates the LUT in units of partitions.
- the difference between the LUT deriving unit 16 and the LUT deriving unit 134 is a unit for deriving this LUT.
- the inter-channel prediction unit 351A is changed to refer to the LUT deriving unit 16.
- FIG. 24 is a flowchart illustrating a schematic flow of color difference predicted image generation processing in the predicted image generation unit 13A.
- the LUT deriving unit 16 determines whether or not the target partition is a block that is first processed by the LCU (S30). When the target partition is a block processed first by the LCU (YES in S30), the LUT is initialized (S31).
- LUT_F [n] 1 indicates that there is a sample
- LUT_F [n] 0 indicates that there is no sample.
- the LUT deriving unit 16 updates the LUT of each channel while referring to the local decoded image P ′ (S32). Details of the LUT update processing will be described later.
- the LUT deriving unit 16 executes the LUT update process without initializing the LUT (S32).
- the inter-channel prediction determination unit 133 refers to the inter-channel prediction flag and determines whether or not the inter-channel prediction mode is set (S33).
- the intra-channel prediction unit 352 generates the color difference prediction image PredC without using the inter-channel prediction (S36), and the process ends.
- the inter-channel prediction unit 351A refers to the LUT updated by the LUT deriving unit 16, and the color difference prediction image PredC by the inter-channel prediction Is generated (S35). This is the end of the process.
- FIG. 25 is a flowchart showing an example of the flow of LUT update processing by the LUT deriving unit 16.
- the LUT deriving unit 16 enters a registration process loop LP41 for each luminance pixel adjacent to the target partition (S400).
- the LUT deriving unit 16 determines whether or not a color difference value is registered in the LUT [n] (S402). In other words, the LUT deriving unit 16 determines that the color difference value is registered if LUT_F [n] is 1 (with a sample), whereas if LUT_F [n] is 0 (no sample), the color difference is determined. It is determined that the value is not registered.
- the LUT derivation unit 134 registers the color difference value m acquired in Step S401 as it is in LUT [n]. At the same time, 1 (with sample) is substituted into LUT_F [n] (S404), and the process returns to the top of the loop LP11 (S405).
- the LUT derivation unit 16 calculates (m + LUT [n] +1) / 2, thereby obtaining the acquired color difference value m and the registered color difference value. And the average value is calculated. The LUT deriving unit 16 substitutes the average value thus calculated for the color difference value m (S403). Subsequently, the LUT deriving unit 16 registers the color difference value m into which the average value is substituted in step S403 in LUT [n], and substitutes 1 (with sample) into LUT_F [n] (S404). Return to the top of LP41 (S405).
- the LUT deriving unit 134 determines whether or not LUT_F [n] is 0 (no sample) (S407).
- the LUT itself is retained while processing the same LCU.
- all entries are registered by interpolation. Therefore, here, instead of confirming whether or not LUT [n] has been registered, it is confirmed whether or not a sample exists. An entry for which no sample exists is again subject to interpolation.
- LUT_F [n] is not 0 (no sample), that is, if LUT [n] has already been registered (NO in S407), the interpolation process continues and returns to the top of the loop LP42 (S410). .
- the sample points Smpl1 and Smpl2 are registered in the immediately preceding target partition and the interpolation process is performed, but the sample point Smpl3 is newly registered in the target partition.
- the LUT deriving unit 16 searches for entries with the latest sample before and after n.
- the LUT deriving unit 16 searches for registered entries for n forwards, that is, nL smaller than n, with n as a base point. That is, here, the sample point Smpl1 shown in FIG. 26 is searched.
- the LUT deriving unit 16 searches for an entry registered for n backward, that is, nR larger than n, with n as a base point.
- nR larger than n
- the sample point Smpl2 shown in FIG. 26 has been searched for in the target partition up to immediately before, but the sample point Smpl3 is searched for when the sample point Smpl3 is registered in the target partition.
- the LUT deriving unit 16 registers a value obtained by linear interpolation between the LUT [nL] and the LUT [nR] in the LUT [n] (S409).
- step S408 If only one of nL and nR is detected as a result of the search in step S408, the value of the registered entry of the detected one is registered in LUT [n].
- step S403 the LUT deriving unit 16 calculates the average value of the acquired color difference value m and the registered color difference value by calculating (m + LUT [n] +1) / 2. Not limited to. For example, a weighted average of 1: 3 may be calculated for the acquired color difference value m and the registered color difference value.
- the predicted image generation unit 13A is configured to initialize the LUT in units of LCUs and update the LUT in each block (target partition) in the same LCU.
- the LUT tends to increase in accuracy as the number of samples used for derivation increases. However, if the sample is acquired too widely, the correlation may be lost or it may become extremely small.
- the LCU is a range that is wider than the target partition and is assumed to have a correlation.
- the above-described moving image encoding device 2 and moving image decoding device 1 can be used by being mounted on various devices that perform transmission, reception, recording, and reproduction of moving images.
- the moving image may be a natural moving image captured by a camera or the like, or may be an artificial moving image (including CG and GUI) generated by a computer or the like.
- moving image encoding device 2 and moving image decoding device 1 can be used for transmission and reception of moving images.
- FIG. 30 (a) is a block diagram illustrating a configuration of a transmission device PROD_A in which the moving image encoding device 2 is mounted.
- the transmission device PROD_A modulates a carrier wave with an encoding unit PROD_A1 that obtains encoded data by encoding a moving image, and the encoded data obtained by the encoding unit PROD_A1.
- a modulation unit PROD_A2 that obtains a modulation signal and a transmission unit PROD_A3 that transmits the modulation signal obtained by the modulation unit PROD_A2 are provided.
- the moving image encoding apparatus 2 described above is used as the encoding unit PROD_A1.
- the transmission device PROD_A is a camera PROD_A4 that captures a moving image, a recording medium PROD_A5 that records the moving image, an input terminal PROD_A6 that inputs the moving image from the outside, as a supply source of the moving image input to the encoding unit PROD_A1.
- An image processing unit A7 that generates or processes an image may be further provided.
- FIG. 30A illustrates a configuration in which the transmission apparatus PROD_A includes all of these, but a part of the configuration may be omitted.
- the recording medium PROD_A5 may be a recording of a non-encoded moving image, or a recording of a moving image encoded by a recording encoding scheme different from the transmission encoding scheme. It may be a thing. In the latter case, a decoding unit (not shown) for decoding the encoded data read from the recording medium PROD_A5 according to the recording encoding method may be interposed between the recording medium PROD_A5 and the encoding unit PROD_A1.
- FIG. 30 is a block diagram illustrating a configuration of the receiving device PROD_B in which the moving image decoding device 1 is mounted.
- the receiving device PROD_B includes a receiving unit PROD_B1 that receives a modulated signal, a demodulating unit PROD_B2 that obtains encoded data by demodulating the modulated signal received by the receiving unit PROD_B1, and a demodulator.
- a decoding unit PROD_B3 that obtains a moving image by decoding the encoded data obtained by the unit PROD_B2.
- the moving picture decoding apparatus 1 described above is used as the decoding unit PROD_B3.
- the receiving device PROD_B has a display PROD_B4 for displaying a moving image, a recording medium PROD_B5 for recording the moving image, and an output terminal for outputting the moving image to the outside as a supply destination of the moving image output by the decoding unit PROD_B3.
- PROD_B6 may be further provided.
- FIG. 30B illustrates a configuration in which the reception device PROD_B includes all of these, but a part of the configuration may be omitted.
- the recording medium PROD_B5 may be used for recording a non-encoded moving image, or may be encoded using a recording encoding method different from the transmission encoding method. May be. In the latter case, an encoding unit (not shown) for encoding the moving image acquired from the decoding unit PROD_B3 according to the recording encoding method may be interposed between the decoding unit PROD_B3 and the recording medium PROD_B5.
- the transmission medium for transmitting the modulation signal may be wireless or wired.
- the transmission mode for transmitting the modulated signal may be broadcasting (here, a transmission mode in which the transmission destination is not specified in advance) or communication (here, transmission in which the transmission destination is specified in advance). Refers to the embodiment). That is, the transmission of the modulation signal may be realized by any of wireless broadcasting, wired broadcasting, wireless communication, and wired communication.
- a terrestrial digital broadcast broadcasting station (broadcasting equipment or the like) / receiving station (such as a television receiver) is an example of a transmitting device PROD_A / receiving device PROD_B that transmits and receives a modulated signal by wireless broadcasting.
- a broadcasting station (such as broadcasting equipment) / receiving station (such as a television receiver) of cable television broadcasting is an example of a transmitting device PROD_A / receiving device PROD_B that transmits and receives a modulated signal by cable broadcasting.
- a server workstation etc.
- Client television receiver, personal computer, smart phone etc.
- VOD Video On Demand
- video sharing service using the Internet is a transmitting device for transmitting and receiving modulated signals by communication.
- PROD_A / reception device PROD_B usually, either a wireless or wired transmission medium is used in a LAN, and a wired transmission medium is used in a WAN.
- the personal computer includes a desktop PC, a laptop PC, and a tablet PC.
- the smartphone also includes a multi-function mobile phone terminal.
- the video sharing service client has a function of encoding a moving image captured by the camera and uploading it to the server. That is, the client of the video sharing service functions as both the transmission device PROD_A and the reception device PROD_B.
- moving picture encoding apparatus 2 and moving picture decoding apparatus 1 can be used for recording and reproduction of moving pictures.
- FIG. 31 (a) is a block diagram showing a configuration of a recording apparatus PROD_C in which the above-described moving picture encoding apparatus 2 is mounted.
- the recording device PROD_C has an encoding unit PROD_C1 that obtains encoded data by encoding a moving image, and the encoded data obtained by the encoding unit PROD_C1 on the recording medium PROD_M.
- a writing unit PROD_C2 for writing.
- the moving image encoding apparatus 2 described above is used as the encoding unit PROD_C1.
- the recording medium PROD_M may be of a type built in the recording device PROD_C, such as (1) HDD (Hard Disk Drive) or SSD (Solid State Drive), or (2) SD memory. It may be of the type connected to the recording device PROD_C, such as a card or USB (Universal Serial Bus) flash memory, or (3) DVD (Digital Versatile Disc) or BD (Blu-ray Disc: registration) Or a drive device (not shown) built in the recording device PROD_C.
- HDD Hard Disk Drive
- SSD Solid State Drive
- SD memory such as a card or USB (Universal Serial Bus) flash memory, or (3) DVD (Digital Versatile Disc) or BD (Blu-ray Disc: registration) Or a drive device (not shown) built in the recording device PROD_C.
- the recording device PROD_C is a camera PROD_C3 that captures moving images as a supply source of moving images to be input to the encoding unit PROD_C1, an input terminal PROD_C4 for inputting moving images from the outside, and reception for receiving moving images.
- the unit PROD_C5 and an image processing unit C6 that generates or processes an image may be further provided.
- FIG. 31A illustrates a configuration in which the recording apparatus PROD_C includes all of these, but a part of the configuration may be omitted.
- the receiving unit PROD_C5 may receive a non-encoded moving image, or may receive encoded data encoded by a transmission encoding scheme different from the recording encoding scheme. You may do. In the latter case, a transmission decoding unit (not shown) that decodes encoded data encoded by the transmission encoding method may be interposed between the reception unit PROD_C5 and the encoding unit PROD_C1.
- Examples of such a recording device PROD_C include a DVD recorder, a BD recorder, and an HDD (Hard Disk Drive) recorder (in this case, the input terminal PROD_C4 or the receiving unit PROD_C5 is a main supply source of moving images).
- a camcorder in this case, the camera PROD_C3 is a main source of moving images
- a personal computer in this case, the receiving unit PROD_C5 or the image processing unit C6 is a main source of moving images
- a smartphone in this case In this case, the camera PROD_C3 or the receiving unit PROD_C5 is a main supply source of moving images
- the camera PROD_C3 or the receiving unit PROD_C5 is a main supply source of moving images
- FIG. 31 is a block showing a configuration of a playback device PROD_D equipped with the above-described video decoding device 1.
- the playback device PROD_D reads a moving image by decoding a read unit PROD_D1 that reads encoded data written to the recording medium PROD_M and a read unit PROD_D1 that reads the encoded data. And a decoding unit PROD_D2 to be obtained.
- the moving picture decoding apparatus 1 described above is used as the decoding unit PROD_D2.
- the recording medium PROD_M may be of the type built into the playback device PROD_D, such as (1) HDD or SSD, or (2) such as an SD memory card or USB flash memory, It may be of a type connected to the playback device PROD_D, or (3) may be loaded into a drive device (not shown) built in the playback device PROD_D, such as DVD or BD. Good.
- the playback device PROD_D has a display PROD_D3 that displays a moving image, an output terminal PROD_D4 that outputs the moving image to the outside, and a transmission unit that transmits the moving image as a supply destination of the moving image output by the decoding unit PROD_D2.
- PROD_D5 may be further provided.
- FIG. 31B illustrates a configuration in which the playback apparatus PROD_D includes all of these, but some of them may be omitted.
- the transmission unit PROD_D5 may transmit an unencoded moving image, or transmits encoded data encoded by a transmission encoding method different from the recording encoding method. You may do. In the latter case, it is preferable to interpose an encoding unit (not shown) that encodes a moving image with an encoding method for transmission between the decoding unit PROD_D2 and the transmission unit PROD_D5.
- Examples of such a playback device PROD_D include a DVD player, a BD player, and an HDD player (in this case, an output terminal PROD_D4 to which a television receiver or the like is connected is a main supply destination of moving images).
- a television receiver in this case, the display PROD_D3 is a main supply destination of moving images
- a digital signage also referred to as an electronic signboard or an electronic bulletin board
- the display PROD_D3 or the transmission unit PROD_D5 is the main supply of moving images.
- each block of the moving picture decoding apparatus 1 and the moving picture encoding apparatus 2 described above may be realized in hardware by a logic circuit formed on an integrated circuit (IC chip), or may be a CPU (Central It may be realized by software using a Processing Unit).
- each device includes a CPU that executes instructions of a program that realizes each function, a ROM (Read (Memory) that stores the program, a RAM (Random Memory) that expands the program, the program, and various types
- a storage device such as a memory for storing data is provided.
- An object of the present invention is to provide a recording medium in which a program code (execution format program, intermediate code program, source program) of a control program of each of the above devices, which is software that realizes the above-described functions, is recorded so as to be readable by a computer. This can also be achieved by supplying to each of the above devices and reading and executing the program code recorded on the recording medium by the computer (or CPU or MPU).
- Examples of the recording medium include tapes such as magnetic tape and cassette tape, magnetic disks such as floppy (registered trademark) disks / hard disks, and CD-ROM / MO / MD / DVD / CD-R / Blu-ray disks (registered trademarks). ) And other optical disks, IC cards (including memory cards) / optical cards, semiconductor memories such as mask ROM / EPROM / EEPROM / flash ROM, PLD (Programmable logic device) and FPGA ( Logic circuits such as Field Programmable Gate Array can be used.
- tapes such as magnetic tape and cassette tape
- magnetic disks such as floppy (registered trademark) disks / hard disks
- CD-ROM / MO / MD / DVD / CD-R / Blu-ray disks registered trademarks
- IC cards including memory cards
- semiconductor memories such as mask ROM / EPROM / EEPROM / flash ROM, PLD (Programmable logic device) and FPGA ( Logic circuits
- each of the above devices may be configured to be connectable to a communication network, and the program code may be supplied via the communication network.
- the communication network is not particularly limited as long as it can transmit the program code.
- the Internet intranet, extranet, LAN, ISDN, VAN, CATV communication network, virtual private network (Virtual Private Network), telephone line network, mobile communication network, satellite communication network, etc. can be used.
- the transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type.
- wired lines such as IEEE 1394, USB, power line carrier, cable TV line, telephone line, ADSL (Asymmetric Digital Subscriber Line) line, infrared rays such as IrDA and remote control, Bluetooth (registered trademark), IEEE 802.11 wireless, HDR ( It can also be used by wireless such as High Data Rate, NFC (Near Field Communication), DLNA (Digital Living Network Alliance), mobile phone network, satellite line, and terrestrial digital network.
- the present invention can also be realized in the form of a computer data signal embedded in a carrier wave in which the program code is embodied by electronic transmission.
- the image decoding device decodes a moving image from encoded data.
- the image decoding device regardless of whether the image is a moving image or a still image, the image decoding device generally Applicable. The same applies to the image encoding device.
- LCU Large Coding Unit
- HEVC High Efficiency Video Coding
- a leaf CU is a CU (Coding Unit, coding) It is also called the leaf of the tree).
- PU and TU in the said embodiment are respectively equivalent to the prediction tree (Prediction Tree) and transformation tree (transform tree) in HEVC.
- the partition of PU in the said embodiment is corresponded to PU (Prediction Unit) in HEVC.
- a block obtained by dividing a TU corresponds to a TU (Transformation Unit) in HEVC.
- each component of each pixel included in the locally decoded image has a variation and is linear. This is a configuration that improves the possibility that higher prediction accuracy can be obtained even if prediction between the channels is performed according to the correlation, even if the prediction accuracy is reduced.
- the prediction image generation unit 13 refers to the local decoded image P ′ positioned around the target partition, and performs nonlinearity between the decoded luminance (Y) channel and the color difference (U, V) channel to be decoded.
- a LUT derivation unit 134 for deriving a correlation as LUT, the target partition, according to the LUT LUT deriving unit 134 derives, from the luminance decoded image P Y, and inter-channel prediction unit 351 for generating a color difference prediction image PredC, the Prepare.
- the image decoding apparatus generates a prediction image for each of a plurality of channels indicating each component constituting an image, and adds a prediction residual to the generated prediction image.
- the image decoding device that decodes the image data encoded according to the above, for each block to be processed, channel decoding means for decoding one or more channels among the plurality of channels, and each of the plurality of channels has been decoded. Then, with reference to the local decoded image located around the block to be processed, a non-linear correlation between the one or more channels already decoded by the channel decoding means and the other channels to be decoded is derived. For the correlation deriving means and the processing target block, the decoded block is added according to the derived correlation. From one or more channels of the decoded image, which is configured to include, the predicted image generating means for generating the prediction image of the other channels.
- a channel is a generalized component that constitutes an image.
- the luminance component and the color difference component correspond to channels. That is, in this example, the channel includes a luminance channel and a color difference channel.
- the color difference channel includes a U channel indicating the U component of the color difference and a V channel indicating the V component of the color difference.
- the channel may relate to the RGB color space.
- the image decoding process is performed for each channel.
- the local decoded image is a decoded image that has been decoded for the plurality of channels and is located around the block to be processed.
- Processing target block refers to various processing units in the decoding process.
- a coding unit, a conversion unit, a prediction unit, and the like can be given.
- the processing unit includes a unit obtained by further subdividing the encoding unit, the conversion unit, and the prediction unit.
- the periphery of the processing target block includes, for example, a pixel adjacent to the target block, a block adjacent to the left side of the target block, a block adjacent to the upper side of the target block, and the like.
- the non-linear correlation can be derived, for example, by examining the correspondence of each point composed of the luminance value and the color difference value.
- the non-linear correlation for example, when describing the YUV color space, it can be derived from the correspondence between the luminance value of each pixel included in the locally decoded image and the color difference value.
- the correlation may be realized as an LUT in which a decoded channel and a decoding target channel are associated with each other.
- the correlation may be expressed by a function including a relational expression established between the decoded channel and the decoding target channel.
- the channel to be decoded in the processing target block is predicted from the channel that has been decoded in the processing target block according to the nonlinear correlation derived in this way.
- such prediction is also referred to as inter-channel prediction.
- the pixel value of the decoded image of the decoded channel is converted according to the nonlinear correlation, and the pixel value of the predicted image of the channel to be decoded is obtained.
- the pixel value is a generalized value of a component constituting the image.
- the correlation deriving unit is configured to detect a decoded channel when a pixel value of the decoded image of the decoded channel does not exist as a corresponding pixel value of a pixel included in the local decoded image. It is preferable to derive the nonlinear correlation by performing interpolation using pixel values of pixels included in the local decoded image having pixel values within a predetermined range of pixel values of the decoded image.
- the pixel value of the image is the value of any component that forms the image.
- the prediction source when the pixel value of the decoded image of the decoded channel does not exist as the corresponding pixel value of the pixel included in the local decoded image, in the example of the luminance value, the prediction source The pixel value of the decoded image of the decoded luminance channel that does not appear as the luminance value of the pixel included in the local decoded image.
- the nonlinear correlation may be derived in advance, or the pixel value of the decoded image of the decoded channel does not exist as the corresponding pixel value of the pixel included in the local decoded image. You may derive
- a correlation can be obtained by linear interpolation from previous and subsequent samples near the value. For example, for each sample point consisting of a luminance value and a color difference value, the correlation can be derived by linearly interpolating adjacent points. As another example of nonlinear correlation, each point can be derived by approximating each point by cubic interpolation.
- the value for the decoded channel is the value that does not appear as the pixel value of the pixel included in the locally decoded image.
- the value for the decoding target channel can be predicted with high accuracy.
- the correlation deriving unit derives a relationship between a plurality of decoded channels and a decoding target channel as a correlation.
- the V channel to be decoded can be predicted from the decoded luminance channel and U channel by the above configuration.
- the prediction between channels is performed using the relationship between a plurality of decoded channels and the channel to be decoded as a correlation, the prediction accuracy can be improved.
- the correlation deriving unit derives a correlation between a plurality of pixel values included in a locally decoded image of a decoded channel and a channel to be decoded.
- the above configuration derives a correlation between a plurality of luminance values included in the locally decoded image of the decoded luminance channel and the color difference to be decoded.
- the plurality of luminance values are luminance values within a predetermined range.
- channel decoding processing order information indicating in which order the plurality of channels are to be decoded, and from which of the decoded channels the channel to be decoded should be predicted
- the processing information acquisition means for acquiring the specified prediction source channel information, and the plurality of channels as decoding targets in the order indicated by the channel decoding processing order information, and the decoding target channels as the prediction source It is preferable to include a prediction control unit that performs control so as to perform prediction from the decoded channel specified in the channel information.
- inter-channel prediction can be controlled based on designation of channel decoding process order information and prediction source channel information.
- the channel decoding processing order information and the prediction source channel information are included in encoded data including encoded image data, for example. Therefore, for example, it is possible to cope with an image encoding device that encodes channel decoding process order information and prediction source channel information into encoded data and transmits the encoded data.
- the processing information acquisition means acquires the channel decoding processing order information and the prediction source channel information that are encoded in a predetermined processing unit in decoding processing
- the prediction control means Preferably, the control is performed in accordance with the channel decoding processing order information and the prediction source channel information acquired by the processing information acquisition unit.
- control can be changed according to the processing unit.
- For each processing unit it is possible to set the actual decoding target order and prediction source channel.
- the correlation deriving means uses, for the processing target block included in the block group including a plurality of blocks, the local decoded image decoded in the processed block included in the block group. It is preferable to derive the correlation.
- the local decoded image “located around the processing target block” is the local decoded image decoded in “the processed block included in the processing target block group”.
- the more sample points the higher the accuracy of prediction based on the above correlation. Therefore, if as many sample points as possible can be acquired between blocks having a spatial correlation, the prediction accuracy of inter-channel prediction can be improved.
- the image encoding device encodes the prediction residual obtained by subtracting the prediction image generated for each of a plurality of channels indicating each component constituting the image from the original image.
- channel decoding means for decoding one or more channels among the plurality of channels, and each of the plurality of channels has been decoded.
- the derivation means and the processing target block are processed in accordance with the above derived correlation. From one or more channels of the decoded image, which is configured to include, the predicted image generating means for generating the prediction image of the other channels.
- the data structure of the encoded data encodes the prediction residual obtained by subtracting the prediction image generated for each of a plurality of channels indicating each component constituting the image from the original image.
- a data structure of encoded data generated by converting to an image encoded by generating a predicted image for each of the plurality of channels and adding a prediction residual to the generated predicted image
- channel decoding processing order information that indicates in which order the plurality of channels are decoded for the processing target block, and one or more decoded ones for the processing target block From the decoded image of the channel, the block to be processed has been decoded for each of the plurality of channels.
- the predicted image of the other channel is generated according to a non-linear correlation between the one or more channels that have been decoded and the other channel to be decoded with reference to a locally decoded image located in the vicinity of
- the data structure includes prediction source channel designation information for designating whether or not to perform.
- the same effects as those of the image decoding device according to the present invention can be obtained.
- the image encoding device specifies channel decoding processing order information indicating in which order the plurality of channels are to be decoded, and from which of the decoded channels the channel to be decoded is to be predicted
- the prediction source channel information may be included in the data structure of the encoded data.
- the image encoding device may encode the information in, for example, side information.
- the present invention can be suitably applied to a decoding device that decodes encoded data and an encoding device that generates encoded data. Further, the present invention can be suitably applied to the data structure of encoded data generated by the encoding device and referenced by the decoding device.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
L'invention porte sur un dispositif de décodage d'image qui comporte : une unité de déduction de table de consultation (134) qui interroge une image décodée locale (P'), qui est située sur la périphérie d'une partition objet, et qui déduit, sous la forme d'une table de consultation, une corrélation non linéaire entre un canal de luminosité (Y) décodé et un canal de différentiel de couleur (U, V) à décoder ; une unité de prédiction entre canaux (351) qui génère une image de prédiction de différentiel de couleur (PredC) à partir d'une image décodée de luminosité (PY) pour la partition objet, conformément à la table de consultation (LUT).
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2010-279454 | 2010-12-15 | ||
| JP2010279454 | 2010-12-15 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2012081636A1 true WO2012081636A1 (fr) | 2012-06-21 |
Family
ID=46244732
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2011/078953 Ceased WO2012081636A1 (fr) | 2010-12-15 | 2011-12-14 | Dispositif de décodage d'image, dispositif de codage d'image et structure de données de données codées |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2012081636A1 (fr) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2018107692A (ja) * | 2016-12-27 | 2018-07-05 | Kddi株式会社 | 動画像復号装置、動画像復号方法、動画像符号化装置、動画像符号化方法及びコンピュータ可読記録媒体 |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH03182171A (ja) * | 1989-12-11 | 1991-08-08 | Dainippon Printing Co Ltd | カラー画像情報の符号化及び再生方法 |
| JP2009534876A (ja) * | 2006-03-23 | 2009-09-24 | サムスン エレクトロニクス カンパニー リミテッド | 画像の符号化方法及び装置、復号化方法及び装置 |
-
2011
- 2011-12-14 WO PCT/JP2011/078953 patent/WO2012081636A1/fr not_active Ceased
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH03182171A (ja) * | 1989-12-11 | 1991-08-08 | Dainippon Printing Co Ltd | カラー画像情報の符号化及び再生方法 |
| JP2009534876A (ja) * | 2006-03-23 | 2009-09-24 | サムスン エレクトロニクス カンパニー リミテッド | 画像の符号化方法及び装置、復号化方法及び装置 |
Non-Patent Citations (1)
| Title |
|---|
| HARUHISA KATO ET AL.: "Adaptive Inter-channel Prediction for Intra Prediction Error in H.264", THE JOURNAL OF THE INSTITUTE OF IMAGE INFORMATION AND TELEVISION ENGINEERS, THE INSTITUTE OF IMAGE INFORMATION AND TELEVISION ENGINEERS, vol. 64, no. 11, 1 November 2010 (2010-11-01), pages 1711 - 1717 * |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2018107692A (ja) * | 2016-12-27 | 2018-07-05 | Kddi株式会社 | 動画像復号装置、動画像復号方法、動画像符号化装置、動画像符号化方法及びコンピュータ可読記録媒体 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7200320B2 (ja) | 画像フィルタ装置、フィルタ方法および動画像復号装置 | |
| JP7001768B2 (ja) | 算術復号装置 | |
| US10547861B2 (en) | Image decoding device | |
| US10136161B2 (en) | DMM prediction section, image decoding device, and image coding device | |
| JP5995448B2 (ja) | 画像復号装置、および画像符号化装置 | |
| WO2014115283A1 (fr) | Dispositif de décodage d'image et dispositif de codage d'image | |
| JPWO2013105622A1 (ja) | 画像復号装置、画像符号化装置、および符号化データのデータ構造 | |
| TW202133613A (zh) | 圖像解碼裝置及圖像編碼裝置 | |
| WO2017068856A1 (fr) | Dispositif de génération d'image prédictive, dispositif de décodage d'image et dispositif de codage d'image | |
| JP2013236358A (ja) | 画像フィルタ装置、画像復号装置、画像符号化装置、およびデータ構造 | |
| WO2017195532A1 (fr) | Dispositif de décodage d'image et dispositif de codage d'image | |
| CN114902669A (zh) | 使用颜色空间转换对图像编码/解码的方法和装置及发送比特流的方法 | |
| TWI848969B (zh) | 用於視訊寫碼及處理之解塊濾波器 | |
| JP2013141094A (ja) | 画像復号装置、画像符号化装置、画像フィルタ装置、および符号化データのデータ構造 | |
| WO2012121352A1 (fr) | Dispositif de décodage de vidéo, dispositif de codage de vidéo et structure de données | |
| CN119893101A (zh) | 图像编码/解码方法和用于发送比特流的方法 | |
| WO2012090962A1 (fr) | Dispositif de décodage d'image, dispositif de codage d'image, structure de données des données codées, dispositif de décodage arithmétique et dispositif de codage arithmétique | |
| WO2012077795A1 (fr) | Dispositif de codage d'images, dispositif de décodage d'images et structure de données | |
| WO2012081636A1 (fr) | Dispositif de décodage d'image, dispositif de codage d'image et structure de données de données codées | |
| JP6162289B2 (ja) | 画像復号装置および画像復号方法 | |
| US11627341B2 (en) | Method and device for signaling information relating to slice type in picture header in image/video coding system | |
| WO2014050554A1 (fr) | Dispositif de décodage d'image et dispositif de codage d'image | |
| JP2013251827A (ja) | 画像フィルタ装置、画像復号装置、画像符号化装置、およびデータ構造 | |
| JP2023142460A (ja) | 3dデータ符号化装置および3dデータ復号装置 | |
| WO2012043676A1 (fr) | Dispositif de décodage, dispositif de codage et structure de données |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11848672 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 11848672 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: JP |